URL to Cloudstorage file on sdk - python

I am experimenting with Google Cloud Storage on Appengine. I installed the new cloudstorage python api code and have everything working well. I deployed my code, and it is also working well.
My ACL is (correctly) set to public-read and I can view my added files on http://commondatastorage.googleapis.com// just fine when calling cloudstorage on the appspot.
However, is there a similar path on the local sdk?
I am using this to upload images and create thumbnails. However, locally, I don't know how to serve up the url to the thumbnail. I do see in the blobstore viewer, the blobs are created, but there is no filename displyed in the blobinfo AND the url uses the blobstore key rather than the filename I gave to the gs create call.

Yes, you can use the following on the app server:
/_ah/gcs/[bucket name]/[filename]
If you used the default bucket, it's called app_default_bucket.
I've tested it with images and it works well. With mp4 videos it seems to run into an error though.

Related

Django + AWS: files not syncing to S3

I inherited a CMS system that was implemented using Django Suit. One of the forms is supposed to upload files to S3 but it's not happening (the files upload to the webserver - EC2, but not to S3).
What I determined so far:
The EC2 instance has full access to S3 (via a role)
The user set up in Django's config file has full access to S3
There is a CloudFront configured to point to the bucket, and it works when files are accessed via a URL. The configuration is working there
The previous developers used the following for handling the upload of files:
DEFAULT_FILE_STORAGE = 'fallback_storage.storage.FallbackStorage'
FALLBACK_STORAGES = (
'django.core.files.storage.FileSystemStorage',
'main.custom_storages.MediaStorage'
)
I looked into these 3 classes to see if I'm missing a configuration but everything looks good.
I'm not familiar with this way of syncing files between a web server and S3, so I may be missing something very obvious. Is there like a cron jon that needs to run in the background?
I found a blog post explaining how to use Django to upload files to S3 using FallbackStorage. That tutorial uses docker. In this case, docker is not used at all.
I'm lost at this point. There are thousands of classes spread across dozens of python libraries. It will take forever to do an exhaustive analysis of the code.
You should probably look at the FallbackStorage class, typically for file uploads to S3 this would be the storage class off S3BotoStorage with the proper AWS_STORAGE_BUCKET, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY set.
stored = models.FileField(storage=S3BotoStorage(bucket=AWS_BUCKET), upload_to='blog-uploads')

How to upload a video file directly to Cloud Storage from a Flask form without BlobStore?

I am using Flask, and I have a form on my web app's index page, which requires users to upload MP4 videos. I expect my users to upload 30min long videos, so the video sizes are likely going to be in the hundreds of megabytes. The issue now is that I intend to deploy this Flask application to Google App Engine, and apparently I cannot work with any static file above 32MB. Somehow, when I try to upload any video in the deployed version that is above 32MB, I get a Request Too Large error.
I see that the BlobStore Python API used to be a recommended solution to work with really large files on the server in the past. But that was for Python 2.7: https://cloud.google.com/appengine/docs/standard/python/blobstore/
I'm using Python 3.7, and Google now recommends that files get uploaded directly to Cloud Storage, and I am not exactly sure how to do that.
Below is a snippet showing how I'm currently storing my users' uploaded videos through the form into Cloud Storage. Unfortunately, I'm still restricted from uploading large files because I get error messages. So again, my question is: How can I make my users upload their files directly to Cloud Storage in a way that won't let the server timeout or give me a Request Too Large error?
form = SessionForm()
blob_url = ""
if form.validate_on_submit():
f = form.video.data
video_string = f.read()
filename = secure_filename(f.filename)
try:
# The custom function upload_session_video() uploads the file to a Cloud Storage bucket
# It uses the Storage API's upload_from_string() method.
blob_url = upload_session_video(video_string, filename)
except FileNotFoundError as error:
flash(error, 'alert')
# Create the Cloud Storage bucket (same name as the video file)
user_bucket = create_bucket(form.patient_name.data.lower())
You cannot upload files more than 32MB to Cloud Storage using Google App Engine due to a request limitation. However, you can bypass that by uploading to Cloud Storage using with resumable uploads in python case use "google-resumable-media".
the size of the resource is not known (i.e. it is generated on the
fly)
requests must be short-lived
the client has request size limitations
the resource is too large to fit into memory
example code included here.

How to upload HTML file to an Azure Web App using Python?

I have a Python application which creates a HTML file which I then want to upload to an Azure Web Application.
What is the best way to do this?
I originally started try to do it using FTP and then switched to pushing with GIT. None of these really felt right. How should I be doing this?
UPDATE
I have this 99% working. I'm using a Storage Account to host a static site (which feels like the right way to do this).
This is how I am uploading:
blob_service_client = BlobServiceClient.from_connection_string(az_string)
# Create a blob client using the local file name as the name for the blob
blob_client = blob_service_client.get_blob_client(container=container_name, blob=local_file_name)
print("\nUploading to Azure Storage as blob:\n\t" + local_file_name)
# Upload the created file
with open('populated.html', "rb") as data:
blob_client.upload_blob(data)
The only problem that I have now, is that the file is downloading instead of opening in the browser. I think I need to set the content type somewhere.
Update 2
Working now, I added:
my_content_settings = ContentSettings(content_type='text/html')
test = blob_client.upload_blob(data, overwrite=True, content_settings=my_content_settings)
Cheers,
Mick
The best way to do this is up to you.
Generally, there are two ways to upload a HTML file to Azure Web App for Windows, as below.
Following the Kudu wiki page Accessing files via ftp to upload a file via FTP.
Following the sections VFS, Zip and Zip Deployment of Kudu wiki page REST API to call the related PUT REST API to upload a file via HTTP client.
However, based on my understanding for your scenario, the two ways above are not simple. So I recommanded to use the feature Static website of Azure Blob Storage Gen 2 to host your static HTML file generated by your Python application and to upload files via Azure Storage SDK for Python. I think it's simple enough to you, even you can bind a custom domain to the default host name of static website of Azure Blob Storage via DNS CNAME.
The steps are below.
Refer to the offical document Host a static website in Azure Storage to create an account of Azure Blob Storage Gen 2 and enable the feature Static website.
Refer to the other offical document Quickstart: Azure Blob storage client library v12 for Python to write the code for uploading in your current Python application . The container default named $web is for hosting static website, you just need to upload the files to it, then access its primary endpoint as the figure from offical document below to see it.

ferris2-framework, python, google app engine, cloud storage -- uploading an image and making it public?

So the google ferris2 framework seems to exclusively use the blobstore api for the Upload component, making me question whether it's possible to make images uploaded to cloud storage public without having to write my own upload method and abandoning the use of the Upload component altogether, which also seems to create compatibility issues when using the cloud storage client library (python).
Backstory / context
using- google App engine, python, cloud storage client library
Requirements
0.5 We require that blob information nor the file be stored in the model. We want a public cloud serving url on the model and that is all. This seems to prevent us from using the normal ferris approach for uploading to cloud storage.
Things I already know / road blocks
One of the big roadblocks is dealing with Ferris using cgi / the blobstore api for field storage on the form. This seems to cause problems because so far it hasn't allowed sending data to to be sent to cloud storage through the google cloud storage python client.
Things we know about the google cloud storage python client and cgi:
To write data to cloud storage from our server, cloud storage needs to be called with cloudstorage.open("/bucket/object", "w", ...), (a cloud storage library method). However, it appears so far that a cgi.FieldStorage is returned from the post for the wtforms.fields.FileField() (as shown by a simple "print image" statement) before the data is applied to the model, after it is applied to the model, it is a blob store instance.
I would like verification on this:
after a lot of research and testing , it seems that because ferris is limited to the blobstore api for the uploads component, using the blob store api and blob keys to handle uploads seems basically unavoidable without having to create a second upload function just for the cloud storage call. Blob instances seem not to be compatible with that cloud storage client library, and it seems there is no way to get anything but meta data from blob files (without actually making a call to cloud storage to get the original file). However, it appears that this will not require storing extra data on the server. Furthermore, I believe it may be possible to get around the public link issue by setting the entire bucket to have read permissions.
Clarifying Questions:
1. To make uploaded images available to the public via our application, (any user, not an authenticated user), will I have to use the the cloudstorage python client library, or is there a way to do this with the blobstore api?
Is there a way to get the original file from a blob key (on save with the add action method) without actually having to make a call to cloud storage first, so that the file can be uploaded using that library?
If not, is there a way to grab the file from the cgi.FieldStorage, then send to cloud storage with the python client library? It seems that using cgi.FieldStorage.value is just meta data and not the file, same with cgi.FieldStorage.file.read()
1) You cannot use the GAE GCS client to update an ACL.
2) You can use the GCS json API after the blobstore upload to GCS and change the ACL to make it public. You do not have to upload again.
See this example code which inserts an acl.
3) Or use cgi.Fieldstorage to read the data (< 32 Mb) and write it to GCS using GAE GCS client.
import cloudstorage as gcs
import mimetypes
class UploadHandler(webapp2.RequestHandler):
def post(self):
file_data = self.request.get("file", default_value=None)
filename = self.request.POST["file"].filename
content_type = mimetypes.guess_type(self.filename)[0]
with gcs.open(filename, 'w', content_type=content_type or b'binary/octet-stream',
options={b'x-goog-acl': b'public-read'}) as f:
f.write(file_data)
A third method: use a form post upload with a GCS signed url and a policy document to control the upload.
And you can always use a public download handler, which reads files from the blobstore or GCS.
You can now specify the ACL when uploading a file from App Engine to Cloud Storage. Not sure how long it's been in place, just wanted to share:
filename = '/' + bucket_name + '/Leads_' + newUNID() + '.csv'
write_retry_params = gcs.RetryParams(backoff_factor=1.1)
gcs_file = gcs.open(filename,
'w',
content_type='text/csv',
options={'x-goog-acl': 'public-read'},
retry_params=write_retry_params)
docs: https://cloud.google.com/storage/docs/xml-api/reference-headers#standard

Upload a large blob from Appengine blobstore to Google Drive using Python using Drive SDK

The Python Drive API requires a "local file" to perform a resumable file upload to Google Drive, how can this be accomplished using Google Appengine which only has blobs and no access to a local file system.
Under the old doclist API (now depreciated) you could upload files from Google Appengine blobstore to Google Drive using the code below:
CHUNK_SIZE = 524288
uploader = gdata.client.ResumableUploader(
client, blob_info.open(), blob_info.content_type, blob_info.size, chunk_size=CHUNK_SIZE, desired_class=gdata.docs.data.DocsEntry)
The key part is using blob_info.open() rather than providing a reference to a local file.
How can we accomplish the same using the new Drive API?
Note the files are fairly big so a resumable upload is required, also I know this can be accomplished in Java but I am looking for a Python solution.
Many thanks,
Ian.
It looks like you are using the older GData client library and the Documents List API. If you use the new Drive SDK and the Google APIs Python client library, you can use the MediaIoBaseUpload class to create a media upload object from memory instead of from a file.

Categories