Batch upload to google cloud using python - python

I am trying to upload multiple files at once using the google cloud python api, and am having trouble with it. Below is my code:
client = storage.Client.from_service_account_json("path_to_json")
bucket = client.get_bucket(bucket_name)
with client.batch():
for i in range(10):
try:
blob = bucket.blob("my_blob")
blob.upload_from_filename("path_to_file",content_type="image/jpeg")
except Exception as e:
raise e
However, this is telling me that there are no deferred requests to make. I have tried doing it without client.batch() and it works but is too slow. I was wondering if anyone has ever encountered this.
Thanks

According to this issue, the back-end API doesn't support batching "media" operations

Related

How to upload a video file directly to Cloud Storage from a Flask form without BlobStore?

I am using Flask, and I have a form on my web app's index page, which requires users to upload MP4 videos. I expect my users to upload 30min long videos, so the video sizes are likely going to be in the hundreds of megabytes. The issue now is that I intend to deploy this Flask application to Google App Engine, and apparently I cannot work with any static file above 32MB. Somehow, when I try to upload any video in the deployed version that is above 32MB, I get a Request Too Large error.
I see that the BlobStore Python API used to be a recommended solution to work with really large files on the server in the past. But that was for Python 2.7: https://cloud.google.com/appengine/docs/standard/python/blobstore/
I'm using Python 3.7, and Google now recommends that files get uploaded directly to Cloud Storage, and I am not exactly sure how to do that.
Below is a snippet showing how I'm currently storing my users' uploaded videos through the form into Cloud Storage. Unfortunately, I'm still restricted from uploading large files because I get error messages. So again, my question is: How can I make my users upload their files directly to Cloud Storage in a way that won't let the server timeout or give me a Request Too Large error?
form = SessionForm()
blob_url = ""
if form.validate_on_submit():
f = form.video.data
video_string = f.read()
filename = secure_filename(f.filename)
try:
# The custom function upload_session_video() uploads the file to a Cloud Storage bucket
# It uses the Storage API's upload_from_string() method.
blob_url = upload_session_video(video_string, filename)
except FileNotFoundError as error:
flash(error, 'alert')
# Create the Cloud Storage bucket (same name as the video file)
user_bucket = create_bucket(form.patient_name.data.lower())
You cannot upload files more than 32MB to Cloud Storage using Google App Engine due to a request limitation. However, you can bypass that by uploading to Cloud Storage using with resumable uploads in python case use "google-resumable-media".
the size of the resource is not known (i.e. it is generated on the
fly)
requests must be short-lived
the client has request size limitations
the resource is too large to fit into memory
example code included here.

azure-storage-python - Exception handling with create_blob_from_path

If for some reason the create_blob_from_path method runs into issues, what would be the proper way to handle the exception? As in, when you are writing your 'except' block, it would be except ??what?? as e: ? I want to be able to handle that exception when a blob upload fails so it can email me via sendmail so I know a backup that is attempting to be offloaded to Azure storage has failed.
Thank you!
I looked for & searched some exception names on the source codes of Azure Python Storage SDK, there are not any exceptions defined about failure of uploading blob. However, per my experience, Azure Storage SDK for Python wrapped REST APIs of Storage Services via the python package requests to do related operations. So any failure will cause the requests exceptions, you can try to catch the requests root exception requests.exceptions.RequestException as below to do the next action like sendmail.
import requests
try:
blobservice.create_blob_from_path(container_name, blob_name, file_path)
except requests.exceptions.RequestException:
sendmain(...)

Why do I get a 500 internal server error from the Google Drive API when adding users to a Google Sheet in Python?

I am writing python script using the Google Sheets apiAPI. It reads data and writes it to a new file, shares that file with a specified email and returns the id of the new file.
def read_sheet(self,spreadsheetId):
try:
result=self.service.spreadsheets().get(spreadsheetId=spreadsheetId,includeGridData=True,fields='namedRanges,properties,sheets').execute()
return result
except apiclient.errors.HttpError as e:
traceback.print_exc()
print(e)
sys.exit(1)
def create_spreadsheet(self,data,email):
try:
newid=self.service.spreadsheets().create(body=data,fields='spreadsheetId').execute()
newid=newid.get('spreadsheetId')
self.give_permissions(email,newid)
return newid
except apiclient.errors.HttpError as e:
traceback.print_exc()
print(e)
sys.exit(1)
This code works very well, but not with 100% accuracy. Sometimes I get a 500 Internal Server Error, but the file is created in my account. I found a similar Stack Overflow question (Getting 500 Error when using Google Drive API to update permissions), but it didn't help. I want to know the exact reason for this. Can anyone help?
EDIT1:
This is the exact error message
https://www.googleapis.com/drive/v3/files/349hsadfhSindfSIins-rasdfisadfOsa3OQmE/permissions?sendNotificationEmail=true&alt=json&transferOwnership=false
returned "Internal Error. User message: "An internal error has
occurred which prevented the sharing of these item(s): Template"">
As hinted to above in DaimTo's comment, the error is due to Google Drive still processing the create request while you're trying to add the permission to share the (new) file with. Remember, when you add a file to Drive, Google's servers are still working on the file-create as well as making it accessible globally. Once the flurry of activity settles down, then adding additional users to the document shouldn't be a problem.
You can see from this Drive API documentation page a description of the (500) error you received as well as the recommended course of action which is to implement exponential backoff, which is really just saying you should pause a bit before trying again & extending that delay each time you get the same error. He also pointed to another SO Q&A which you can look at. Another resource is this descriptive blog post. If you don't want to implement it yourself, you can try the retrying or backoff packages.
NOTE: You didn't show us all your code, but I changed the title of this question to more accurately reflect that you're using the Drive API for adding the permissions. While you've used the Sheets API to create the Sheet with, realize that you can just do this all with the Drive API (and not use the Sheets API at all unless you're doing spreadsheet-oriented operations. The Drive API is for all file-related operations like sharing, copying, import/export, etc.)
Bottom line is that you can create Sheets using either API, but if you're not doing anything else with the Sheets API, why bother making your app more complex? If you want to see how to create Sheets with both APIs, there's a short segment in my blog post that covers this... you'll see that they're nearly identical but using the Drive API does require one more thing, the MIMEtype.
If you want to learn more about both APIs, see this answer I gave to a related question that features additional learning resources I've created for both Drive and Sheets, most of which are Python-based.
I guess I'm late but just in case: just add a few seconds delay between the create request and the give permissions one. For me it works making the thread sleep for 10 seconds. Try this:
def create_spreadsheet(self,data,email):
try:
newid=self.service.spreadsheets().create(body=data,fields='spreadsheetId').execute()
newid=newid.get('spreadsheetId')
time.sleep(10)
self.give_permissions(email,newid)
return newid
except apiclient.errors.HttpError as e:
traceback.print_exc()
print(e)
sys.exit(1)

How to read HTTP response from Azure Python SDK

I'm trying to push a file (put blob request) to the Azure CDN Blob storage using the python SDK. It works no problem, I just can't figure out how to read the header information in the response. According to the docs, its supposed to send back a 201 status if it is successful.
http://msdn.microsoft.com/en-us/library/azure/dd179451.aspx
http://azure.microsoft.com/en-us/documentation/articles/storage-python-how-to-use-blob-storage/
from azure.storage import BlobService
blob_service = BlobService(account_name='accountnamehere', account_key='apikeyhere')
file_contents = open('path/to/image.jpg').read()
blob_service.put_blob(CONTAINER, 'filename.jpg', file_contents, x_ms_blob_type='BlockBlob', x_ms_blob_content_type='image/jpeg')
Any help is greatly appreciated.
Thanks
You can't read the response code.
Source code for SDK is available on GitHub, and in the current version put_blob() function does not return anything.
Do you need to read it though? If put_blob completes succesfully, then your code continues from the next statement. If it were to fail, then the SDK will raise an exception which you can then catch.
You could verify your exception/error handling by using a wrong access key for example.

Can I read from the AppEngine BlobStore using the remote api

I am trying to read (and subsequently save) blobs from the blobstore using the remote api. I get the error: "No api proxy found for service "blobstore"" when I execute the read.
Here is the stub code:
for b in bs.BlobInfo.all().fetch(100):
blob_reader = bs.BlobReader(str(b.key))
file = blob_reader.read()
the error occurs on the line: file = blob_reader.read()
I am reading the file from my personal appspot via terminal with:
python tools/remote_api/blobstore_download.py --servername=myinstance.appspot.com --appid=myinstance
So, reading from the blobstore possible via the remote api? or is my code bad? Any suggestions?
We recently added blobstore support to remote_api. Make sure you're using the latest version of the SDK, and your error should go away.

Categories