python "azure-storage-blob" package upload creates empty files - python

I am using the "azure-storage-blob" package within fastAPI to upload a blob image to my Azure storage blob container. Aftera lot of trial and error I decided to just copy over a static file from my directory to the azure table storage. but everytime I upload the file it gets added as empty. If I write the file locally everything goes fine.
I am using the official documentation as decribed here:
https://pypi.org/project/azure-storage-blob/
I have the following code:
#app.post("/files/")
async def upload(incoming_file: UploadFile = File(...)):
fs = await incoming_file.read()
file_size = len(fs)
print(file_size)
if math.ceil(file_size / 1024) > 64:
raise HTTPException(400, detail="File must be smaller than 64kb.")
if incoming_file.content_type not in ["image/png", "image/jpeg"]:
raise HTTPException(400, detail="File type must either be JPEG or PNG.")
try:
blob = BlobClient.from_connection_string(conn_str=az_connection_string, container_name="app-store-logos",
blob_name="dockerLogo.png")
with open("./dockerLogo.png", "rb") as data:
blob.upload_blob(data)
except Exception as err :
return {"message": "There was an error uploading the file {0}".format(err)}
finally:
await incoming_file.close()
return {"message": f"Successfuly uploaded {incoming_file.filename}"}
When I upload the file to the table storage the entry gets saved but empty:
If I change any filenames or storage names I do get an error, so the files exist and are in the right place, thoug it seems like the azure storage sdk doesnt copy over the contents of the file.
If anyone has any pointers I would be grateful

Related

Error uploading file to google cloud storage

How should the files on my server be uploaded to google cloud storage?
the code I have tried is given below, however, it throws a type error, saying, the expected type is not byte for:
the expected type is not byte for:
blob.upload_from_file(file.file.read()).
Although upload_from_file requires a binary type.
#app.post("/file/")
async def create_upload_file(files: List[UploadFile] = File(...)):
storage_client = storage.Client.from_service_account_json(path.json)
bucket_name = 'data'
try:
bucket = storage_client.create_bucket(bucket_name)
except Exception:
bucket = storage_client.get_bucket(bucket_name)
for file in files:
destination_file_name = f'{file.filename}'
new_data = models.Data(
path=destination_file_name
)
try:
blob = bucket.blob(destination_file_name)
blob.upload_from_file(file.file.read())
except Exception:
raise HTTPException(
status_code=500,
detail="File upload failed"
)
Option 1
As per the documentation, upload_from_file() supports a file-like object; hence, you could use the .file attribute of UploadFile (which represents a SpooledTemporaryFile instance). For example:
blob.upload_from_file(file.file)
Option 2
You could read the contents of the file and pass them to upload_from_string(), which supports data in bytes or string format. For instance:
blob.upload_from_string(file.file.read())
or, since you defined your endpoint with async def (see this answer for def vs async def):
contents = await file.read()
blob.upload_from_string(contents)
Option 3
For the sake of completeness, upload_from_filename() expects a filename which represents the path to the file. Hence, the No such file or directory error was thrown when you passed file.filename (as mentioned in your comment), as this is not a path to the file. To use that method (as a last resort), you should save the file contents to a NamedTemporaryFile, which "has a visible name in the file system" that "can be used to open the file", and once you are done with it, delete it. Example:
from tempfile import NamedTemporaryFile
import os
contents = file.file.read()
temp = NamedTemporaryFile(delete=False)
try:
with temp as f:
f.write(contents);
blob.upload_from_filename(temp.name)
except Exception:
return {"message": "There was an error uploading the file"}
finally:
#temp.close() # the `with` statement above takes care of closing the file
os.remove(temp.name)
Note 1:
If you are uploading a rather large file to Google Cloud Storage that may require some time to completely upload, and have encountered a timeout error, please consider increasing the amount of time to wait for the server response, by changing the timeout value, which—as shown in upload_from_file() documentation, as well as all other methods described earlier—by default is set to timeout=60 seconds. To change that, use e.g., blob.upload_from_file(file.file, timeout=180), or you could also set timeout=None (meaning that it will wait until the connection is closed).
Note 2:
Since all the above methods from the google-cloud-storage package perform blocking I/O operations—as can been seen in the source code here, here and here—if you have decided to define your create_upload_file endpoint with async def instead of def (have a look at this answer for more details on def vs async def), you should rather run the "upload file" function in a separate thread to ensure that the main thread (where coroutines are run) does not get blocked. You can do that using Starlette's run_in_threadpool, which is also used by FastAPI internally (see here as well). For example:
await run_in_threadpool(blob.upload_from_file, file.file)
Alternatively, you can use asyncio's loop.run_in_executor, as described in this answer and demonstrated in this sample snippet too.
As for Option 3, wehere you need to open a NamedTemporaryFile and write the contents to it, you can do that using the aiofiles library, as demonstrated in Option 2 of this answer, that is, using:
async with aiofiles.tempfile.NamedTemporaryFile("wb", delete=False) as temp:
contents = await file.read()
await temp.write(contents)
#...
and again, run the "upload file" function in an exterrnal threadpool:
await run_in_threadpool(blob.upload_from_filename, temp.name)
Finally, have a look at the answers here and here on how to enclose the I/O operations in try-except-finally blocks, so that you can catch any possible exceptions, as well as close the UploadFile object properly. UploadFile is a temporary file that is deleted from the filesystem when it is closed. To find out where your system keeps the temporary files, see this answer. Note: Starlette, as described here, uses a SpooledTemporaryFile with 1MB max_size, meaning that the data is spooled in memory until the file size exceeds 1MB, at which point the contents are written to the temporary directory. Hence, you will only see the file you uploaded showing up in the temp directory, if it is larger than 1MB and if .close() has not yet been called.

How to download a file from blob and send it has a file response

I have a client page which will list all the file in the container, on choosing a file the filename along with the container name is sent to the server.
The server should initiate the file download and should send the file as response to the client request, please refer to the image below:
I tried with get_blob_to_stream
#app.route("/blobs/testDownload/")
def testDownload():
container_name =request.args.get("containerName")
print(container_name)
local_file_name= request.args.get("fileName")
with BytesIO() as input_blob:
with BytesIO() as output_blob:
# Download as a stream
block_blob_service.get_blob_to_stream(container_name, local_file_name, input_blob)
copyfileobj(input_blob, output_blob)
newFile = str(output_blob.getvalue())
with open("file.txt","a") as f:
f.write(newFile)
f.close()
return send_file('file.txt',attachment_filename='sample.txt',as_attachment=True,mimetype='text/plain')
But the file which is getting downloaded is in only text file format, I want to download file irrespective of its format. and I know this is not the right way to download file via Web API.
You're using a fixed file-name "file.txt" for all the blobs which may be the reason. Using a stream seems useless here. try get_blob_to_path() instead, check out the following modified code:
--- // your code // ---
block_blob_service.get_blob_to_path(container_name, local_file_name, local_file_name)
# notice that I'm reusing the local_file_name here, hence no input/output blobs are required
return send_file(local_file_name,attachment_filename=local_file_name,as_attachment=True,mimetype='text/plain')
Complete Code:
#app.route("/blobs/testDownload/")
def testDownload():
container_name =request.args.get("containerName")
print(container_name)
local_file_name= request.args.get("fileName")
# Download as a file
block_blob_service.get_blob_to_path(container_name, local_file_name, local_file_name)
return send_file(local_file_name,attachment_filename=local_file_name,as_attachment=True,mimetype='text/plain')
See if that works!
try not to hard-code the extension, as the extension is part of the blob name, whichever method you are using from the documentation. Have a look at the method get_blob_to_path as you are downloading the file first locally. The Local file name is the same as the filename in the blob container.
You can try to get the blob.name for each blob file in the container. Blob name contains the file extension(you just have to parse it) which you can use as a parameter for the method above, and that way you do not have to hard-code it:
Below you can find an example of how you can iterate through the files in the container and get the blob name, and you can just adjust it for your use-case:
block_blob_service = BlockBlobService(account_name=accountName, account_key=accountKey)
# create container if not exists called 'batches'
container_name ='batches'
block_blob_service.create_container(container_name)
# Set the permission so the blobs are public.
block_blob_service.set_container_acl(container_name, public_access=PublicAccess.Container)
# Calculation
blobs = block_blob_service.list_blobs(container_name)
for blob in blobs.items:
file_name = blob.name
So now you can use file_name and split method for '/', and the last item is the filename.extension.

S3 client partially uploads .png files

I am developing a Python app which consist of an uploading module, the core function pulls .png images from a queue, and by using a Boto3 client, uploading them to a certain bucket.
The problem is that sometimes, not always, the images are only partially uploaded. e.g. when I download a defective image, it seem to be cropped.
When I manually uploading the images (using an FTP/SSH client) the images are being perfectly uploaded.
The following is my core function, note that I'm using upload_fileobj() with a callback for progress bar mechanics.
def upload_file_aws(self):
s3 = boto3.client('s3', aws_access_key_id=self.aws_access_key,
aws_secret_access_key=self.aws_secret_key)
if (not self.uploader.queue.empty()):
file = self.uploader.queue.get()
with open(file, 'rb') as f:
aws_format = '%s' % AppObject.file_path_dic.get(file)
s3.upload_fileobj(f, self.bucket_name, aws_format, Callback=ProgressBarInit(file))
Did anyone encountered with that problem before?
At Amazon's doc file they declare that boto3 protocols does not enables partial uploads.
There are high chances that it is happening for larger images of size more then 5 MB.
You should be using multipart upload for large size objects.
Here is basic code example of multipart upload.
import boto3
def upload_file( filename ):
session = boto3.Session()
s3_client = session.client( 's3' )
try:
print "Uploading file:", filename
tc = boto3.s3.transfer.TransferConfig()
t = boto3.s3.transfer.S3Transfer( client=s3_client,
config=tc )
t.upload_file( filename, 'my-bucket-name', 'name-in-s3.dat' )
except Exception as e:
print "Error uploading: %s" % ( e )

Uploaded Files get overwritten in Dropbox

I am trying to upload users files to DropBox in Django. When I use the built in 'open()' function, it throws the following exception:
expected str, bytes or os.PathLike object, not TemporaryUploadedFile
When I don't, the file gets uploaded successfully but is blank (write mode).
UPLOAD HANDLER:
def upload_handler(DOC, PATH):
dbx = dropbox.Dropbox(settings.DROPBOX_APP_ACCESS_TOKEN)
with open(DOC, 'rb') as f:
dbx.files_upload(f.read(), PATH)
dbx.sharing_create_shared_link_with_settings(PATH)
How do I upload files or pass a mode to DropBox API without it being overwritten?
To specify a write mode when uploading files to Dropbox, pass the desired WriteMode to the files_upload method as the mode parameter. That would look like this:
dbx.files_upload(f.read(), PATH, mode=dropbox.files.WriteMode('overwrite')
This only controls how Dropbox commits the file (see the WriteMode docs for info); it doesn't control what data you're uploading. In your code, it is uploading whatever is returned by f.read(), so make sure that's what you expect it to be.

Error 500 while Uploading CSV file to S3 bucket using boto3 and python flask

kind of looked at all possible options.
I am using boto3 and python3.6 to upload file to s3 bucket, Funny thing is while json and even .py file is getting uploaded, it is throwing Error 500 while uploading CSV. On successful uplaod i am returning an json to check all the values.
import boto3
from botocore.client import Config
#app.route("/upload",methods = ['POST','GET'])
def upload():
if request.method == 'POST':
file = request.files['file']
filename = secure_filename(file.filename)
s3 = boto3.resource('s3', aws_access_key_id= os.environ.get('AWS_ACCESS_KEY_ID'), aws_secret_access_key=os.environ.get('AWS_SECRET_ACCESS_KEY'),config=Config(signature_version='s3v4'))
s3.Bucket(os.environ.get('S3_BUCKET')).put_object(Key=filename, Body=open(filename, 'rb'), ContentEncoding='text/csv')
return jsonify({'successful upload':filename, 'S3_BUCKET':os.environ.get('S3_BUCKET'), 'ke':os.environ.get('AWS_ACCESS_KEY_ID'), 'sec':os.environ.get('AWS_SECRET_ACCESS_KEY'),'filepath': "https://s3.us-east-2.amazonaws.com/"+os.environ.get('S3_BUCKET')+"/" +filename})
Please help!!
You are getting a FileNotFoundError for file xyz.csv because the file does not exist.
This could be because the code in upload() does not actually save the uploaded file, it merely obtains a safe name for it and immediately tries to open it - which fails.
That it works for other files is probably due to the fact that those files already exist, perhaps left over from testing, so there is no problem.
Try saving the file to the file system using save() after obtaining the safe filename:
upload_file = request.files['file']
filename = secure_filename(upload_file.filename)
upload_file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
and then uploading it (assuming that you've configured an UPLOAD_FOLDER):
with open(os.path.join(app.config['UPLOAD_FOLDER'], filename), 'rb') as f:
s3.Bucket(os.environ.get('S3_BUCKET')).put_object(Key=filename, Body=f, ContentEncoding='text/csv')
return jsonify({...})
There is no need to actually save the file to the file system; it can be streamed directly to your S3 bucket using the stream attribute of the upload_file object:
upload_file = request.files['file']
filename = secure_filename(upload_file.filename)
s3 = boto3.resource('s3', aws_access_key_id='key', aws_secret_access_key='secret')
s3.Bucket('bucket').put_object(Key=filename, Body=upload_file.stream, ContentType=upload_file.content_type)
To make this more generic you should use the content_type attribute of the uploaded file as shown above.

Categories