I am trying to use the Python requests package to upload an image file to my Amazon AWS S3 bucket.
The code I have opens the bucket, downloads an image file, resizes the image, saves the image locally, then tries to upload the saved image to the S3 bucket.
It all works fine except that the uploaded jpg file is corrupt in some way in as much as it can no longer be viewed as an image. I have checked that the original file that is being uploaded is not corrupt.
My code is:
conn = S3Connection(settings.AWS_ACCESS_KEY_ID, settings.AWS_SECRET_ACCESS_KEY)
bucket = conn.get_bucket(settings.AWS_STORAGE_BUCKET_NAME)
for key in bucket.list(prefix='media/userphotos'):
file_name=key.name
full_path_filename = 'https://' + settings.AWS_STORAGE_BUCKET_NAME + '.s3.amazonaws.com/' + file_name
fd_img = urlopen(full_path_filename);
img = Image.open(fd_img)
img = resizeimage.resize_width(img, 800)
new_filename = full_path_filename.replace('userphotos', 'webversion')
# Save temporarily before uploading to S3 bucket
img.save('temp.jpg', img.format)
the_file = {'media': open('temp.jpg', 'rb')}
r = requests.put(new_filename, files=the_file, auth=S3Auth(settings.AWS_ACCESS_KEY_ID, settings.AWS_SECRET_ACCESS_KEY))
fd_img.close()
UPDATE
I have just noticed that while the jpg file cannot be opened with a web browser or with Preview on my Mac it can be opened successfully with Adobe Photoshop! Clearly the image is in the file but there is something about the jpg file created by requests.put() which is doing something to the file that stops it being readable by web browsers. Strange!
Do this instead:
requests.put(url, data=open(filename, 'rb'))
I noticed using "files" as documented in requests library prepends a bunch of garbage to the file. You inspect that with xxd <filename>
Related
Here is my code for uploading the image to AWS S3:
#app.post("/post_ads")
async def create_upload_files(files: list[UploadFile] = File(description="Multiple files as UploadFile")):
main_image_list = []
for file in files:
s3 = boto3.resource(
's3',
aws_access_key_id = aws_access_key_id,
aws_secret_access_key = aws_secret_access_key
)
bucket = s3.Bucket(aws_bucket_name)
bucket.upload_fileobj(file.file,file.filename,ExtraArgs={"ACL":"public-read"})
Is there any way to compress the image size and upload the image to a specific folder using boto3? I have this function for compressing the image, but I don't know how to integrate it into boto3.
for file in files:
im = Image.open(file.file)
im = im.convert("RGB")
im_io = BytesIO()
im = im.save(im_io, 'JPEG', quality=50)
s3 = boto3.resource(
's3',
aws_access_key_id = aws_access_key_id,
aws_secret_access_key = aws_secret_access_key
)
bucket = s3.Bucket(aws_bucket_name)
bucket.upload_fileobj(file.file,file.filename,ExtraArgs={"ACL":"public-read"})
Update #1
After following Chris's recommendation, my problem has been resolved:
Here is Chris's solution:
im_io.seek(0)
bucket.upload_fileobj(im_io,file.filename,ExtraArgs={"ACL":"public-read"})
You seem to be saving the image bytes to a BytesIO stream, which is never used, as you upload the original file object to the s3 bucket instead, as shown in this line of your code:
bucket.upload_fileobj(file.file, file.filename, ExtraArgs={"ACL":"public-read"})
Hence, you need to pass the BytesIO object to upload_fileobj() function, and make sure to call .seek(0) before that, in order to rewind the cursor (or "file pointer") to the start of the buffer. The reason for calling .seek(0) is that im.save() method uses the cursor to iterate through the buffer, and when it reaches the end, it does not reset the cursor to the beginning. Hence, any future read operations would start at the end of the buffer. The same applies to reading from the original file, as described in this answer—you would need to call file.file.seek(0), if the file contents were read already and you needed to read from the file again.
Example on how to load the image into BytesIO stream and use it to upload the file/image can be seen below. Please remember to properly close the UploadFile, Image and BytesIO objects, in order to release their memory (see related answer as well).
from fastapi import HTTPException
from PIL import Image
import io
# ...
try:
im = Image.open(file.file)
if im.mode in ("RGBA", "P"):
im = im.convert("RGB")
buf = io.BytesIO()
im.save(buf, 'JPEG', quality=50)
buf.seek(0)
bucket.upload_fileobj(buf, 'out.jpg', ExtraArgs={"ACL":"public-read"})
except Exception:
raise HTTPException(status_code=500, detail='Something went wrong')
finally:
file.file.close()
buf.close()
im.close()
As for the URL, using ExtraArgs={"ACL":"public-read"} should work as expected and make your resource (file) publicly accessible. Hence, please make sure you are accessing the correct URL.
aws s3 sync s3://your-pics. for file in "$ (find. -name "*.jpg")"; do gzip "$file"; echo "$file"; done aws s3 sync. s3://your-pics --content-encoding gzip --dryrun This will download all files in s3 bucket to the machine (or ec2 instance), compresses the image files and upload them back to s3 bucket.
This should help you.
So I am able to post Docx files to WordPress using WP REST-API using mammoth docx package in Python
I am able to upload an image to WordPress.
But when there are images in the docx file they are not uploading on the WordPress media section.
Any input on this?
I am using python for this.
Here is the code for Docx to HTML conversion
with open(file_path, "rb") as docx_file:
# html = mammoth.extract_raw_text(docx_file)
result = mammoth.convert_to_html(docx_file, convert_image=mammoth.images.img_element(convert_image))
html = result.value # The generated HTML
kindly do note that I am able to see images in the actual published post but they have a weird source image URL & are not appearing in the WordPress media section.
Weird image source URL like
data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAEBAQEBAQEBAQEBAQECAgMCAgICAgQDAwIDBQQFBQUEBAQFBgcGBQUHBgQEBgkGBwgICAgIBQYJCgkICgcICAj/2wBDAQEBAQICAgQCAgQIBQQFCAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAj/wAARCAUABQADASIAAhEBAxEB/8QAHwAAAQMFAQEBAAAAAAAAAAAAAAUGBwMECAkKAgsB/8QAhxAAAQIEBAMEBQYHCAUOFggXAQIDAAQFEQYHEiETMUEIIlFhCRQ & so on
Also Huge thanks to Contributors for the Python to WordPress repo
The mammoth cli has a function that extracts images, saves them to a directory and inserts the file names in the img tags in the html code. If you don't want to use mammoth in command line you could use this code:
import os
from mammoth.cli import ImageWriter, _write_output
output_dir = './output'
filename = 'filename.docx'
with open(filename, "rb") as docx_fileobj:
convert_image = mammoth.images.img_element(ImageWriter(output_dir))
output_filename = "{0}.html".format(os.path.basename(filename).rpartition(".")[0])
output_path = os.path.join(output_dir, output_filename)
result = mammoth.convert(
docx_fileobj,
convert_image=convert_image,
output_format='html',
)
_write_output(output_path, result.value)
Note that you would still need to change the img links as you'll be uploading the images to Wordpress, but this solves your mapping issue. You might also want to change the ImageWriter class to save the images to something else than tiff.
I have a client page which will list all the file in the container, on choosing a file the filename along with the container name is sent to the server.
The server should initiate the file download and should send the file as response to the client request, please refer to the image below:
I tried with get_blob_to_stream
#app.route("/blobs/testDownload/")
def testDownload():
container_name =request.args.get("containerName")
print(container_name)
local_file_name= request.args.get("fileName")
with BytesIO() as input_blob:
with BytesIO() as output_blob:
# Download as a stream
block_blob_service.get_blob_to_stream(container_name, local_file_name, input_blob)
copyfileobj(input_blob, output_blob)
newFile = str(output_blob.getvalue())
with open("file.txt","a") as f:
f.write(newFile)
f.close()
return send_file('file.txt',attachment_filename='sample.txt',as_attachment=True,mimetype='text/plain')
But the file which is getting downloaded is in only text file format, I want to download file irrespective of its format. and I know this is not the right way to download file via Web API.
You're using a fixed file-name "file.txt" for all the blobs which may be the reason. Using a stream seems useless here. try get_blob_to_path() instead, check out the following modified code:
--- // your code // ---
block_blob_service.get_blob_to_path(container_name, local_file_name, local_file_name)
# notice that I'm reusing the local_file_name here, hence no input/output blobs are required
return send_file(local_file_name,attachment_filename=local_file_name,as_attachment=True,mimetype='text/plain')
Complete Code:
#app.route("/blobs/testDownload/")
def testDownload():
container_name =request.args.get("containerName")
print(container_name)
local_file_name= request.args.get("fileName")
# Download as a file
block_blob_service.get_blob_to_path(container_name, local_file_name, local_file_name)
return send_file(local_file_name,attachment_filename=local_file_name,as_attachment=True,mimetype='text/plain')
See if that works!
try not to hard-code the extension, as the extension is part of the blob name, whichever method you are using from the documentation. Have a look at the method get_blob_to_path as you are downloading the file first locally. The Local file name is the same as the filename in the blob container.
You can try to get the blob.name for each blob file in the container. Blob name contains the file extension(you just have to parse it) which you can use as a parameter for the method above, and that way you do not have to hard-code it:
Below you can find an example of how you can iterate through the files in the container and get the blob name, and you can just adjust it for your use-case:
block_blob_service = BlockBlobService(account_name=accountName, account_key=accountKey)
# create container if not exists called 'batches'
container_name ='batches'
block_blob_service.create_container(container_name)
# Set the permission so the blobs are public.
block_blob_service.set_container_acl(container_name, public_access=PublicAccess.Container)
# Calculation
blobs = block_blob_service.list_blobs(container_name)
for blob in blobs.items:
file_name = blob.name
So now you can use file_name and split method for '/', and the last item is the filename.extension.
kind of looked at all possible options.
I am using boto3 and python3.6 to upload file to s3 bucket, Funny thing is while json and even .py file is getting uploaded, it is throwing Error 500 while uploading CSV. On successful uplaod i am returning an json to check all the values.
import boto3
from botocore.client import Config
#app.route("/upload",methods = ['POST','GET'])
def upload():
if request.method == 'POST':
file = request.files['file']
filename = secure_filename(file.filename)
s3 = boto3.resource('s3', aws_access_key_id= os.environ.get('AWS_ACCESS_KEY_ID'), aws_secret_access_key=os.environ.get('AWS_SECRET_ACCESS_KEY'),config=Config(signature_version='s3v4'))
s3.Bucket(os.environ.get('S3_BUCKET')).put_object(Key=filename, Body=open(filename, 'rb'), ContentEncoding='text/csv')
return jsonify({'successful upload':filename, 'S3_BUCKET':os.environ.get('S3_BUCKET'), 'ke':os.environ.get('AWS_ACCESS_KEY_ID'), 'sec':os.environ.get('AWS_SECRET_ACCESS_KEY'),'filepath': "https://s3.us-east-2.amazonaws.com/"+os.environ.get('S3_BUCKET')+"/" +filename})
Please help!!
You are getting a FileNotFoundError for file xyz.csv because the file does not exist.
This could be because the code in upload() does not actually save the uploaded file, it merely obtains a safe name for it and immediately tries to open it - which fails.
That it works for other files is probably due to the fact that those files already exist, perhaps left over from testing, so there is no problem.
Try saving the file to the file system using save() after obtaining the safe filename:
upload_file = request.files['file']
filename = secure_filename(upload_file.filename)
upload_file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
and then uploading it (assuming that you've configured an UPLOAD_FOLDER):
with open(os.path.join(app.config['UPLOAD_FOLDER'], filename), 'rb') as f:
s3.Bucket(os.environ.get('S3_BUCKET')).put_object(Key=filename, Body=f, ContentEncoding='text/csv')
return jsonify({...})
There is no need to actually save the file to the file system; it can be streamed directly to your S3 bucket using the stream attribute of the upload_file object:
upload_file = request.files['file']
filename = secure_filename(upload_file.filename)
s3 = boto3.resource('s3', aws_access_key_id='key', aws_secret_access_key='secret')
s3.Bucket('bucket').put_object(Key=filename, Body=upload_file.stream, ContentType=upload_file.content_type)
To make this more generic you should use the content_type attribute of the uploaded file as shown above.
The following code is successfully uploading an image file using the Bottle framework.
upload = bottle.request.files.get("filPhoto01")
if upload is not None:
name, ext = os.path.splitext(upload.filename)
if ext not in ('.png','.jpg','.jpeg'):
return "File extension not allowed."
save_path = "/tmp/abc".format(category=category)
if not os.path.exists(save_path):
os.makedirs(save_path)
file_path = "{path}/{file}".format(path=save_path, file=upload.filename)
with open(file_path, 'w') as open_file:
open_file.write(upload.file.read())
However, when I try to open this file manually after upload, I can't open the file. I can see the icon of the uploaded file with the correct size (implying the whole image was uploaded), but I cannot view it in any application like MS paint, etc.
I also tried referencing the file in my web application, but it does not render there either. What could possibly be wrong?
Just a guess, but since it sounds like you're on Windows, you'll want to write the file in binary mode:
with open(file_path, 'wb') as open_file:
(Also, you didn't mention your Python version, but FYI in Python 3 you'd need to use binary mode even on Linux.)