I am trying to resize a source image to multiple dimensions+extensions.
For example: when I upload a source image, say abc.jpg I need to resize it .jpg and .webp with different dimensions like abc_320.jpg, abc_320.webp, abc_640.jpg, abc_640.webp with s3 event trigger. So with my current python lambda handler I can do it with multiple put_object call to destination bucket but I want to make it more optimize as in future my dimension+extension may increase. So how can I store all the resized images to destination bucket with one call?
Current Lambda Handler:
import json
import boto3
import os
from os import path
from io import BytesIO
from PIL import Image
# boto3 S3 initialization
s3_client = boto3.client("s3")
def lambda_handler(event, context):
destination_bucket_name = 'destination-bucket'
# event contains all information about uploaded object
print("Event :", event)
# Bucket Name where file was uploaded
source_bucket_name = event['Records'][0]['s3']['bucket']['name']
# Filename of object (with path)
dest_bucket_perfix = 'resized'
file_key_name = event['Records'][0]['s3']['object']['key']
image_obj = s3_client.get_object(Bucket=source_bucket_name, Key=file_key_name)
image_obj = image_obj.get('Body').read()
img = Image.open(BytesIO(image_obj))
dimensions = [320, 640]
# Checking the extension and
img_extension = path.splitext(file_key_name)[1].lower()
extension_dict = {".jpg":"JPEG", ".png":"PNG", ".jpeg":"JPEG"}
extensions = ["WebP"]
if img_extension in extension_dict.keys():
extensions.append(extension_dict[img_extension])
print ("test-1")
for dimension in dimensions:
WIDTH = HEIGHT = dimension
for extension in extensions:
resized_img = img.resize((WIDTH, HEIGHT))
buffer = BytesIO()
resized_img.save(buffer, extension)
buffer.seek(0)
# I don't want to use this put_object in loop <<<---
s3_client.put_object(Bucket=destination_bucket_name, Key=file_key_name.replace("upload", dest_bucket_perfix, 1), Body=buffer)
return {
'statusCode': 200,
'body': json.dumps('Hello from S3 events Lambda!')
}
You can see I need to call put_object on every iteration of dimension+extension which is costly. I also thought about multi-threading and zipped solution but looking for others possible thoughts/solutions
Amazon S3 API calls only allow one object to be uploaded per call.
However, you could modify your program for multi-threading and upload the objects in parallel.
Related
Here is my code for uploading the image to AWS S3:
#app.post("/post_ads")
async def create_upload_files(files: list[UploadFile] = File(description="Multiple files as UploadFile")):
main_image_list = []
for file in files:
s3 = boto3.resource(
's3',
aws_access_key_id = aws_access_key_id,
aws_secret_access_key = aws_secret_access_key
)
bucket = s3.Bucket(aws_bucket_name)
bucket.upload_fileobj(file.file,file.filename,ExtraArgs={"ACL":"public-read"})
Is there any way to compress the image size and upload the image to a specific folder using boto3? I have this function for compressing the image, but I don't know how to integrate it into boto3.
for file in files:
im = Image.open(file.file)
im = im.convert("RGB")
im_io = BytesIO()
im = im.save(im_io, 'JPEG', quality=50)
s3 = boto3.resource(
's3',
aws_access_key_id = aws_access_key_id,
aws_secret_access_key = aws_secret_access_key
)
bucket = s3.Bucket(aws_bucket_name)
bucket.upload_fileobj(file.file,file.filename,ExtraArgs={"ACL":"public-read"})
Update #1
After following Chris's recommendation, my problem has been resolved:
Here is Chris's solution:
im_io.seek(0)
bucket.upload_fileobj(im_io,file.filename,ExtraArgs={"ACL":"public-read"})
You seem to be saving the image bytes to a BytesIO stream, which is never used, as you upload the original file object to the s3 bucket instead, as shown in this line of your code:
bucket.upload_fileobj(file.file, file.filename, ExtraArgs={"ACL":"public-read"})
Hence, you need to pass the BytesIO object to upload_fileobj() function, and make sure to call .seek(0) before that, in order to rewind the cursor (or "file pointer") to the start of the buffer. The reason for calling .seek(0) is that im.save() method uses the cursor to iterate through the buffer, and when it reaches the end, it does not reset the cursor to the beginning. Hence, any future read operations would start at the end of the buffer. The same applies to reading from the original file, as described in this answer—you would need to call file.file.seek(0), if the file contents were read already and you needed to read from the file again.
Example on how to load the image into BytesIO stream and use it to upload the file/image can be seen below. Please remember to properly close the UploadFile, Image and BytesIO objects, in order to release their memory (see related answer as well).
from fastapi import HTTPException
from PIL import Image
import io
# ...
try:
im = Image.open(file.file)
if im.mode in ("RGBA", "P"):
im = im.convert("RGB")
buf = io.BytesIO()
im.save(buf, 'JPEG', quality=50)
buf.seek(0)
bucket.upload_fileobj(buf, 'out.jpg', ExtraArgs={"ACL":"public-read"})
except Exception:
raise HTTPException(status_code=500, detail='Something went wrong')
finally:
file.file.close()
buf.close()
im.close()
As for the URL, using ExtraArgs={"ACL":"public-read"} should work as expected and make your resource (file) publicly accessible. Hence, please make sure you are accessing the correct URL.
aws s3 sync s3://your-pics. for file in "$ (find. -name "*.jpg")"; do gzip "$file"; echo "$file"; done aws s3 sync. s3://your-pics --content-encoding gzip --dryrun This will download all files in s3 bucket to the machine (or ec2 instance), compresses the image files and upload them back to s3 bucket.
This should help you.
I'm struggling to download a JPG file from Amazon S3 using Python, I want to load this code onto Heroku so I need to the image to be loaded into memory rather than onto disk.
The code I'm using is:
import boto3
s3 = boto3.client(
"s3",
aws_access_key_id = access_key,
aws_secret_access_key = access_secret
)
s3.upload_fileobj(image_conv, bucket, Key = "image_3.jpg")
new_obj = s3.get_object(Bucket=bucket, Key="image_3.jpg")
image_dl = new_obj['Body'].read()
Image.open(image_dl)
I'm getting the error message:
File ..... line 2968, in open
fp = builtins.open(filename, "rb")
ValueError: embedded null byte
Calling image_dl returns a massive long list of what I assume are bytes, one small section looks like the following:
f\xbc\xdc\x8f\xfe\xb5q\xda}\xed\xcb\xdcD\xab\xe6o\x1c;\xb7\xa0\xf5\xf5\xae\xa6)\xbe\xee\xe6\xc3vn\xdfLVW:\x96\xa8\xa3}\xa4\xd8\xea\x8f*\x89\xd7\xcc\xe8\xf0\xca\xb9\x0b\xf4\x1f\xe7\x15\x93\x0f\x83ty$h\xa6\x83\xc8\x99z<K\xc3c\xd4w\xae\xa4\xc2\xfb\xcb\xee\xe0
The image before I uploaded to S3 returned the below and that's the format that I'm trying to return the image into. Is anyone able to help me on where I'm going wrong?
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1440x1440 at 0x7F2BB4005EB0>
Pillow's Image class needs either a filename to open, or a file-like object that it can call read on. Since you don't have a filename, you'll need to provide a stream. It's easiest to use BytesIO to turn the byte array into a strem:
import boto3
from PIL import Image
from io import BytesIO
bucket = "--example-bucket--"
s3 = boto3.client("s3")
with open("image.jpg", "rb") as image_conv:
s3.upload_fileobj(image_conv, bucket, Key="image_3.jpg")
new_obj = s3.get_object(Bucket=bucket, Key="image_3.jpg")
image_dl = new_obj['Body'].read()
image = Image.open(BytesIO(image_dl))
print(image.width, image.height)
Try first to load raw data into a BytesIO container:
from io import StringIO
from PIL import Image
file_stream = StringIO()
s3.download_fileobj(bucket, "image_3.jpg", file_stream)
img = Image.open(file_stream)
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.download_fileobj
I am calling an API that returns a iterator object containing image data. I'd like to iterate over them and upload to s3. I could either open them into .png or .jpeg before or after dumping / uploading them to s3.
import boto3
# Download / open photo
img_obj = gmaps.places_photo(ph, max_width = 500, max_height = 400)
print(img_obj)
<generator object Response.iter_content.<locals>.generate at 0x7ffa2dsa7820>
s3 = boto3.client('s3')
with open('output_image_{}.png'.format(c), 'w') as data:
for chunk in img_obj:
s3.upload_fileobj(data,'mybucket','img/{}'.format(chunk))
Error:
Input <_io.BufferedWriter name='output_image_1.png'> of type: <class '_io.BufferedWriter'> is not supported.
On local machine, I am able to write the file:
with open("output_image_{}.png".format(c), "w") as fp:
for chunk in img_obj:
fp.write(chunk)
I'd like to directly save the img_obj on AWS S3.
This can be done by using the s3fs to save the image in png format.
I could recreate and achieve the sample using the below code:
from PIL import Image
import s3fs
s3=s3fs.S3FileSystem(client_kwargs=<aws_credentials>)
img_obj = Image.open('sample_file.png')
img_obj.save(s3.open('s3:///<bucket_name>/sample_file.png', 'wb'), 'PNG')
I can successfully access the google cloud bucket from my python code running on my PC using the following code.
client = storage.Client()
bucket = client.get_bucket('bucket-name')
blob = bucket.get_blob('images/test.png')
Now I don't know how to retrieve and display image from the "blob" without writing to a file on the hard-drive?
You could, for example, generate a temporary url
from gcloud import storage
client = storage.Client() # Implicit environ set-up
bucket = client.bucket('my-bucket')
blob = bucket.blob('my-blob')
url_lifetime = 3600 # Seconds in an hour
serving_url = blob.generate_signed_url(url_lifetime)
Otherwise you can set the image as public in your bucket and use the permanent link that you can find in your object details
https://storage.googleapis.com/BUCKET_NAME/OBJECT_NAME
Download the image from GCS as bytes, wrap it in BytesIO object to make the bytes file-like, then read in as a PIL Image object.
from io import BytesIO
from PIL import Image
img = Image.open(BytesIO(blob.download_as_bytes()))
Then you can do whatever you want with img -- for example, to display it, use plt.imshow(img).
In Jupyter notebooks you can display the image directly with download_as_bytes:
from google.cloud import storage
from IPython.display import Image
client = storage.Client() # Implicit environment set up
# with explicit set up:
# client = storage.Client.from_service_account_json('key-file-location')
bucket = client.get_bucket('bucket-name')
blob = bucket.get_blob('images/test.png')
Image(blob.download_as_bytes())
I need to save my avatar to the "avatar" folder inside my Amazon S3 bucket.
Bucket
-Static
--Misc
-Media
--Originals
--Avatars
Currently, when I create the avatar, it is saved to the Originals "folder". My goal is to save it to the Avatars "folder".
Here is my code for creating and saving the avatar
def create_avatar(self):
import os
from PIL import Image
from django.core.files.storage import default_storage as storage
if not self.filename:
return ""
file_path = self.filename.name
filename_base, filename_ext = os.path.splitext(file_path)
thumb_file_path = "%s_thumb.jpg" % filename_base
if storage.exists(thumb_file_path):
return "exists"
try:
# resize the original image and return url path of the thumbnail
f = storage.open(file_path, 'r')
image = Image.open(f)
width, height = image.size
size = 128, 128
image.thumbnail(size, Image.ANTIALIAS)
f_thumb = storage.open(thumb_file_path, "w")
image.save(f_thumb, "JPEG", quality=90)
f_thumb.close()
return "success"
except:
return "error"
I was able to save the avatar to the desired "folder" by renaming the file path with a simple python replace() function.
This did the trick if anyone else ever need to "move" a file within the S3 bucket
thumb_file_path = thumb_file_path.replace('originals/', 'avatar/')