You would think it's quite simple, but the following code doesn't work as I would expect:
from hashlib import md5
from PIL import Image
im = Image.open("/tmp/original.jpg")
im.save("/tmp/new.jpg", quality="keep")
original = Image.open("/tmp/original.jpg")
new = Image.open("/tmp/new.jpg")
assert md5(original.tobytes()).hexdigest() == md5(new.tobytes()).hexdigest()
Why is it that when I'm simply saving an image as a new file, and keeping the quality settings the same, that the image data isn't identical? What am I missing?
Update (Explanation):
My problem is that I have a Pillow JpegImage object being handed to my code as part of a pipeline and I don't have control over the step at which the file is saved to disk:
<magic> → <my code> → <magic that saves to disk>
All I want my code to do is add/update/replace (any of these) the EXIF data for the to-be-saved jpeg image. As this info doesn't appear to be editable on an image object, the only way that I can figure to do this is to save the image to a temporary place (like BytesIO), save it and the re-open it with Image.open() before passing it to the next function in the chain.
Please tell me that there's a smarter, more efficient way to do this?
Related
I'm a beginner in python and I'm trying to send someone my small python program together with a picture that'll display when the code is run.
I tried to first convert the image to a binary file thinking that I'd be able to paste it in the source code but I'm not sure if that's even possible as I failed to successfully do it.
You can base64-encode your JPEG/PNG image which will make it into a regular (non-binary string) like this:
base64 -w0 IMAGE.JPG
Then you want to get the result into a Python variable, so repeat the command but copy the output to your clipboard:
base64 -w0 IMAGE.JPG | xclip -selection clipboard # Linux
base64 -w0 IMAGE.JPG | pbcopy # macOS
Now start Python and make a variable called img and paste the clipboard into it:
img = 'PASTE'
It will look like this:
img = '/9j/4AAQSk...' # if your image was JPEG
img = 'iVBORw0KGg...' # if your image was PNG
Now do some imports:
from PIL import Image
import base64
import io
# Make PIL Image from base64 string
pilImage = Image.open(io.BytesIO(base64.b64decode(img)))
Now you can do what you like with your image:
# Print its description and size
print(pilImage)
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=200x100>
# Save it to local disk
pilImage.save('result.jpg')
You can save a picture in byte format inside a variable in your program. You can then convert the bytes back into a file-like object using the BytesIO function of the io module and plot that object using the Image module from the Pillow library.
import io
import PIL.Image
with open("filename.png", "rb") as file:
img_binary = file.read()
img = PIL.Image.open(io.BytesIO(img_binary))
img.show()
To save the binary data inside your program without having to read from the source file you need to encode it with something like base64, use print() and then simply copy the output into a new variable and remove the file reading operation from your code.
That would look like this:
img_encoded = base64.encodebytes(img_binary)
print(img_binary)
img_encoded = " " # paste the output from the console into the variable
the output will be very long, especially if you are using a big image. I only used a very small png for testing.
This is how the program should look like at the end:
import io
import base64
import PIL.Image
# with open("filename.png", "rb") as file:
# img_binary = file.read()
# img_encoded = base64.encodebytes(img_binary)
img_encoded = b'iVBORw0KGgoAAAANSUhEUgAAADAAAAAwCAYAAABX[...]'
img = PIL.Image.open(io.BytesIO(base64.decodebytes(img_encoded)))
img.show()
You could perhaps have your Python program download the image from a site where you upload files such as Google Drive, Mega, or Imgur. That way, you can always access and view the image easily without the need of running the program or for example converting the binary back into the image in the method you mentioned.
Otherwise, you could always store the image as bytes in a variable and have your program read this variable. I'm assuming that you really wish to do it this way as it would be easier to distribute as there is only one file that needs to be downloaded and run.
Or you could take a look at pyinstaller which is made for python programs to be easily distributed across machines without the need to install Python by packaging it as an executable (.exe) file! That way you can include the image file together by embedding it into the program. There are plenty of tutorials for pyinstaller you could google up. Note: Include the '--onefile' in your parameters when running pyinstaller as this will package the executable into a single file that the person you're sending it to can easily open whoever it may be-- granted the executable file can run on the user's operating system. :)
I need to save an Image in Python (created as a Numpy array) as a JPEG file, while including a "comment" in the file with some specific metadata. This metadata will be used by another (third-party) application and is a simple ASCII string. I have a sample image including such a "comment", which I can read out using Pillow (PIL), via the image.info['comment'] or the image.app['COM'] property. However, when I try a simple round-trip, i.e. loading my sample image and save it again using a different file name, the comment is no longer preserved. Equally, I found no way to include a comment in a newly created image.
I am aware that EXIF tags are the preferred way to save metadata in JPEG images, but as mentioned, the third-party application only accepts this data as a "comment", not as EXIF, which I cannot change. After reading this question, I looked into the binary structure of my sample file and found the comment at the start of the file, after a few bytes of some other (meta)data. I do however not know a lot about binary file manipulation, and also I was wondering if there is a more elegant way, other than messing with the binary...
EDIT: minimum example:
from PIL import Image
img = Image.open(path) # where path is the path to the sample image
# this prints the desired metadata if it is correctly saved in loaded image
print(img.info["comment"])
img.save(new_path) # save with different file name
img.close()
# now open to see if it has been saved correctly
new_img = Image.open(new_path)
print(new_img.info['comment']) # now results in KeyError
I also tried img.save(new_path, info=img.info), but this does not seem to have an effect. Since img.info['comment'] appears identical to img.app['COM'], I tried img.save(new_path, app=img.app), again does not work.
Just been having a play with this and I couldn't see anything directly in Pillow to support this. I've found that the save() method supports a parameter called extra that can be used to pass arbitrary bytes to the output file.
We then just need a simple method to turn a comment into a valid JPEG segment, for example:
import struct
from PIL import Image
def make_jpeg_variable_segment(marker: int, payload: bytes) -> bytes:
"make a JPEG segment from the given payload"
return struct.pack('>HH', marker, 2 + len(payload)) + payload
def make_jpeg_comment_segment(comment: bytes) -> bytes:
"make a JPEG comment/COM segment"
return make_jpeg_variable_segment(0xFFFE, comment)
# open source image
with Image.open("foo.jpeg") as im:
# save out with new JPEG comment
im.save('bar.jpeg', extra=make_jpeg_comment_segment("hello world".encode()))
# read file back in to ensure comment round-trips
with Image.open('bar.jpeg') as im:
print(im.app['COM'])
print(im.info['comment'])
Note that in my initial attempts I tried appending the comment segment at the end of the file, but Pillow wouldn't load this comment even after calling the .load() method to force it to load the entire JPEG file.
Update: The upcoming version Pillow version 9.4.0 will support this by passing a comment parameter while saving, e.g.:
with Image.open("foo.jpeg") as im:
im.save('bar.jpeg', comment="hello world")
hopefully that makes things easier!
I want to remove exif from an image before uploading to s3. I found a similar question (here), but it saves as a new file (I don't want it). Then I found an another way (here), then I tried to implemented it, everything was ok when I tested it. But after I deployed to prod, some users reported they occasionally got a problem while uploading images with a size of 1 MB and above, so they must try it several times.
So, I just want to make sure is my code correct?, or maybe there is something I can improve.
from PIL import Image
# I got body from http Request
img = Image.open(body)
img_format = img.format
# Save it in-memory to remove EXIF
temp = io.BytesIO()
img.save(temp, format=img_format)
body = io.BytesIO(temp.getvalue())
# Upload to s3
s3_client.upload_fileobj(body, BUCKET_NAME, file_key)
*I'm still finding out if this issue is caused by other things.
You should be able to copy the pixel data and palette (if any) from an existing image to a new stripped image like this:
from PIL import Image
# Load existing image
existing = Image.open(...)
# Create new empty image, same size and mode
stripped = Image.new(existing.mode, existing.size)
# Copy pixels, but not metadata, across
stripped.putdata(existing.getdata())
# Copy palette across, if any
if 'P' in existing.mode: stripped.putpalette(existing.getpalette())
Note that this will strip ALL metadata from your image... EXIF, comments, IPTC, 8BIM, ICC colour profiles, dpi, copyright, whether it is progressive, whether it is animated.
Note also that it will write JPEG images with PIL's default quality of 75 when you save it, which may or may not be the same as your original image had - i.e. the size may change.
If the above stripping is excessive, you could just strip the EXIF like this:
from PIL import Image
im = Image.open(...)
# Strip just EXIF data
if 'exif' in im.info: del im.info['exif']
When saving, you could test if JPEG, and propagate the existing quality forward with:
im.save(..., quality='keep')
Note: If you want to verify what metadata is in any given image, before and after stripping, you can use exiftool or ImageMagick on macOS, Linux and Windows, as follows:
exiftool SOMEIMAGE.JPG
magick identify -verbose SOMEIMAGE.JPG
builded a healthcare project on python flask framework.I need to duplicate the image or the report that is uploaded by patient and keep the original file away and only visible that duplicated report to show to any unauthorised access attackers.ie The duplicate image should be similar one but not same a little changes must have for the duplicated image. I think now u get the idea what im trying to
Do you mean to load an image file into python array, and create a new copy of that array?
If that's your demand, try using copy():
from PIL import Image
img = Image.open("example.png")
copy_img = img.copy()
And you can do whatever you want with copy_img, which is independent from the original one.
Edit: 2021/6/8 22:35
If you want to copy an image located in a directory, use shutils:
import shutil
upload_directory = "file/upload/upload.jpg"
save_directory = "file/copy/upload_copy.jpg"
shutil.copy(upload_directory, save_directory)
Then you have an copied image file located in your destination directory.
Please let me know it that works.
If I have multiple base64 strings that are images (one string = one image). Is there a way to combine them and decode to a single image file? i.e. from multiple base64 strings, merge and output a single image file.
I'm not sure how I would approach this using Pillow (or if I even need it).
Further clarification:
The source images are TIFFs that are encoded into base64
When I say "merge", I mean turning multiple images into a multi-page image like you see in a multi-page PDF
I dug through the Pillow documentation (v5.3) and found something that seems to work. Basically, there are two phases to this:
Save encoded base64 strings as TIF
Append them together and save to disk
Example using Python 3.7:
from PIL import Image
import io
import base64
base64_images = ["asdfasdg...", "asdfsdafas..."]
image_files = []
for base64_string in base64_images:
buffer = io.BytesIO(base64.b64decode(base64_string))
image_file = Image.open(buffer)
image_files.append(image_file)
combined_image = images_files[0].save(
'output.tiff',
save_all=True,
append_images=image_files[1:]
)
In the above code, I first create PIL Image objects from a bytes buffers in order to do this whole thing in-memory but you can probably use .save() and create a bunch of tempfiles instead if I/O isn't a concern.
Once I have all the PIL Image objects, I choose the first image (assuming they were in desired order in base64_images list) and append the rest of the images with append_images flag. The resulting image has all the frames in one output file.
I assume this pattern is extensible to any image format that supports the save_all and append_images keyword arguments. The Pillow documentation should let you know if it is supported.