Store jpg, gif, png, etc it gae-datastore - python

I found an example on how to store png in datastore:
img = images.Image(img_data)
# Basically, we just want to make sure it's a PNG
# since we don't have a good way to determine image type
# through the API, but the API throws an exception
# if you don't do any transforms, so go ahead and use im_feeling_lucky.
img.im_feeling_lucky()
png_data = img.execute_transforms(images.PNG)
img.resize(60, 100)
thumbnail_data = img.execute_transforms(images.PNG)
Picture(data=png_data,
thumbnail_data=thumbnail_data).put()
This code is very confusing to me, but it works for png. However, what should I do to be able to store all most common formats (jpg, gif, tiff, etc) ?

The quick answer
You can store binary data of any file type by using db.BlobProperty() in your model.
If you use the Image API to manipulate the image data, you're limited to inputting .jpg, .png, .gif, .bmp, .tiff, and .ico types, and outputting to either .jpg or .png.
Storing images
To simply store the images in the data store, use db.BlobProperty() in your model, and have this store the binary data for the picture. This is how the data is stored in the example code you linked to (see Line 85).
Because the type db.BlobProperty type is not a picture per se, but can store any binary data, some discipline is needed; there's no easy way to programmatically enforce a pictures-only constraint. Luckily, this means that you can store data of any type you want, including .jpg, .gif, .tiff, etc. files in addition to the .png format, as in the example.
You'll probably want to, as they have in the example, create a new Class for the Model, and store certain metadata ("name", "filetype", etc.) needed for the files, in addition to the image's binary data. You can see an example of this at Line 65 in the example you linked to.
To store the image in the BlobProperty, you'll want to use the db.put() to save the data; this is the same as with any type. See the code starting on Line 215 in the example code you linked to.
Manipulating images
If you have to manipulate the image, you can use the Images API package. From the Overview of the Images API we can see the following:
The service accepts image data in the JPEG, PNG, GIF (including animated GIF), BMP, TIFF and ICO formats.
It can return transformed images in the JPEG and PNG formats. If the input format and the output format are different, the service converts the input data to the output format before performing the transformation.
So even though you can technically store any type in the datastore, the valid input and output typer are limited if you're using this API to manipulate the images.

models.py
class Profile(db.Model):
avatar=db.BlobProperty()
views.py
if(self.request.get):
image = self.request.get('MyFile')
if image:
mime=self.request.POST['MyFile'].type
mime=mime.split('/')
icon_image = db.Blob(images.resize(image,460,460))
prof.avatar = db.Blob(icon_image)
if mime[1]== 'jpeg' or 'jpg' or 'gif' or 'png':
prof.put()
display image
class disp_image(webapp.RequestHandler):
def get(self):
if profile.avatar is not None:
image = view_profile.avatar
self.response.headers['Content-Type'] = "image/png"
return self.response.out.write(image)
Templates
<img id="crop" src='/module/disp_image' alt="profile image" >

Related

Save JPEG comment using Pillow

I need to save an Image in Python (created as a Numpy array) as a JPEG file, while including a "comment" in the file with some specific metadata. This metadata will be used by another (third-party) application and is a simple ASCII string. I have a sample image including such a "comment", which I can read out using Pillow (PIL), via the image.info['comment'] or the image.app['COM'] property. However, when I try a simple round-trip, i.e. loading my sample image and save it again using a different file name, the comment is no longer preserved. Equally, I found no way to include a comment in a newly created image.
I am aware that EXIF tags are the preferred way to save metadata in JPEG images, but as mentioned, the third-party application only accepts this data as a "comment", not as EXIF, which I cannot change. After reading this question, I looked into the binary structure of my sample file and found the comment at the start of the file, after a few bytes of some other (meta)data. I do however not know a lot about binary file manipulation, and also I was wondering if there is a more elegant way, other than messing with the binary...
EDIT: minimum example:
from PIL import Image
img = Image.open(path) # where path is the path to the sample image
# this prints the desired metadata if it is correctly saved in loaded image
print(img.info["comment"])
img.save(new_path) # save with different file name
img.close()
# now open to see if it has been saved correctly
new_img = Image.open(new_path)
print(new_img.info['comment']) # now results in KeyError
I also tried img.save(new_path, info=img.info), but this does not seem to have an effect. Since img.info['comment'] appears identical to img.app['COM'], I tried img.save(new_path, app=img.app), again does not work.
Just been having a play with this and I couldn't see anything directly in Pillow to support this. I've found that the save() method supports a parameter called extra that can be used to pass arbitrary bytes to the output file.
We then just need a simple method to turn a comment into a valid JPEG segment, for example:
import struct
from PIL import Image
def make_jpeg_variable_segment(marker: int, payload: bytes) -> bytes:
"make a JPEG segment from the given payload"
return struct.pack('>HH', marker, 2 + len(payload)) + payload
def make_jpeg_comment_segment(comment: bytes) -> bytes:
"make a JPEG comment/COM segment"
return make_jpeg_variable_segment(0xFFFE, comment)
# open source image
with Image.open("foo.jpeg") as im:
# save out with new JPEG comment
im.save('bar.jpeg', extra=make_jpeg_comment_segment("hello world".encode()))
# read file back in to ensure comment round-trips
with Image.open('bar.jpeg') as im:
print(im.app['COM'])
print(im.info['comment'])
Note that in my initial attempts I tried appending the comment segment at the end of the file, but Pillow wouldn't load this comment even after calling the .load() method to force it to load the entire JPEG file.
Update: The upcoming version Pillow version 9.4.0 will support this by passing a comment parameter while saving, e.g.:
with Image.open("foo.jpeg") as im:
im.save('bar.jpeg', comment="hello world")
hopefully that makes things easier!

Remove EXIF from Image Before Upload to S3 in Python

I want to remove exif from an image before uploading to s3. I found a similar question (here), but it saves as a new file (I don't want it). Then I found an another way (here), then I tried to implemented it, everything was ok when I tested it. But after I deployed to prod, some users reported they occasionally got a problem while uploading images with a size of 1 MB and above, so they must try it several times.
So, I just want to make sure is my code correct?, or maybe there is something I can improve.
from PIL import Image
# I got body from http Request
img = Image.open(body)
img_format = img.format
# Save it in-memory to remove EXIF
temp = io.BytesIO()
img.save(temp, format=img_format)
body = io.BytesIO(temp.getvalue())
# Upload to s3
s3_client.upload_fileobj(body, BUCKET_NAME, file_key)
*I'm still finding out if this issue is caused by other things.
You should be able to copy the pixel data and palette (if any) from an existing image to a new stripped image like this:
from PIL import Image
# Load existing image
existing = Image.open(...)
# Create new empty image, same size and mode
stripped = Image.new(existing.mode, existing.size)
# Copy pixels, but not metadata, across
stripped.putdata(existing.getdata())
# Copy palette across, if any
if 'P' in existing.mode: stripped.putpalette(existing.getpalette())
Note that this will strip ALL metadata from your image... EXIF, comments, IPTC, 8BIM, ICC colour profiles, dpi, copyright, whether it is progressive, whether it is animated.
Note also that it will write JPEG images with PIL's default quality of 75 when you save it, which may or may not be the same as your original image had - i.e. the size may change.
If the above stripping is excessive, you could just strip the EXIF like this:
from PIL import Image
im = Image.open(...)
# Strip just EXIF data
if 'exif' in im.info: del im.info['exif']
When saving, you could test if JPEG, and propagate the existing quality forward with:
im.save(..., quality='keep')
Note: If you want to verify what metadata is in any given image, before and after stripping, you can use exiftool or ImageMagick on macOS, Linux and Windows, as follows:
exiftool SOMEIMAGE.JPG
magick identify -verbose SOMEIMAGE.JPG

Adding custom extratags with tifffile

I'm trying to write a script to simplify my everyday life in the lab. I operate one ThermoFisher / FEI scanning electron microscope and I save all my pictures in the TIFF format.
The microscope software is adding an extensive custom TiffTag (code 34682) containing all the microscope / image parameters.
In my script, I would like to open an image, perform some manipulations and then save the data in a new file, including the original FEI metadata. To do so, I would like to use a python script using the tifffile module.
I can open the image file and perform the needed manipulations without problems. Retrieving the FEI metadata from the input file is also working fine.
I was thinking to use the imwrite function to save the output file and using the extratags optional argument to transfer to the output file the original FEI metadata.
This is an extract of the tifffile documentation about the extratags:
extratags : sequence of tuples
Additional tags as [(code, dtype, count, value, writeonce)].
code : int
The TIFF tag Id.
dtype : int or str
Data type of items in 'value'. One of TIFF.DATATYPES.
count : int
Number of data values. Not used for string or bytes values.
value : sequence
'Count' values compatible with 'dtype'.
Bytes must contain count values of dtype packed as binary data.
writeonce : bool
If True, the tag is written to the first page of a series only.
Here is a snippet of my code.
my_extratags = [(input_tags['FEI_HELIOS'].code,
input_tags['FEI_HELIOS'].dtype,
input_tags['FEI_HELIOS'].count,
input_tags['FEI_HELIOS'].value, True)]
tifffile.imwrite('output.tif', data, extratags = my_extratags)
This code is not working and complaining that the value of the extra tag should be ASCII 7-bit encoded. This looks already very strange to me because I haven't touched the metadata and I am just copying it to the output file.
If I convert the metadata tag value in a string as below:
my_extratags = [(input_tags['FEI_HELIOS'].code,
input_tags['FEI_HELIOS'].dtype,
input_tags['FEI_HELIOS'].count,
str(input_tags['FEI_HELIOS'].value), True)]
tifffile.imwrite('output.tif', data, extratags = my_extratags)
the code is working, the image is saved, the metadata corresponding to 'FEI_HELIOS' is created but it is empty!
Can you help me in finding what I am doing wrongly?
I don't need to use tifffile, but I would prefer to use python rather than ImageJ because I have already several other python scripts and I would like to integrate this new one with the others.
Thanks a lot in advance!
toto
ps. I'm a frequent user of stackoverflow, but this is actually my first question!
In principle the approach is correct. However, tifffile parses the raw values of certain tags, including FEI_HELIOS, to dictionaries or other Python types. To get the raw tag value for rewriting, it needs to be read from file again. In these cases, use the internal TiffTag._astuple function to get an extratag compatible tuple of the tag, e.g.:
import tifffile
with tifffile.TiffFile('FEI_SEM.tif') as tif:
assert tif.is_fei
page = tif.pages[0]
image = page.asarray()
... # process image
with tifffile.TiffWriter('copy1.tif') as out:
out.write(
image,
photometric=page.photometric,
compression=page.compression,
planarconfig=page.planarconfig,
rowsperstrip=page.rowsperstrip,
resolution=(
page.tags['XResolution'].value,
page.tags['YResolution'].value,
page.tags['ResolutionUnit'].value,
),
extratags=[page.tags['FEI_HELIOS']._astuple()],
)
This approach does not preserve Exif metadata, which tifffile cannot write.
Another approach, since FEI files seem to be written uncompressed, is to directly memory map the image data in the file to a numpy array and manipulate that array:
import shutil
import tifffile
shutil.copyfile('FEI_SEM.tif', 'copy2.tif')
image = tifffile.memmap('copy2.tif')
... # process image
image.flush()
Finally, consider tifftools for rewriting TIFF files where tifffile is currently failing, e.g. Exif metadata.

Merge multiple base64 images into one

If I have multiple base64 strings that are images (one string = one image). Is there a way to combine them and decode to a single image file? i.e. from multiple base64 strings, merge and output a single image file.
I'm not sure how I would approach this using Pillow (or if I even need it).
Further clarification:
The source images are TIFFs that are encoded into base64
When I say "merge", I mean turning multiple images into a multi-page image like you see in a multi-page PDF
I dug through the Pillow documentation (v5.3) and found something that seems to work. Basically, there are two phases to this:
Save encoded base64 strings as TIF
Append them together and save to disk
Example using Python 3.7:
from PIL import Image
import io
import base64
base64_images = ["asdfasdg...", "asdfsdafas..."]
image_files = []
for base64_string in base64_images:
buffer = io.BytesIO(base64.b64decode(base64_string))
image_file = Image.open(buffer)
image_files.append(image_file)
combined_image = images_files[0].save(
'output.tiff',
save_all=True,
append_images=image_files[1:]
)
In the above code, I first create PIL Image objects from a bytes buffers in order to do this whole thing in-memory but you can probably use .save() and create a bunch of tempfiles instead if I/O isn't a concern.
Once I have all the PIL Image objects, I choose the first image (assuming they were in desired order in base64_images list) and append the rest of the images with append_images flag. The resulting image has all the frames in one output file.
I assume this pattern is extensible to any image format that supports the save_all and append_images keyword arguments. The Pillow documentation should let you know if it is supported.

Tensorflow: How to encode and read bmp images?

I am trying to read .bmp images, do some augmentation on these, save them to a .tfrecords file and then open the .tfrecords files and use the images for image classification. I know that there is a tf.image.encode_jpeg() and a tf.image.encode_png() function, but there is no tf.image.encode_bmp() function. I know that .bmp images are uncompressed, so I've tried to simply base64-encode, np.tostring() and np.tobytes() the images, but I get the following error when trying to decode these formats:
tensorflow.python.framework.errors_impl.InvalidArgumentError: channels attribute 3 does not match bits per pixel from file <some long number>
My take is that tensorflow, in its encoding to jpeg or png, does something extra with the byte encoding of the images; saving information about array dimensionality, etc. However, I am quite clueless about this, so any help would be great!
Some code to show what it is I am trying to achieve:
with tf.gfile.FastGFile(filename, 'rb') as f:
image_data = f.read()
bmp_data = tf.placeholder(dtype=tf.string)
decode_bmp = tf.image.decode_bmp(self._decode_bmp_data, channels=3)
augmented_bmp = <do some augmentation on decode_bmp>
sess = tf.Session()
np_img = sess.run(augmented_bmp, feed_dict={bmp_data: image_data})
byte_img = np_img.tostring()
# Write byte_img to file using tf.train.Example
writer = tf.python_io.TFRecordWriter(<output_tfrecords_filename>)
example = tf.train.Example(features=tf.train.Features(feature={
'encoded_img': tf.train.Feature(bytes_list=tf.train.BytesList(value=[byte_img])}))
writer.write(example.SerializeToString())
# Read img from file
dataset = tf.data.TFRecordDataset(<img_file>)
dataset = dataset.map(parse_img_fn)
The parse_img_fn may be condensed to the following:
def parse_img_fn(serialized_example):
features = tf.parse_single_example(serialized_example, feature_map)
image = features['encoded_img']
image = tf.image.decode_bmp(image, channels=3) # This is where the decoding fails
features['encoded_img']
return features
in your comment, surely you mean encode instead of encrypt
The BMP file format is quite simplistic, consisting of a bunch of headers and pretty much raw pixel data. This is why BMP images are so big. I suppose this is also why TensorFlow developers did not bother to write a function to encode arrays (representing images) into this format. Few people still use it. It is recommended to use PNG instead, which performs lossless compression of the image. Or, if you can deal with lossy compression, use JPG.
TensorFlow doesn't do anything special for encoding images. It just returns the bytes that represent the image in that format, similar to what matplotlib does when you do save_fig (except MPL also writes the bytes to a file).
Suppose you produce a numpy array where the top rows are 0 and the bottom rows are 255. This is an array of numbers which, if you think it as a picture, would represent 2 horizontal bands, the top one black and the bottom one white.
If you want to see this picture in another program (GIMP) you need to encode this information in a standard format, such as PNG. Encoding means adding some headers and metadata and, optionally, compressing the data.
Now that it is a bit more clear what encoding is, I recommend you work with PNG images.
with tf.gfile.FastGFile('image.png', 'rb') as f:
# get the bytes representing the image
# this is a 1D array (string) which includes header and stuff
raw_png = f.read()
# decode the raw representation into an array
# so we have 2D array representing the image (3D if colour)
image = tf.image.decode_png(raw_png)
# augment the image using e.g.
augmented_img = tf.image.random_brightness(image)
# convert the array back into a compressed representation
# by encoding it into png
# we now end up with a string again
augmented_png = tf.image.encode_png(augmented_img, compression=9)
# Write augmented_png to file using tf.train.Example
writer = tf.python_io.TFRecordWriter(<output_tfrecords_filename>)
example = tf.train.Example(features=tf.train.Features(feature={
'encoded_img': tf.train.Feature(bytes_list=tf.train.BytesList(value=[augmented_png])}))
writer.write(example.SerializeToString())
# Read img from file
dataset = tf.data.TFRecordDataset(<img_file>)
dataset = dataset.map(parse_img_fn)
There are a few important pieces of advice:
don't use numpy.tostring. This returns a HUUGE representation because each pixel is represented as a float, and they are all concatenated. No compression, nothing. Try and check the file size :)
no need to pass back into python by using tf.Session. You can perform all the ops on TF side. This way you have an input graph which you can reuse as part of an input pipeline.
There is no encode_bmp in the tensorflow main package, but if you import tensorflow_io (also a Google officially supported package) you can find the encode_bmp method there.
For the documentation see:
https://www.tensorflow.org/io/api_docs/python/tfio/image/encode_bmp

Categories