Basically, I want to add a few bytes to my new PNG image file. An example case is like the following code:
img = open("example.png", "rb") # Open images and read as binnary
hex = img.read().hex() # Read images as Bytes to Hexadecimal.
add_hex = hex+"7feab1e74a4bdb755cca" # Add some bytes to it (as hex)
to_bytes_img = bytes.fromhex(add_hex) # Convert hex to bytes
with open("example2.png", "wb") as f: # Write images
f.write(to_bytes_img)
But, the problem is, I have a special case that requires me to perform the above operation using OpenCV (CV2). Where cv2.imread() only reads and stores Pixels as a numpy array (an array of pixels, not the whole file).
Then, I want to write that image into a new file cv2.imwrite(), which will rebuild the image and save the PNG on disk. My question is, how do I add some bytes to the PNG image file (in buffer/memory), before the cv2.imwrite() operation.
I could probably do it with with open() as above, but that would be very inefficient opening, writing, opening, writing to disk again.
Related
I have to add an image to a database, so I open the image as binary and it stores it this way:
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00\x01\x00\x00\x00\x01\x01\x03\x00\x00\x00%\xdbV\xca\x00\x00\x00\x03PLTE\x00\x00\x00\xa7z=\xda\x00\x00\x00\x01tRNS\x00#\xe6\xd8f\x00\x00\x00\nIDAT\x08\xd7c`\x00\x00\x00\x02\x00\x01\xe2!\xbc3\x00\x00\x00\x00IEND\xaeB`\x82'
However I need it to be strored this way:
0x89504e470d0a1a0a0000000d494844520000000100000001010300000025db56ca00000003504c5445000000a77a3dda0000000174524e530040e6d8660000000a4944415408d76360000000020001e221bc330000000049454e44ae426082
It is my first ever time working with binary files so there is probably something basic I'm not understanding.
This is my code for opening the image in python:
with open("1x1.jpg", 'rb') as File:
binaryData=File.read()
print(binaryData)
This is the image: (1x1 empty pixel, I changed the extension from png to jpg, the original image is from https://upload.wikimedia.org/wikipedia/commons/c/ca/1x1.png)
binaryData is bytes and you need to convert it to the hex format.
binaryData.hex()
returns
'89504e470d0a1a0a0000000d494844520000000100000001010300000025db56ca00000003504c5445000000a77a3dda0000000174524e530040e6d8660000000a4944415408d76360000000020001e221bc330000000049454e44ae426082'
I have an image in format .img and I want to open it in python. How can I do it?
I have an interference pattern in *.img format and I need to process it. I tried to open it using GDAL, but I have an error:
ERROR 4: `frame_064_0000.img' not recognized as a supported file format.
If your image is 1,024 x 1,024 pixels, that would make 1048576 bytes, if the data are 8-bit. But your file is 2097268 bytes, which is just a bit more than double the expected size, so I guessed your data are 16-bit, i.e. 2 bytes per pixel. That means there are 2097268-(2*1024*1024), i.e. 116 bytes of other junk in the file. Folks normally store that extra stuff at the start of the file. So, I just took the last 2097152 bytes of your file and assumed that was a 16-bit greyscale image at 1024x1024.
You can do it at the command-line in Terminal with ImageMagick like this:
magick -depth 16 -size 1024x1024+116 gray:frame_064_0000.img -auto-level result.png
In Python, you could open the file, seek backwards 2097152 bytes from the end of the file and read that into a 1024x1024 np.array of uint16.
That will look something like this:
import numpy as np
from PIL import Image
filename = 'frame_064_0000.img'
# set width and height
w, h = 1024, 1024
with open(filename, 'rb') as f:
# Seek backwards from end of file by 2 bytes per pixel
f.seek(-w*h*2, 2)
img = np.fromfile(f, dtype=np.uint16).reshape((h,w))
# Save as PNG, and retain 16-bit resolution
Image.fromarray(img).save('result.png')
# Alternative to line above - save as JPEG, but lose 16-bit resolution
Image.fromarray((img>>8).astype(np.uint8)).save('result.jpg')
If I have multiple base64 strings that are images (one string = one image). Is there a way to combine them and decode to a single image file? i.e. from multiple base64 strings, merge and output a single image file.
I'm not sure how I would approach this using Pillow (or if I even need it).
Further clarification:
The source images are TIFFs that are encoded into base64
When I say "merge", I mean turning multiple images into a multi-page image like you see in a multi-page PDF
I dug through the Pillow documentation (v5.3) and found something that seems to work. Basically, there are two phases to this:
Save encoded base64 strings as TIF
Append them together and save to disk
Example using Python 3.7:
from PIL import Image
import io
import base64
base64_images = ["asdfasdg...", "asdfsdafas..."]
image_files = []
for base64_string in base64_images:
buffer = io.BytesIO(base64.b64decode(base64_string))
image_file = Image.open(buffer)
image_files.append(image_file)
combined_image = images_files[0].save(
'output.tiff',
save_all=True,
append_images=image_files[1:]
)
In the above code, I first create PIL Image objects from a bytes buffers in order to do this whole thing in-memory but you can probably use .save() and create a bunch of tempfiles instead if I/O isn't a concern.
Once I have all the PIL Image objects, I choose the first image (assuming they were in desired order in base64_images list) and append the rest of the images with append_images flag. The resulting image has all the frames in one output file.
I assume this pattern is extensible to any image format that supports the save_all and append_images keyword arguments. The Pillow documentation should let you know if it is supported.
I am trying to read .bmp images, do some augmentation on these, save them to a .tfrecords file and then open the .tfrecords files and use the images for image classification. I know that there is a tf.image.encode_jpeg() and a tf.image.encode_png() function, but there is no tf.image.encode_bmp() function. I know that .bmp images are uncompressed, so I've tried to simply base64-encode, np.tostring() and np.tobytes() the images, but I get the following error when trying to decode these formats:
tensorflow.python.framework.errors_impl.InvalidArgumentError: channels attribute 3 does not match bits per pixel from file <some long number>
My take is that tensorflow, in its encoding to jpeg or png, does something extra with the byte encoding of the images; saving information about array dimensionality, etc. However, I am quite clueless about this, so any help would be great!
Some code to show what it is I am trying to achieve:
with tf.gfile.FastGFile(filename, 'rb') as f:
image_data = f.read()
bmp_data = tf.placeholder(dtype=tf.string)
decode_bmp = tf.image.decode_bmp(self._decode_bmp_data, channels=3)
augmented_bmp = <do some augmentation on decode_bmp>
sess = tf.Session()
np_img = sess.run(augmented_bmp, feed_dict={bmp_data: image_data})
byte_img = np_img.tostring()
# Write byte_img to file using tf.train.Example
writer = tf.python_io.TFRecordWriter(<output_tfrecords_filename>)
example = tf.train.Example(features=tf.train.Features(feature={
'encoded_img': tf.train.Feature(bytes_list=tf.train.BytesList(value=[byte_img])}))
writer.write(example.SerializeToString())
# Read img from file
dataset = tf.data.TFRecordDataset(<img_file>)
dataset = dataset.map(parse_img_fn)
The parse_img_fn may be condensed to the following:
def parse_img_fn(serialized_example):
features = tf.parse_single_example(serialized_example, feature_map)
image = features['encoded_img']
image = tf.image.decode_bmp(image, channels=3) # This is where the decoding fails
features['encoded_img']
return features
in your comment, surely you mean encode instead of encrypt
The BMP file format is quite simplistic, consisting of a bunch of headers and pretty much raw pixel data. This is why BMP images are so big. I suppose this is also why TensorFlow developers did not bother to write a function to encode arrays (representing images) into this format. Few people still use it. It is recommended to use PNG instead, which performs lossless compression of the image. Or, if you can deal with lossy compression, use JPG.
TensorFlow doesn't do anything special for encoding images. It just returns the bytes that represent the image in that format, similar to what matplotlib does when you do save_fig (except MPL also writes the bytes to a file).
Suppose you produce a numpy array where the top rows are 0 and the bottom rows are 255. This is an array of numbers which, if you think it as a picture, would represent 2 horizontal bands, the top one black and the bottom one white.
If you want to see this picture in another program (GIMP) you need to encode this information in a standard format, such as PNG. Encoding means adding some headers and metadata and, optionally, compressing the data.
Now that it is a bit more clear what encoding is, I recommend you work with PNG images.
with tf.gfile.FastGFile('image.png', 'rb') as f:
# get the bytes representing the image
# this is a 1D array (string) which includes header and stuff
raw_png = f.read()
# decode the raw representation into an array
# so we have 2D array representing the image (3D if colour)
image = tf.image.decode_png(raw_png)
# augment the image using e.g.
augmented_img = tf.image.random_brightness(image)
# convert the array back into a compressed representation
# by encoding it into png
# we now end up with a string again
augmented_png = tf.image.encode_png(augmented_img, compression=9)
# Write augmented_png to file using tf.train.Example
writer = tf.python_io.TFRecordWriter(<output_tfrecords_filename>)
example = tf.train.Example(features=tf.train.Features(feature={
'encoded_img': tf.train.Feature(bytes_list=tf.train.BytesList(value=[augmented_png])}))
writer.write(example.SerializeToString())
# Read img from file
dataset = tf.data.TFRecordDataset(<img_file>)
dataset = dataset.map(parse_img_fn)
There are a few important pieces of advice:
don't use numpy.tostring. This returns a HUUGE representation because each pixel is represented as a float, and they are all concatenated. No compression, nothing. Try and check the file size :)
no need to pass back into python by using tf.Session. You can perform all the ops on TF side. This way you have an input graph which you can reuse as part of an input pipeline.
There is no encode_bmp in the tensorflow main package, but if you import tensorflow_io (also a Google officially supported package) you can find the encode_bmp method there.
For the documentation see:
https://www.tensorflow.org/io/api_docs/python/tfio/image/encode_bmp
resized_image = Image.resize((100,200));
Image is Python-Pillow Image class, and i've used the resize function to resize the original image,
How do i find the new file-size (in bytes) of the resized_image without having to save to disk and then reading it again
The file doesn't have to be written to disk. A file like object does the trick:
from io import BytesIO
# do something that defines `image`...
img_file = BytesIO()
image.save(img_file, 'png')
print(img_file.tell())
This prints the size in bytes of the image saved in PNG format without saving to disk.
You can't. PIL deals with image manipulations in memory. There's no way of knowing the size it will have on disk in a specific format.
You can save it to a temp file and read the size using os.stat('/tmp/tempfile.jpg').st_size