Here's a script that reads a JPG image and then writes 2 JPG images:
import cv2
# https://github.com/opencv/opencv/blob/master/samples/data/baboon.jpg
input_path = './baboon.jpg'
# Read image
im = cv2.imread(input_path)
# Write image using default quality (95)
cv2.imwrite('./baboon_out.jpg', im)
# Write image using best quality
cv2.imwrite('./baboon_out_100.jpg', im, [cv2.IMWRITE_JPEG_QUALITY, 100])
after running the above script, here's what the files look like:
.
├── baboon.jpg
├── baboon_out.jpg
├── baboon_out_100.jpg
└── main.py
However, the MD5 checksums of the JPGs created by the script do not match the original:
>>> md5 ./baboon.jpg
MD5 (./baboon.jpg) = 9a7171af1d6c6f0901d36d04e1bd68ad
>>> md5 ./baboon_out.jpg
MD5 (./baboon_out.jpg) = 1014782b9e228e848bc63bfba3fb49d9
>>> md5 ./baboon_out_100.jpg
MD5 (./baboon_out_100.jpg) = dbadd2fadad900e289e285393778ad89
Is there anyway to preserve the original image content with OpenCV? In which step is the data being modified?
It's a simplification, but the image will probably always change the checksum if you're reading a JPEG image because it's a lossy format being re-encoded and read by an avalanching hash and definitely will if you do any meaningful work on it, even if (especially if) you directly manipulated the bits of the file rather than loading it as an image
If you want the hash to be the same, just copy the file!
If you're looking for pixel-perfect manipulation, try a lossless format like PNG and understand if you are comparing the pixels or meta attributes a-la Exif (ie. should the Google, GitHub (Microsoft), and Twitter-affiliated fields from your baboon.jpg be included in your comparison? what about color data or depth information..?)
$ exiftool baboon.jpg | grep Description
Description : Open Source Computer Vision Library. Contribute to opencv/opencv development by creating an account on GitHub.
Twitter Description : Open Source Computer Vision Library. Contribute to opencv/opencv development by creating an account on GitHub.
Finally, watch out specifically when using MD5 as it's largely considered broken - a more modern checksum like SHA-256 might help with unforseen issues
There's a fundamental misconception here about the way images are stored and what lossy/lossless means.
Let's imagine we have a really simple 1-pixel grey image and that pixel has brightness 100. Let's make our own format for storing our image... I'm going to choose to store it as a Python script. Ok, here's our file:
print(100)
Now, let's do another lossless version, like PNG might:
print(10*10)
Now we have recreated our image losslessly, because we still get 100, but the MD5 hash is clearly going to be different. PNG does filtering before compressing and it checks whether each pixel is similar to the ones above and to the left in order to save space. One encoder might decide it can do better by looking at the pixel above, another one might think it can do better by looking at the pixel to the left. The result is the same.
Now let's make a new version of the file that stores the image:
# Updated 2022-08-12T21:15:46.544Z
print(100)
Now there's a new comment. The metadata about the image has changed, and so has the MD5 hash. But no pixel data has changed so it is a lossless change. PNG images often store the date/time in the file like this, so even two bit-for-bit identical images will have a different hash for the same pixels if saved 1 second apart.
Now let's make our single pixel image like a JPEG:
print(99)
According to JPEG, that'll look pretty similar, and it's 1-byte shorter, so it's fine. How do you think the MD5 hash will compare?
Now there's a different JPEG library, it thinks this is pretty close too:
print(98)
Again, the image will appear indistinguishable from the original, but your MD5 hash will be different.
TL/DR;
If you want an identical, bitwise perfect copy of your file, use the copy command.
Further food for thought... when you read an image with OpenCV, you actually get a Numpy array of pixels, don't you. Where is the comment that was in your image? The copyright? The GPS location? The camera make and model and shutter speed? The ICC colour profile? They are not in your Numpy array of pixels, are they? They are lost. So, even if the PNG/JPEG encoder made exactly the same decisions as the original encoder, your MD5 hash would still be different because all that other data is missing.
Related
I need a programmatic way to embed a clipping-path (saved as SVG file) to another image such as JPG/TIF/PSD. Using a tool such as Photoshop, this can be done easily and the path will be inserted in the image 8BIM profile, but it seems there is no way to do it programmatically. ImageMagick allows you to get a vector image for example by using the following command:
identify -format "%[8BIM:1999,2998:#1]" test.jpg > test.svg
But it seems not possible to do the reverse operation and add a vector image. Can anyone suggest any libraries which allow this operation?
It's a bit more code than I feel like writing for the moment, but it should be possible to put an 8BIM into a JPEG using the following information.
The anatomy of a JPEG is described here and here.
You can use PIL or OpenCV to encode a JPEG into a memory buffer and then locate and modify/add segments (such as an 8BIM) using code like this. Or you could just read() in an existing JPEG that you want to modify. To insert a segment, just write the first few segments to disk, then write your new one followed by the remaining segments from the existing file that you read at the start.
You can construct an 8BIM segment to insert using this answer.
You can use exiftool -v -v -v to see where an 8BIM appears in a JPEG created by Photoshop and then put yours in a similar place. You can also, obviously, equally use exiftool to see where/how your own attempt has landed.
I saved the image to the clipboard, and when I read the image information from the clipboard and saved it locally, the image quality changed. How can I save it to maintain the original high quality?
from PIL import ImageGrab
im = ImageGrab.grabclipboard()
im.save('somefile.png','PNG')
I tried adding the parameter 'quality=95' in im.save(), but it didn't work. The original image quality is 131K, and the saved image is 112K.
The size of the file is not directly related to the quality of the image. It also depends on how efficiently the encoder does its job. As it is PNG, the process is lossless, so you don't need to worry - the quality is retained.
Note that the quality parameter has a different meaning when saving JPEG files versus PNG files:
With JPEG files, if you specify a lower quality you are effectively allowing the encoder to discard more information and give up image quality in return for a smaller file size.
With PNG, your encoding and decoding are lossless. The quality is a hint to the decoder as to how much time to spend compressing the file (always losslessly) and about the types of filtering/encoding that may suit best. It is more akin to the parameter to gzip like --best or --fast.
Further information about PNG format is here on Wikipedia.
Without analysing the content of the two images it is impossible to say why the sizes differ - there could be many reasons:
One encoder may have noticed that the image contains fewer than 256 colours and so has decided to use a palette whereas the other may not have done. That could make the images size differ by a factor of 3 times, yet the quality would be identical.
One encoder may use a larger buffer and spend longer looking for repeating patterns in the image. For a simplistic example, imagine the image was 32,000 pixels wide and each line was the same as the one above. If one encoder uses an 8kB buffer, it can never spot that the image just repeats over and over down the page so it has to encode every single line in full, whereas an encoder with a 64kB buffer might just be able to use 1 byte per line and use the PNG filtering to say "same as line above".
One encoder might decide, on grounds of simplicity of code or for lack of code space, to always encode the data in a 16-bit version even if it could use just 8 bits.
One encoder might decide it is always going to store an alpha layer even if it is opaque because that may make the code/data cleaner simpler.
One encoder may always elect to do no filtering, whilst the other has the code required to do sub, up, average or Paeth filtering.
One encoder may not have enough memory to hold the entire image, so it may have to use a simplistic approach to be assured that it can handle whatever turns up later in the image stream.
I just made these examples up - don't take them was gospel - I am just trying to illustrate some possibilities.
To reproduce an exact copy of file from a clipboard, the only way is if the clipboard contains a byte-for-byte copy of the original. This does not happen when the content comes from the "Copy" function in a program.
In theory a program could be created to do that by setting a blob-type object with a copy of the original file, but that would be highly inefficient and defeat the purpose of the clipboard.
Some points:
- When you copy into the clipboard using the file manager, the clipboard will have a reference to the original file (not the entire file which can potentially be much larger than ram)
- Most programs will set the clipboard contents to some "useful version" of the displayed or selected data. This is very much subject to interpretation by the creator of the program.
- Parsing the clipboard content when reading an image is again subject to the whims of the library used to process the data and pack it back into an image format.
Generally if you want to copy a file exactly you will be better off just copying the original file.
Having said that: Evaluate the purpose of the copy-paste process and decide whether the data you get from the clipboard is "good enough" for the intended purpose. This obviously depends on what you want to use it for.
I am doing image processing in a scientific context. Whenever I need to save an image to the hard drive, I want to be able to reopen it at a later time and get exactly the data that I had before saving it. I exclusively use the PNG format, having always been under the impression that it is a lossless format. Is this always correct, provided I am not using the wrong bit-depth? Should encoder and decoder play no role at all? Specifically, the images I save
are present as 2D numpy arrays
have integer values from 0 to 255
are encoded with the OpenCV imwrite() function, e.g. cv2.imwrite("image.png", array)
PNG is a lossless format by design:
Since PNG's compression is fully lossless--and since it supports up to 48-bit truecolor or 16-bit grayscale--saving, restoring and re-saving an image will not degrade its quality, unlike standard JPEG (even at its highest quality settings).
The encoder and decoder should not matter, in regards of reading the images correctly. (Assuming, of course, they're not buggy).
And unlike TIFF, the PNG specification leaves no room for implementors to pick and choose what features they'll support; the result is that a PNG image saved in one app is readable in any other PNG-supporting application.
While png is lossless, this does not mean it is uncompressed by default.
I specify compression using the IMWRITE_PNG_COMPRESSION flag. It varies between 0 (no compression) and 9 (maximum compression). So if you want uncompressed png:
cv2.imwrite(filename, data, [cv2.IMWRITE_PNG_COMPRESSION, 0])
The more you compress, the longer it takes to save.
Link to docs
How would you scale/optimize/minimally output a PNG image so that it just falls below a certain maximum file size? (The input sources are various - PDF, JPEG, GIF, TIFF...)
I've looked in many places but can't find an answer to this question.
In ImageMagick a JPEG output can do this with extent (see e.g. ImageMagick: scale JPEG image with a maximum file-size), but there doesn't seem to be an equivalent for other file formats e.g. PNG.
I could use Wand or PIL in a loop (preference for python) until the filesize is below a certain value, but for 1000s of images this will have a large I/O overhead unless there's a way to predict/estimate the filesize without writing it out first. Perhaps this is the only option.
I could also wrap the various (macOS) command-line tools in python.
Additionally, I only want to do any compression at all where it's absolutely necessary (the source is mainly text), which leaves a choice of compression algorithms.
Thanks for all help.
PS Other relevant questions:
Scale image according a maximum file size
Compress a PNG image with ImageMagick
python set maximum file size when converting (pdf) to jpeg using e.g. Wand
Edit: https://stackoverflow.com/a/40588202/1021819 is quite close too - though the exact code there already (inevitably?) makes some choices about how to go about reducing the file size (resize in that case). Perhaps there is no generalized way to do this without a multi-dimensional search.
Also, since the input files are PDFs, can this even be done with PIL? The first choice is about rasterization, for which I have been using Wand.
https://stackoverflow.com/a/34618887/1021819 is also useful, in that it uses Wand, so putting that operation within the binary-chop loop seems to be a way forward.
With PNG there is no tradeoff of compression method and visual appearance because PNG is lossless. Just go for the smallest possible file, using
my "pngcrush" application, "optipng", "zopflipng" or the like.
If you need a smaller file than any of those can produce, try reducing the number of colors to 255 or fewer, which will allow the PNG codec to produce an indexed-color PNG (color-type 3) which is around 1/3 of the filesize of an RGB PNG. You can use ImageMagick's "-colors 255" option to do this. However, I recommend the "pngquant" application for this; it does a better job than IM does in most cases.
I want to write a python code that reads a .jpg picture, alter some of its RBG components and save it again, without changing the picture size.
I tried to load the picture using OpenCV and PyGame, however, when I tried a simple Load/Save code, using three different functions, the resulting images is greater in size than the initial image. This is the code I used.
>>> import cv, pygame # Importing OpenCV & PyGame libraries.
>>> image_opencv = cv.LoadImage('lena.jpg')
>>> image_opencv_matrix = cv.LoadImageM('lena.jpg')
>>> image_pygame = pygame.image.load('lena.jpg')
>>> cv.SaveImage('lena_opencv.jpg', image_opencv)
>>> cv.SaveImage('lena_opencv_matrix.jpg', image_opencv_matrix)
>>> pygame.image.save(image_pygame, 'lena_pygame.jpg')
The original size was 48.3K, and the resulting are 75.5K, 75.5K, 49.9K.
So, I'm not sure I'm missing something that makes the picture original size changes, although I only made a Load/Save, or not?
And is there a better library to use rather than OpenCV or PyGame ?!
JPEG is a lossy image format. When you open and save one, you’re encoding the entire image again. You can adjust the quality settings to approximate the original file size, but you’re going to lose some image quality regardless. There’s no general way to know what the original quality setting was, but if the file size is important, you could guess until you get it close.
The size of a JPEG output depends on 3 things:
The dimensions of the original image. In your case these are the same for all 3 examples.
The color complexity within the image. An image with a lot of detail will be bigger than one that is totally blank.
The quality setting used in the encoder. In your case you used the defaults, which appear to be higher for OpenCV vs. PyGame. A better quality setting will generate a file that's closer to the original (less lossy) but larger.
Because of the lossy nature of JPEG some of this is slightly unpredictable. You can save an image with a particular quality setting, open that new image and save it again at the exact same quality setting, and it will probably be slightly different in size because of the changes introduced when you saved it the first time.