need help creating Jpeg Generational Degradation code - python

I am currently creating a Generation loss code for .jpeg images.
Theory:- .jpg is a lossy compression format (for the most part). i.e. every time the image is converted to .jpg some contents/data of the original image is lost in the process. This results in lower file sizes, but due to the loss of data the image is of lower quality then the original. In most use cases, the degradation in quality is negligible. But if this process is carried out a lot of time, all the pixel data of the image get's compressed (lost) so many times, that we end up with just random noise.
I have tried doing it on PIL and cv2, but had no success.
What i tried:- Opening the image (let's say a image of format .png), and converting it into a .jpg. Then converting the image (which is currently of format .jpg) back to .png, so that the before mentioned process can be carried out several times.
My reasoning behind this is, since we are converting the original image into a jpeg, some data should be lost.
I am displaying the image using cv2.imshow() because the window stays active until destroyed explicitly, or an cv2.destroyWindow()/cv2.destroyAllWindows() is encountered.
I expected the image to show up, and its quality to gradually decrease as the program goes by, but for some reason the image stays the same. So, I am expecting someone to help me create the code from scratch (as my current efforts are in vain).
P.S.:- The Reason why I didn't posted any code, is because it's more of a bodge rather then anything concrete, and does nothing towards achieving the objective. So me uploading it would only waste, others time analysing it.

The flaw in your theory is here:
every time the image is converted to .jpg some contents/data of the original image is lost in the process.
If you have already converted to JPEG and recompress with the same settings you might not loose data.

Related

Compressing PIL image without saving the file

I am having trouble with compressing image in python without saving the image at the disk. The image has a save function as described here. Here it optimizes the image by saving it. Is it possible to use the same procedure without saving the image. I want to do it like another python function.
image=image.quantize() [here it reduces the quality a lot ]
Thanks in advance :)
In PIL or opencv the image is just a large matrix with values for its pixels. If you want to do something with the image(e.g. display it), the function needs to know all the pixel values, and thus needs the extracted image.
However, there is a method to keep the image compressed in memory until you really need to do something with the image. Have a look at this answer: How can i load a image in Python, but keep it compressed?

How to improve the quality of function ‘image.save()’?

I saved the image to the clipboard, and when I read the image information from the clipboard and saved it locally, the image quality changed. How can I save it to maintain the original high quality?
from PIL import ImageGrab
im = ImageGrab.grabclipboard()
im.save('somefile.png','PNG')
I tried adding the parameter 'quality=95' in im.save(), but it didn't work. The original image quality is 131K, and the saved image is 112K.
The size of the file is not directly related to the quality of the image. It also depends on how efficiently the encoder does its job. As it is PNG, the process is lossless, so you don't need to worry - the quality is retained.
Note that the quality parameter has a different meaning when saving JPEG files versus PNG files:
With JPEG files, if you specify a lower quality you are effectively allowing the encoder to discard more information and give up image quality in return for a smaller file size.
With PNG, your encoding and decoding are lossless. The quality is a hint to the decoder as to how much time to spend compressing the file (always losslessly) and about the types of filtering/encoding that may suit best. It is more akin to the parameter to gzip like --best or --fast.
Further information about PNG format is here on Wikipedia.
Without analysing the content of the two images it is impossible to say why the sizes differ - there could be many reasons:
One encoder may have noticed that the image contains fewer than 256 colours and so has decided to use a palette whereas the other may not have done. That could make the images size differ by a factor of 3 times, yet the quality would be identical.
One encoder may use a larger buffer and spend longer looking for repeating patterns in the image. For a simplistic example, imagine the image was 32,000 pixels wide and each line was the same as the one above. If one encoder uses an 8kB buffer, it can never spot that the image just repeats over and over down the page so it has to encode every single line in full, whereas an encoder with a 64kB buffer might just be able to use 1 byte per line and use the PNG filtering to say "same as line above".
One encoder might decide, on grounds of simplicity of code or for lack of code space, to always encode the data in a 16-bit version even if it could use just 8 bits.
One encoder might decide it is always going to store an alpha layer even if it is opaque because that may make the code/data cleaner simpler.
One encoder may always elect to do no filtering, whilst the other has the code required to do sub, up, average or Paeth filtering.
One encoder may not have enough memory to hold the entire image, so it may have to use a simplistic approach to be assured that it can handle whatever turns up later in the image stream.
I just made these examples up - don't take them was gospel - I am just trying to illustrate some possibilities.
To reproduce an exact copy of file from a clipboard, the only way is if the clipboard contains a byte-for-byte copy of the original. This does not happen when the content comes from the "Copy" function in a program.
In theory a program could be created to do that by setting a blob-type object with a copy of the original file, but that would be highly inefficient and defeat the purpose of the clipboard.
Some points:
- When you copy into the clipboard using the file manager, the clipboard will have a reference to the original file (not the entire file which can potentially be much larger than ram)
- Most programs will set the clipboard contents to some "useful version" of the displayed or selected data. This is very much subject to interpretation by the creator of the program.
- Parsing the clipboard content when reading an image is again subject to the whims of the library used to process the data and pack it back into an image format.
Generally if you want to copy a file exactly you will be better off just copying the original file.
Having said that: Evaluate the purpose of the copy-paste process and decide whether the data you get from the clipboard is "good enough" for the intended purpose. This obviously depends on what you want to use it for.

Reading a .JPG Image and Saving it without file size change

I want to write a python code that reads a .jpg picture, alter some of its RBG components and save it again, without changing the picture size.
I tried to load the picture using OpenCV and PyGame, however, when I tried a simple Load/Save code, using three different functions, the resulting images is greater in size than the initial image. This is the code I used.
>>> import cv, pygame # Importing OpenCV & PyGame libraries.
>>> image_opencv = cv.LoadImage('lena.jpg')
>>> image_opencv_matrix = cv.LoadImageM('lena.jpg')
>>> image_pygame = pygame.image.load('lena.jpg')
>>> cv.SaveImage('lena_opencv.jpg', image_opencv)
>>> cv.SaveImage('lena_opencv_matrix.jpg', image_opencv_matrix)
>>> pygame.image.save(image_pygame, 'lena_pygame.jpg')
The original size was 48.3K, and the resulting are 75.5K, 75.5K, 49.9K.
So, I'm not sure I'm missing something that makes the picture original size changes, although I only made a Load/Save, or not?
And is there a better library to use rather than OpenCV or PyGame ?!
JPEG is a lossy image format. When you open and save one, you’re encoding the entire image again. You can adjust the quality settings to approximate the original file size, but you’re going to lose some image quality regardless. There’s no general way to know what the original quality setting was, but if the file size is important, you could guess until you get it close.
The size of a JPEG output depends on 3 things:
The dimensions of the original image. In your case these are the same for all 3 examples.
The color complexity within the image. An image with a lot of detail will be bigger than one that is totally blank.
The quality setting used in the encoder. In your case you used the defaults, which appear to be higher for OpenCV vs. PyGame. A better quality setting will generate a file that's closer to the original (less lossy) but larger.
Because of the lossy nature of JPEG some of this is slightly unpredictable. You can save an image with a particular quality setting, open that new image and save it again at the exact same quality setting, and it will probably be slightly different in size because of the changes introduced when you saved it the first time.

Can you reduce memory consumption by ReportLab when embedding very large images, or is there a Python PDF toolkit that can?

Right now reportlab is making PDFs most of the time. However when one file gets several large images (125 files with a total on disk size of 7MB), we end up running out of memory and crashing trying to build a PDF that should ultimately be smaller than 39MB. The problem stems from:
elif mode not in ('L','RGB','CMYK'):
im = im.convert('RGB')
self.mode = 'RGB'
Where nice b&w (bitonal) images are converted to RGB and when you have images with sizes in the 2595x3000, they consume a lot of memory. (Not sure why they consume 2GB, but that point is moot. When we add them to reportlab our entire python memory footprint is about 50MB, when we call
doc.build(elements, canvasmaker=canvasmaker)
Memory usage skyrockets as we go from bitonal PNGs to RGB and then render them onto the page.
While I try to see if I can figure out how to inject bitonal images into reportlab PDFs, I thought I would see if anyone else had an idea of how to fix this problem either in reportlab or with another tool.
We have a working PDF maker using PODOFO in C++, one of my possible solutions is to write a script/outline for that tool that will simply generate the PDF in a subprocess and then return that via a file or stdout.
Short of redoing PIL you are out of luck. The Images are converted internally in PIL to 24 bit color TIFs. This is not something you can easily change.
We switched to Podofo and generate the PDF outside of python.

How to scale an image without occasionally inverting it (with the Python Imaging Library)

When resizing images along the lines shown in this question occasionally the resulting image is inverted. About 1% of the images I resize are inverted, the rest is fine. So far I was unable to find out what is different about these images.
See resized example and original image for examples.
Any suggestions on how to track down that problem?
I was finally able to find someone experienced in JPEG and with some additional knowledge was able to find a solution.
JPEG is a very underspecified
Format.
The second image is a valid JPEG but it is in CMYK color space, not in RGB color space.
Design minded tools (read: things from Apple) can process CMYK JPEGs, other stuff (Firefox, IE) can't.
CMYK JPEG is very under specified and the way Adobe Photoshop writes it to disk is borderline to buggy.
Best of it all there is a patch to fix the issue.
Your original image won't display for me; Firefox says
The image “http://images.hudora.de/o/NIRV2MRR3XJGR52JATL6BOVMQMFSV54I01.jpeg”
cannot be displayed, because it contains errors.
This suggests that the problem arises when you attempt to resize a corrupted JPEG, and indeed your resized example shows what looks like JPEG corruption to my eye (Ever cracked open a JPEG image and twiddled a few bits to see what it does to the output? I have, and a few of my abominable creations looked like that). There are a few JPEG repair tools out there, but I've never seriously tried any of them and don't know if they might be able to help you out.

Categories