PIL "IOError: image file truncated" with big images - python

I think this problem is not Zope-related. Nonetheless I'll explain what I'm trying to do:
I'm using a PUT_factory in Zope to upload images to the ZODB per FTP. The uploaded image is saved as a Zope Image inside a newly created container object. This works fine, but I want to resize the image if it exceeds a certain size (width and height). So I'm using the thumbnail function of PIL to resize them i.e. to 200x200. This works fine as long as the uploaded images are relatively small. I didn't check out the exact limit, but 976x1296px is still ok.
With bigger pictures I get:
Module PIL.Image, line 1559, in thumbnail
Module PIL.ImageFile, line 201, in load
IOError: image file is truncated (nn bytes not processed).
I tested a lot of jpegs from my camera. I don't think they are all truncated.
Here is my code:
if img and img.meta_type == 'Image':
pilImg = PIL.Image.open( StringIO(str(img.data)) )
elif imgData:
pilImg = PIL.Image.open( StringIO(imgData) )
pilImg.thumbnail((width, height), PIL.Image.ANTIALIAS)
As I'm using a PUT_factory, I don't have a file object, I'm using either the raw data from the factory or a previously created (Zope) Image object.
I've heard that PIL handles image data differently when a certain size is exceeded, but I don't know how to adjust my code. Or is it related to PIL's lazy loading?

I'm a little late to reply here, but I ran into a similar problem and I wanted to share my solution. First, here's a pretty typical stack trace for this problem:
Traceback (most recent call last):
...
File ..., line 2064, in ...
im.thumbnail(DEFAULT_THUMBNAIL_SIZE, Image.ANTIALIAS)
File "/Library/Python/2.7/site-packages/PIL/Image.py", line 1572, in thumbnail
self.load()
File "/Library/Python/2.7/site-packages/PIL/ImageFile.py", line 220, in load
raise IOError("image file is truncated (%d bytes not processed)" % len(b))
IOError: image file is truncated (57 bytes not processed)
If we look around line 220 (in your case line 201—perhaps you are running a slightly different version), we see that PIL is reading in blocks of the file and that it expects that the blocks are going to be of a certain size. It turns out that you can ask PIL to be tolerant of files that are truncated (missing some file from the block) by changing a setting.
Somewhere before your code block, simply add the following:
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
...and you should be good.
EDIT: It looks like this helps for the version of PIL bundled with Pillow ("pip install pillow"), but may not work for default installations of PIL

Here is what I did:
Edit LOAD_TRUNCATED_IMAGES = False line from /usr/lib/python3/dist-packages/PIL/ImageFile.py:40 to LOAD_TRUNCATED_IMAGES = True.
Editing the file requires root access though.
I encountered this error while running some pytorch which was maybe using the PIL library.
Do this fix only if you encounter this error, without directly using PIL.
Else please do
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

Best thing is that you can:
if img and img.meta_type == 'Image':
pilImg = PIL.Image.open( StringIO(str(img.data)) )
elif imgData:
pilImg = PIL.Image.open( StringIO(imgData) )
try:
pilImg.load()
except IOError:
pass # You can always log it to logger
pilImg.thumbnail((width, height), PIL.Image.ANTIALIAS)
As dumb as it seems - it will work like a miracle. If your image has missing data, it will be filled with gray (check the bottom of your image).
Note: usage of camel case in Python is discouraged and is used only in class names.

This might not be a PIL issue. It might be related to your HTTP Server setting. HTTP servers put a limit on the size of the entity body that will be accepted.
For eg, in Apache FCGI, the option FcgidMaxRequestLen determines the maximum size of file that can be uploaded.
Check that for your server - it might be the one that is limiting the upload size.

I was trapped with the same problem. However, setting ImageFile.LOAD_TRUNCATED_IMAGES = True is not suitable in my case, and I have checked that all my image files were unbroken, but big.
I read the images using cv2, and then converted it to PIL.Image to get round the problem.
img = cv2.imread(imgfile, cv2.IMREAD_GRAYSCALE)
img = Image.fromarray(img)

I had to change the tds version to 7.2 to prevent this from happening. Also works with tds version 8.0, however I had some other issues with 8.0.

When image was partly broken, OSError will be caused.
I use below code for check and save the wrong image list.
try:
img = Image.open(file_path)
img.load()
## do the work with the loaded image
except OSError as error:
print(f"{error} ({file_path})")
with open("./error_file_list.txt", "a") as error_log:
log.write(str(file_path))

Related

CV2.imread() Python image not loading and no error displayed

In my python flask program hosted on Ubuntu using Apache2, the program I wrote usually is able to use imread to compress and process an image. The code now is not being able to access the image even though the file path are correct and it is an existing file and does not return any error message. This seems to occur with any type of image not depending on the type of file and the code sometimes returns a none type is not subscript able if that is related to this issue. Thanks for any help in advance! The code:
img = cv2.imread(file_path)
height, width, _ = img.shape
roi = img[0: height, 0: width]
_, compressedimage = cv2.imencode(".jpg", roi, [1, 90])
file_bytes = io.BytesIO(compressedimage)
code sometimes returns a none type
It looks to me like the file path you are supplying might be incorrect. I am saying this because cv2.imread does not throw an error for a non-existent file path:
img = cv2.imread("/non/existent/image.jpeg")
Rather it returns None:
img is None
(returns True). Your code would then fail on img.shape line. So you could either check that the file path is correct before supplying to cv2.imread:
assert os.path.isfile(file_path)
or you could assert img is not None after doing the imread and before going further.
EDIT:
I see that you are claiming that you have checked that the file path is correct and the file exists. But then you are yourself saying
the code sometimes returns a none type is not subscript able
This is because of img[0: height, 0: width] where you are trying to index/subscript None.
You can check that if you type None[0], you get the same error TypeError: 'NoneType' object is not subscriptable. So I still feel that the image path is not correct or it is not a valid image file (has some other extension than the ones listed here). In both cases you get a None on imread.
Since the code snippet you've provided is a part of a larger app, the problem could be somewhere else such that sometime the file path is valid and sometime it is not? I am only guessing at this point.
You can't expect cv2.imshow() to work inside flask or apache.
When apache starts it generally switches user to a user and group called www so it doesn't have the same environment (PATH, HOME DISPLAY and other variables) as when you run something from your Terminal sitting in front of your screen.
Nor does the www user even have the same Python modules installed if you added them to your own user.
It also doesn't have a tty where it could apply cv2.waitKey() to tell if you had pressed a key - to quit, for example.
It also doesn't have permission to draw on your screen - or even know who you are or which one of the potentially many attached screens/sessions you are sitting at.
Maybe consider looking at the logging module to debug/monitor your script and log the image path and its file size prior to cv2.imread() and its shape and dtype afterwards.
I had the same issue.
The solution was to "clean" the path.
I had spaces and "+" in the filepath, and by replacing both with "_" the files are now properly read by cv2.imread().

Only one image from 5 is downloaded and it knocks out an error

import requests
from PIL import Image
url_shoes_for_choice = [
"https://content.adidas.co.in/static/Product-CM7531/Unisex_OUTDOOR_SANDALS_CM7531_1.jpg",
"https://cdn.shopify.com/s/files/1/0080/1374/2161/products/product-image-897958210_640x.jpg?v=1571713841",
"https://cdn.chamaripashoes.com/media/catalog/product/cache/9/image/9df78eab33525d08d6e5fb8d27136e95/1/_/1_8_3.jpg",
"https://ae01.alicdn.com/kf/HTB1EyKjaI_vK1Rjy0Foq6xIxVXah.jpg_q50.jpg",
"https://www.converse.com/dw/image/v2/BCZC_PRD/on/demandware.static/-/Sites-cnv-master-catalog/default/dwb9eb8c43/images/a_107/167708C_A_107X1.jpg"
]
def img():
for url in url_shoes_for_choice:
image = requests.get(url, stream=True).raw
out = Image.open(image)
out.save('image/image.jpg', 'jpg')
if __name__=="__main__":
img()
Error:
OSError: cannot identify image file <_io.BytesIO object at 0x7fa185c52d58>
The problem is that one of the images is making issues with the byte data returned by the requests.get(url, stream=True).raw, I'm not sure but I guess the data of the 3rd image is invalid byte data so instead of getting the raw data we can just fetch the content and then by using BytesIO we can fix the byte data.
I fixed one more thing from your original code, I added numbering to your images so each can be saved with different name.
from io import BytesIO
def img():
for count, url in enumerate(url_shoes_for_choice):
image = requests.get(url, stream=True)
with BytesIO(image.content) as f:
with Image.open(f) as out:
# out.show() # See the images
out.save('image/image{}.jpg'.format(count))
(Though this works fine but I'm not sure what was the main issue. If anyone knows exactly what is the issue please comment and explain.)
I opened the first link in my browser and saved the image. It's actually a webp file.
$ file Unisex_OUTDOOR_SANDALS_CM7531_1.webp
Unisex_OUTDOOR_SANDALS_CM7531_1.webp: RIFF (little-endian) data, Web/P image, VP8 encoding, 500x500, Scaling: [none]x[none], YUV color, decoders should clamp
You explicitly tell the image library that it should expect a jpg. When you remove that parameter and let it figure it out on its own using out.save('image/image.jpg') the first image successfully downloads for me.
The first two images work this way if you make sure to save each under a different name:
def img():
i = 0
for url in url_shoes_for_choice:
i+=1
image = requests.get(url, stream=True).raw
out = Image.open(image)
out.save('image{}.jpg'.format(i))
the third is a valid jpeg file, as well as the fourth, but using the JFIF standard 1.01 which I hear the first time of. I'm pretty sure you'll have to figure out support for different such filetypes.
It is worth noting that if I download the images in chrome and open those with python, nothing fails. So chrome might be adding information to the file.
The documentation of PIL/pillow explains here that you need a new enough version for animated images, but that is not your problem.
Support for animated WebP files will only be enabled if the system
WebP library is v0.5.0 or later. You can check webp animation support
at runtime by calling features.check(“webp_anim”).

PIL can not deal with images generated by uvccapture

I use uvccapture to take pictures and want to process them with the help of python and the python imaging library (PIL).
The problem is that PIL can not open those images. It throws following error message.
Traceback (most recent call last):
File "process.py", line 6, in <module>
im = Image.open(infile)
File "/usr/lib/python2.7/dist-packages/PIL/Image.py", line 1980, in open
raise IOError("cannot identify image file")
IOError: cannot identify image file
My python code looks like this:
import Image
infile = "snap.jpg"
im = Image.open(infile)
I tried to save the images in different formats before processing them. But this does not help. Also changing file permissions and owners does not help.
The only thing that helps is to open the images, for example with jpegoptim, and overwriting the old image with the optimized one. After this process, PIL can deal with these images.
What is the problem here? Are the files generated by uvccapture corrupt?
//EDIT: I also found out, that it is not possible to open the images, generated with uvccapture, with scipy. Running the command
im = scipy.misc.imread("snap.jpg")
produces the same error.
IOError: cannot identify image file
I only found a workaround to this problem. I processed the captured pic with jpegoptim and afterwords PIL could deal with the optimized image.

Python PIL Paste

I want to paste a bunch of images together with PIL. For some reason, when I run the line blank.paste(img,(i*128,j*128)) I get the following error: ValueError: cannot determine region size; use 4-item box
I tried messing with it and using a tuple with 4 elements like it said (ex. (128,128,128,128)) but it gives me this error: SystemError: new style getargs format but argument is not a tuple
Each image is 128x and has a naming style of "x_y.png" where x and y are from 0 to 39. My code is below.
from PIL import Image
loc = 'top right/'
blank = Image.new("RGB", (6000,6000), "white")
for x in range(40):
for y in reversed(range(40)):
file = str(x)+'_'+str(y)+'.png'
img = open(loc+file)
blank.paste(img,(x*128,y*128))
blank.save('top right.png')
How can I get this to work?
This worked for me, I'm using Odoo v9 and I have pillow 4.0.
I did it steps in my server with ubuntu:
# pip uninstall pillow
# pip install Pillow==3.4.2
# /etc/init.d/odoo restart
You're not loading the image correctly. The built-in function open just opens a new file descriptor. To load an image with PIL, use Image.open instead:
from PIL import Image
im = Image.open("bride.jpg") # open the file and "load" the image in one statement
If you have a reason to use the built-in open, then do something like this:
fin = open("bride.jpg") # open the file
img = Image.open(fin) # "load" the image from the opened file
With PIL, "loading" an image means reading the image header. PIL is lazy, so it doesn't load the actual image data until it needs to.
Also, consider using os.path.join instead of string concatenation.
For me the methods above didn't work.
After checking image.py I found that image.paste(color) needs one more argument like image.paste(color, mask=original). It worked well for me by changing it to this:
image.paste(color, box=(0, 0) + original.size)

Python Image Library (PIL) error with floating point tiff

I am writing a script that changes the resolution of a floating point 2K (2048x2048) tiff image to 1024x1024.
But I get the following error:
File "C:\Python26\lib\site-packages\PIL\Image.py", line 1916, in open
IOError: cannot identify image file
My Code:
import Image
im = Image.open( inPath )
im = im.resize( (1024, 1024) , Image.ANTIALIAS )
im.save( outPath )
Any Ideas?
Download My Image From This Link
Also I'm using pil 1.1.6. The pil install is x64 same as the python install (2.6.6)
Try one of these two:
open the file in binary mode,
give the full path to the file.
HTH!
EDIT after testing the OP's image:
It definitively seems like is the image having some problem. I'm on GNU/Linux and couldn't find a single program being able to handle it. Among the most informative about what the problem is have been GIMP:
and ImageMagik:
display: roadnew_disp27-dm_u0_v0_hr.tif: invalid TIFF directory; tags are not sorted in ascending order. `TIFFReadDirectory' # warning/tiff.c/TIFFWarnings/703.
display: roadnew_disp27-dm_u0_v0_hr.tif: unknown field with tag 18 (0x12) encountered. `TIFFReadDirectory' # warning/tiff.c/TIFFWarnings/703.
I did not try it myself, but googling for "python tiff" returned the pylibtiff library, which - being specifically designed for TIFF files, it might perhaps offer some more power in processing this particular ones...
HTH!

Categories