Python Pillow - ValueError: Decompressed Data Too Large - python

I use the Pillow lib to create thumbnails. I have to create a lot of them, actually more than 10.000
The program works fine, but after processing round about 1.500, I get the following error:
Traceback (most recent call last):
File "thumb.py", line 15, in <module>
im = Image.open('/Users/Marcel/images/07032017/' + infile)
File "/Users/Marcel/product-/PIL/Image.py", line 2339, in open
im = _open_core(fp, filename, prefix)
File "/Users/Marcel/product-/PIL/Image.py", line 2329, in _open_core
im = factory(fp, filename)
File "/Users/Marcel/product-/PIL/ImageFile.py", line 97, in __init__
self._open()
File "/Users/Marcel/product-/PIL/PngImagePlugin.py", line 538, in _open
s = self.png.call(cid, pos, length)
File "/Users/Marcel/product-/PIL/PngImagePlugin.py", line 136, in call
return getattr(self, "chunk_" + cid.decode('ascii'))(pos, length)
File "/Users/Marcel/product-/PIL/PngImagePlugin.py", line 319, in chunk_iCCP
icc_profile = _safe_zlib_decompress(s[i+2:])
File "/Users/Marcel/product-/PIL/PngImagePlugin.py", line 90, in _safe_zlib_decompress
raise ValueError("Decompressed Data Too Large")
ValueError: Decompressed Data Too Large
My program is very straight forward:
import os, sys
import PIL
from PIL import Image
size = 235, 210
reviewedProductsList = open('products.txt', 'r')
reviewedProducts = reviewedProductsList.readlines()
t = map(lambda s: s.strip(), reviewedProducts)
print "Thumbs to create: '%s'" % len(reviewedProducts)
for infile in t:
outfile = infile
try:
im = Image.open('/Users/Marcel/images/07032017/' + infile)
im.thumbnail(size, Image.ANTIALIAS)
print "thumb created"
im.save('/Users/Marcel/product-/thumbs/' + outfile, "JPEG")
except IOError, e:
print "cannot create thumbnail for '%s'" % infile
print "error: '%s'" % e
I am performing this operation locally on my MacBook Pro.

This is to protect against a potential DoS attack on servers running Pillow caused by decompression bombs. It occurs when a decompressed image is found to have too large metadata. See http://pillow.readthedocs.io/en/4.0.x/handbook/image-file-formats.html?highlight=decompression#png
Here's the CVE report: https:// www.cvedetails.com/cve/CVE-2014-9601/
From a recent issue:
If you set ImageFile.LOAD_TRUNCATED_IMAGES to true, it will suppress
the error (but still not read the large metadata). Alternately, you can
change set the values here: https://github.com/python-pillow/Pillow/ blob/master/PIL/PngImagePlugin.py#L74
https://github.com/python-pillow/Pillow/issues/2445

following code should help you in setting what accepted answer says.
from PIL import PngImagePlugin
LARGE_ENOUGH_NUMBER = 100
PngImagePlugin.MAX_TEXT_CHUNK = LARGE_ENOUGH_NUMBER * (1024**2)
It's not documented how to set this value. I hope people find this useful.

Related

Converting Video File(s) to GIF

I want make a converter based on python 3.8
I'm using imageoi API 2.6.1
Here's some of my codes what i think i did it wrong
from tkinter import *
from tkinter import filedialog
import imageio
import os
root = Tk()
ftypes = [('All Files', "*.*"), ('Webm', "*.webm")]
ttl = "Select Files(s)"
dir1 = 'D:/My Pictures/9gag'
root.fileName = filedialog.askopenfilenames(filetypes=ftypes, initialdir=dir1, title=ttl)
lst = list(root.fileName)
def path_leaf(path):
return path.strip('/').strip('\\').split('/')[-1].split('\\')[-1]
print([path_leaf(path) for path in lst])
lst2 = [path_leaf(path) for path in lst]
print(lst)
def gifMaker(inputPath, targetFormat):
outputPath = os.path.splitext(inputPath)[0] + targetFormat
print(f'converting {inputPath} \n to {outputPath}')
reader = imageio.get_reader(inputPath)
fps = reader.get_meta_data()['fps']
writer = imageio.get_writer(outputPath, fps=fps)
for frames in reader:
writer.append_data(frames)
print(f'Frame {frames}')
print('Done!')
writer.close()
for ad in lst2:
gifMaker(ad, '.gif')
And the error are shown like this
Traceback (most recent call last):
File "D:/My Pictures/GIF/GIF.py", line 41, in <module>
gifMaker(ad, '.gif')
File "D:/My Pictures/GIF/GIF.py", line 28, in gifMaker
reader = imageio.get_reader(inputPath)
File "C:\Python\Anaconda3\lib\site-packages\imageio\core\functions.py", line 173, in get_reader
request = Request(uri, "r" + mode, **kwargs)
File "C:\Python\Anaconda3\lib\site-packages\imageio\core\request.py", line 126, in __init__
self._parse_uri(uri)
File "C:\Python\Anaconda3\lib\site-packages\imageio\core\request.py", line 278, in _parse_uri
raise FileNotFoundError("No such file: '%s'" % fn)
FileNotFoundError: No such file: 'D:\My Pictures\GIF\a6VOVL2_460sv.mp4'
So, what am i missing or fault? I don't understand why the error is showing "file is not found". Can someone explain to me in detail, how these lines of error occurred?
There are several possibilities
Maybe you misstyped the path/filename.
Maybe the space in the path is causing trouble.

Python OpenCV Error: "TypeError: Image data cannot be converted to float"

So I am trying to create a Python Program to detect similar details in two images using Python's OpenCV. I have the two images and they are in my current directory, and they exist (see the code in lines 6-17). But I am getting the following error when I try running it.
import numpy as np
import matplotlib.pyplot as plt
import cv2
import os
path1 = "WIN_20171207_13_51_33_Pro.jpg"
path2 = "WIN_20171207_13_51_43_Pro.jpg"
if os.path.isfile(path1):
img1 = cv2.imread('WIN_20171207_13_51_33_Pro.jpeg',0)
else:
print ("The file " + path1 + " does not exist.")
if os.path.isfile(path2):
img2 = cv2.imread('WIN_20171207_13_51_43_Pro.jpeg',0)
else:
print ("The file " + path2 + " does not exist.")
orb = cv2.ORB_create()
kpl1, des1 = orb.detectAndCompute(img1,None)
kpl2, des2 = orb.detectAndCompute(img2,None)
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = bf.match(des1, des2)
matches = sorted(matches, key=lambda x:x.distance)
img3 = cv2.drawMatches(img1,kpl1,img2,kpl2,matches[:10],None, flags=2)
plt.imshow (img3)
plt.show()
Here is the error I keep on getting...
Traceback (most recent call last):
File "C:\Users\jweir\source\repos\BruteForceFeatureDetection\BruteForceFeatureDetection\BruteForceFeatureDetection.py", line 31, in <module>
plt.imshow (img3)
File "C:\Program Files\Python36\lib\site-packages\matplotlib\pyplot.py", line 3080, in imshow
**kwargs)
File "C:\Program Files\Python36\lib\site-packages\matplotlib\__init__.py", line 1710, in inner
return func(ax, *args, **kwargs)
File "C:\Program Files\Python36\lib\site-packages\matplotlib\axes\_axes.py", line 5194, in imshow
im.set_data(X)
File "C:\Program Files\Python36\lib\site-packages\matplotlib\image.py", line 600, in set_data
raise TypeError("Image data cannot be converted to float")
TypeError: Image data cannot be converted to float
Can someone please explpain to me why I am getting this error, what it means, and how to fix it.
You're not actually reading in an image.
Check out what happens if you try to display None in matplotlib:
plt.imshow(None)
Traceback (most recent call last):
File ".../example.py", line 16, in <module>
plt.imshow(None)
File ".../matplotlib/pyplot.py", line 3157, in imshow
**kwargs)
File ".../matplotlib/__init__.py", line 1898, in inner
return func(ax, *args, **kwargs)
File ".../matplotlib/axes/_axes.py", line 5124, in imshow
im.set_data(X)
File ".../matplotlib/image.py", line 596, in set_data
raise TypeError("Image data can not convert to float")
TypeError: Image data can not convert to float
You're reading WIN_20171207_13_51_33_Pro.jpeg but you're checking if WIN_20171207_13_51_33_Pro.jpg exists. Note the different extensions. Why do you have the filename written twice (and differently)? Just simply write:
if os.path.isfile(path1):
img1 = cv2.imread(path1, 0)
else:
print ("The file " + path1 + " does not exist.")
Note that even if you put a bogus file into cv2.imread(), the resulting image will just be None, which doesn't error in any of the subsequent function calls until matplotlib tries to draw it. If you print(img1) after reading, you'll see it's None and not reading properly.
I do not know whether it is relevant with your case but since we assigning the file path in string type like cv2.imread("filepathHere"), if arguments like "\b" or "\r" occurs in the file path it causes program to pop an error such as this.
When I encountered such an error before, I changed the file name from file / brick.png to ibrick.png and the problem was resolved.

Amazon S3 upload fails using boto + Python

Hi I am unable to upload a file to S3 using boto. It fails with the following error message. Can someone help me, i am new to python and boto.
from boto.s3 import connect_to_region
from boto.s3.connection import Location
from boto.s3.key import Key
import boto
import gzip
import os
AWS_KEY = ''
AWS_SECRET_KEY = ''
BUCKET_NAME = 'mybucketname'
conn = connect_to_region(Location.USWest2,aws_access_key_id = AWS_KEY,
aws_secret_access_key = AWS_SECRET_KEY,
is_secure=False,debug = 2
)
bucket = conn.lookup(BUCKET_NAME)
bucket2 = conn.lookup('unzipped-data')
rs = bucket.list()
rs2 = bucket2.list()
compressed_files = []
all_files = []
files_to_download = []
downloaded_files = []
path = "~/tmp/"
# Check if the file has already been decompressed
def filecheck():
for filename in bucket.list():
all_files.append(filename.name)
for n in rs2:
compressed_files.append(n.name)
for file_name in all_files:
if file_name.strip('.gz') in compressed_files:
pass;
elif '.gz' in file_name and 'indeed' in file_name:
files_to_download.append(file_name)
# Download necessary files
def download_files():
for name in rs:
if name.name in files_to_download:
file_name = name.name.split('/')
print('Downloading: '+ name.name).strip('\n')
file_name = name.name.split('/')
name.get_contents_to_filename(path+file_name[-1])
print(' - Completed')
# Decompressing the file
print('Decompressing: '+ name.name).strip('\n')
inF = gzip.open(path+file_name[-1], 'rb')
outF = open(path+file_name[-1].strip('.gz'), 'wb')
for line in inF:
outF.write(line)
inF.close()
outF.close()
print(' - Completed')
# Uploading file
print('Uploading: '+name.name).strip('\n')
full_key_name = name.name.strip('.gz')
k = Key(bucket2)
k.key = full_key_name
k.set_contents_from_filename(path+file_name[-1].strip('.gz'))
print('Completed')
# Clean Up
d_list = os.listdir(path)
for d in d_list:
os.remove(path+d)
# Function Calls
filecheck()
download_files()
Error message :
Traceback (most recent call last):
File "C:\Users\Siddartha.Reddy\workspace\boto-test\com\salesify\sid\decompress_s3.py", line 86, in <module>
download_files()
File "C:\Users\Siddartha.Reddy\workspace\boto-test\com\salesify\sid\decompress_s3.py", line 75, in download_files
k.set_contents_from_filename(path+file_name[-1].strip('.gz'))
File "C:\Python27\lib\site-packages\boto\s3\key.py", line 1362, in set_contents_from_filename
encrypt_key=encrypt_key)
File "C:\Python27\lib\site-packages\boto\s3\key.py", line 1293, in set_contents_from_file
chunked_transfer=chunked_transfer, size=size)
File "C:\Python27\lib\site-packages\boto\s3\key.py", line 750, in send_file
chunked_transfer=chunked_transfer, size=size)
File "C:\Python27\lib\site-packages\boto\s3\key.py", line 951, in _send_file_internal
query_args=query_args
File "C:\Python27\lib\site-packages\boto\s3\connection.py", line 664, in make_request
retry_handler=retry_handler
File "C:\Python27\lib\site-packages\boto\connection.py", line 1070, in make_request
retry_handler=retry_handler)
File "C:\Python27\lib\site-packages\boto\connection.py", line 1029, in _mexe
raise ex
socket.error: [Errno 10053] An established connection was aborted by the software in your host machine
I have no problem downloading the files, but the upload fails for some weird reason.
If the problem is the size of files (> 5GB), you should use multipart upload:
http://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html
search for multipart_upload in the docs:
http://boto.readthedocs.org/en/latest/ref/s3.html#module-boto.s3.multipart
Also, see this question for a related issue:
How can I copy files bigger than 5 GB in Amazon S3?
The process is a little non-intuitive. You need to:
run initiate_multipart_upload(), storing the returned object
split the file into chunks (either on disk, or read from memory using CStringIO)
feed the parts sequentially into upload_part_from_file()
run complete_upload() on the stored object

Memory error while parsing files in a folder

I'm using python 2.7
Here is my code to parse files in a folder
import linecache
import glob
path = r"G:\test\folder1"
Key = '''testresult="NOK"'''
Files = glob.glob(path+'\*.xml')
for FileName in Files:
Loop_Count = 1
while Loop_Count!= 50:
Line_Read = linecache.getline(FileName, Loop_Count)
if (Key in Line_Read):
a = FileName.split('\\')
b = len(a)-1
print a[b]
break
elif(Loop_Count == 49):
pass
Loop_Count = Loop_Count+1
print "Completed"
if folder1 has many files, i'm getting memory error
Traceback (most recent call last):
File "C:\Users\whoKnows\Desktop\test_Check111.py", line 10, in <module> Line_Read = linecache.getline(FileName, Loop_Count)
File "C:\Python27\lib\linecache.py", line 14, in getline
lines = getlines(filename, module_globals)
File "C:\Python27\lib\linecache.py", line 40, in getlines
return updatecache(filename, module_globals)
File "C:\Python27\lib\linecache.py", line 128, in updatecache
lines = fp.readlines()
MemoryError
I think its because i'm opening all the files for reading and i'm not closing them. Can anyone please tell me how to close the files While using glob.
MemoryError means you have run out of memory. You are probably loading all the files into the memory at once. Try deleting lines not needed anymore with linecache.clearcache().

file handling using pickle

try.py:
import pickle
f=open('abc.dat','w')
x=[320,315,316]
y=pickle.load(f)
f.close()
f=open('abc.dat','w')
for i in x:
y.append(i)
pickle.dump(y,f)
f.close()
use.py
import pickle
import os
os.system('try.py')
f=open('abc.dat', 'r')
print "abc.dat = "
x=pickle.load(f)
print x
print "end of abc.dat"
f.close();
y=x[:]
for z in x:
y.remove(z)
print "removing " + str(z)
print str(y) + " and " + str(x)
f=open('abc.dat', 'w')
pickle.dump(y, f)
f.close()
error:
Traceback (most recent call last):
File "G:\parin\new start\use.py", line 7, in <module>
x=pickle.load(f)
File "C:\Python26\lib\pickle.py", line 1370, in load
return Unpickler(file).load()
File "C:\Python26\lib\pickle.py", line 858, in load
dispatch[key](self)
File "C:\Python26\lib\pickle.py", line 880, in load_eof
raise EOFError
EOFError
The error is in try.py:
f=open('abc.dat','w')
y=pickle.load(f)
Note that the 'w' mode resets the file to size 0 (i.e. deletes its content). Pass 'r' or nothing at all to open abc.dat for reading.
Example doesn't work for me. try.py fails when the file doesn't exist.
My big recommendation, though, is to look at using JSON instead of pickle, as you'll have more cross-platform flexibility and the interface is more flexible.
For example, use this to create a file of JSON lines:
import json,random
with open("data.txt","w") as f:
for i in range(0,10):
info = {"line":i,
"random":random.random()}
f.write(json.dumps(info)+"\n")
(Make info whatever you want, obviously.)
Then use this to read them:
import json
for line in open("data.txt"):
data = json.loads(line)
print("data:" + str(data))

Categories