file handling using pickle - python

try.py:
import pickle
f=open('abc.dat','w')
x=[320,315,316]
y=pickle.load(f)
f.close()
f=open('abc.dat','w')
for i in x:
y.append(i)
pickle.dump(y,f)
f.close()
use.py
import pickle
import os
os.system('try.py')
f=open('abc.dat', 'r')
print "abc.dat = "
x=pickle.load(f)
print x
print "end of abc.dat"
f.close();
y=x[:]
for z in x:
y.remove(z)
print "removing " + str(z)
print str(y) + " and " + str(x)
f=open('abc.dat', 'w')
pickle.dump(y, f)
f.close()
error:
Traceback (most recent call last):
File "G:\parin\new start\use.py", line 7, in <module>
x=pickle.load(f)
File "C:\Python26\lib\pickle.py", line 1370, in load
return Unpickler(file).load()
File "C:\Python26\lib\pickle.py", line 858, in load
dispatch[key](self)
File "C:\Python26\lib\pickle.py", line 880, in load_eof
raise EOFError
EOFError

The error is in try.py:
f=open('abc.dat','w')
y=pickle.load(f)
Note that the 'w' mode resets the file to size 0 (i.e. deletes its content). Pass 'r' or nothing at all to open abc.dat for reading.

Example doesn't work for me. try.py fails when the file doesn't exist.
My big recommendation, though, is to look at using JSON instead of pickle, as you'll have more cross-platform flexibility and the interface is more flexible.
For example, use this to create a file of JSON lines:
import json,random
with open("data.txt","w") as f:
for i in range(0,10):
info = {"line":i,
"random":random.random()}
f.write(json.dumps(info)+"\n")
(Make info whatever you want, obviously.)
Then use this to read them:
import json
for line in open("data.txt"):
data = json.loads(line)
print("data:" + str(data))

Related

EOFError: Ran out of input when unpickling non-empty file

Traceback (most recent call last):
File "C:\Users\me\folder\project.py", line 999, in <module>
load_data()
File "C:\Users\me\folder\project.py", line 124, in load_data
globals()[var_name] = pickle.(f)
EOFError: Ran out of input
I get this error when trying to unpickle a file, even though the file is non-empty. I've tried opening the file and printing its value and it is non-empty, but the unpickler returns this result still.
Anyone know why this may be happening?
The code here is as follows:
files_to_load = ['var1','var2',...]
def load_data():
for var_name in files_to_load:
path = '{}.txt'.format(var_name)
if os.path.exists(path):
with open(path, 'rb') as f:
globals()[var_name] = pickle.Unpickler(f).load()
else: globals()[var_name] = {}

Python Pillow - ValueError: Decompressed Data Too Large

I use the Pillow lib to create thumbnails. I have to create a lot of them, actually more than 10.000
The program works fine, but after processing round about 1.500, I get the following error:
Traceback (most recent call last):
File "thumb.py", line 15, in <module>
im = Image.open('/Users/Marcel/images/07032017/' + infile)
File "/Users/Marcel/product-/PIL/Image.py", line 2339, in open
im = _open_core(fp, filename, prefix)
File "/Users/Marcel/product-/PIL/Image.py", line 2329, in _open_core
im = factory(fp, filename)
File "/Users/Marcel/product-/PIL/ImageFile.py", line 97, in __init__
self._open()
File "/Users/Marcel/product-/PIL/PngImagePlugin.py", line 538, in _open
s = self.png.call(cid, pos, length)
File "/Users/Marcel/product-/PIL/PngImagePlugin.py", line 136, in call
return getattr(self, "chunk_" + cid.decode('ascii'))(pos, length)
File "/Users/Marcel/product-/PIL/PngImagePlugin.py", line 319, in chunk_iCCP
icc_profile = _safe_zlib_decompress(s[i+2:])
File "/Users/Marcel/product-/PIL/PngImagePlugin.py", line 90, in _safe_zlib_decompress
raise ValueError("Decompressed Data Too Large")
ValueError: Decompressed Data Too Large
My program is very straight forward:
import os, sys
import PIL
from PIL import Image
size = 235, 210
reviewedProductsList = open('products.txt', 'r')
reviewedProducts = reviewedProductsList.readlines()
t = map(lambda s: s.strip(), reviewedProducts)
print "Thumbs to create: '%s'" % len(reviewedProducts)
for infile in t:
outfile = infile
try:
im = Image.open('/Users/Marcel/images/07032017/' + infile)
im.thumbnail(size, Image.ANTIALIAS)
print "thumb created"
im.save('/Users/Marcel/product-/thumbs/' + outfile, "JPEG")
except IOError, e:
print "cannot create thumbnail for '%s'" % infile
print "error: '%s'" % e
I am performing this operation locally on my MacBook Pro.
This is to protect against a potential DoS attack on servers running Pillow caused by decompression bombs. It occurs when a decompressed image is found to have too large metadata. See http://pillow.readthedocs.io/en/4.0.x/handbook/image-file-formats.html?highlight=decompression#png
Here's the CVE report: https:// www.cvedetails.com/cve/CVE-2014-9601/
From a recent issue:
If you set ImageFile.LOAD_TRUNCATED_IMAGES to true, it will suppress
the error (but still not read the large metadata). Alternately, you can
change set the values here: https://github.com/python-pillow/Pillow/ blob/master/PIL/PngImagePlugin.py#L74
https://github.com/python-pillow/Pillow/issues/2445
following code should help you in setting what accepted answer says.
from PIL import PngImagePlugin
LARGE_ENOUGH_NUMBER = 100
PngImagePlugin.MAX_TEXT_CHUNK = LARGE_ENOUGH_NUMBER * (1024**2)
It's not documented how to set this value. I hope people find this useful.

Get name attribute of IO_Bufferedreader

What I am wanting to do is use the name of the current file I have that is from a generator and use the first section of the name + append .csv
The buffered stream looks like this
<_io.BufferedReader name='data/20160107W FM0.xml'>
I am having an issue with this code:
for file_to_read in roots:
print(file_to_read)
base = os.path.basename(file_to_read)
print(base)
name_to_write = os.path.splitext(file_to_read)[0]
outname = str(name_to_write[0]) + ".csv"
outdir = "output"
with open(os.path.join(outdir, outname), 'w', newline='') as csvf:
I receive this error which I believe means I am trying to split the stream rather than the name attribute of the buffered stream. Which leads me to this error.
$ python race.py data/ -e .xml
<_io.BufferedReader name='data/20160107W FM0.xml'>
Traceback (most recent call last):
File "race.py", line 106, in <module>
data_attr(rootObs)
File "race.py", line 40, in data_attr
base = os.path.basename(file_to_read)
File "C:\Users\Sayth\Anaconda3\lib\ntpath.py", line 232, in basename
return split(p)[1]
File "C:\Users\Sayth\Anaconda3\lib\ntpath.py", line 204, in split
d, p = splitdrive(p)
File "C:\Users\Sayth\Anaconda3\lib\ntpath.py", line 139, in splitdrive
if len(p) >= 2:
TypeError: object of type '_io.BufferedReader' has no len()
My expected output is:
20160107W FM0.csv
for a file you are reading/writing this works:
filepath = '../data/test.txt'
with open(filepath, 'w') as file:
print(file) # -> <_io.TextIOWrapper name='../data/test.txt' mode='w' encoding='UTF-8'>
print(file.name) # -> ../data/test.txt
but the type here is <_io.TextIOWrapper name='../data/test.txt' mode='w' encoding='UTF-8'> so i am not entirely sure how you open your file or get a _io.BufferedReader.
i assume they are both derived from io.FileIO and should therefore have a .name attribute.
thanks to Ashwini Chaudhary's comment, i can recreate your exact situation:
from io import BufferedReader
filepath = '../data/test.txt'
with BufferedReader(open(filepath, 'r')) as file:
print(file) # -> <_io.BufferedReader name='../data/test.txt'>
print(file.name) # -> ../data/test.txt

python - Writing to csv file out of coroutine sink ... how to avoid closed file error?

My code (simplified):
import csv
def generate_record(downstream):
try:
while True:
incoming = (yield)
record = incoming.strip()
for worker in downstream:
worker.send(record)
except GeneratorExit:
for worker in downstream:
worker.close()
print('generate_record shutdown')
def file_writer(filename):
l = list()
try:
while True:
record = (yield)
l.append(record)
except GeneratorExit:
with open(filename, 'w', newline=''):
writer = csv.writer(f)
writer.writerows(l)
print('file_writer shutdown')
if __name__ == '__main__':
sink = file_writer('C:/Users/Some User/Downloads/data.csv')
next(sink)
worker = generate_record([sink])
next(worker)
with open('C:/Users/Some User/Downloads/Energy.txt') as f:
for line in f:
worker.send(line)
worker.close()
Generates the following error:
Traceback (most recent call last):
File "<ipython-input-43-ff97472f6399>", line 1, in <module>
runfile('C:/Users/Some User/Documents/Python Scripts/isii.py', wdir='C:/Users/Some User/Documents/Python Scripts')
File "C:\Users\Some User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 699, in runfile
execfile(filename, namespace)
File "C:\Users\Some User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 88, in execfile
exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)
File "C:/Users/Some User/Documents/Python Scripts/isii.py", line 75, in <module>
worker.close()
File "C:/Users/Some User/Documents/Python Scripts/isii.py", line 49, in generate_record
worker.close()
File "C:/Users/Some User/Documents/Python Scripts/isii.py", line 63, in file_writer
writer.writerows(l)
ValueError: I/O operation on closed file.
What have I tried?
I've tried incrementally writing with writerow within file_writer within the try block, but that generates the same error.
The with statement in the file_writer is missing as f part; by missing that, f references the global variable f instead which is closed at the time of writing; cases the IOError.
with open(filename, 'w', newline='') as f:
^^^^
When you use with open(filename) as f:, it will do the operation you have added and then closes the file. So you don't need to use worker.close() since you are trying to close a file that already is closed.
See: What is the python keyword "with" used for?
This should be a comment, but I do not seem to have enough reputation for that.

Loading a file as a JSON in python?

I'm quite new to python and im trying to get a JSON file to be created and then loaded and then organised. But I keep getting a weird error.
This is my code to write the file:
def save(): # save to file
with open(class_name, 'a') as f:
data = [name, score]
json.dump(data, f)
This is my code to load the file:
with open(class_name, 'r') as f:
data2 = json.load(f)
This is my code to organise the file:
with open(class_name, 'r') as f:
data2 = json.load(f)
Alpha = sorted(data, key=str.lower)
print(Alpha)
And this is my error:
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
viewscores()
File "C:\Users\Danny\Desktop\Task123.py", line 60, in viewscores
data2 = json.load(f)
File "C:\Python34\lib\json\__init__.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "C:\Python34\lib\json\__init__.py", line 318, in loads
return _default_decoder.decode(s)
File "C:\Python34\lib\json\decoder.py", line 346, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 13 - line 1 column 153 (char 12 - 152)
You are appending data to your file, creating multiple JSON documents in that file. The json.load() command on the other hand can only load one JSON document from a file, and is now complaining you have more data after the JSON document.
Either adjust your code to load those documents separately (insert newlines after each entry you append, and load the JSON documents line by line, or replace everything in the file with a new JSON object.
Using newlines as separators, then loading all the entries separately would look like:
def save(): # save to file
with open(class_name, 'a') as f:
data = [name, score]
json.dump(data, f)
f.write('\n')
and loading would be:
with open(class_name, 'r') as f:
scores = []
for line in f:
entry = json.loads(line)
scores.append(entry)
after which you could sort those entries if you so wish.

Categories