pickle - putting more than 1 object in a file? [duplicate] - python

This question already has answers here:
Saving and loading multiple objects in pickle file?
(8 answers)
Closed 6 years ago.
I have got a method which dumps a number of pickled objects (tuples, actually) into a file.
I do not want to put them into one list, I really want to dump several times into the same file.
My problem is, how do I load the objects again?
The first and second object are just one line long, so this works with readlines.
But all the others are longer.
naturally, if I try
myob = cpickle.load(g1.readlines()[2])
where g1 is the file, I get an EOF error because my pickled object is longer than one line.
Is there a way to get just my pickled object?

If you pass the filehandle directly into pickle you can get the result you want.
import pickle
# write a file
f = open("example", "w")
pickle.dump(["hello", "world"], f)
pickle.dump([2, 3], f)
f.close()
f = open("example", "r")
value1 = pickle.load(f)
value2 = pickle.load(f)
f.close()
pickle.dump will append to the end of the file, so you can call it multiple times to write multiple values.
pickle.load will read only enough from the file to get the first value, leaving the filehandle open and pointed at the start of the next object in the file. The second call will then read the second object, and leave the file pointer at the end of the file. A third call will fail with an EOFError as you'd expect.
Although I used plain old pickle in my example, this technique works just the same with cPickle.

I think the best way is to pack your data into a single object before you store it, and unpack it after loading it. Here's an example using
a tuple as the container(you can use dict also):
a = [1,2]
b = [3,4]
with open("tmp.pickle", "wb") as f:
pickle.dump((a,b), f)
with open("tmp.pickle", "rb") as f:
a,b = pickle.load(f)

Don't try reading them back as lines of the file, justpickle.load()the number of objects you want. See my answer to the question How to save an object in Python for an example of doing that.

Related

How to test a function that loads a pickle file without actual IO operations [duplicate]

This question already has answers here:
How do I mock an open used in a with statement (using the Mock framework in Python)?
(11 answers)
Closed 8 months ago.
I have written a function that loads a pickled list of dictionaries and optionally filters the result:
def load_pickled_list(path_to_file, filter_key=None):
with open(path_to_file, "rb") as file:
loaded_list = pickle.load(file)
if filter_key is not None:
loaded_list = [entry for entry in loaded_list if loaded_list[filter_key] == filter_key]
return loaded_list
How do I test this with pytest by providing two different lists of dictionaries in code? Especially, how do I implement a test double of pickle? I do not want to provide a file such as test_list.pkl so that the test would have to perform real disk IO operations.
If you have a sequence of bytes in mind, you can hard-code it directly into your program using io.BytesIO. The example in the module docs may be what you need, or at least provide a good starting point:
f = io.BytesIO(b"your pickle file here ")
loaded_list = pickle.load(f)
Better than hard-coding the data, have some part of your setup or a fixture generate it:
# make some objects
data = ...
f = io.BytesIO()
pickle.dump(data, f)
f.seek(0)
# now load `f`
You could also avoid the file interface entirely by using dumps/loads to work with bytes directly instead of doing I/O with dump/load.

Access values outside with-block

Is there a way, in the code below, to access the variable utterances_dict outside of the with-block? The code below obviously returns the error: ValueError: I/O operation on closed file.
from csv import DictReader
utterances_dict = {}
utterance_file = 'toy_utterances.csv'
with open(utterance_file, 'r') as utt_f:
utterances_dict = DictReader(utt_f)
for line in utterances_dict:
print(line)
I am not an expert on DictReader implementation, however their documentation leaves the implementation open to the reader itself parsing the file after construction. Meaning it may be possible that the underlying file has to remain open until you are done using it. In this case, it would be problematic to attempt to use the utterances_dict outside of the with block because the underlying file will be closed by then.
Even if the current implementation of DictReader does in fact parse the whole csv on construction, it doesn't mean their implementation won't change in the future.
DictReader returns a view of the csv file.
Convert the result to a list of dictionaries.
from csv import DictReader
utterances = []
utterance_file = 'toy_utterances.csv'
with open(utterance_file, 'r') as utt_f:
utterances = [dict(row) for row in DictReader(utt_f) ]
for line in utterances:
print(line)

Saving the output of a python program [duplicate]

This question already has answers here:
Trying to pickle a list in python
(3 answers)
Python pickle/unpickle a list to/from a file
(2 answers)
Closed 4 years ago.
The output of my python program is a list (G) which has almost 100000 elements. I want to use these elements in the later part of the program. How can I save my list (G) so that I don’t have to run the program again and again?
You can do like this
list=[1,2,3,4,5,6]
thefile = open('test.txt', 'w')
for item in list:
thefile.write("%s\n" % item)
pickle enables you to save your python object to your disk. Without running your first program, you can just load this pickle file and use it in another program by just calling the load function.
Personally I like to use pickle
To store as a pickle object
import pickle
a = {'your_list': [1,2,3,4]}
with open('filename.pickle', 'wb') as handle:
pickle.dump(a, handle, protocol=pickle.HIGHEST_PROTOCOL)
To read from pickle object
import pickle
with open('filename.pickle', 'rb') as handle:
a = pickle.load(handle)
print a # a is now {'your_list': [1,2,3,4]}
Here i got your question you have to use data or arrays in onother program is it?
For that you didn't have to use your physical memory you can easily create module of your program and use this module in second program by importing first module which decrease space complexity and you can use specific value in second program with loading all list or array which also decreases time complexity of execution.python module
And if you want to store data for other purpose then you can use local file storage or sqlite. File i/o inpython

How to not have set written on my file- python 2

So I basically just want to have a list of all the pixel colour values that overlap written in a text file so I can then access them later.
The only problem is that the text file is having (set([ or whatever written with it.
Heres my code
import cv2
import numpy as np
import time
om=cv2.imread('spectrum1.png')
om=om.reshape(1,-1,3)
om_list=om.tolist()
om_tuple={tuple(item) for item in om_list[0]}
om_set=set(om_tuple)
im=cv2.imread('RGB.png')
im=cv2.resize(im,(100,100))
im= im.reshape(1,-1,3)
im_list=im.tolist()
im_tuple={tuple(item) for item in im_list[0]}
ColourCount= om_set & set(im_tuple)
File= open('Weedlist', 'w')
File.write(str(ColourCount))
Also, if I run this program again but with a different picture for comparison, will it append the data or overwrite it? It's kinda hard to tell when just looking at numbers.
If you replace these lines:
im=cv2.imread('RGB.png')
File= open('Weedlist', 'w')
File.write(str(ColourCount))
with:
import sys
im=cv2.imread(sys.argv[1])
open(sys.argv[1]+'Weedlist', 'w').write(str(list(ColourCount)))
you will get a new file for each input file and also you don't have to overwrite the RGB.png every time you want to try something new.
Files opened with mode 'w' will be overwritten. You can use 'a' to append.
You opened the file with the 'w' mode, write mode, which will truncate (empty) the file when you open it. Use 'a' append mode if you want data to be added to the end each time
You are writing the str() conversion of a set object to your file:
ColourCount= om_set & set(im_tuple)
File= open('Weedlist', 'w')
File.write(str(ColourCount))
Don't use str to convert the whole object; format your data to a string you find easy to read back again. You probably want to add a newline too if you want each new entry to be added on a new line. Perhaps you want to sort the data too, since a set lists items in an ordered determined by implementation details.
If comma-separated works for you, use str.join(); your set contains tuples of integer numbers, and it sounds as if you are fine with the repr() output per tuple, so we can re-use that:
with open('Weedlist', 'a') as outputfile:
output = ', '.join([str(tup) for tup in sorted(ColourCount)])
outputfile.write(output + '\n')
I used with there to ensure that the file object is automatically closed again after you are done writing; see Understanding Python's with statement for further information on what this means.
Note that if you plan to read this data again, the above is not going to be all that efficient to parse again. You should pick a machine-readable format. If you need to communicate with an existing program, you'll need to find out what formats that program accepts.
If you are programming that other program as well, pick a format that other programming language supports. JSON is widely supported for example (use the json module and convert your set to a list first; json.dump(sorted(ColourCount), fileobj), then `fileobj.write('\n') to produce newline-separated JSON objects could do).
If that other program is coded in Python, consider using the pickle module, which writes Python objects to a file efficiently in a format the same module can load again:
with open('Weedlist', 'ab') as picklefile:
pickle.dump(ColourCount, picklefile)
and reading is as easy as:
sets = []
with open('Weedlist', 'rb') as picklefile:
while True:
try:
sets.append(pickle.load(output))
except EOFError:
break
See Saving and loading multiple objects in pickle file? as to why I use a while True loop there to load multiple entries.
How would you like the data to be written? Replace the final line by
File.write(str(list(ColourCount)))
Maybe you like that more.
If you run that program, it will overwrite the previous content of the file. If you prefer to apprend the data open the file with:
File= open('Weedlist', 'a')

In a pickle with pickling in python

I have gone through this website and many others but no one seems to give me the simplest possible answer. In the scrip bellow there are 2 different variables that need to be placed into a single pickle (aka 'test1' and 'test2'); but I am wholly unable to get even the simpler one of the two to load. There are no error messages or anything, and it does appear that something is being written to the pickle but then I close the 'program', re open it, try to load the pickle but the value of 'test1' does not change.
The second question is how to save both to the same pickle? at first i tried using the allStuff variable to store both test1 and test2 then dumping allStuff...the dump seems to be a success but loading does jack. Ive tried a variation where you list each file that should be loaded but this just caused a whole lot of errors and caused me to assault my poor old keyboard...
Please Help.
import pickle
class testing():
test1 = 1000
test2 = {'Dogs' : 0, 'Cats' : 0, 'Birds' : 0, 'Mive' : 0}
def saveload():
check = int(input(' 1. Save : 2. Load : 3. Print : 4. Add'))
allStuff = testing.test1, testing.test2
saveFile = 'TestingSaveLoad.data'
if check == 1:
f = open(saveFile, 'wb')
pickle.dump(testing.test1, f)
f.close()
print()
print('Saved.')
testing.saveload()
elif check == 2:
f = open(saveFile, 'rb')
pickle.load(f)
print()
print('Loaded.')
testing.saveload()
elif check == 3:
print(allStuff)
testing.saveload()
else:
testing.test1 += 234
testing.saveload()
testing.saveload()
The pickle.load documentation states:
Read a pickled object representation from the open file object file and return the reconstituted object hierarchy specified therein.
So you would need something like this:
testing.test1 = pickle.load(f)
However, to save and load multiple objects, you can use
# to save
pickle.dump(allStuff, f)
# to load
allStuff = pickle.load(f)
testing.test1, testing.test2 = allStuff
Dump them as a tuple, and when loading, unpack the result back into the two variables.
pickle.dump((testing.test1,testing.test2), f)
and
testing.test1, testing.test2 = pickle.load(f)
Then change the print to print the two items and forget about allStuff, since you would have to keep updating allStuff every time you loaded/reassigned (depending on the type of item you are storing).
print(testing.test1, testing.test2)
I'd also remove the recursive call to saveLoad() and wrap whatever should be repeated in a while loop with an option to exit
if check == 0:
break
You aren't saving the reconstituted pickled object currently. The documentation states that pickle.load() returns the reconstituted object.
You should have something like:
f = open(saveFile, 'rb')
testing.test1 = pickle.load(f)
To save multiple objects, use the approach recommended in this answer:
If you need to save multiple objects, you can simply put them in a single list, or tuple
Also, I recommend using the with keyword to open the file. That will ensure the file is closed even if something goes wrong. An example of a final output:
with open(saveFile, 'wb') as f:
pickle.dump((testing1, testing2), f)
...
with open(saveFile, 'rb') as f:
testing1, testing2 = pickle.load(f) # Implicit unpacking of the tuple
You might also want a while loop instead of the multiple calls to saveload() - it will be a bit cleaner. Note that right now you have no way out of your loop, short of quitting the program.

Categories