append variables to a pickle file and read them - python

I'm trying to append several variables to a pickle file to read them later. But it doesn't work as I expected. I would expect that at the end of this script c='A' and d='B' but instead it thows me an error. Could you please explain me why and how to get what I want? Many thanks
import pickle
filename = 'test.pkl'
a = 'A'
b = 'B'
with open(filename, 'wb') as handle:
pickle.dump(a, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open(filename, 'ab') as handle:
pickle.dump(b, handle)
with open(filename, 'rb') as filehandle:
c,d = pickle.load(filehandle)

After running your code, I got ValueError: not enough values to unpack (expected 2, got 1).
If you run help(pickle.load), it will tell you that it only loads objects from the file. If you have multiple objects in the file, you have to call pickle.load multiple times to read the objects sequentially.
Your issue is basically you stored them as 2 separate objects but are attempting to read them as a single tuple.

The problem is that pickle.load(filehandle) only selects the first object. The most common way to solve this problem is to use a tuple or list. So basically how this works is that you pickle one object, which can be the first object, and then decompose it later. So you would do this:
import pickle
filename = 'test.pkl'
a = 'A'
b = 'B'
List = (a,b)
with open(filename, 'wb') as handle:
pickle.dump(List, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open(filename, 'rb') as filehandle:
c,d = pickle.load(filehandle)

Short answer: Every load must corresponds to a single dump in the pickling code. You can't have a load that gets the values from two separate dump calls, they need to match
So you can either:
load twice to match two dump calls:
# dump code unchanged
with open(filename, 'rb') as filehandle:
c = pickle.load(filehandle) # Load first object
d = pickle.load(filehandle) # Load second object
dump as a single object so it can be loaded as a single object:
with open(filename, 'wb') as handle:
# dump a simple anonymous tuple of both objects
pickle.dump((a, b), handle, protocol=pickle.HIGHEST_PROTOCOL)
# Original load code unchanged
Since you clearly know exactly how many objects must be dumped, either solution works; I'd choose #2 in most cases, unless the two objects are computed in wildly different places in the code and not preserved, requiring the first object to be serialized and discarded before the second exists.

Related

Python: csv to pickle representation, back to csv messes with file content

I am trying to pickle a csv file and then turn its pickled representation back into a csv file.
This is the code I came up with:
from pathlib import Path
import pickle, csv
csvFilePath = Path('/path/to/file.csv')
pathToSaveTo = Path('/path/to/newFile.csv')
csvFile = open(csvFilePath, 'r')
f = csvFile.read()
csvFile.close()
f_pickled = pickle.dumps(f)
f_unpickled = pickle.loads(f_pickled)
#save unpickled csv file
new_csvFile = open(pathToSaveTo, 'w')
csvWriter = csv.writer(new_csvFile)
csvWriter.writerow(f_unpickled)
new_csvFile.close()
newFile.csv is created however there are two problems with its content:
There is now a comma between every character.
There is now a pair of quotation marks after every line.
What would I have to change about my code to get an exact copy of file.csv?
The problem is that you are reading the raw text of the file, with f = csvFile.read() then, on writting, you are feeding the data, which is a single lump of text, all in a single string, though a CSV writer object. The CSV writer will see the string as an iterable, and write each of the iterable elements in a CSV cell. Then, there is no data for a second row, and the process ends.
The pickle dumps and loads you perform is just a no-operation: nothing happens there, and if there were any issue, it would rather be due to some unpickleable object reference in the object you are passing to dumps: you'd get an exception, and not differing data when loads is called.
Now, without telling why you want to do this, and what intermediate steps you hav planned for the data, it is hard to tell you: you are performing two non-operations: reading a file, pickling and unpickling its contents, and writting those contents back to disk.
At which point do you need these data structured as rows, or as CSV cells? Just apply the proper transforms where you need it, and you are done.
If you want the whole "do nothing" cycle going through actual having the CSV data separated in different elements in Python you can perform:
from pathlib import Path
import pickle, csv
csvFilePath = Path('file.csv')
pathToSaveTo = Path('newFile.csv')
data = list(csv.reader(open(csvFilePath)))
# ^consumes all iterations of the reader: each iteration is a row, composed of a list where each cell value is a list elemnt
pickled_data = pickle.dumps(data)
restored_data = pickle.loads(pickled_data)
csv.writer(open(pathToSaveTo, "wt")).writerows(restored_data)
Perceive as in this snippet the data is read through csv.reader, not directly. Wrapping it in a list call causes all rows to be read and transformed in list items - because the reader is a lazy iterator otherwise (and it would not be pickeable, as one of the attributs it depends for its state is an open file)
I believe the problem is in how you're attempting to write the CSV file, the pickling and unpickling is fine. If you compare f with f_unpickled:
if f==f_unpickled:
print("Same")
This printed in my case. If you print the type, you'll see there's both strings.
The better option is to follow the document style and write each row one at a time rather than putting the entire string in including new lines. Something like this:
from pathlib import Path
import pickle, csv
csvFilePath = Path('file.csv')
pathToSaveTo = Path('newFile.csv')
rows = []
csvFile = open(csvFilePath, 'r')
with open(csvFilePath, 'r') as file:
reader = csv.reader(file)
for row in reader:
rows.append(row)
# pickle and unpickle
rows_pickled = pickle.dumps(rows)
rows_unpickled = pickle.loads(rows_pickled)
if rows==rows_unpickled:
print("Same")
#save unpickled csv file
with open(pathToSaveTo, 'w', newline='') as csvfile:
csvWriter = csv.writer(csvfile)
for row in rows_unpickled:
csvWriter.writerow(row)
This worked when I tested it--albeit it would take more finagling with line separators to get no empty line at the end.

Writing and reading a list from a file in Python

I want to save a list in python to a file which should be able to read later and added to a list variable in later use.
As an example
list = [42,54,24,65]
This should be written to a file as
[42,54,24,65] or
list = [42,54,24,65]
And should be able to read later from python for a later use and assign it to a list variable
Right now I'm using the following code.
f = open('list_file', 'w')
f.write(values)
f.close()
This gives me an error
TypeError: write() argument must be str, not list
How can I fix this?
Thanks
You could do it also with pickle, it works similarly to json, but it can serialize a broader set of Python objects than json. Json serializes text, and is human readable, while pickle serializes bytes, not human readable.
Consider this example:
import pickle, json
list_ = [42,54,24,65]
with open('list_file.pickle', 'wb') as fp, open('list_file.json', 'w') as fj:
pickle.dump(list_, fp)
json.dump(list_, fj)
with open('list_file.pickle', 'rb') as fp, open('list_file.json', 'r') as fj:
list_unpickled = pickle.load(fp)
list_from_json = json.load(fj)
print(list_unpickled) #[42, 54, 24, 65]
print(list_from_json) #[42, 54, 24, 65]
Notice that with pickle you have to open the files with the 'b' for binary reading/writing.
A side note: do not use variables with the same name as python keywords, like list.
According to 12.1.4 in the documentation:
The following types can be pickled:
None, True, and False
integers, floating point numbers, complex numbers
strings, bytes, bytearrays
tuples, lists, sets, and dictionaries containing only picklable objects
functions defined at the top level of a module (using def, not lambda)
built-in functions defined at the top level of a module
classes that are defined at the top level of a module
instances of such classes whose dict or the result of calling getstate() is picklable (see section Pickling Class Instances for details).
If you just have a simple list, then you can use JSON and the json module.
import json
data = [42,54,24,65]
with open('output.txt', 'w') as f_out:
json.dump(data, f_out)
with open('output.txt', 'r') as f_in:
data2 = json.load(f_in)
print(data2) # [42,54,24,65]
And the contents of output.txt looks like
[42,54,24,65]
Map all values in the list to strings first, the write method only supports strings.
E.g. list = list(map(str, list))
Also calling a variable "list" is a bad practice, use something like "ls" or whatever differs from standard Python keywords. If you want to use it later, you can just delimit the values using spaces. Just write it like f.write(" ".join(list)). Then, to read it back into a list, do list = f.readline().split() This, however, will keep the values in the list as strings, to get them back to ints, map again like list = list(map(int, list))
According to the error in your code you passing a list to f.write().you need to pass string.
I assuming you want to write one word per line.try the code below it should work.
f = open('list_file', 'w')
for value in list:
f.write(value+"\n")
f.close()
To read later you can just open file again and read using this code:
f = open('list_file', 'r')
for line in f:
print line.strip()
f.close()
Turning my comment into an answer:
Try Saving and loading objects and using pickle:
import pickle
filehandler = open(b"Fruits.obj","wb")
pickle.dump(banana,filehandler)
To load the data, use:
file = open("Fruits.obj",'r')
object_file = pickle.load(file)

Pickle dump replaces current file data

When I use pickle, it works fine and I can dump any load.
The problem is if I close the program and try to dump again, it replaces the old file data with the new dumping. Here is my code:
import pickle
import os
import time
dictionary = dict()
def read():
with open('test.txt', 'rb') as f:
a = pickle.load(f)
print(a)
time.sleep(2)
def dump():
chs = raw_input('name and number')
n = chs.split()
dictionary[n[0]] = n[1]
with open('test.txt', 'wb') as f:
pickle.dump(dictionary, f)
Inpt = raw_input('Option : ')
if Inpt == 'read':
read()
else:
dump()
When you open a file in w mode (or wb), that tells it to write a brand-new file, erasing whatever was already there.
As the docs say:
The most commonly-used values of mode are 'r' for reading, 'w' for writing (truncating the file if it already exists), and 'a' for appending…
In other words, you want to use 'ab', not 'wb'.
However, when you append new dumps to the same file, you end up with a file made up of multiple separate values. If you only call load once, it's just going to load the first one. If you want to load all of them, you need to write code that does that. For example, you can load in a loop until EOFError.
Really, it looks like what you're trying to do is not to append to the pickle file, but to modify the existing pickled dictionary.
You could do that with a function that loads and merges all of the dumps together, like this:
def Load():
d = {}
with open('test.txt', 'rb') as f:
while True:
try:
a = pickle.load(f)
except EOFError:
break
else:
d.update(a)
# do stuff with d
But that's going to get slower and slower the more times you run your program, as you pile on more and more copies of the same values. To do that right you need to load the old dictionary, modify that, and then dump the modified version. And for that, you want w mode.
However, a much better way to persist a dictionary, at least if the keys are strings, is to use dbm (if the values are also strings) or shelve (otherwise) instead of a dictionary in the first place.
Opening a file in "wb" mode truncates the file -- that is, it deletes the contents of the file, and then allows you to work on it.
Usually, you'd open the file in append ("ab") mode to add data at the end. However, Pickle doesn't support appending, so you'll have to save your data to a new file (come up with a different file name -- ask the user or use a command-line parameter such as -o test.txt?) each time the program is run.
On a related topic, don't use Pickle. It's unsafe. Consider using JSON instead (it's in the standard lib -- import json).

In a pickle with pickling in python

I have gone through this website and many others but no one seems to give me the simplest possible answer. In the scrip bellow there are 2 different variables that need to be placed into a single pickle (aka 'test1' and 'test2'); but I am wholly unable to get even the simpler one of the two to load. There are no error messages or anything, and it does appear that something is being written to the pickle but then I close the 'program', re open it, try to load the pickle but the value of 'test1' does not change.
The second question is how to save both to the same pickle? at first i tried using the allStuff variable to store both test1 and test2 then dumping allStuff...the dump seems to be a success but loading does jack. Ive tried a variation where you list each file that should be loaded but this just caused a whole lot of errors and caused me to assault my poor old keyboard...
Please Help.
import pickle
class testing():
test1 = 1000
test2 = {'Dogs' : 0, 'Cats' : 0, 'Birds' : 0, 'Mive' : 0}
def saveload():
check = int(input(' 1. Save : 2. Load : 3. Print : 4. Add'))
allStuff = testing.test1, testing.test2
saveFile = 'TestingSaveLoad.data'
if check == 1:
f = open(saveFile, 'wb')
pickle.dump(testing.test1, f)
f.close()
print()
print('Saved.')
testing.saveload()
elif check == 2:
f = open(saveFile, 'rb')
pickle.load(f)
print()
print('Loaded.')
testing.saveload()
elif check == 3:
print(allStuff)
testing.saveload()
else:
testing.test1 += 234
testing.saveload()
testing.saveload()
The pickle.load documentation states:
Read a pickled object representation from the open file object file and return the reconstituted object hierarchy specified therein.
So you would need something like this:
testing.test1 = pickle.load(f)
However, to save and load multiple objects, you can use
# to save
pickle.dump(allStuff, f)
# to load
allStuff = pickle.load(f)
testing.test1, testing.test2 = allStuff
Dump them as a tuple, and when loading, unpack the result back into the two variables.
pickle.dump((testing.test1,testing.test2), f)
and
testing.test1, testing.test2 = pickle.load(f)
Then change the print to print the two items and forget about allStuff, since you would have to keep updating allStuff every time you loaded/reassigned (depending on the type of item you are storing).
print(testing.test1, testing.test2)
I'd also remove the recursive call to saveLoad() and wrap whatever should be repeated in a while loop with an option to exit
if check == 0:
break
You aren't saving the reconstituted pickled object currently. The documentation states that pickle.load() returns the reconstituted object.
You should have something like:
f = open(saveFile, 'rb')
testing.test1 = pickle.load(f)
To save multiple objects, use the approach recommended in this answer:
If you need to save multiple objects, you can simply put them in a single list, or tuple
Also, I recommend using the with keyword to open the file. That will ensure the file is closed even if something goes wrong. An example of a final output:
with open(saveFile, 'wb') as f:
pickle.dump((testing1, testing2), f)
...
with open(saveFile, 'rb') as f:
testing1, testing2 = pickle.load(f) # Implicit unpacking of the tuple
You might also want a while loop instead of the multiple calls to saveload() - it will be a bit cleaner. Note that right now you have no way out of your loop, short of quitting the program.

pickle - putting more than 1 object in a file? [duplicate]

This question already has answers here:
Saving and loading multiple objects in pickle file?
(8 answers)
Closed 6 years ago.
I have got a method which dumps a number of pickled objects (tuples, actually) into a file.
I do not want to put them into one list, I really want to dump several times into the same file.
My problem is, how do I load the objects again?
The first and second object are just one line long, so this works with readlines.
But all the others are longer.
naturally, if I try
myob = cpickle.load(g1.readlines()[2])
where g1 is the file, I get an EOF error because my pickled object is longer than one line.
Is there a way to get just my pickled object?
If you pass the filehandle directly into pickle you can get the result you want.
import pickle
# write a file
f = open("example", "w")
pickle.dump(["hello", "world"], f)
pickle.dump([2, 3], f)
f.close()
f = open("example", "r")
value1 = pickle.load(f)
value2 = pickle.load(f)
f.close()
pickle.dump will append to the end of the file, so you can call it multiple times to write multiple values.
pickle.load will read only enough from the file to get the first value, leaving the filehandle open and pointed at the start of the next object in the file. The second call will then read the second object, and leave the file pointer at the end of the file. A third call will fail with an EOFError as you'd expect.
Although I used plain old pickle in my example, this technique works just the same with cPickle.
I think the best way is to pack your data into a single object before you store it, and unpack it after loading it. Here's an example using
a tuple as the container(you can use dict also):
a = [1,2]
b = [3,4]
with open("tmp.pickle", "wb") as f:
pickle.dump((a,b), f)
with open("tmp.pickle", "rb") as f:
a,b = pickle.load(f)
Don't try reading them back as lines of the file, justpickle.load()the number of objects you want. See my answer to the question How to save an object in Python for an example of doing that.

Categories