Saving a Dictionary of Pointers to File - python

I'm trying to create a family tree in python, and I want to create a dictionary of Persons, where the key is the person's first name and the value is a Person object for that name.
I can create the dictionary fine, and I can save it fine using the below code.
import os, ast
myFile = open( FILE, "r+" ) # Opens the file for reading and writing
tree = myFile.read()
if tree == "":
tree = {}
else:
tree = ast.literal_eval(tree)
def save():
myFile.write(str(tree))
myFile.close()
However, when I reload my program and try to read in the dictionary, I get the following error:
File "<unknown>", line 1
{'Charlie': <__main__.Person object at 0x00000000032DB860>}
^
SyntaxError: invalid syntax
I suspect that the evaluator can't recognize the object because once the program closes the pointer no longer exists. Is there a way I can save my dictionary so that I can reload it and have access to all my Person objects' data without losing it each time my program closes?

ast.literal_eval is not meant for evaluating and understanding custom objects. It's used to evaluate strings and literals safely. You need to serialize your data by using something like pickle.
>>> import pickle
>>> class Person(object):
... def __init__(self, name):
... self.name = name
...
>>> persons = {'Charlie': Person('Charlie')}
>>> with open(FILE, "wb") as my_file:
... pickle.dump(persons, my_file)
...
>>> with open(FILE, "rb") as f:
... result = pickle.load(f)
...
...
>>> result
{'Charlie': <__main__.Person object at 0x1598bd0>}
When you have large (or many) objects, you can use cPickle in which the pickling is done in C instead of python thus providing a major speed increase.
If object serialization is something you have not heard about before, please read this.

Related

Writing and reading a list from a file in Python

I want to save a list in python to a file which should be able to read later and added to a list variable in later use.
As an example
list = [42,54,24,65]
This should be written to a file as
[42,54,24,65] or
list = [42,54,24,65]
And should be able to read later from python for a later use and assign it to a list variable
Right now I'm using the following code.
f = open('list_file', 'w')
f.write(values)
f.close()
This gives me an error
TypeError: write() argument must be str, not list
How can I fix this?
Thanks
You could do it also with pickle, it works similarly to json, but it can serialize a broader set of Python objects than json. Json serializes text, and is human readable, while pickle serializes bytes, not human readable.
Consider this example:
import pickle, json
list_ = [42,54,24,65]
with open('list_file.pickle', 'wb') as fp, open('list_file.json', 'w') as fj:
pickle.dump(list_, fp)
json.dump(list_, fj)
with open('list_file.pickle', 'rb') as fp, open('list_file.json', 'r') as fj:
list_unpickled = pickle.load(fp)
list_from_json = json.load(fj)
print(list_unpickled) #[42, 54, 24, 65]
print(list_from_json) #[42, 54, 24, 65]
Notice that with pickle you have to open the files with the 'b' for binary reading/writing.
A side note: do not use variables with the same name as python keywords, like list.
According to 12.1.4 in the documentation:
The following types can be pickled:
None, True, and False
integers, floating point numbers, complex numbers
strings, bytes, bytearrays
tuples, lists, sets, and dictionaries containing only picklable objects
functions defined at the top level of a module (using def, not lambda)
built-in functions defined at the top level of a module
classes that are defined at the top level of a module
instances of such classes whose dict or the result of calling getstate() is picklable (see section Pickling Class Instances for details).
If you just have a simple list, then you can use JSON and the json module.
import json
data = [42,54,24,65]
with open('output.txt', 'w') as f_out:
json.dump(data, f_out)
with open('output.txt', 'r') as f_in:
data2 = json.load(f_in)
print(data2) # [42,54,24,65]
And the contents of output.txt looks like
[42,54,24,65]
Map all values in the list to strings first, the write method only supports strings.
E.g. list = list(map(str, list))
Also calling a variable "list" is a bad practice, use something like "ls" or whatever differs from standard Python keywords. If you want to use it later, you can just delimit the values using spaces. Just write it like f.write(" ".join(list)). Then, to read it back into a list, do list = f.readline().split() This, however, will keep the values in the list as strings, to get them back to ints, map again like list = list(map(int, list))
According to the error in your code you passing a list to f.write().you need to pass string.
I assuming you want to write one word per line.try the code below it should work.
f = open('list_file', 'w')
for value in list:
f.write(value+"\n")
f.close()
To read later you can just open file again and read using this code:
f = open('list_file', 'r')
for line in f:
print line.strip()
f.close()
Turning my comment into an answer:
Try Saving and loading objects and using pickle:
import pickle
filehandler = open(b"Fruits.obj","wb")
pickle.dump(banana,filehandler)
To load the data, use:
file = open("Fruits.obj",'r')
object_file = pickle.load(file)

Pickle dump replaces current file data

When I use pickle, it works fine and I can dump any load.
The problem is if I close the program and try to dump again, it replaces the old file data with the new dumping. Here is my code:
import pickle
import os
import time
dictionary = dict()
def read():
with open('test.txt', 'rb') as f:
a = pickle.load(f)
print(a)
time.sleep(2)
def dump():
chs = raw_input('name and number')
n = chs.split()
dictionary[n[0]] = n[1]
with open('test.txt', 'wb') as f:
pickle.dump(dictionary, f)
Inpt = raw_input('Option : ')
if Inpt == 'read':
read()
else:
dump()
When you open a file in w mode (or wb), that tells it to write a brand-new file, erasing whatever was already there.
As the docs say:
The most commonly-used values of mode are 'r' for reading, 'w' for writing (truncating the file if it already exists), and 'a' for appending…
In other words, you want to use 'ab', not 'wb'.
However, when you append new dumps to the same file, you end up with a file made up of multiple separate values. If you only call load once, it's just going to load the first one. If you want to load all of them, you need to write code that does that. For example, you can load in a loop until EOFError.
Really, it looks like what you're trying to do is not to append to the pickle file, but to modify the existing pickled dictionary.
You could do that with a function that loads and merges all of the dumps together, like this:
def Load():
d = {}
with open('test.txt', 'rb') as f:
while True:
try:
a = pickle.load(f)
except EOFError:
break
else:
d.update(a)
# do stuff with d
But that's going to get slower and slower the more times you run your program, as you pile on more and more copies of the same values. To do that right you need to load the old dictionary, modify that, and then dump the modified version. And for that, you want w mode.
However, a much better way to persist a dictionary, at least if the keys are strings, is to use dbm (if the values are also strings) or shelve (otherwise) instead of a dictionary in the first place.
Opening a file in "wb" mode truncates the file -- that is, it deletes the contents of the file, and then allows you to work on it.
Usually, you'd open the file in append ("ab") mode to add data at the end. However, Pickle doesn't support appending, so you'll have to save your data to a new file (come up with a different file name -- ask the user or use a command-line parameter such as -o test.txt?) each time the program is run.
On a related topic, don't use Pickle. It's unsafe. Consider using JSON instead (it's in the standard lib -- import json).

In a pickle with pickling in python

I have gone through this website and many others but no one seems to give me the simplest possible answer. In the scrip bellow there are 2 different variables that need to be placed into a single pickle (aka 'test1' and 'test2'); but I am wholly unable to get even the simpler one of the two to load. There are no error messages or anything, and it does appear that something is being written to the pickle but then I close the 'program', re open it, try to load the pickle but the value of 'test1' does not change.
The second question is how to save both to the same pickle? at first i tried using the allStuff variable to store both test1 and test2 then dumping allStuff...the dump seems to be a success but loading does jack. Ive tried a variation where you list each file that should be loaded but this just caused a whole lot of errors and caused me to assault my poor old keyboard...
Please Help.
import pickle
class testing():
test1 = 1000
test2 = {'Dogs' : 0, 'Cats' : 0, 'Birds' : 0, 'Mive' : 0}
def saveload():
check = int(input(' 1. Save : 2. Load : 3. Print : 4. Add'))
allStuff = testing.test1, testing.test2
saveFile = 'TestingSaveLoad.data'
if check == 1:
f = open(saveFile, 'wb')
pickle.dump(testing.test1, f)
f.close()
print()
print('Saved.')
testing.saveload()
elif check == 2:
f = open(saveFile, 'rb')
pickle.load(f)
print()
print('Loaded.')
testing.saveload()
elif check == 3:
print(allStuff)
testing.saveload()
else:
testing.test1 += 234
testing.saveload()
testing.saveload()
The pickle.load documentation states:
Read a pickled object representation from the open file object file and return the reconstituted object hierarchy specified therein.
So you would need something like this:
testing.test1 = pickle.load(f)
However, to save and load multiple objects, you can use
# to save
pickle.dump(allStuff, f)
# to load
allStuff = pickle.load(f)
testing.test1, testing.test2 = allStuff
Dump them as a tuple, and when loading, unpack the result back into the two variables.
pickle.dump((testing.test1,testing.test2), f)
and
testing.test1, testing.test2 = pickle.load(f)
Then change the print to print the two items and forget about allStuff, since you would have to keep updating allStuff every time you loaded/reassigned (depending on the type of item you are storing).
print(testing.test1, testing.test2)
I'd also remove the recursive call to saveLoad() and wrap whatever should be repeated in a while loop with an option to exit
if check == 0:
break
You aren't saving the reconstituted pickled object currently. The documentation states that pickle.load() returns the reconstituted object.
You should have something like:
f = open(saveFile, 'rb')
testing.test1 = pickle.load(f)
To save multiple objects, use the approach recommended in this answer:
If you need to save multiple objects, you can simply put them in a single list, or tuple
Also, I recommend using the with keyword to open the file. That will ensure the file is closed even if something goes wrong. An example of a final output:
with open(saveFile, 'wb') as f:
pickle.dump((testing1, testing2), f)
...
with open(saveFile, 'rb') as f:
testing1, testing2 = pickle.load(f) # Implicit unpacking of the tuple
You might also want a while loop instead of the multiple calls to saveload() - it will be a bit cleaner. Note that right now you have no way out of your loop, short of quitting the program.

Store a dictionary in a file for later retrieval

I've had a search around but can't find anything regarding this...
I'm looking for a way to save a dictionary to file and then later be able to load it back into a variable at a later date by reading the file.
The contents of the file don't have to be "human readable" it can be as messy as it wants.
Thanks
- Hyflex
EDIT
import cPickle as pickle
BDICT = {}
## Automatically generated START
name = "BOB"
name_title = name.title()
count = 5
BDICT[name_title] = count
name = "TOM"
name_title = name.title()
count = 5
BDICT[name_title] = count
name = "TIMMY JOE"
name_title = name.title()
count = 5
BDICT[name_title] = count
## Automatically generated END
if BDICT:
with open('DICT_ITEMS.txt', 'wb') as dict_items_save:
pickle.dump(BDICT, dict_items_save)
BDICT = {} ## Wiping the dictionary
## Usually in a loop
firstrunDICT = True
if firstrunDICT:
with open('DICT_ITEMS.txt', 'rb') as dict_items_open:
dict_items_read = dict_items_open.read()
if dict_items_read:
BDICT = pickle.load(dict_items_open)
firstrunDICT = False
print BDICT
Error:
Traceback (most recent call last):
File "C:\test3.py", line 35, in <module>
BDICT = pickle.load(dict_items_open)
EOFError
A few people have recommended shelve - I haven't used it, and I'm not knocking it. I have used pickle/cPickle and I'll offer the following approach:
How to use Pickle/cPickle (the abridged version)...
There are many reasons why you would use Pickle (or its noticable faster variant, cPickle). Put tersely Pickle is a way to store objects outside of your process.
Pickle not only gives you the options to store objects outside your python process, but also does so in a serialized fashion. Meaning, First In, First Out behavior (FIFO).
import pickle
## I am making up a dictionary here to show you how this works...
## Because I want to store this outside of this single run, it could be that this
## dictionary is dynamic and user based - so persistance beyond this run has
## meaning for me.
myMadeUpDictionary = {"one": "banana", "two": "banana", "three": "banana", "four": "no-more"}
with open("mySavedDict.txt", "wb") as myFile:
pickle.dump(myMadeUpDictionary, myFile)
So what just happened?
Step1: imported a module named 'pickle'
Step2: created my dictionary object
Step3: used a context manager to handle the opening/closing of a new file...
Step4: dump() the contents of the dictionary (which is referenced as 'pickling' the object) and then write it to a file (mySavedDict.txt).
If you then go into the file that was just created (located now on your filesystem), you can see the contents. It's messy - ugly - and not very insightlful.
nammer#crunchyQA:~/workspace/SandBox/POSTS/Pickle & cPickle$ cat mySavedDict.txt
(dp0
S'four'
p1
S'no-more'
p2
sS'three'
p3
S'banana'
p4
sS'two'
p5
g4
sS'one'
p6
g4
s.
So what's next?
To bring that BACK into our program we simply do the following:
import pickle
with open("mySavedDict.txt", "rb") as myFile:
myNewPulledInDictionary = pickle.load(myFile)
print myNewPulledInDictionary
Which provides the following return:
{'four': 'no-more', 'one': 'banana', 'three': 'banana', 'two': 'banana'}
cPickle vs Pickle
You won't see many people use pickle these days - I can't think off the top of my head why you would want to use the first implementation of pickle, especially when there is cPickle which does the same thing (more or less) but a lot faster!
So you can be lazy and do:
import cPickle as pickle
Which is great if you have something already built that uses pickle... but I argue that this is a bad recommendation and I fully expect to get scolded for even recommending that! (you should really look at your old implementation that used the original pickle and see if you need to change anything to follow cPickle patterns; if you have legacy code or production code you are working with, this saves you time refactoring (finding/replacing all instances of pickle with cPickle).
Otherwise, just:
import cPickle
and everywhere you see a reference to the pickle library, just replace accordingly. They have the same load() and dump() method.
Warning Warning I don't want to write this post any longer than it is, but I seem to have this painful memory of not making a distinction between load() and loads(), and dump() and dumps(). Damn... that was stupid of me! The short answer is that load()/dump() does it to a file-like object, wheres loads()/dumps() will perform similar behavior but to a string-like object (read more about it in the API, here).
Again, I haven't used shelve, but if it works for you (or others) - then yay!
RESPONSE TO YOUR EDIT
You need to remove the dict_items_read = dict_items_open.read() from your context-manager at the end. The file is already open and read in. You don't read it in like you would a text file to pull out strings... it's storing pickled python objects. It's not meant for eyes! It's meant for load().
Your code modified... works just fine for me (copy/paste and run the code below and see if it works). Notice near the bottom I've removed your read() of the file object.
import cPickle as pickle
BDICT = {}
## Automatically generated START
name = "BOB"
name_title = name.title()
count = 5
BDICT[name_title] = count
name = "TOM"
name_title = name.title()
count = 5
BDICT[name_title] = count
name = "TIMMY JOE"
name_title = name.title()
count = 5
BDICT[name_title] = count
## Automatically generated END
if BDICT:
with open('DICT_ITEMS.txt', 'wb') as dict_items_save:
pickle.dump(BDICT, dict_items_save)
BDICT = {} ## Wiping the dictionary
## Usually in a loop
firstrunDICT = True
if firstrunDICT:
with open('DICT_ITEMS.txt', 'rb') as dict_items_open:
BDICT = pickle.load(dict_items_open)
firstrunDICT = False
print BDICT
Python has the shelve module for this. It can store many objects in a file that can be opened up later and read in as objects, but it's operating system-dependent.
import shelve
dict1 = #dictionary
dict2 = #dictionary
#flags:
# c = create new shelf; this can't overwrite an old one, so delete the old one first
# r = read
# w = write; you can append to an old shelf
shelf = shelve.open("filename", flag="c")
shelf['key1'] = dict1
shelf['key2'] = dict2
shelf.close()
#reading:
shelf = shelve.open("filename", flag='r')
for key in shelf.keys():
newdict = shelf[key]
#do something with it
shelf.close()
You can also use Pickle for this task. Here's a blog post that explains how to do it.
What you are looking for is shelve.
Two functions which create a text file for saving a dictionary and loading a dictionary (which was already saved before) for use again.
import pickle
def SaveDictionary(dictionary,File):
with open(File, "wb") as myFile:
pickle.dump(dictionary, myFile)
myFile.close()
def LoadDictionary(File):
with open(File, "rb") as myFile:
dict = pickle.load(myFile)
myFile.close()
return dict
These functions can be called through :
SaveDictionary(mylib.Members,"members.txt") # saved dict. in a file
members = LoadDictionary("members.txt") # opened dict. of members

Python - 'str' object has no attribute 'close'

I am having a great time trying to figure out why there doesn't need to be a closing attribute for this few lines of code I wrote:
from sys import argv
from os.path import exists
script, from_file, to_file = argv
file_content = open(from_file).read()
new_file = open(to_file, 'w').write(file_content)
new_file.close()
file_content.close()
I read some things and other people's posts about this, but their scripts were a lot more complicated than what I'm currently learning, so I couldn't figure out why.
I am doing Learning Python the Hard Way and would appreciate any help.
file_content is a string variable, which contains contents of the file -- it has no relation to the file. The file descriptor you open with open(from_file) will be closed automatically: file sessions are closed after the file-objects exit the scope (in this case, immediately after .read()).
open(...) returns a reference to a file object, calling read on that reads the file returning a string object, calling write writes to it returning None, neither of which have a close attribute.
>>> help(open)
Help on built-in function open in module __builtin__:
open(...)
open(name[, mode[, buffering]]) -> file object
Open a file using the file() type, returns a file object. This is the
preferred way to open a file.
>>> a = open('a', 'w')
>>> help(a.read)
read(...)
read([size]) -> read at most size bytes, returned as a string.
If the size argument is negative or omitted, read until EOF is reached.
Notice that when in non-blocking mode, less data than what was requested
may be returned, even if no size parameter was given.
>>> help(a.write)
Help on built-in function write:
write(...)
write(str) -> None. Write string str to file.
Note that due to buffering, flush() or close() may be needed before
the file on disk reflects the data written.
Theres a couple ways of remedying this:
>>> file = open(from_file)
>>> content = file.read()
>>> file.close()
or with python >= 2.5
>>> with open(from_file) as f:
... content = f.read()
The with will make sure the file is closed.
When you do file_content = open(from_file).read(), you set file_content to the contents of the file (as read by read). You can't close this string. You need to save the file object separately from its contents, something like:
theFile = open(from_file)
file_content = theFile.read()
# do whatever you need to do
theFile.close()
You have a similar problem with new_file. You should separate the open(to_file) call from the write.

Categories