Unpickle class instances by interating over a list - python

I am trying to unpickle various class instances which are saved in separate .pkl files by iterating over a list containing all the class instances (each class instance appends itself to the appropriate list when instantiated).
This works:
# LOAD IN INGREDIENT INSTANCES
for each in il:
with open('Ingredients/{}.pkl'.format(each), 'rb') as f:
globals()[each] = pickle.load(f)
For example, one ingredient is Aubergine:
print(Aubergine)
output:
Name: Aubergine
Price: £1.00
Portion Size: 1
However, this doesn't work:
# LOAD IN RECIPE INSTANCES
for each in rl:
with open('Recipes/{}.pkl'.format(each.name), 'rb') as f:
globals()[each] = pickle.load(f)
I can only assume that the issue stems from each.name being used for the file names of the recipes, whereas each is used for the ingredient file names. This is intentional, however, as the name attribute of the recipes is formatted for the end-user (i.e. contains white space etc.) I think this may be the issue, but I am not sure.
Both the ingredient and recipe classes use:
def __repr__(self):
return self.name
For example:
I have a recipe class instance SausageAubergineRagu, for which self.name is 'Sausage & Aubergine Ragu', and this is inside the list rl. I have tried testing this individually:
input:
rl
output:
[Sausage & Aubergine Ragu]
So I believe that this code:
# LOAD IN RECIPE INSTANCES
for each in rl:
with open('Recipes/{}.pkl'.format(each.name), 'rb') as f:
globals()[each] = pickle.load(f)
...should result in this:
with open('Recipes/Sausage & Aubergine Ragu.pkl', 'rb') as f:
globals()[SausageAubergineRagu] = pickle.load(f)
But attempting to access the recipe class instances results in a NameError.
One final note - please don't ask why I am doing things this way. Instead help me to address and solve the problem, so I can make it work, and understand what is going on. Appreciated :)

The NameError you are getting is Python telling you that you are trying to use a variable that hasn't been defined yet.
You aren't defining SausageAubergineRagu before you use it in this line:
globals()[SausageAubergineRagu] = pickle.load(f)
In your first example, you are adding keys and values to globals. You are using instances of recipes (each) as keys, and the pickled data as values.
In your second example, you are attempting to do the same thing, but instead of using instances of recipes (each) as keys, you are using SausageAubergineRagu, which is undefined.
How is Python supposed to know what SausageAubergineRagu is? If you want that line to work, you will need to define it first, or use something that is already defined, like each, which is what you do in your other snippet.
Honestly, using instances of custom classes as keys in globals seems bizarre to me anyway (usually people use strings), but since you apparently want to make it work, the answer is simple:
Define SausageAubergineRagu before attempting to use it as a key in a dictionary.

Related

Why shouldn't one dynamically generate variable names in python?

Right now I am learning Python and struggling with a few concepts of OOP, one of that being how difficult it is (to me) to dynamically initialize class instances and assign them to a dynamically generated variable name and why I am reading that I shouldn't do that in the first place.
In most threads with a similar direction, the answer seems to be that it is un-Pythonic to do that.
For example generating variable names on fly in python
Could someone please elaborate?
Take the typical OOP learning case:
LOE = ["graham", "eric", "terry_G", "terry_J", "john", "carol"]
class Employee():
def __init__(self, name, job="comedian"):
self.name = name
self.job = job
Why is it better to do this:
employees = []
for name in LOE:
emp = Employee(name)
employees.append(emp)
and then
for emp in employees:
if emp.name == "eric":
print(emp.job)
instead of this
for name in LOE:
globals()[name] = Employee(name)
and
print(eric.job)
Thanks!
If you dynamically generate variable names, you don't know what names exist, and you can't use them in code.
globals()[some_unknown_name] = Foo()
Well, now what? You can't safely do this:
eric.bar()
Because you don't know whether eric exists. You'll end up having to test for eric's existence using dictionaries/lists anyway:
if 'eric' in globals(): ...
So just store your objects in a dictionary or list to begin with:
people = {}
people['eric'] = Foo()
This way you can also safely iterate one data structure to access all your grouped objects without needing to sort them from other global variables.
globals() gives you a dict which you can put names into. But you can equally make your own dict and put the names there.
So it comes down to the idea of "namespaces," that is the concept of isolating similar things into separate data structures.
You should do this:
employees = {}
employees['alice'] = ...
employees['bob'] = ...
employees['chuck'] = ...
Now if you have another part of your program where you describe parts of a drill, you can do this:
drill['chuck'] = ...
And you won't have a name collision with Chuck the person. If everything were global, you would have a problem. Chuck could even lose his job.

Names of instances and loading objects from a database

I got for example the following structure of a class.
class Company(object):
Companycount = 0
_registry = {}
def __init__(self, name):
Company.Companycount +=1
self._registry[Company.Companycount] = [self]
self.name = name
k = Company("a firm")
b = Company("another firm")
Whenever I need the objects I can access them by using
Company._registry
which gives out a dictionary of all instances.
Do I need reasonable names for my objects since the name of the company is a class attribute, and I can iterate over Company._registry?
When loading the data from the database does it matter what the name of the instance (here k and b) is? Or can I just use arbitrary strings?
Both your Company._registry and the names k and b are just references to your actual instances. Neither play any role in what you'd store in the database.
Python's object model has all objects living on a big heap, and your code interacts with the objects via such references. You can make as many references as you like, and objects automatically are deleted when there are no references left. See the excellent Facts and myths about Python names and values article by Ned Batchelder.
You need to decide, for yourself, if the Company._registry structure needs to have names or not. Iteration over a list is slow if you already have a name for a company you wanted to access, but a dictionary gives you instant access.
If you are going to use an ORM, then you don't really need that structure anyway. Leave it to the ORM to help you find your objects, or give you a sequence of all objects to iterate over. I recommend using SQLAlchemy for this.
the name doesn't matter but if you are gonna initialize a lot of objects you are still gonna make it reasonable somehow

Saving and Loading a class instance from file

an essential part of my project is being able to save and load class instances to a file. For further context, my class has both a set of attributes as well as a few methods.
So far, I've tried using pickle, but it's not working quite as expected. For starters, it's not loading the methods, nor it's letting me add attributes that I've defined initially; in other words, it's not really making a copy of the class I can work with.
Relevant Code:
class Brick(object):
def __init__(self, name, filename=None, areaMin=None, areaMax=None, kp=None):
self.name = name
self.filename = filename
self.areaMin = areaMin
self.areaMax = areaMax
self.kp = kp
self.__kpsave = None
if filename != None:
self.__logfile = file(filename, 'w')
def __getstate__(self):
f = self.__logfile
self.__kpsave = []
for point in self.kp:
temp = (point.pt, point.size, point.angle, point.response, point.octave, point.class_id)
self.__kpsave.append(temp)
return (self.name, self.areaMin, self.areaMax, self.__kpsave,
f.name, f.tell())
def __setstate__(self, state):
self.value, self.areaMin, self.areaMax, self.__kpsave, name, position = state
f = file(name, 'w')
f.seek(position)
self.__logfile = f
self.filename = name
self.kp = []
for point in self.__kpsave:
temp = cv2.KeyPoint(x=point[0][0], y=point[0][1], _size=point[1], _angle=point[2], _response=point[3],
_octave=point[4], _class_id=point[5])
self.kp.append(temp)
def calculateORB(self, img):
pass #I've omitted the actual method here
(There are a few more attributes and methods, but they're not relevant)
Now, this class definition works just fine when creating new instances: I can make a new Brick with just the name, I can then set areaMin or any other attribute, and I can use pickle(cPickle) to dump the current instance to a file just fine (I'm using those getstate and setstate because pickle won't work with OpenCV's Keypoint elements).
The problem comes, of course, when I do load the instance: using pickle load() I can load the instance from a file, and the values I set previously will be there (ie I can access areaMin just fine if I did set a value for it) but I can't access either methods or add values to any of the other attributes if I never changed their values. I've noticed that I don't need to import my class definition either if I'm simply pickling from a completely different source file.
Since all I want to do is build a "database" of sorts from my class objects, what's the best way to approach this? I know something that should work is to simply write a .Save() method that writes a .py source file where I essentially create an instance of the class, so I can then .Load() which will do exec and eval as appropriate, however, this seems like the worst possible way to do this, so, how should I actually do this?
Thanks.
You should not try to do I/O inside your __getstate__ and __setstate__ methods - those are called by Pickle, and the expted result is just an in-memory object that can be further pickled.
Moreover, if your "Point" class in the "self.kp" attribute is just a regular Python class, there is no need for you to customize pickling at all -
What you have to worry about is to deal with the I/O at the point you call Pickle. If you really need to load different instances independently, you could resort to the "shelve" module, or, better yet, use pickle.dumps and store the resulting string in a DBMS (which can be the built-in sqlite).
All in all:
class Point(object):
...
class Brick(object):
def __init__(self, point, ...):
self.kp = point
Then, to save a single object to a file:
with open("filename.pickle", "wb") as file_:
pickle.dump(my_brick, file_, -1)
and restore with:
my_brick = pickle.load(open("filename.pickle", "rb", -1)
To store several instances and recover all at once, you could just dump then in sequence to the same open file, and them read one by one until you got a fault due to "empty file" - or ou can simply add all objects you want to save to a List, and pickle the whole list at once.
To save and retrieve arbitrary objects that you can retrieve giving some attrbute like "name" or "id" - you can resort to the shelve module: https://docs.python.org/3/library/shelve.html or use a real database if you need complex queries and such. Trying to write your own ad hoc binary format to allow for searching the required instance is an horrible idea - as you'd have to implement all the protocol for that file 0 reading, writting, safeguards, corner cases, and such.

python load from shelve - can I retain the variable name?

I'm teaching myself how to write a basic game in python (text based - not using pygame). (Note: I haven't actually gotten to the "game" part per-se, because I wanted to make sure I have the basic core structure figured out first.)
I'm at the point where I'm trying to figure out how I might implement a save/load scenario so a game session could persist beyond a signle running of the program. I did a bit of searching and everything seems to point to pickling or shelving as the best solutions.
My test scenario is for saving and loading a single instance of a class. Specifically, I have a class called Characters(), and (for testing's sake) a sigle instance of that class assigned to a variable called pc. Instances of the Character class have an attribute called name which is originally set to "DEFAULT", but will be updated based on user input at the initial setup of a new game. For ex:
class Characters(object):
def __init__(self):
self.name = "DEFAULT"
pc = Characters()
pc.name = "Bob"
I also have (or will have) a large number of functions that refer to various instances using the variables they are asigned to. For example, a made up one as a simplified example might be:
def print_name(character):
print character.name
def run():
print_name(pc)
run()
I plan to have a save function that will pack up the pc instance (among other info) with their current info (ex: with the updated name). I also will have a load function that would allow a user to play a saved game instead of starting a new one. From what I read, the load could work something like this:
*assuming info was saved to a file called "save1"
*assuming the pc instance was shelved with "pc" as the key
import shelve
mysave = shelve.open("save1")
pc = mysave["pc"]
My question is, is there a way for the shelve load to "remember" the variable name assotiated with the instance, and automatically do that << pc = mysave["pc"] >> step? Or a way for me to store that variable name as a string (ex as the key) and somehow use that string to create the variable with the correct name (pc)?
I will need to "save" a LOT of instances, and can automate that process with a loop, but I don't know how to automate the unloading to specific variable names. Do I really have to re-asign each one individually and explicitly? I need to asign the instances back to the apropriate variable names bc I have a bunch of core functions that refer to specific instances using variable names (like the example I gave above).
Ideas? Is this possible, or is there an entirely different solution that I'm not seeing?
Thanks!
~ribs
Sure, it's possible to do something like that. Since a shelf itself is like a dictionary, just save all the character instances in a real dictionary instance inside it using their variable's name as the key. For example:
class Character(object):
def __init__(self, name="DEFAULT"):
self.name = name
pc = Character("Bob")
def print_name(character):
print character.name
def run():
print_name(pc)
run()
import shelve
mysave = shelve.open("save1")
# save all Character instances without the default name
mysave["all characters"] = {varname:value for varname,value in
globals().iteritems() if
isinstance(value, Character) and
value.name != "DEFAULT"}
mysave.close()
del pc
mysave = shelve.open("save1")
globals().update(mysave["all characters"])
mysave.close()
run()

How to directly access class instances through class dictionary in Python

I need a way of accessing directly an instance of a class through an ID number.
As I tried to explain here, I am importing a .csv file and I want to create an instance of my class Person() for every line in the .csv file, plus I want to be able to directly access these instances using as a key a unique identifier, already present in the .csv file.
What I have done so far, thanks to the help of user433831, is this:
from sys import argv
from csv import DictReader
from person import Person
def generateAgents(filename):
reader = DictReader(open(filename, 'rU'))
persons = [Person(**entry) for entry in reader]
return persons
where person is just the module where I define the class Person() as:
class Person:
def __init__(self, **kwds):
self.__dict__.update(kwds)
Now I have a list of my person instances, namely persons, which is already something neat.
But now I need to create a network among these persons, using the networkx module, and I definitely need a way to access directly every person (at present my instances don't have any name).
For example, every person has an attribute called "father_id", which is the unique ID of the father. Persons not having a father alive in the current population has a "father_id" equal to "-1".
Now, to link every person to his/her father, I'd do something like:
import networkx as nx
G=nx.Graph()
for person in persons:
G.add_edge(person, person_with_id_equal_to_father_id)
My problem is that I am unable to access directly this "person_with_id_equal_to_father_id".
Keep in mind that I will need to do this direct access many many times, so I would need a pretty efficient way of doing it, and not some form of searching in the list (also considering that I have around 150000 persons in my population).
It would be great to implement something like a dictionary feature in my class Person(), with the key of every instance being a unique identifier. This unique identifier is already present in my csv file and therefore I already have it as an attribute of every person.
Thank you for any help, as always greatly appreciated. Also please keep in mind I am a total python newbie (as you can probably tell... ;) )
Simply use a dictionary:
persons_by_id = {p.id: p for p in persons}
This requires a recent version of Python. If yours doesn't support this syntax, use the following:
persons_by_id = dict((p.id, p) for p in persons)
Having done either of the above, you can locate the person by their id like so:
persons_by_id[id]
The networkx example becomes:
import networkx as nx
G=nx.Graph()
for person in persons:
if person.father_id != -1:
G.add_edge(person, persons_by_id[person.father_id])
Here's an efficient way to do what #Toote suggests:
def generateAgents(filename):
with open(filename, 'rU') as input:
reader = DictReader(input)
persons = dict((entry['father_id'], Person(**entry)) for entry in reader)
return persons

Categories