Python pickle load doesn't work when program is run again - python

I'm trying to store a class instance that contains one variable using pickle.
class Scm1:
keys = {'b':0, 'i':0, 's':0}
The first thing I do in my program is check if the pickled file is exists. If it does, I attempt to load the data using pickle load. If it doesn't (this only happens the very first time the program is run), I create two instances of this class t1 = Scm1() and t2 = Scm1(). Then, in my program, I modify the entries in the keys field. At the end, I attempt to store the instances to a file. For this, I add the two instances to a dictionary -- tmpDict = {'t1':t1, 't2':t2} and execute a pickle dump using tmpDict as the object. When I load the data using pickle load right after the dump, I get what I expect (the data is set to what it was during the program). However, when I run the program again (this time the file exists) and load the data, all entries in the keys field for the two objects (t1 and t2) are 0. Why is it that when I'm able to get the correct results when I do pickle load prior to my program ending and not again when I rerun the program. I'm new to python and so I'm not sure if I'm expecting pickle to work the right way. Sorry for not being able to paste more code snippets as it's for a school assignment.

class Scm1:
keys = {'b':0, 'i':0, 's':0}
keys is a class variable, so all instances of the class share it. Class variables are not pickled when you store the instances, as they do not belong to the instances.
You should use instance attributes instead:
class Scm1:
def __init__(self):
self.keys = {'b':0, 'i':0, 's':0}

Related

Hook every variable reference and execute code

I have a Python app split across different files. One of them, models.py, contains, among PyQt5 table models, several maps referred from several PyQt5 form files:
# first lines:
agents_id_map = \
{agent.name:agent.id for agent in db.session.query(db.Agent, db.Agent.id)}
# ....
# 2000 thousand lines
I want to keep this kind of maps centralized in a single point. I'm using SQLAlchemy also. Agent class is defined in a db.py file. I use these maps to fulfill the foreign key in another object, say, an invoice, like:
invoice = db.Invoice()
# Here is a reference
invoice.agent_id = models.agents_id_map[agent_combo.currentText()]
ยทยทยทยท
db.session.add(invoice)
db.session.commit()
The problem is that the model.py module gets cached and several parts of the application access old data, and, if another running instance A of the app creates a new agent, and a running instance B wants to create a new invoice, the B running instance won't see the new Agent created by A unless restarts the app. This also happens if a user in the same running instance creates an agent and then he wants to create an invoice. My solutions are:
Reload the module, to get the whole code executed again, but this could be very expensive.
Isolate the code building those maps in another file, say maps.py, which would be less expensive to reload and change all code that references it through refactoring.
Is there a solution that would allow me to touch only the code building those maps and the rest of the application remains ignorant of the change, and every time the map is referenced from another module or even the same, the code gets executed, effectively re-building maps with fresh data?
Is there a solution that would allow me to touch only the code building those maps and the rest of the application remains ignorant of the change, and every time the map is referenced from another module or even the same, the code gets executed, effectively re-building maps with fresh data?
Certainly: put you maps inside a function, or even better, a class.
If I understand this problem correctly, you have stateful data (maps) which need regenerating under some condition (every time they are accessed? Or just every time the db is updated?). I would do something like this:
class Mappings:
def __init__(self, db):
self._db = db
... # do any initial db stuff you need to here
def id_map(self, thing):
db_thing = getattr(self._db, thing.title)
return {x.name:x.id for x in self._db.session.query(db_thing, db_thing.id)}
def other_property_map(self, prop):
... # etc
mapping = Mapping(db)
mapping.id_map("agent")
This assumes that the mapping example you've given is your major use-case, but this model could easily be adapted for almost any other mapping you might want.
You would write a method of every kind of 'mapping' you need, and it would return the desired dictionary. Note that here I've assumed you handle setting up the db elsewhere and pass a fully initialised db access object to the class, which is probably what you want to do---this class is just about encapsulating mapper state, not re-inventing your orm.
Caching
I have not provided any caching. But if you have complete control over the db, it is easy enough to run a hook before you do any db commits looking to see if you've touched any particular model, and then state that those need rebuilding. Something like this:
class DbAccess(Mappings):
def __init__(self, db, models):
super().init(db)
self._cached_map = {model: {} for model in models}
def db_update(model: str, params: dict):
try:
self._cached_map[model] = {} # wipe cache
except KeyError:
pass
self._db.update_with_model(model, params) # dummy fn
def id_map(self, thing: str):
try:
return self._cached_map[thing]["id"]
except KeyError:
self._cached_map[thing]["id"] = super().id_map(thing)
return self._cached_map[thing]["id"]
I don't really think DbAccess should inherit from Mappings---put it all in one class, or have a DB class and a Mappings mixin and inherit from both. I just didn't want to write everything out again.
I've not written any real db access routines, (hence my dummy fn) as I don't know how you're doing it (but clearly using an ORM). But the basic idea is just to handle the caching yourself, by storing the mapping every time, but deleting all the stored mappings every time you do any commit transactions involving the model in question (thus rebuilding the cache as needed).
Aside
Note that if you really do have 2,000 lines of manually declared mappings of the form thing.name: thing.id you really should generate them at runtime anyhow. Declarative is all very well and good, but writing out 2,000 permutations of the same thing isn't declarative, it's just time-consuming---and doing the job a simple loop putting the data in ram could do for you at startup.

Saving and Loading a class instance from file

an essential part of my project is being able to save and load class instances to a file. For further context, my class has both a set of attributes as well as a few methods.
So far, I've tried using pickle, but it's not working quite as expected. For starters, it's not loading the methods, nor it's letting me add attributes that I've defined initially; in other words, it's not really making a copy of the class I can work with.
Relevant Code:
class Brick(object):
def __init__(self, name, filename=None, areaMin=None, areaMax=None, kp=None):
self.name = name
self.filename = filename
self.areaMin = areaMin
self.areaMax = areaMax
self.kp = kp
self.__kpsave = None
if filename != None:
self.__logfile = file(filename, 'w')
def __getstate__(self):
f = self.__logfile
self.__kpsave = []
for point in self.kp:
temp = (point.pt, point.size, point.angle, point.response, point.octave, point.class_id)
self.__kpsave.append(temp)
return (self.name, self.areaMin, self.areaMax, self.__kpsave,
f.name, f.tell())
def __setstate__(self, state):
self.value, self.areaMin, self.areaMax, self.__kpsave, name, position = state
f = file(name, 'w')
f.seek(position)
self.__logfile = f
self.filename = name
self.kp = []
for point in self.__kpsave:
temp = cv2.KeyPoint(x=point[0][0], y=point[0][1], _size=point[1], _angle=point[2], _response=point[3],
_octave=point[4], _class_id=point[5])
self.kp.append(temp)
def calculateORB(self, img):
pass #I've omitted the actual method here
(There are a few more attributes and methods, but they're not relevant)
Now, this class definition works just fine when creating new instances: I can make a new Brick with just the name, I can then set areaMin or any other attribute, and I can use pickle(cPickle) to dump the current instance to a file just fine (I'm using those getstate and setstate because pickle won't work with OpenCV's Keypoint elements).
The problem comes, of course, when I do load the instance: using pickle load() I can load the instance from a file, and the values I set previously will be there (ie I can access areaMin just fine if I did set a value for it) but I can't access either methods or add values to any of the other attributes if I never changed their values. I've noticed that I don't need to import my class definition either if I'm simply pickling from a completely different source file.
Since all I want to do is build a "database" of sorts from my class objects, what's the best way to approach this? I know something that should work is to simply write a .Save() method that writes a .py source file where I essentially create an instance of the class, so I can then .Load() which will do exec and eval as appropriate, however, this seems like the worst possible way to do this, so, how should I actually do this?
Thanks.
You should not try to do I/O inside your __getstate__ and __setstate__ methods - those are called by Pickle, and the expted result is just an in-memory object that can be further pickled.
Moreover, if your "Point" class in the "self.kp" attribute is just a regular Python class, there is no need for you to customize pickling at all -
What you have to worry about is to deal with the I/O at the point you call Pickle. If you really need to load different instances independently, you could resort to the "shelve" module, or, better yet, use pickle.dumps and store the resulting string in a DBMS (which can be the built-in sqlite).
All in all:
class Point(object):
...
class Brick(object):
def __init__(self, point, ...):
self.kp = point
Then, to save a single object to a file:
with open("filename.pickle", "wb") as file_:
pickle.dump(my_brick, file_, -1)
and restore with:
my_brick = pickle.load(open("filename.pickle", "rb", -1)
To store several instances and recover all at once, you could just dump then in sequence to the same open file, and them read one by one until you got a fault due to "empty file" - or ou can simply add all objects you want to save to a List, and pickle the whole list at once.
To save and retrieve arbitrary objects that you can retrieve giving some attrbute like "name" or "id" - you can resort to the shelve module: https://docs.python.org/3/library/shelve.html or use a real database if you need complex queries and such. Trying to write your own ad hoc binary format to allow for searching the required instance is an horrible idea - as you'd have to implement all the protocol for that file 0 reading, writting, safeguards, corner cases, and such.

Migrate cPickled objects to a new class with the same __getstate__ format

I'm using cPickle to serialize and deserialize instances of a class. I've recently moved a few classes into different packages and now I'm noticing that cPickle/pickle stores the class name and the package it comes from.
>>> class A(object):
pass
>>> dumps(A())
'ccopy_reg\n_reconstructor\np1\n(c__main__\nA\np2\nc__builtin__\nobject\np3\nNtRp4\n.'
Notice how __main__ is stored because I ran that code in the main python interpreter.
If I try to unpickle the many objects I have stored this way, I'll get an ImportError complaining that since I moved the classes around, the old class doesn't exist in the location where it is expected.
I haven't changed the format of __getstate__ or __setstate__. I've only changed the location of the class that needs to be deserialized. Is there a way to migrate these objects so that I don't run into problems?
If you want to migrate your data, you must provide a reference to the new object at the old location, and dump the data again:
old_location.A = new_location.A
data = loads(pickle_data)
pickle_data = dumps(data)
and now, pickle_data contains a reference to new_location.A.

python load from shelve - can I retain the variable name?

I'm teaching myself how to write a basic game in python (text based - not using pygame). (Note: I haven't actually gotten to the "game" part per-se, because I wanted to make sure I have the basic core structure figured out first.)
I'm at the point where I'm trying to figure out how I might implement a save/load scenario so a game session could persist beyond a signle running of the program. I did a bit of searching and everything seems to point to pickling or shelving as the best solutions.
My test scenario is for saving and loading a single instance of a class. Specifically, I have a class called Characters(), and (for testing's sake) a sigle instance of that class assigned to a variable called pc. Instances of the Character class have an attribute called name which is originally set to "DEFAULT", but will be updated based on user input at the initial setup of a new game. For ex:
class Characters(object):
def __init__(self):
self.name = "DEFAULT"
pc = Characters()
pc.name = "Bob"
I also have (or will have) a large number of functions that refer to various instances using the variables they are asigned to. For example, a made up one as a simplified example might be:
def print_name(character):
print character.name
def run():
print_name(pc)
run()
I plan to have a save function that will pack up the pc instance (among other info) with their current info (ex: with the updated name). I also will have a load function that would allow a user to play a saved game instead of starting a new one. From what I read, the load could work something like this:
*assuming info was saved to a file called "save1"
*assuming the pc instance was shelved with "pc" as the key
import shelve
mysave = shelve.open("save1")
pc = mysave["pc"]
My question is, is there a way for the shelve load to "remember" the variable name assotiated with the instance, and automatically do that << pc = mysave["pc"] >> step? Or a way for me to store that variable name as a string (ex as the key) and somehow use that string to create the variable with the correct name (pc)?
I will need to "save" a LOT of instances, and can automate that process with a loop, but I don't know how to automate the unloading to specific variable names. Do I really have to re-asign each one individually and explicitly? I need to asign the instances back to the apropriate variable names bc I have a bunch of core functions that refer to specific instances using variable names (like the example I gave above).
Ideas? Is this possible, or is there an entirely different solution that I'm not seeing?
Thanks!
~ribs
Sure, it's possible to do something like that. Since a shelf itself is like a dictionary, just save all the character instances in a real dictionary instance inside it using their variable's name as the key. For example:
class Character(object):
def __init__(self, name="DEFAULT"):
self.name = name
pc = Character("Bob")
def print_name(character):
print character.name
def run():
print_name(pc)
run()
import shelve
mysave = shelve.open("save1")
# save all Character instances without the default name
mysave["all characters"] = {varname:value for varname,value in
globals().iteritems() if
isinstance(value, Character) and
value.name != "DEFAULT"}
mysave.close()
del pc
mysave = shelve.open("save1")
globals().update(mysave["all characters"])
mysave.close()
run()

How to store the current state of InteractiveInterpreter Object in a database?

I am trying to build an online Python Shell. I execute commands by creating an instance of InteractiveInterpreter and use the command runcode. For that I need to store the interpreter state in the database so that variables, functions, definitions and other values in the global and local namespaces can be used across commands. Is there a way to store the current state of the object InteractiveInterpreter that could be retrieved later and passed as an argument local to InteractiveInterpreter constructor or If I can't do this, what alternatives do I have to achieve the mentioned functionality?
Below is the pseudo code of what I am trying to achieve
def fun(code, sessionID):
session = Session()
# get the latest state of the interpreter object corresponding to SessionID
vars = session.getvars(sessionID)
it = InteractiveInterpreter(vars)
it.runcode(code)
#save back the new state of the interpreter object
session.setvars(it.getState(),sessionID)
Here, session is an instance of table containing all the necessary information.
I believe the pickle package should work for you. You can use pickle.dump or pickle.dumps to save the state of most objects. (then pickle.load or pickle.loads to get it back)
Okay, if pickle doesn't work, you might try re-instantiating the class, with a format (roughly) like below:
class MyClass:
def __init__(self, OldClass):
for d in dir(OldClass):
# go through the attributes from the old class and set
# the new attributes to what you need

Categories