How to overwrite an existing dictionary Python 3 - python

Sorry if this is worded badly, I hope you can understand/edit my question to make it easier to understand.
Ive been using python pickle to pickle/unpickle the state of the objects in a game (i do understand this is probably very storage/just generally inefficient and lazy but its only whilst im learning more python). However I encounter errors when doing this with the classes for presenting information.
The issue at root I believe is that when I unpickle the save data to load, it overwrites the existing dictionaries but the object storage points change, so the information class is trying to detect a room that the player can no longer enter since the data was overwritten.
I've made a snippet to reproduce the issue I have:
import pickle
class A(object):
def __init__(self):
pass
obj_dict = {
'a' : A(),
'b' : A()
## etc.
}
d = obj_dict['a']
f = open('save', 'wb')
pickle.Pickler(f,2).dump(obj_dict)
f.close()
f = open('save', 'rb')
obj_dict = pickle.load(f)
f.close()
if d == obj_dict['a']:
print('success')
else:
print(str(d) + '\n' + str(obj_dict['a']))
I understand this is probably to be expected when rewriting variables like this, but is there a way around it? Many thanks

Is your issue that you want d == obj_dict['a'] to evaluate to true?
By default, the above == equality check will compare the references of the two objects. I.e. does d and obj_dict['a'] point to the same chunk of memory?
When you un-pickle your object, it will be created as a new object, in a new chunk of memory and thus your equality check will fail.
You need to override how your equality check behaves to get the behavior you want. The methods you need to override are: __eq__ and __hash__.
In order to track your object through repeated pickling and un-pickling, you'll need to assign a unique id to the object on creation:
class A:
def __init__(self):
self.id = uuid.uuid4() # assign a unique, random id
Now you must override the methods mentioned above:
def __eq__( self, other ):
# is the other object also a class A and does it have the same id
return isinstance( other, A ) and self.id == other.id
def __hash__( self ):
return hash(self.id)

Related

How do I merge two objects into one?

How can we create dynamically (procedurally.... at run-time) one new object from two old objects so that operations performed on the new object are performed on both of the two old objects?
As just one example, we might have two streams:
a string stream ... str_strm = io.StringIO()
a file stream ... fl_strm = open("test_file.txt", "w")
In the example, we might want it to be that anytime we try to write a string to the new steam, copies of that string are written to the two older streams.
class MergedStream:
def __init__(strm1, strm2):
self._strm1 = strm1
self._strm2 = strm2
def write(self, msg:str):
self._strm1.write()
self._strm2.write()
return None
str_strm = io.StringIO()
fl_strm = open("test_file.txt", "w")
ms = MergedStream(str_strm, fl_strm)
I can NOT guarantee that the two old objects both have a method named write.
Our solution should be general and dynamically generate the new merged object no matter what the two old objects are.
How might we create a new object in such a way that anytime we try to act upon the new object, we perform that same action upon the two older objects?
Something like the following is a step in the right direction, but not a complete solution:
class MergedObject:
#classmethod
def merge(cls, left, right):
if left == right:
return left
else:
return cls(left, right)
def __call__(self, *args, **kwargs):
try:
r1 = self._lefty(*args, **kwargs)
r2 = self._righty(*args, **kwargs)
return type(self).merge(r1, r2)
except AttributeError:
raise AttributeError()
def __init__(lefty, righty):
self._lefty = lefty
self._righty = righty
reserved = dir(self) # reserved attribute names
attributes = dict()
attribute_names = set(dir(lefty), dir(righty)).difference(reserved)
for attribute_name in attribute_names:
latter = getattr(lefty) # `latter` is the `left-attribute`
ratter = getattr(righty) # `ratter` is the `right-attribute`
setattr(attribute_name, type(self).merge(latter, ratter))
SOME WRINKLES TO IRON OUT
I need to be able to merge two objects which have no overloaded == operator. That is, even if there is no code inside of the class definition which says def __eq__():, I need to test if two things are equal. Suppose I merge two objects which both have a method named insert() which return -1. I do not want to carry around duplicate values. I do not want a merged object of two copies of the number 10, and then x + 5 computes 10 + 5 twice.
There are some issues with overriding python's "magic" methods (dunder methods) such as __len__ or __mul__ If the two old objects both have a __len__ method, the new merged object might fail to have a a method named __len__.

Iteratively create subclass and store objects as class attribute

I have a class that does some complex calculation and generates some result MyClass.myresults.
MyClass.myresults is actually a class itself with different attributes (e.g. MyClass.myresults.mydf1, MyClass.myresults.mydf2.
Now, I need to run MyClass iteratively following a list of scenarios(scenarios=[1,2,[2,4], 5].
This happens with a simple loop:
for iter in scenarios:
iter = [iter] if isinstance(iter, int) else iter
myclass = MyClass() #Initialize MyClass
myclass.DoStuff(someInput) #Do stuff and get results
results.StoreScenario(myclass.myresults, iter)
and at the end of each iteration store MyClass.myresults.
I would like to create a separate class (Results) that at each iteration creates a subclass scenario_1, scenario_2, scenario_2_4 and stores within it MyClass.myresults.
class Results:
# no initialization, is an empty container to which I would like to add attributes iteratively
class StoreScenario:
def __init__(self, myresults, iter):
self.'scenario_'.join(str(iter)) = myresults #just a guess, I am assuming this is wrong
Suggestions on different approaches are more than welcome, I am quite new to classes and I am not sure if this is an acceptable approach or if I am doing something awful (clunky, memory inefficient, or else).
There's two problems of using this approach, The first one is, Result class (separate class) only stores modified values of your class MyClass, I mean, they should be the same class.
The second problem is memory efficiency, you create the same object twice for storing actual values and modified values at each iteration.
The suggested approach is using a hashmap or a dictionary in python. Using dictionary you are able to store copies of modified object very efficient and there's no need to create another class.
class MyClass:
def __init__(self):
# some attributes ...
self.scenarios_result = {}
superObject = MyClass()
for iter in scenarios:
iter = [iter] if isinstance(iter, int) else iter
myclass = MyClass() #Initialize MyClass
myclass.DoStuff(someInput) #Do stuff and get results
# results.StoreScenario(myclass.myresults, iter)
superObject.scenarios_result[iter] = myclass
So I solved it using setattr:
class Results:
def __init__(self):
self.scenario_results= type('ScenarioResults', (), {}) # create an empty object
def store_scenario(self, data, scenarios):
scenario_key = 'scenario_' + '_'.join(str(x) for x in scenarios)
setattr(self.simulation_results, scenario_key,
subclass_store_scenario(data))
class subclass_store_scenario:
def __init__(self, data):
self.some_stuff = data.result1.__dict__
self.other_stuff = data.result2.__dict__
This allows me to call things like:
results.scenario_results.scenario_1.some_stuff.something
results.scenario_results.scenario_1.some_stuff.something_else
This is necessary for me as I need to compute other measures, summary or scenario-specific, which I can then iteratively assign using again setattr:
def construct_measures(self, some_data, configuration):
for scenario in self.scenario_results:
#scenario is a reference to the self.scenario_results class.
#we can simply add attributes to it
setattr(scenario , 'some_measure',
self.computeSomething(
some_data.input1, some_data.input2))

How to decrease the memory footprint of dictionary?

In my application, I need a fast look up of attributes. Attributes are in this case a composition of a string and a list of dictionaries. These attributes are stored in a wrapper class. Let's call this wrapper class Plane:
class Plane(object):
def __init__(self, name, properties):
self.name = name
self.properties = properties
#classmethod
def from_idx(cls, idx):
if idx == 0:
return cls("PaperPlane", [{"canFly": True}, {"isWaterProof": False}])
if idx == 1:
return cls("AirbusA380", [{"canFly": True}, {"isWaterProof": True}, {"hasPassengers": True}])
To better play with this class, I added a simple classmethod to construct instances by providing and integer.
So now in my application I have many Planes, of the order of 10,000,000. Each of these planes can be accessed by a universal unique id (uuid). What I need is a fast lookup: given an uuid, what is the Plane. The natural solution is a dict. A simple class to generate planes with uuids in a dict and to store this dict in a file may look like this:
class PlaneLookup(object):
def __init__(self):
self.plane_dict = {}
def generate(self, n_planes):
for i in range(n_planes):
plane_id = uuid.uuid4().hex
self.plane_dict[plane_id] = Plane.from_idx(np.random.randint(0, 2))
def save(self, filename):
with gzip.open(filename, 'wb') as f:
pickle.dump(self.plane_dict, f, pickle.HIGHEST_PROTOCOL)
#classmethod
def from_disk(cls, filename):
pl = cls()
with gzip.open(filename, 'rb') as f:
pl.plane_dict = pickle.load(f)
return pl
So now what happens is that if I generate some planes?
pl = PlaneLookup()
pl.generate(1000000)
What happens is, that lots of memory gets consumed! If I check the size of my pl object with the getsize() method from this question, I get on my 64bit machine a value of 1,087,286,831 bytes. Looking at htop, my memory demand seems to be even higher (around 2GB).
In this question, it is explained quite well, why python dictionaries need much memory.
However, I think this does not have to be the case in my application. The plane object that is created in the PlaneLookup.generate() method contains very often the same attributes (i.e. the same name and the same properties). So it has to be possible, to save this object once in the dict and whenever the same object (same name, same attribute) is created again, only a reference to the already existing dict entry is stored. As a simple Plane object has a size of 1147 bytes (according to the getsize() method), just saving references may save a lot of memory!
The question is now: How do I do this? In the end I need a function that takes a uuid as an input and returns the corresponding Plane object as fast as possible with as little memory as possible.
Maybe lru_cache can help?
Here is again the full code to play with:
https://pastebin.com/iTZyQQAU
Did you think about having another dictionary with idx -> plane? then in self.plane_dict[plane_uuid] you would just store idx instead of object. this will save memory and speed up your app, though you'd need to modify the lookup method.

are user defined classes mutable

Say I want to create a class for car, tractor and boat. All these classes have an instance of engine and I want to keep track of all the engines in a single list. If I understand correctly if the motor object is mutable i can store it as an attribute of car and also the same instance in a list.
I cant track down any solid info on whether user defined classes are mutable and if there is a choice to choose when you define them, can anybody shed some light?
User classes are considered mutable. Python doesn't have (absolutely) private attributes, so you can always change a class by reaching into the internals.
For using your class as a key in a dict or storing them in a set, you can define a .__hash__() method and a .__eq__() method, making a promise that your class is immutable. You generally design your class API to not mutate the internal state after creation in such cases.
For example, if your engines are uniquely defined by their id, you can use that as the basis of your hash:
class Engine(object):
def __init__(self, id):
self.id = id
def __hash__(self):
return hash(self.id)
def __eq__(self, other):
if isinstance(other, self.__class__):
return self.id == other.id
return NotImplemented
Now you can use instances of class Engine in sets:
>>> eng1 = Engine(1)
>>> eng2 = Engine(2)
>>> eng1 == eng2
False
>>> eng1 == eng1
True
>>> eng1 == Engine(1)
True
>>> engines = set([eng1, eng2])
>>> engines
set([<__main__.Engine object at 0x105ebef10>, <__main__.Engine object at 0x105ebef90>])
>>> engines.add(Engine(1))
>>> engines
set([<__main__.Engine object at 0x105ebef10>, <__main__.Engine object at 0x105ebef90>])
In the above sample I add another Engine(1) instance to the set, but it is recognized as already present and the set didn't change.
Note that as far as lists are concerned, the .__eq__() implementation is the important one; lists don't care if an object is mutable or not, but with the .__eq__() method in place you can test if a given engine is already in a list:
>>> Engine(1) in [eng1, eng2]
True
All objects (with the exception of a few in the standard library, some that implement special access mechanisms using things like descriptors and decorators, or some implemented in C) are mutable. This includes instances of user defined classes, classes themselves, and even the type objects that define the classes. You can even mutate a class object at runtime and have the modifications manifest in instances of the class created before the modification. By and large, things are only immutable by convention in Python if you dig deep enough.
I think you're confusing mutability with how python keeps references -- Consider:
class Foo(object):
pass
t = (1,2,Foo()) # t is a tuple, :. t is immutable
b = a[2] # b is an instance of Foo
b.foo = "Hello" # b is mutable. (I just changed it)
print (hash(b)) # b is hashable -- although the default hash isn't very useful
d = {b : 3} # since b is hashable, it can be used as a key in a dictionary (or set).
c = t # even though t is immutable, we can create multiple references to it.
a = [t] # here we add another reference to t in a list.
Now to your question about getting/storing a list of engines globally -- There are a few different ways to do this, here's one:
class Engine(object):
def __init__(self, make, model):
self.make = make
self.model = model
class EngineFactory(object):
def __init__(self,**kwargs):
self._engines = kwargs
def all_engines(self):
return self._engines.values()
def __call__(self,make, model):
""" Return the same object every for each make,model combination requested """
if (make,model) in _engines:
return self._engines[(make,model)]
else:
a = self._engines[(make,model)] = Engine(make,model)
return a
engine_factory = EngineFactory()
engine1 = engine_factory('cool_engine',1.0)
engine2 = engine_factory('cool_engine',1.0)
engine1 is engine2 #True !!! They're the same engine. Changing engine1 changes engine2
The example above could be improved a little bit by having the EngineFactory._engines dict store weakref.ref objects instead of actually storing real references to the objects. In that case, you'd check to make sure the reference is still alive (hasn't been garbage collected) before you return a new reference to the object.
EDIT: This is conceptually wrong, The immutable object in python can shed some light as to why.
class Engine():
def __init__(self, sn):
self.sn = sn
a = Engine(42)
b = a
print (a is b)
prints True.

sharing a string between two objects

I want two objects to share a single string object. How do I pass the string object from the first to the second such that any changes applied by one will be visible to the other? I am guessing that I would have to wrap the string in a sort of buffer object and do all sorts of complexity to get it to work.
However, I have a tendency to overthink problems, so undoubtedly there is an easier way. Or maybe sharing the string is the wrong way to go? Keep in mind that I want both objects to be able to edit the string. Any ideas?
Here is an example of a solution I could use:
class Buffer(object):
def __init__(self):
self.data = ""
def assign(self, value):
self.data = str(value)
def __getattr__(self, name):
return getattr(self.data, name)
class Descriptor(object):
def __get__(self, instance, owner):
return instance._buffer.data
def __set__(self, instance, value):
if not hasattr(instance, "_buffer"):
if isinstance(value, Buffer):
instance._buffer = value
return
instance._buffer = Buffer()
instance._buffer.assign(value)
class First(object):
data = Descriptor()
def __init__(self, data):
self.data = data
def read(self, size=-1):
if size < 0:
size = len(self.data)
data = self.data[:size]
self.data = self.data[size:]
return data
class Second(object):
data = Descriptor()
def __init__(self, data):
self.data = data
def add(self, newdata):
self.data += newdata
def reset(self):
self.data = ""
def spawn(self):
return First(self._buffer)
s = Second("stuff")
f = s.spawn()
f.data == s.data
#True
f.read(2)
#"st"
f.data
# "uff"
f.data == s.data
#True
s.data
#"uff"
s._buffer == f._buffer
#True
Again, this seems like absolute overkill for what seems like a simple problem. As well, it requires the use of the Buffer class, a descriptor, and the descriptor's impositional _buffer variable.
An alternative is to put one of the objects in charge of the string and then have it expose an interface for making changes to the string. Simpler, but not quite the same effect.
I want two objects to share a single
string object.
They will, if you simply pass the string -- Python doesn't copy unless you tell it to copy.
How do I pass the string object from
the first to the second such that any
changes applied by one will be visible
to the other?
There can never be any change made to a string object (it's immutable!), so your requirement is trivially met (since a false precondition implies anything).
I am guessing that I would have to
wrap the string in a sort of buffer
object and do all sorts of complexity
to get it to work.
You could use (assuming this is Python 2 and you want a string of bytes) an array.array with a typecode of c. Arrays are mutable, so you can indeed alter them (with mutating methods -- and some operators, which are a special case of methods since they invoke special methods on the object). They don't have the myriad non-mutating methods of strings, so, if you need those, you'll indeed need a simple wrapper (delegating said methods to the str(...) of the array that the wrapper also holds).
It doesn't seem there should be any special complexity, unless of course you want to do something truly weird as you seem to given your example code (have an assignment, i.e., a *rebinding of a name, magically affect a different name -- that has absolutely nothing to do with whatever object was previously bound to the name you're rebinding, nor does it change that object in any way -- the only object it "changes" is the one holding the attribute, so it's obvious that you need descriptors or other magic on said object).
You appear to come from some language where variables (and particularly strings) are "containers of data" (like C, Fortran, or C++). In Python (like, say, in Java), names (the preferred way to call what others call "variables") always just refer to objects, they don't contain anything except exactly such a reference. Some objects can be changed, some can't, but that has absolutely nothing to do with the assignment statement (see note 1) (which doesn't change objects: it rebinds names).
(note 1): except of course that rebinding an attribute or item does alter the object that "contains" that item or attribute -- objects can and do contain, it's names that don't.
Just put your value to be shared in a list, and assign the list to both objects.
class A(object):
def __init__(self, strcontainer):
self.strcontainer = strcontainer
def upcase(self):
self.strcontainer[0] = self.strcontainer[0].upper()
def __str__(self):
return self.strcontainer[0]
# create a string, inside a shareable list
shared = ['Hello, World!']
x = A(shared)
y = A(shared)
# both objects have the same list
print id(x.strcontainer)
print id(y.strcontainer)
# change value in x
x.upcase()
# show how value is changed in both x and y
print str(x)
print str(y)
Prints:
10534024
10534024
HELLO, WORLD!
HELLO, WORLD!
i am not a great expert in python, but i think that if you declare a variable in a module and add a getter/setter to the module for this variable you will be able to share it this way.

Categories