My problem is quite simple but I am unable to solve it. When I insert objects into a list, the elements of the list all change whenever I change one of them (they all point to the same object in the memory I think). I want to unlink them so the list would not be full of the exactly same objects with the same values. E.g. avoid linking or mutability. I think the problem is how I initialize the objects but I am not sure how to solve it. Here is my code.
from typing import List, Tuple
class State:
#think of State as some kind of coordinates
def __init__(self, z:float, angle:float):
self.z = z
self.angle = angle
class ListOfStates:
#this should be an object with a list containing DIFFERENT (unlinked) State objects
def __init__(self, list_of_states : List[State]):
self.list_of_states = list_of_states
class StateSettings:
#a bigger object to encapsulate previous objects
def __init__(self, state : State, list_of_states : ListOfStates):
self.state = state
self.list_of_states = list_of_states
some_number = 42
# my try #1
state_settings = StateSettings
#create a list of State objects to be used later
state_settings.list_of_states = [State for i in range(some_number)]
state_settings.state = State
for i in range(some_number):
state_settings.list_of_states[i].angle = i
And state_settings.list_of_states contains the same copy of the object 42 times, e.g.
print(state_settings.list_of_states[0].angle)
print(state_settings.list_of_states[1].angle)
print(state_settings.list_of_states[2].angle)
prints
41
41
41
I also tried different ways to initialize, but with no luck.
# my try #2
state_settings = StateSettings(
state = State(
z = 0,
angle = 0),
list_of_states = [State for i in range(some_number)]
)
for i in range(some_number):
state_settings.list_of_states[i].angle = i
or
# my try 3
from copy import deepcopy
state_settings = StateSettings
state_settings.list_of_states = [deepcopy(State) for i in range(some_number)]
state_settings.state = deepcopy(State)
for i in range(some_number):
state_settings.list_of_states[i].angle = i
My question, as far as I know, is not solved by answers such as Changing a single object within an array of objects changes all, even in a different array or List of Objects changes when the object that was input in the append() function changes.
There are some fundamental mistakes you have made in the code. Let me try to put some light on those first , using your lines of code
# my try #1
state_settings = StateSettings
What you did in the above line is that, you assigned the class StateSettings to state_settings variable. You never created an object here.
#create a list of State objects to be used later
state_settings.list_of_states = [State for i in range(some_number)]
What you did here is also same, created a list of State class references, not objects. So, all the values in list are same.
state_settings.state = State
What you did here, is set an attribute state to StateSettings class , not the object.
for i in range(some_number):
state_settings.list_of_states[i].angle = i
What you did here, set an attribute angle the class State. Since all values in the list are same State references, everywhere value will be same
To summarize the above said issues,
When you assign an attribute to the class name, attribute gets added to the class itself. Any where you have a reference to that class will have the same attribute value.
When you create an object and then set an attribute on the object, the attribute lies only in that object. Its not reflected on other objects created.
A simple update on the code you wrote is below, which I guess works like you want.
from typing import List
class State:
# think of State as some kind of coordinates
# Use default values, so you dont need to provide a value in init
def __init__(self, z: float = None, angle: float = None):
self.z = z
self.angle = angle
class ListOfStates:
# this should be an object with a list containing DIFFERENT (unlinked) State objects
# Use default values, so you dont need to provide a value in init
def __init__(self, list_of_states: List[State] = None):
self.list_of_states = list_of_states
class StateSettings:
# a bigger object to encapsulate previous objects
# Use default values, so you dont need to provide a value in init
def __init__(self, state: State = None, list_of_states: ListOfStates = None):
self.state = state
self.list_of_states = list_of_states
some_number = 42
# my try #1
state_settings = StateSettings()
# create a list of State objects to be used later
state_settings.list_of_states = [State() for i in range(some_number)]
state_settings.state = State()
for i in range(some_number):
state_settings.list_of_states[i].angle = i
Related
I have a situation where one party (Alice) has a complex custom object, whose attributes are complicated and can involve circular references. Alice then sends this object to two separate parties, Bob and Claire, by pickling and sending through an (encrypted) socket. They each then modify one attribute of the object each, but what they change includes complex references to the object they received from Alice. Both Bob and Claire then pickle their own modified object themselves, and send it back to Alice.
The question is, how can Alice combine the changes made by both Bob and Claire? Because object persistence is lost on pickling/unpickling, the naive idea of copying the attribute that Bob or Claire created onto the original object doesn't work. I am aware of how persistent_id() and persistent_load() works in pickling, but I would very much like to avoid having to manually write rules for every single attribute in the object that Alice creates. Partly because its a big pile of nested and circularly referenced objects (some 10,000+ lines of them), and partly because I want flexibility to modify the rest of the code without have to change how I pickle/unpickle every time (and the difficulty of properly testing that).
Can this be done? Or do I have to swallow the bitter pill and deal with the pickling "manually"?
Here's a minimal concrete example. Obviously, the circular references could be easily removed here, or Bob and Claire could just send their value over to Alice, but not so in my real case.
import pickle
class Shared:
pass
class Bob:
pass
class Claire:
pass
class Alice:
def __init__(self):
self.shared = Shared()
self.bob = Bob()
self.claire = Claire()
def add_some_data(self, x, y):
self.shared.bob = self.bob
self.shared.claire = self.claire
self.shared.x = x
self.shared.y = y
def bob_adds_data(self, extra):
self.bob.value = self.shared.x + self.shared.y + extra
def claire_adds_data(self, extra):
self.claire.value = self.shared.x * self.shared.y * extra
# Done on Alice's side
original = Alice()
original.add_some_data(2, 3)
outgoing = pickle.dumps(original)
# Done on Bob's side
bobs_copy = pickle.loads(outgoing)
bobs_copy.bob_adds_data(4)
bobs_reply = pickle.dumps(bobs_copy)
# Done on Claires's side
claires_copy = pickle.loads(outgoing)
claires_copy.claire_adds_data(5)
claires_reply = pickle.dumps(claires_copy)
# Done on Alice's side
from_bob = pickle.loads(bobs_reply)
from_claire = pickle.loads(claires_reply)
original.bob = from_bob.bob
original.claire = from_claire.claire
# If the circularly references were maintained, these two lines would be equal
# instead, the attributes on the bottom line do not exist because the reference is broken
print(original.bob.value, original.claire.value)
print(original.shared.bob.value, original.shared.claire.value)
Partial Solution
I have a partial solution that works with some restrictions on the problem case.
Restrictions
The restriction is that Alice's object has only a single reference to Bob's and Claire's in a known place. Those latter two, however, can have arbitrarily complex references to themselves and to Alice, including circular, nested and recursive structures. The other requirement is that Bob doesn't have any reference to Claire and vice-versa: this is very natural if we require both of those objects to be updated independently and in any order.
In other words, Alice receives something from Bob which gets placed in a single tidy place. The difficulty is in making the references contained within Bob match up to the correct objects that Alice contains, but nothing else in Alice itself needs to be changed. This is the use case I need, and it isn't clear to me that the more general case is possible if Bob and Claire can make arbitrary changes to Alice.
Idea
This works by having a base class that creates a persistent id which does not change over the object's life time, is maintained by pickling/unpickling, and is unique. Any object whose references are to be maintained in this scenario must inherit from this class. When Bob sends his changes to Alice, he uses a dictionary of all the objects he received from Alice and their persistent id to pickle, such that those that all the references to pre-existing objects are encoded by the persistent id. On the other side, Alice does the same. She unpickles what Bob sent her using a dictionary of persistent id to object of everything she sent to Bob earlier. Thus, while Alice and Bob have different instances of everything, the persistent id of some objects are the same so they can be "swapped out" when pickling between different parties.
This can be made to work with existing code pretty easily. It just consists of adding a base class to all custom classes that we want to make persistent, and a small addition every time we pickle/unpickle.
Module
import io
import time
import pickle
class Persistent:
def __init__(self):
"""Both unique and unchanging, even after modifying or pickling/unpickling object
Potential problem if clocks are not in sync"""
self.persistent_id = str(id(self)) + str(time.time())
def make_persistent_memo(obj):
"""Makes two dictionaries (one reverse of other) linking every instance of Persistent found
in the attributes and collections of obj recursively, with the persistent id of that instant.
Can cope with circular references and recursively nested objects"""
def add_to_memo(item, id_to_obj, obj_to_id, checked):
# Prevents checking the same object multiple times
if id(item) in checked:
return id_to_obj, obj_to_id, checked
else:
checked.add(id(item))
if isinstance(item, Persistent):
id_to_obj[item.persistent_id] = item
obj_to_id[item] = item.persistent_id
try: # Try to add attributes of item to memo, recursively
for sub_item in vars(item).values():
add_to_memo(sub_item, id_to_obj, obj_to_id, checked)
except TypeError:
pass
try: # Try to add iterable elements of item to memo, recursively
for sub_item in item:
add_to_memo(sub_item, id_to_obj, obj_to_id, checked)
except TypeError:
pass
return id_to_obj, obj_to_id, checked
return add_to_memo(obj, {}, {}, set())[:2]
class PersistentPickler(pickle.Pickler):
""" Normal pickler, but it takes a memo of the form {obj: persistent id}
any object in that memo is pickled as its persistent id instead"""
#staticmethod # Because dumps is not defined for custom Picklers
def dumps(obj_to_id_memo, obj):
with io.BytesIO() as file:
PersistentPickler(file, obj_to_id_memo).dump(obj)
file.seek(0)
return file.read()
def __init__(self, file, obj_to_id_memo):
super().__init__(file)
self.obj_to_id_memo = obj_to_id_memo
def persistent_id(self, obj):
try:
if obj in self.obj_to_id_memo and obj:
return self.obj_to_id_memo[obj]
except TypeError: # If obj is unhashable
pass
return None
class PersistentUnPickler(pickle.Unpickler):
""" Normal pickler, but it takes a memo of the form {persistent id: obj}
used to undo the effects of PersistentPickler"""
#staticmethod # Because loads is not defined for custom Unpicklers
def loads(id_to_obj_memo, pickled_data):
with io.BytesIO(pickled_data) as file:
obj = PersistentUnPickler(file, id_to_obj_memo).load()
return obj
def __init__(self, file, id_to_obj_memo):
super().__init__(file)
self.id_to_obj_memo = id_to_obj_memo
def persistent_load(self, pid):
if pid in self.id_to_obj_memo:
return self.id_to_obj_memo[pid]
else:
super().persistent_load(pid)
Use Example
class Alice(Persistent):
""" Must have a single attribute saved as bob or claire """
def __init__(self):
super().__init__()
self.shared = Shared()
self.bob = Bob()
self.claire = Claire()
def add_some_data(self, x, y):
self.nested = [self]
self.nested.append(self.nested)
self.shared.x = x
self.shared.y = y
class Bob(Persistent):
""" Can have arbitrary reference to itself and to Alice but must not touch Claire """
def make_changes(self, alice, extra):
self.value = alice.shared.x + alice.shared.y + extra
self.attribute = alice.shared
self.collection = [alice.bob, alice.shared]
self.collection.append(self.collection)
self.new = Shared()
class Claire(Persistent):
""" Can have arbitrary reference to itself and to Alice but must not touch Bob """
def make_changes(self, alice, extra):
self.value = alice.shared.x * alice.shared.y * extra
self.attribute = alice
self.collection = {"claire": alice.claire, "shared": alice.shared}
self.collection["collection"] = self.collection
class Shared(Persistent):
pass
# Done on Alice's side
alice = Alice()
alice.add_some_data(2, 3)
outgoing = pickle.dumps(alice)
# Done on Bob's side
bobs_copy = pickle.loads(outgoing)
# Create a memo of the persistent_id of the received objects that are *not* being modified
_, bob_pickling_memo = make_persistent_memo(bobs_copy)
bob_pickling_memo.pop(bobs_copy.bob)
# Make changes and send everything back to Alice
bobs_copy.bob.make_changes(bobs_copy, 4)
bobs_reply = PersistentPickler.dumps(bob_pickling_memo, bobs_copy.bob)
# Same on Claires's side
claires_copy = pickle.loads(outgoing)
_, claire_pickling_memo = make_persistent_memo(claires_copy)
claire_pickling_memo.pop(claires_copy.claire)
claires_copy.claire.make_changes(claires_copy, 5)
claires_reply = PersistentPickler.dumps(claire_pickling_memo, claires_copy.claire)
# Done on Alice's side
alice_unpickling_memo, _ = make_persistent_memo(alice)
alice.bob = PersistentUnPickler.loads(alice_unpickling_memo, bobs_reply)
alice.claire = PersistentUnPickler.loads(alice_unpickling_memo, claires_reply)
# Check that Alice has received changes from Bob and Claire
print(alice.bob.value == bobs_copy.bob.value == 9,
alice.claire.value == claires_copy.claire.value == 30)
# Check that all references match up as expected
print("Alice:", alice is alice.nested[0] is alice.nested[1][0] is alice.claire.attribute)
print("Bob:", (alice.bob is alice.nested[0].bob is alice.bob.collection[0] is
alice.bob.collection[2][0]))
print("Claire:", (alice.claire is alice.nested[0].claire is alice.claire.collection["claire"] is
alice.claire.collection["collection"]["claire"]))
print("Shared:", (alice.shared is alice.bob.attribute is alice.bob.collection[1] is
alice.bob.collection[2][1] is alice.claire.collection["shared"] is
alice.claire.collection["collection"]["shared"] is not alice.bob.new))
Output
C>python test.py
True True
Alice: True
Bob: True
Claire: True
Shared: True
All exactly as required
Follow up
It feels like I'm reinventing the wheel here by doing my own nested introspection, can this be done better with existing tools?
My code feels rather inefficient, with a lot of introspection, can this be improved?
Can I be sure that add_to_memo() is not missing some references out?
The use of time.time() to create the persistent id feels rather clunky, is there a better alternative?
I would like to simply make a list of kinds of coffe, but get an error stating that the list is not defined. Do I have to use self in the constructor when referencing to a classvariable?
I have tried changing the return statement to return self.coffelist.append(name), but then get another error: 'Function' object has no attribute 'append'.
class coffe:
coffelist = []
def __init__(self,name,origin,price):
self.name = name
self.origin = origin
self.price = price
return (self.coffelist.append(self.name))
def coffelist(self):
print(coffelist)
c1=coffe("blackcoffe","tanz",55)
c2=coffe("fineroasted","ken",60)
This is because you named one of your methods as coffelist.
I think this shows how to do what you want. I also modified your code to follow the PEP 8 - Style Guide for Python Code and corrected some misspelled words.
class Coffee: # Class names should Capitalized.
coffeelist = [] # Class attribute to track instance names.
def __init__(self,name,origin,price):
self.name = name
self.origin = origin
self.price = price
self.coffeelist.append(self.name)
def print_coffeelist(self):
print(self.coffeelist)
c1 = Coffee("blackcoffee", "tanz", 55)
c1.print_coffeelist() # -> ['blackcoffee']
c2 = Coffee("fineroasted", "ken", 60)
c1.print_coffeelist() # -> ['blackcoffee', 'fineroasted']
# Can also access attribute directly through the class:
print(Coffee.coffeelist) # -> ['blackcoffee', 'fineroasted']
yes thanks that's exactly what I wanted!
I wasnt sure.. I thought you could do 2 things simultaneously in the return statement, both return append. I guess allot of times python is very flexible and sometimes not. thanks
I'm pretty new to Python so I'm sure this isn't the most efficient way to code this. The problem I'm having is I have a 2nd For loop that runs inside another for loop. It works fine the first time, but on the second iteration, the second for loop doesn't register the data and skips over it so the it never runs again. I use a zipped tuple and it just looks like it loses the value completely. `
class Model:
def predict(self, data):
prediction = []
distances = []
for item in data:
distances.clear()
for trainedItem in self.Train_Data:
distances.append([(abs((item[0] - trainedItem[0][3])) + abs((item[1] - trainedItem[0][1])) + abs((item[2] - trainedItem[0][2])) + abs((item[3] - trainedItem[0][3]))), trainedItem[1]])
distances.sort()
targetNeighbors = []
for closest in distances[:self.K]:
targetNeighbors.append(closest[1])
prediction.append(Counter(targetNeighbors).most_common()[0][0])
return prediction
class HardcodedClassifier:
def fit(X_Train, Y_Train, k):
Model.Train_Data = zip(X_Train, Y_Train)
Model.K = k
return Model`
The iterator was depleted. Try Model.Train_Data = list(zip(X_Train, Y_Train)) so it will iterate every time in the inner for loop.
Based on what I see, you are calling the Model class constructor instead of instantiating a model object and accessing its data. When you declare a class, the declaration only creates a constructor object. The constructor returns a new object of the type you defined when called.
class Bacon:
tasty = true
def __init__():
self.salty = true
Bacon
>> <class constructor object at #memoryaddress>
Bacon.tasty
>> True
Bacon.salty
>> Error: Attribute not found
baconstrip = Bacon()
baconstrip
>> <Bacon object at #memoryaddress>
baconstrip.tasty
>> True
baconstrip.salty
>> True
The baconstrip object is of type Bacon and has memory and its own namespace allocated to it to store variables. The Bacon variable is the constructor, and you can also access it like an object too, but it's not an actual instance of itself.
For your code:
class HardcodedClassifier:
def __init__(self, model): # to initialize the class, provide a model.
self.model = model
def fit(X_Train, Y_Train, k):
self.model.Train_Data = zip(X_Train, Y_Train)
self.model.K = k
# no need to return a value. The state of the object is preserved.
mymodel = Model()
myclassifier = HardcodedClassifier(mymodel)
I have extensively read about immutable and mutable objects in Python for a couple of months now and I seem to begin to understand the concept. Still I cannot spot the problem why my code below produces memory leaks. The dicts function as references to immutable records of specific type. In many cases, I get an update of an existing record, in this case, the existing record will only be updated if the two records (oldrecord and newrecord) are not equal. However, I have the feeling that newrecord gets never deleted if oldrecord and newrecord match, although all references appear to cease to exist in such a case.
My question:
Is the code below good practice for selecting a reference to a dict based on record type or should I do it differently (e.g. through dictSwitcher)?
class myRecordDicts():
def __init__(self, type1Dict=dict(), type2Dict=dict(),
type3Dict=dict(),type4Dict=dict(),type5Dict=dict(),type6Dict=dict(), type7Dict=dict()):
self.type1Dict = type1Dict
self.type2Dict = type2Dict
self.type3Dict = type3Dict
self.type4Dict = type4Dict
self.type5Dict = type5Dict
self.type6Dict = type6Dict
self.type7Dict = type7Dict
def dictSelector(self, record):
dictSwitcher = {
myCustomRecordType1().name: self.type1Dict,
myCustomRecordType2().name: self.type2Dict,
myCustomRecordType3().name: self.type3Dict,
myCustomRecordType4().name: self.type4Dict,
myCustomRecordType5().name: self.type5Dict,
myCustomRecordType6().name: self.type6Dict,
myCustomRecordType7().name: self.type7Dict,
}
return dictSwitcher.get(record.name)
def AddRecordToDict(self, newrecord):
dict = self.dictSelector(newrecord)
recordID = newrecord.id
if recordID in dict:
oldrecord = dict[recordID]
self.MergeExistingRecords(oldrecord,newrecord)
else:
dict[recordID] = newrecord
def MergeExistingRecords(self, oldrecord, newrecord):
# Basic Compare function
oldRecordString = oldrecord.SerializeToString()
newRecordString = newrecord.SerializeToString()
# no need to do anything if same length
if not len(oldRecordString) == len(newRecordString):
oldrecord.CustomMergeFrom(newrecord)
Well, it seems always like that: I was working for hours on this problem and could not make progress. 5 Minutes after formulating the question correctly on StackExchange, I found my issue:
I needed to remove the references in init since I was never passing dicts when instantiating myRecordsDicts(), the following code does not leak memory:
class myRecordDicts():
def __init__(self):
self.type1Dict = dict()
self.type2Dict = dict()
self.type3Dict = dict()
self.type4Dict = dict()
self.type5Dict = dict()
self.type6Dict = dict()
self.type7Dict = dict()
My class:
class ManagementReview:
"""Class describing ManagementReview Object.
"""
# Class attributes
id = 0
Title = 'New Management Review Object'
fiscal_year = ''
region = ''
review_date = ''
date_completed = ''
prepared_by = ''
__goals = [] # List of <ManagementReviewGoals>.
__objectives = [] # List of <ManagementReviewObjetives>.
__actions = [] # List of <ManagementReviewActions>.
__deliverables = [] # List of <ManagementReviewDeliverable>.
__issues = [] # List of <ManagementReviewIssue>.
__created = ''
__created_by = ''
__modified = ''
__modified_by = ''
The __modified attribute is a datetime string in isoformat. I want that attribute to be automatically to be upated to datetime.now().isoformat() every time one of the other attributes is updated. For each of the other attributes I have a setter like:
def setObjectives(self,objectives):
mro = ManagementReviewObjective(args)
self.__objectives.append(mro)
So, is there an easier way to than to add a line like:
self.__modified = datetime.now().isoformat()
to every setter?
Thanks! :)
To update __modified when instance attributes are modified (as in your example of self.__objectives), you could override __setattr__.
For example, you could add this to your class:
def __setattr__(self, name, value):
# set the value like usual and then update the modified attribute too
self.__dict__[name] = value
self.__dict__['__modified'] = datetime.now().isoformat()
Perhaps adding a decorator before each setter?
If you have a method that commits the changes made to these attributes to a database (like a save() method or update_record() method. Something like that), you could just append the
self.__modified = datetime.now().isoformat()
just before its all committed, since thats the only time it really matters anyway.