Recombining object after pickling-unpickling - python

I have a situation where one party (Alice) has a complex custom object, whose attributes are complicated and can involve circular references. Alice then sends this object to two separate parties, Bob and Claire, by pickling and sending through an (encrypted) socket. They each then modify one attribute of the object each, but what they change includes complex references to the object they received from Alice. Both Bob and Claire then pickle their own modified object themselves, and send it back to Alice.
The question is, how can Alice combine the changes made by both Bob and Claire? Because object persistence is lost on pickling/unpickling, the naive idea of copying the attribute that Bob or Claire created onto the original object doesn't work. I am aware of how persistent_id() and persistent_load() works in pickling, but I would very much like to avoid having to manually write rules for every single attribute in the object that Alice creates. Partly because its a big pile of nested and circularly referenced objects (some 10,000+ lines of them), and partly because I want flexibility to modify the rest of the code without have to change how I pickle/unpickle every time (and the difficulty of properly testing that).
Can this be done? Or do I have to swallow the bitter pill and deal with the pickling "manually"?
Here's a minimal concrete example. Obviously, the circular references could be easily removed here, or Bob and Claire could just send their value over to Alice, but not so in my real case.
import pickle
class Shared:
pass
class Bob:
pass
class Claire:
pass
class Alice:
def __init__(self):
self.shared = Shared()
self.bob = Bob()
self.claire = Claire()
def add_some_data(self, x, y):
self.shared.bob = self.bob
self.shared.claire = self.claire
self.shared.x = x
self.shared.y = y
def bob_adds_data(self, extra):
self.bob.value = self.shared.x + self.shared.y + extra
def claire_adds_data(self, extra):
self.claire.value = self.shared.x * self.shared.y * extra
# Done on Alice's side
original = Alice()
original.add_some_data(2, 3)
outgoing = pickle.dumps(original)
# Done on Bob's side
bobs_copy = pickle.loads(outgoing)
bobs_copy.bob_adds_data(4)
bobs_reply = pickle.dumps(bobs_copy)
# Done on Claires's side
claires_copy = pickle.loads(outgoing)
claires_copy.claire_adds_data(5)
claires_reply = pickle.dumps(claires_copy)
# Done on Alice's side
from_bob = pickle.loads(bobs_reply)
from_claire = pickle.loads(claires_reply)
original.bob = from_bob.bob
original.claire = from_claire.claire
# If the circularly references were maintained, these two lines would be equal
# instead, the attributes on the bottom line do not exist because the reference is broken
print(original.bob.value, original.claire.value)
print(original.shared.bob.value, original.shared.claire.value)

Partial Solution
I have a partial solution that works with some restrictions on the problem case.
Restrictions
The restriction is that Alice's object has only a single reference to Bob's and Claire's in a known place. Those latter two, however, can have arbitrarily complex references to themselves and to Alice, including circular, nested and recursive structures. The other requirement is that Bob doesn't have any reference to Claire and vice-versa: this is very natural if we require both of those objects to be updated independently and in any order.
In other words, Alice receives something from Bob which gets placed in a single tidy place. The difficulty is in making the references contained within Bob match up to the correct objects that Alice contains, but nothing else in Alice itself needs to be changed. This is the use case I need, and it isn't clear to me that the more general case is possible if Bob and Claire can make arbitrary changes to Alice.
Idea
This works by having a base class that creates a persistent id which does not change over the object's life time, is maintained by pickling/unpickling, and is unique. Any object whose references are to be maintained in this scenario must inherit from this class. When Bob sends his changes to Alice, he uses a dictionary of all the objects he received from Alice and their persistent id to pickle, such that those that all the references to pre-existing objects are encoded by the persistent id. On the other side, Alice does the same. She unpickles what Bob sent her using a dictionary of persistent id to object of everything she sent to Bob earlier. Thus, while Alice and Bob have different instances of everything, the persistent id of some objects are the same so they can be "swapped out" when pickling between different parties.
This can be made to work with existing code pretty easily. It just consists of adding a base class to all custom classes that we want to make persistent, and a small addition every time we pickle/unpickle.
Module
import io
import time
import pickle
class Persistent:
def __init__(self):
"""Both unique and unchanging, even after modifying or pickling/unpickling object
Potential problem if clocks are not in sync"""
self.persistent_id = str(id(self)) + str(time.time())
def make_persistent_memo(obj):
"""Makes two dictionaries (one reverse of other) linking every instance of Persistent found
in the attributes and collections of obj recursively, with the persistent id of that instant.
Can cope with circular references and recursively nested objects"""
def add_to_memo(item, id_to_obj, obj_to_id, checked):
# Prevents checking the same object multiple times
if id(item) in checked:
return id_to_obj, obj_to_id, checked
else:
checked.add(id(item))
if isinstance(item, Persistent):
id_to_obj[item.persistent_id] = item
obj_to_id[item] = item.persistent_id
try: # Try to add attributes of item to memo, recursively
for sub_item in vars(item).values():
add_to_memo(sub_item, id_to_obj, obj_to_id, checked)
except TypeError:
pass
try: # Try to add iterable elements of item to memo, recursively
for sub_item in item:
add_to_memo(sub_item, id_to_obj, obj_to_id, checked)
except TypeError:
pass
return id_to_obj, obj_to_id, checked
return add_to_memo(obj, {}, {}, set())[:2]
class PersistentPickler(pickle.Pickler):
""" Normal pickler, but it takes a memo of the form {obj: persistent id}
any object in that memo is pickled as its persistent id instead"""
#staticmethod # Because dumps is not defined for custom Picklers
def dumps(obj_to_id_memo, obj):
with io.BytesIO() as file:
PersistentPickler(file, obj_to_id_memo).dump(obj)
file.seek(0)
return file.read()
def __init__(self, file, obj_to_id_memo):
super().__init__(file)
self.obj_to_id_memo = obj_to_id_memo
def persistent_id(self, obj):
try:
if obj in self.obj_to_id_memo and obj:
return self.obj_to_id_memo[obj]
except TypeError: # If obj is unhashable
pass
return None
class PersistentUnPickler(pickle.Unpickler):
""" Normal pickler, but it takes a memo of the form {persistent id: obj}
used to undo the effects of PersistentPickler"""
#staticmethod # Because loads is not defined for custom Unpicklers
def loads(id_to_obj_memo, pickled_data):
with io.BytesIO(pickled_data) as file:
obj = PersistentUnPickler(file, id_to_obj_memo).load()
return obj
def __init__(self, file, id_to_obj_memo):
super().__init__(file)
self.id_to_obj_memo = id_to_obj_memo
def persistent_load(self, pid):
if pid in self.id_to_obj_memo:
return self.id_to_obj_memo[pid]
else:
super().persistent_load(pid)
Use Example
class Alice(Persistent):
""" Must have a single attribute saved as bob or claire """
def __init__(self):
super().__init__()
self.shared = Shared()
self.bob = Bob()
self.claire = Claire()
def add_some_data(self, x, y):
self.nested = [self]
self.nested.append(self.nested)
self.shared.x = x
self.shared.y = y
class Bob(Persistent):
""" Can have arbitrary reference to itself and to Alice but must not touch Claire """
def make_changes(self, alice, extra):
self.value = alice.shared.x + alice.shared.y + extra
self.attribute = alice.shared
self.collection = [alice.bob, alice.shared]
self.collection.append(self.collection)
self.new = Shared()
class Claire(Persistent):
""" Can have arbitrary reference to itself and to Alice but must not touch Bob """
def make_changes(self, alice, extra):
self.value = alice.shared.x * alice.shared.y * extra
self.attribute = alice
self.collection = {"claire": alice.claire, "shared": alice.shared}
self.collection["collection"] = self.collection
class Shared(Persistent):
pass
# Done on Alice's side
alice = Alice()
alice.add_some_data(2, 3)
outgoing = pickle.dumps(alice)
# Done on Bob's side
bobs_copy = pickle.loads(outgoing)
# Create a memo of the persistent_id of the received objects that are *not* being modified
_, bob_pickling_memo = make_persistent_memo(bobs_copy)
bob_pickling_memo.pop(bobs_copy.bob)
# Make changes and send everything back to Alice
bobs_copy.bob.make_changes(bobs_copy, 4)
bobs_reply = PersistentPickler.dumps(bob_pickling_memo, bobs_copy.bob)
# Same on Claires's side
claires_copy = pickle.loads(outgoing)
_, claire_pickling_memo = make_persistent_memo(claires_copy)
claire_pickling_memo.pop(claires_copy.claire)
claires_copy.claire.make_changes(claires_copy, 5)
claires_reply = PersistentPickler.dumps(claire_pickling_memo, claires_copy.claire)
# Done on Alice's side
alice_unpickling_memo, _ = make_persistent_memo(alice)
alice.bob = PersistentUnPickler.loads(alice_unpickling_memo, bobs_reply)
alice.claire = PersistentUnPickler.loads(alice_unpickling_memo, claires_reply)
# Check that Alice has received changes from Bob and Claire
print(alice.bob.value == bobs_copy.bob.value == 9,
alice.claire.value == claires_copy.claire.value == 30)
# Check that all references match up as expected
print("Alice:", alice is alice.nested[0] is alice.nested[1][0] is alice.claire.attribute)
print("Bob:", (alice.bob is alice.nested[0].bob is alice.bob.collection[0] is
alice.bob.collection[2][0]))
print("Claire:", (alice.claire is alice.nested[0].claire is alice.claire.collection["claire"] is
alice.claire.collection["collection"]["claire"]))
print("Shared:", (alice.shared is alice.bob.attribute is alice.bob.collection[1] is
alice.bob.collection[2][1] is alice.claire.collection["shared"] is
alice.claire.collection["collection"]["shared"] is not alice.bob.new))
Output
C>python test.py
True True
Alice: True
Bob: True
Claire: True
Shared: True
All exactly as required
Follow up
It feels like I'm reinventing the wheel here by doing my own nested introspection, can this be done better with existing tools?
My code feels rather inefficient, with a lot of introspection, can this be improved?
Can I be sure that add_to_memo() is not missing some references out?
The use of time.time() to create the persistent id feels rather clunky, is there a better alternative?

Related

Coding a group of interrelated objects which are updated upon addition or removal of an object in the group

So I have a group of N persons each having their own unique id. Each person has a randomized opinion of each already existing person ranging from 0 to 100. Upon the addition of a new person, I'd like all existing persons to acquire a randomized opinion of this new person. Upon removal of an existing person, I'd like all remaining persons to remove their opinion of the removed person.
Here's what I have up to now:
import random
persons = {}
class Person():
def __init__(self, idn):
self.idn = idn
self.opinions = {}
for i in persons:
self.opinions[i] = random.randrange(100)
persons[idn] = self
for i in persons:
persons[i].update()
def update(self):
pass
for i in range(20):
person_i = Person(i)
Now clearly the problem here is that only the last created object has opinions of all other persons. I was tinkering with creating a Person.update() function, but I have no clue how to proceed.
I was thinking, perhaps there is already somewhere a framework created to deal with this type of situation? (I would eventually hope to make even more complicated interrelations). The idea is having an object that holds a relationship to every other object in its group, and vice-versa for each other object in the group.
Any help is appreciated, especially resources to learn. I am a beginner at python.
Here for your reference, it is not working for more Person groups, just one Person group. If you need more groups, you have to specified group key for each person. If you want to del person, should person.delete() first.
import random
class Person():
table = {}
def __init__(self):
self.key = self.get_key()
self.opinions = {}
for key in Person.table:
self.opinions[key] = random.randrange(100)
for person in Person.table.values():
person.opinions[self.key] = random.randrange(100)
Person.table[self.key] = self
def get_key(self):
i = 0
while i in Person.table:
i += 1
return i
def delete(self):
del Person.table[self.key]
for person in Person.table.values():
del person.opinions[self.key]
del self
persons = [Person() for i in range(20)]

Python,unable to call initial arguments outside the class

The below python class have empty dictionary as initial arguments, after calling
createAccount() outside class it successfully add data to dictionary but I can't access dictionary outside class.
What changes shall I make in the below code to access the newly created account details ?
*Please note that my error occurs in the last line of the code *
class SavingsAccount():
def __init__(self):
self.savingsAccounts = {}
def createAccount(self, name, initialDeposit):
print()
self.accountNumber = int(12345)
self.savingsAccounts[self.accountNumber] = [name, initialDeposit]
print("Account creation has been successful. Your account number is ", self.accountNumber)
SavingsAccount().createAccount(name = 'a',initialDeposit=4)
print(SavingsAccount().savingsAccounts[12345]) # getting error here
You should initialize your object using __init__,
class SavingsAccount:
def __init__(self, name, initial_deposit):
self.accountNumber = 12345
self.savingsAccounts = {self.accountNumber : [name, initial_deposit] }
print("Account creation has been successful. Your account number is ", self.accountNumber)
saving_account = SavingsAccount(name='a', initial_deposit=4)
print(saving_account.savingsAccounts)
Also, most of the Pythonistas prefer snake_casing while naming variables.
You are creating a new instance of SavingsAccount with every call. After you call to createAccount completes, that instance is garbage-collected, as there are no references to it stored anywhere.
s = SavingsAccount()
s.createAccount(name='a', initialDeposit=4)
print(s.savingsAccounts[12345])
(See Taohidul Islam's answer for how you should be defining the class, though.)
The line that gives the error does this actions:
Calls SavingsAccount.init() to create the object
Asks for the item 12345 in the dictionary (that whas just created so it's empty)
You should structure your code in a different way. You should have a list of accounts or similar that is unique, and then insert in it the accounts you create.
Must first initialize an instance of your SavingsAccount class
#initialize savings account object
s = SavingsAccount()
#call created account method
s.createAccount(name="a", initialDeposit=4)
#print the account
print(s.savingsAccounts[12345])
Although your datastructure is confusing, why not have one instance of a savings account object represent an individuals account? Then you could just assign member variables for values you want to track.
class SavingsAccount:
def __init__(self, name, initial_deposit):
self.account_name = name
self.bal = initial_deposit
def deposit(self, val):
self.bal += val
def return_account(self):
return self.__dict__
Now you can use it more simplistically
s = SavingsAccount("name", 500)
s.deposit(500)
acc = s.return_account()
print(acc)
>> {"account_name": "name", "bal": 1000}

Pass request to different class based on condition Python

I am designing an API which deals with different type of vehicles for example cars, buses and vans. There is an endpoint POST /vehicle which will take a defined body.
{
'registration': 'str',
'date_of_purchase': 'datetime str',
'vehicle_type': 'car'
...
}
The first thing that we do is load an object based on the vehicle_type.
if body[vehicle_type] == 'car':
passed_in_vehicle = Car(body)
elif body['vehicle_type'] == 'bus':
passed_in_vehicle = Bus(body)
...
Within the python program there are multiple classes including:
vehicles.py
cars.py
vans.py
buses.py
The API entry point goes to vehicles.py which does some logic and then depending on the input will route it to one of the other classes to do more specific logic.
Each class has the same set of base methods.
update_vehicle_specs
register_vehicle
At the moment in the bottom of vehicles.py I have
if is_instance(passed_in_vehicle, Car):
carService = Cars()
carService.register_vehicle(passed_in_vehicle)
elif is_instance(passed_in_vehicle, Van):
vanService = Vans()
vanService.register_vehicle(passed_in_vehicle)
...
However, this cannot be scalable. What is the correct solution to route to specific classes based on a condition?
The way it looks to me, you can improve design a bit. Since each vehicle 'knows' where it is serviced, you should probably do in each something like
def __init__(self, ..., service):
super().__init__(self, ...)
...
...
service.register_vehicle(self)
this is coming from the perspective that "the car drives to the garage to register" rather than the "garage looks for new cars".
When you initialize a vehicle, pass it its servicing class, as in
new_car = Car(..., my_car_service)
where
my_car_service = Cars()
somewhere before. This is since we assume each new car needs to register, and we do not make a new service for each new car. Another approach, is for the class to contain (composition) its service object from the get go:
class Car:
service = Cars()
def __init__(...)
...
Car.service.register_vehicle(self)
so the class contains the servicing object always, and all initialized objects share the same one through a class variable. I actually prefer this.
First initialization:
With regard to your initial initialization of the vehicle, while using locals might be a nifty solution for that, it might be a bit more type safe to have something like what you have. In Python I like to use the "Python switch dictionary" (TM) for stuff like that:
{'car': Car,
'bus': Bus,
...
'jet': Jet
}[body[vehicle_type]](body)
I don't know if using locals would be ideal, but at least I believe it would scale better than your current solution:
use string name to instantiate classes from built-in locals() which will store imports as a dictionary
class Car(object):
def print(self):
print("I'm a car")
class Bus(object):
def print(self):
print("I'm a bus")
vehicle_type = "Car"
currV_class = locals()[vehicle_type]
currV_instance = currV_class()
#currV_instance.register_vehicle
currV_instance.print()
vehicle_type = "Bus"
currV_class = locals()[vehicle_type]
currV_instance = currV_class()
#currV_instance.register_vehicle
currV_instance.print()
OUTPUT:
I'm a car
I'm a bus

How to print actual name of variable class type in function?

I'm trying to return variable name, but i keep getting this:
<classes.man.man object at (some numbers (as example:0x03BDCA50))>
Below is my code:
from classes.man import man
def competition(guy1, guy2, counter1=0, counter2=0):
.......................
some *ok* manipulations
.......................
if counter1>counter2:
return guy1
bob = man(172, 'green')
bib = man(190, 'brown')
print(competition(bob , bib ))
Epilogue
If anyone want to, explain please what I can write instead of __class__ in example below to get variable name.
def __repr__(self):
return self.__class__.__name__
Anyway, thank you for all of your support
There are different ways to approach your problem.
The simplest I can fathom is if you can change the class man, make it accept an optional name in its __init__ and store it in the instance. This should look like this:
class man:
def __init__(number, color, name="John Doe"):
self.name = name
# rest of your code here
That way in your function you could just do with:
return guy1.name
Additionnally, if you want to go an extra step, you could define a __str__ method in your class man so that when you pass it to str() or print(), it shows the name instead:
# Inside class man
def __str__(self):
return self.name
That way your function could just do:
return guy1
And when you print the return value of your function it actually prints the name.
If you cannot alter class man, here is an extremely convoluted and costly suggestion, that could probably break depending on context:
import inspect
def competition(guy1, guy2, counter1=0, counter2=0):
guy1_name = ""
guy2_name = ""
for name, value in inspect.stack()[-1].frame.f_locals.items():
if value is guy1:
guy1_name = name
elif value is guy2:
guy2_name = name
if counter1 > counter2:
return guy1_name
elif counter2 > counter2:
return guy1_name
else:
return "Noone"
Valentin's answer - the first part of it at least (adding a name attribute to man) - is of course the proper, obvious solution.
Now wrt/ the second part (the inspect.stack hack), it's brittle at best - the "variables names" we're interested in might not necessarily be defined in the first parent frame, and FWIW they could as well just come from a dict etc...
Also, it's definitly not the competition() function's responsability to care about this (don't mix domain layer with presentation layer, thanks), and it's totally useless since the caller code can easily solve this part by itself:
def competition(guy1, guy2, counter1=0, counter2=0):
.......................
some *ok* manipulations
.......................
if counter1>counter2:
return guy1
def main():
bob = man(172, 'green')
bib = man(190, 'brown')
winner = competition(bob, bib)
if winner is bob:
print("bob wins")
elif winner is bib:
print("bib wins")
else:
print("tie!")
Python prints the location of class objects in memory if they are passed to the print() function as default. If you want a prettier output for a class you need to define the __repr__(self) function for that class which should return a string that is printed if an object is passed to print(). Then you can just return guy1
__repr__ is the method that defines the name in your case.
By default it gives you the object type information. If you want to print more apt name then you should override the __repr__ method
Check below code for instance
class class_with_overrided_repr:
def __repr__(self):
return "class_with_overrided_repr"
class class_without_overrided_repr:
pass
x = class_with_overrided_repr()
print x # class_with_overrided_repr
x = class_without_overrided_repr()
print x # <__main__.class_without_overrided_repr instance at 0x7f06002aa368>
Let me know if this what you want?

Redefine Class Instances in Python

I am migrating a project I have from being littered with globals variables to actually have a structure defined by classes defined in a separate module. This is my first time really using OOP so want to understand if it is safe to re-define an instance of a Class or if my code is missing something.
At the top of my code, I import my module -
import NHLGameEvents
config = configparser.ConfigParser()
config.read('config.ini')
TEAM_BOT = config['DEFAULT']['TEAM_NAME']
I then build two Team objects (defined in my NHLGameEvents module).
game_today, game_info = is_game_today(get_team(TEAM_BOT))
awayteam_info = game_info["teams"]["away"]["team"]
awayteamobj_name = awayteam_info["name"]
awayteamobj_shortname = awayteam_info["teamName"]
awayteamobj_tri = awayteam_info["abbreviation"]
away_team_obj = NHLGameEvents.Team(
awayteamobj_name, awayteamobj_shortname, awayteamobj_tri, "away")
game_obj.register_team(away_team_obj, "away")
hometeam_info = game_info["teams"]["home"]["team"]
hometeamobj_name = hometeam_info["name"]
hometeamobj_shortname = hometeam_info["teamName"]
hometeamobj_tri = hometeam_info["abbreviation"]
home_team_obj = NHLGameEvents.Team(
hometeamobj_name, hometeamobj_shortname, hometeamobj_tri, "home")
game_obj.register_team(home_team_obj, "home")
home_team_obj.preferred = bool(home_team_obj.team_name == TEAM_BOT)
away_team_obj.preferred = bool(away_team_obj.team_name == TEAM_BOT)
In some instances, I want to reference these Team objects as preferred and other as opposed to home / away so I use a method defined in my Game class to retrieve that. Since my Game object knows about both of my Teams, the method in my Game class that returns this Tuple is -
def register_team(self, team, key):
"""Registers a team to the instance of the Game."""
if key not in ('home', 'away'):
raise AttributeError(
"Key '{}' is not valid - Team key can only be home or away.".format(key))
if len(self.teams) > 1:
raise ValueError(
"Too many teams! Cannot register {} for {}".format(team, self))
self.teams[key] = team
team.game = self
team.tv_channel = self.broadcasts[key]
def get_preferred_team(self):
"""Returns a Tuple of team objects of the preferred & other teams."""
if self.teams["home"].preferred is True:
return (self.teams["home"], self.teams["away"])
return (self.teams["away"], self.teams["home"])
I can then retrieve that information from anywhere in my script.
preferred_team_obj, other_team_obj = game_obj.get_preferred_team()
Is it safe to redefine these class instances (ex - home_team_obj also known as preferred_team_obj) or should I just use an if statement whenever I want to reference these, such as -
if home_team_obj.preferred:
# Do something with home_team_obj
else:
# Do something with away_team_obj
Just as a follow up to this question, it seems that is totally safe to refer to assign an object to another name for use later in the code with no issues (as per the example below).
preferred_team = game.preferred_team
preferred_homeaway = preferred_team.home_away
on_ice = json_feed["liveData"]["boxscore"]["teams"][preferred_homeaway]["onIce"]
players = json_feed["gameData"]["players"]
if recent_event(play):
get_lineup(game, event_period, on_ice, players)

Categories