How to dynamically convert non-existing class member functions to existing ones - python

I have a class:
class DataReader:
def get_data(self, name):
# get data of given name
It's OK that I use it as following:
reader = DataReader()
a = reader.get_data('a')
b = reader.get_data('b')
c = reader.get_data('c')
...
Is it possible that I write codes like following:
a = reader.get_a()
b = reader.get_b()
c = reader.get_c()
For current codes, it will fail since class DataReader has no methods like get_a(). What I want is, do something to let DataReader support method like get_a, and automatically convert it to self.get_data('a'), without really define get_xxx methods one by one.
Here, the a, b, c can be any string, and I cannot know all of them while defining DataReader class. So, let me ask my question in another way: is there some shortcut way to let DataReader support all (infinity) get_xxx methods (here xxx can be any string), as if I defined infinity methods like:
class DataReader:
def get_a(self): return self.get('a')
def get_b(self): return self.get('b')
...
def get_z(self): return self.get('z')
def get_aa(self): return self.get('aa')
...
def get_asdf(self): return self.get('asdf')
...
def get_okjgoke(self): return self.get('okjgoke')
...

One method is having the DataReader to define __getattr__ special method (that method is invoked when attribute is not found inside the object):
class DataReader:
def __init__(self, data):
self.items = data.copy()
def __getattr__(self, attr):
if attr.startswith('get_'):
return lambda: self.items[attr.split('get_')[-1]]
raise AttributeError('Attribute "{}" not found.'.format(attr))
d = DataReader({'a':1, 'b':2})
print(d.get_a())
print(d.get_b())
Prints:
1
2

Your approach of passing the name to get_data seems pretty reasonable to me. But if you insist on using the attribute based lookups, you can override __getattr__ and use get_data to in there for lookups e.g.:
class DataReader:
def __getattr__(self, attr):
parts = attr.partition('_')
if parts[0] == 'get' and parts[-1] != 'data':
return self.get_data(parts[-1])
return super().__getattr__(attr)
def get_data(self, name):
return name
Now you can use Foo().get_a to get Foo().get_data('a').
If you want get the value from a callable like Foo().get_a() instead of Foo().get_a, you can use tuck in a lambda:
class DataReader:
def __getattr__(self, attr):
parts = attr.partition('_')
if parts[0] == 'get' and parts[-1] != 'data':
return lambda: self.get_data(parts[-1])
return super().__getattr__(attr)
def get_data(self, name):
return name
Now you can do Foo().get_a().

Related

Serialize and deserialize objects from user-defined classes

Suppose I have class hierarchy like this:
class SerializableWidget(object):
# some code
class WidgetA(SerilizableWidget):
# some code
class WidgetB(SerilizableWidget):
# some code
I want to be able to serialize instances of WidgetA and WidgetB (and potentially other widgets) to text files as json. Then, I want to be able to deserialize those, without knowing beforehand their specific class:
some_widget = deserielize_from_file(file_path) # pseudocode, doesn't have to be exactly a method like this
and some_widget needs to be constructed from the precise subclass of SerilizableWidget. How do I do this? What methods exactly do I need to override/implement in each of the classes of my hierarchy?
Assume all fields of the above classes are primitive types. How do I override some __to_json__ and __from_json__ methods, something like that?
You can solve this with many methods. One example is to use the object_hook and default parameters to json.load and json.dump respectively.
All you need is to store the class together with the serialized version of the object, then when loading you have to use a mapping of which class goes with which name.
The example below uses a dispatcher class decorator to store the class name and object when serializing, and look it up later when deserializing. All you need is a _as_dict method on each class to convert the data to a dict:
import json
#dispatcher
class Parent(object):
def __init__(self, name):
self.name = name
def _as_dict(self):
return {'name': self.name}
#dispatcher
class Child1(Parent):
def __init__(self, name, n=0):
super().__init__(name)
self.n = n
def _as_dict(self):
d = super()._as_dict()
d['n'] = self.n
return d
#dispatcher
class Child2(Parent):
def __init__(self, name, k='ok'):
super().__init__(name)
self.k = k
def _as_dict(self):
d = super()._as_dict()
d['k'] = self.k
return d
Now for the tests. First lets create a list with 3 objects of different types.
>>> obj = [Parent('foo'), Child1('bar', 15), Child2('baz', 'works')]
Serializing it will yield the data with the class name in each object:
>>> s = json.dumps(obj, default=dispatcher.encoder_default)
>>> print(s)
[
{"__class__": "Parent", "name": "foo"},
{"__class__": "Child1", "name": "bar", "n": 15},
{"__class__": "Child2", "name": "baz", "k": "works"}
]
And loading it back generates the correct objects:
obj2 = json.loads(s, object_hook=dispatcher.decoder_hook)
print(obj2)
[
<__main__.Parent object at 0x7fb6cd561cf8>,
<__main__.Child1 object at 0x7fb6cd561d68>,
<__main__.Child2 object at 0x7fb6cd561e10>
]
Finally, here's the implementation of dispatcher:
class _Dispatcher:
def __init__(self, classname_key='__class__'):
self._key = classname_key
self._classes = {} # to keep a reference to the classes used
def __call__(self, class_): # decorate a class
self._classes[class_.__name__] = class_
return class_
def decoder_hook(self, d):
classname = d.pop(self._key, None)
if classname:
return self._classes[classname](**d)
return d
def encoder_default(self, obj):
d = obj._as_dict()
d[self._key] = type(obj).__name__
return d
dispatcher = _Dispatcher()
I really liked #nosklo's answer, but I wanted to customize what the property value was for how the model type got saved, so I extended his code a little to add a sub-annotation.
(I know this isn't directly related to the question, but you can use this to serialize to json too since it produces dict objects. Note that your base class must use the #dataclass annotation to serialize correctly - otherwise you could adjust this code to define the __as_dict__ method like #nosklo's answer)
data.csv:
model_type, prop1
sub1, testfor1
sub2, testfor2
test.py:
import csv
from abc import ABC
from dataclasses import dataclass
from polymorphic import polymorphic
#polymorphic(keyname="model_type")
#dataclass
class BaseModel(ABC):
prop1: str
#polymorphic.subtype_when_(keyval="sub1")
class SubModel1(BaseModel):
pass
#polymorphic.subtype_when_(keyval="sub2")
class SubModel2(BaseModel):
pass
with open('data.csv') as csvfile:
reader = csv.DictReader(csvfile, skipinitialspace=True)
for row_data_dict in reader:
price_req = BaseModel.deserialize(row_data_dict)
print(price_req, '\n\tre-serialized: ', price_req.serialize())
polymorphic.py:
import dataclasses
import functools
from abc import ABC
from typing import Type
# https://stackoverflow.com/a/51976115
class _Polymorphic:
def __init__(self, keyname='__class__'):
self._key = keyname
self._class_mapping = {}
def __call__(self, abc: Type[ABC]):
functools.update_wrapper(self, abc)
setattr(abc, '_register_subtype', self._register_subtype)
setattr(abc, 'serialize', lambda self_subclass: self.serialize(self_subclass))
setattr(abc, 'deserialize', self.deserialize)
return abc
def _register_subtype(self, keyval, cls):
self._class_mapping[keyval] = cls
def serialize(self, self_subclass) -> dict:
my_dict = dataclasses.asdict(self_subclass)
my_dict[self._key] = next(keyval for keyval, cls in self._class_mapping.items() if cls == type(self_subclass))
return my_dict
def deserialize(self, data: dict):
classname = data.pop(self._key, None)
if classname:
return self._class_mapping[classname](**data)
raise ValueError(f'Invalid data: {self._key} was not found or it referred to an unrecognized class')
#staticmethod
def subtype_when_(*, keyval: str):
def register_subtype_for(_cls: _Polymorphic):
nonlocal keyval
if not keyval:
keyval = _cls.__name__
_cls._register_subtype(keyval, _cls)
#functools.wraps(_cls)
def construct_original_subclass(*args, **kwargs):
return _cls(*args, **kwargs)
return construct_original_subclass
return register_subtype_for
polymorphic = _Polymorphic
Sample console output:
SubModel1(prop1='testfor1')
re-serialized: {'prop1': 'testfor1', 'model_type': 'sub1'}
SubModel2(prop1='testfor2')
re-serialized: {'prop1': 'testfor2', 'model_type': 'sub2'}

Use metaclass to allow forward declarations

I want to do something decidedly unpythonic. I want to create a class that allows for forward declarations of its class attributes. (If you must know, I am trying to make some sweet syntax for parser combinators.)
This is the kind of thing I am trying to make:
a = 1
class MyClass(MyBaseClass):
b = a # Refers to something outside the class
c = d + b # Here's a forward declaration to 'd'
d = 1 # Declaration resolved
My current direction is to make a metaclass so that when d is not found I catch the NameError exception and return an instance of some dummy class I'll call ForwardDeclaration. I take some inspiration from AutoEnum, which uses metaclass magic to declare enum values with bare identifiers and no assignment.
Below is what I have so far. The missing piece is: how do I continue normal name resolution and catch the NameErrors:
class MetaDict(dict):
def __init__(self):
self._forward_declarations = dict()
def __getitem__(self, key):
try:
### WHAT DO I PUT HERE ??? ###
# How do I continue name resolution to see if the
# name already exists is the scope of the class
except NameError:
if key in self._forward_declarations:
return self._forward_declarations[key]
else:
new_forward_declaration = ForwardDeclaration()
self._forward_declarations[key] = new_forward_declaration
return new_forward_declaration
class MyMeta(type):
def __prepare__(mcs, name, bases):
return MetaDict()
class MyBaseClass(metaclass=MyMeta):
pass
class ForwardDeclaration:
# Minimal behavior
def __init__(self, value=0):
self.value = value
def __add__(self, other):
return ForwardDeclaration(self.value + other)
To start with:
def __getitem__(self, key):
try:
return super().__getitem__(key)
except KeyError:
...
But that won't allow you to retrieve the global variables outside the class body.
You can also use the __missin__ method which is reserved exactly for subclasses of dict:
class MetaDict(dict):
def __init__(self):
self._forward_declarations = dict()
# Just leave __getitem__ as it is on "dict"
def __missing__(self, key):
if key in self._forward_declarations:
return self._forward_declarations[key]
else:
new_forward_declaration = ForwardDeclaration()
self._forward_declarations[key] = new_forward_declaration
return new_forward_declaration
As you can see, that is not that "UnPythonic" - advanced Python stuff such as SymPy and SQLAlchemy have to resort to this kind of behavior to do their nice magic - just be sure to get it very well documented and tested.
Now, to allow for global (module) variables, you have a to get a little out of the way - and possibly somthing that may not be avaliablein all Python implementations - that is: introspecting the frame where the class body is to get its globals:
import sys
...
class MetaDict(dict):
def __init__(self):
self._forward_declarations = dict()
# Just leave __getitem__ as it is on "dict"
def __missing__(self, key):
class_body_globals = sys._getframe().f_back.f_globals
if key in class_body_globals:
return class_body_globals[key]
if key in self._forward_declarations:
return self._forward_declarations[key]
else:
new_forward_declaration = ForwardDeclaration()
self._forward_declarations[key] = new_forward_declaration
return new_forward_declaration
Now that you are here - your special dictionaries are good enough to avoid NameErrors, but your ForwardDeclaration objects are far from smart enough - when running:
a = 1
class MyClass(MyBaseClass):
b = a # Refers to something outside the class
c = d + b # Here's a forward declaration to 'd'
d = 1
What happens is that c becomes a ForwardDeclaration object, but summed to the instant value of d which is zero. On the next line, d is simply overwritten with the value 1 and is no longer a lazy object. So you might just as well declare c = 0 + b .
To overcome this, ForwardDeclaration has to be a class designed in a smartway, so that its values are always lazily evaluated, and it behaves as in the "reactive programing" approach: i.e.: updates to a value will cascade updates into all other values that depend on it. I think giving you a full implementation of a working "reactive" aware FOrwardDeclaration class falls off the scope of this question. - I have some toy code to do that on github at https://github.com/jsbueno/python-react , though.
Even with a proper "Reactive" ForwardDeclaration class, you have to fix your dictionary again so that the d = 1 class works:
class MetaDict(dict):
def __init__(self):
self._forward_declarations = dict()
def __setitem__(self, key, value):
if key in self._forward_declarations:
self._forward_declations[key] = value
# Trigger your reactive update here if your approach is not
# automatic
return None
return super().__setitem__(key, value)
def __missing__(self, key):
# as above
And finally, there is a way to avoid havign to implement a fully reactive aware class - you can resolve all pending FOrwardDependencies on the __new__ method of the metaclass - (so that your ForwardDeclaration objects are manually "frozen" at class creation time, and no further worries - )
Something along:
from functools import reduce
sentinel = object()
class ForwardDeclaration:
# Minimal behavior
def __init__(self, value=sentinel, dependencies=None):
self.dependencies = dependencies or []
self.value = value
def __add__(self, other):
if isinstance(other, ForwardDeclaration):
return ForwardDeclaration(dependencies=self.dependencies + [self])
return ForwardDeclaration(self.value + other)
class MyMeta(type):
def __new__(metacls, name, bases, attrs):
for key, value in list(attrs.items()):
if not isinstance(value, ForwardDeclaration): continue
if any(v.value is sentinel for v in value.dependencies): continue
attrs[key] = reduce(lambda a, b: a + b.value, value.dependencies, 0)
return super().__new__(metacls, name, bases, attrs)
def __prepare__(mcs, name, bases):
return MetaDict()
And, depending on your class hierarchy and what exactly you are doing, rememebr to also update one class' dict _forward_dependencies with the _forward_dependencies created on its ancestors.
AND if you need any operator other than +, as you will have noted, you will have to keep information on the operator itself - at this point, hou might as well jsut use sympy.

Unclear descriptor caller reference evaluation

I am using Python descriptors to create complex interfaces on host objects.
I don't get the behaviour I would intuitively expect when I run code such as this:
class Accessor(object):
def __get__(self,inst,instype):
self._owner = inst
return self
def set(self,value):
self._owner._val = value
def get(self):
if hasattr(self._owner,'_val'):
return self._owner._val
else: return None
class TestClass(object):
acc = Accessor()
source = TestClass()
destination = TestClass()
source.acc.set('banana')
destination.acc.set('mango')
destination.acc.set(source.acc.get())
print destination.acc.get()
# Result: mango
I would expect in this case for destination.acc.get() to return 'banana', not 'mango'.
However, the intention (to copy _val from 'source' to 'destination') works if the code is refactored like this:
val = source.acc.get()
destination.acc.set(val)
print destination.acc.get()
# Result: banana
What is is that breaks down the 'client' reference passed through get if descriptors are used in a single line versus broken into separate lines? Is there a way to get the behaviour I would intuitively expect?
Many thanks in advance.
K
Your implementation ALMOST works. The problem with it comes up with destination.acc.set(source.acc.get()). What happens is that it first looks up destination.acc, which will set _owner to destination, but before it can call set(), it has to resolve the parameter, source.acc.get(), which will end up setting acc's _owner to source.
Since destination.acc and source.acc are the same object (descriptors are stored on the class, not the instance), you're calling set() on it after its _owner is set to source. That means you're setting source._val, not destination._val.
The way to get the behavior you would intuitively expect is to get rid or your get() and set() and replace them with __get__() and __set__(), so that your descriptor can be used for the reason a descriptor is used.
class Accessor(object):
def __get__(self, instance, owner): # you should use the conventional parameter names
if instance is None:
return self
else:
return instance._val
def __set__(self, instance, value):
instance._val = value
Then you could rewrite your client code as
source = TestClass()
destination = TestClass()
source.acc = 'banana'
destination.acc = 'mango'
destination.acc = source.acc
print destination.acc
The point of descriptors is to remove explicit getter and setter calls with implicit ones that look like simple attribute use. If you still want to be using your getters and setters on Accessor, then don't make it a descriptor. Do this instead:
class Accessor(object):
def get(self):
if hasattr(self, '_val'):
return self._val
else:
return None
def set(self, val):
self._val = val
Then rewrite TestClass to look more like this:
class TestClass(object):
def __init__(self):
self.acc = Accessor()
After that, your original client code would work.
I already said why it's not working in my other post. So, here's a way to use a descriptor while still retaining your get() and set() methods.
class Accessor(object):
def __get__(self, instance, owner):
if instance is None:
return self
elif not hasattr(instance, '_val'):
setattr(instance, '_val', Acc())
return getattr(instance, '_val')
class Acc(object):
def get(self):
if hasattr(self, '_val'):
return self._val
else:
return None
def set(self, val):
self._val = val
class TestClass(object):
acc = Accessor()
The trick is to move the get() and set() methods to a new class that is returned instead of returning self from the descriptor.

Dynamically generate method from string?

I have a dict of different types for which I want to add a simple getter based on the name of the actual parameter.
For example, for three storage parameters, let's say:
self.storage = {'total':100,'used':88,'free':1}
I am looking now for a way (if possible?) to generate a function on the fly with some meta-programming magic.
Instead of
class spaceObj(object):
def getSize(what='total'):
return storage[what]
or hard coding
#property
def getSizeTotal():
return storage['total']
but
class spaceObj(object):
# manipulting the object's index and magic
#property
def getSize:
return ???
so that calling mySpaceObj.getSizeFree would be derived - with getSize only defined once in the object and related functions derived from it by manipulating the objects function list.
Is something like that possible?
While certainly possible to get an unknown attribute from a class as a property, this is not a pythonic approach (__getattr__ magic methods are rather rubyist)
class spaceObj(object):
storage = None
def __init__(self): # this is for testing only
self.storage = {'total':100,'used':88,'free':1}
def __getattr__(self, item):
if item[:7] == 'getSize': # check if an undefined attribute starts with this
return self.getSize(item[7:])
def getSize(self, what='total'):
return self.storage[what.lower()]
print (spaceObj().getSizeTotal) # 100
You can put the values into the object as properties:
class SpaceObj(object):
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
storage = {'total':100,'used':88,'free':1}
o = SpaceObj(**storage)
print o.total
or
o = SpaceObj(total=100, used=88, free=1)
print o.total
or using __getattr__:
class SpaceObj(object):
def __init__(self, **kwargs):
self.storage = kwargs
def __getattr__(self,name):
return self.storage[name]
o = SpaceObj(total=100, used=88, free=1)
print o.total
The latter approach takes a bit more code but it's more safe; if you have a method foo and someone create the instance with SpaceObj(foo=1), then the method will be overwritten with the first approach.
>>> import new
>>> funcstr = "def wat(): print \"wat\";return;"
>>> funcbin = compile(funcstr,'','exec')
>>> ns = {}
>>> exec funcbin in ns
>>> watfunction = new.function(ns["wat"].func_code,globals(),"wat")
>>> globals()["wat"]=watfunction
>>> wat()
wat

Exposing dict values via properties

I have this (Py2.7.2):
class MyClass(object):
def __init__(self, dict_values):
self.values = dict_values
self.changed_values = {} #this should track changes done to the values{}
....
I can use it like this:
var = MyClass()
var.values['age'] = 21
var.changed_values['age'] = 21
But I want to use it like this:
var.age = 21
print var.changed_values #prints {'age':21}
I suspect I can use properties to do that, but how?
UPDATE:
I don't know the dict contents at the design time. It will be known at run-time only. And it will likely to be not empty
You can create a class that inherits from a dict and override the needed functions
class D(dict):
def __init__(self):
self.changed_values = {}
self.__initialized = True
def __setitem__(self, key, value):
self.changed_values[key] = value
super(D, self).__setitem__(key, value)
def __getattr__(self, item):
"""Maps values to attributes.
Only called if there *isn't* an attribute with this name
"""
try:
return self.__getitem__(item)
except KeyError:
raise AttributeError(item)
def __setattr__(self, item, value):
"""Maps attributes to values.
Only if we are initialised
"""
if not self.__dict__.has_key('_D__initialized'): # this test allows attributes to be set in the __init__ method
return dict.__setattr__(self, item, value)
elif self.__dict__.has_key(item): # any normal attributes are handled normally
dict.__setattr__(self, item, value)
else:
self.__setitem__(item, value)
a = D()
a['hi'] = 'hello'
print a.hi
print a.changed_values
a.hi = 'wow'
print a.hi
print a.changed_values
a.test = 'test1'
print a.test
print a.changed_values
output
>>hello
>>{'hi': 'hello'}
>>wow
>>{'hi': 'wow'}
>>test1
>>{'hi': 'wow', 'test': 'test1'}
Properties (descriptors, really) will only help if the set of attributes to monitor is bounded. Simply file the new value away in the __set__() method of the descriptor.
If the set of attributes is arbitrary or unbounded then you will need to overrive MyClass.__setattr__() instead.
You can use the property() built-in function.
This is preferred to overriding __getattr__ and __setattr__, as explained here.
class MyClass:
def __init__(self):
self.values = {}
self.changed_values = {}
def set_age( nr ):
self.values['age'] = nr
self.changed_values['age'] = nr
def get_age():
return self.values['age']
age = property(get_age,set_age)

Categories