Class instance attributes derived from other instance attributes - python

I apologise if the title is cryptic, I could not think of a way to describe my problem in a sentence. I am building some code in python2.7 that I describe below.
Minimal working example
My code has a Parameter class that implements attributes such as name and value, which looks something like this.
class Parameter(object):
def __init__(self, name, value=None, error=None, dist=None, prior=None):
self.name = name
self._value = value # given value for parameter, this is going to be changed very often in an MCMC sampler
self.error = error # initial estimate of error for the parameter, will only be set once
self._dist = dist # a distribution for the parameter, will only be set once
self.prior = prior
#property
def value(self):
return self._value
#property
def dist(self):
return self._dist
The class also has several properties that returns the mean, median, etc. of Parameter.dist if a distribution is given.
I have another class, e.g. ParameterSample, that creates a population of different Parameter objects. Some of these Parameter objects have their attributes (e.g. value, error) set using the Parameter.set_parameter() function, but some other Parameter objects are not explicitly set, but their value and dist attributes depend on some of the other Parameter objects that are set:
class ParameterSample(object):
def __init__(self):
varied_parameters = ('a', 'b') # parameter names whose `value` attribute is varied
derived_parameters = ('c',) # parameter names whose `value` attribute is varied, but depends on `a.value` and `b.value`
parameter_names = varied_parameters + derived_parameters
# create `Parameter` objects for each parameter name
for name in parameter_names:
setattr(self, name, Parameter(name))
def set_parameter(self, name, **kwargs):
for key, val in kwargs.items():
if key == 'value':
key = '_'.join(['', key]) # add underscore to set `Parameter._value`
setattr(getattr(self, name), key, val) # basically does e.g. `self.a.value = 1`
I can now create a ParameterSample and use them like this:
parobj = ParameterSample()
parobj.set_parameter('a', value=1, error=0.1)
parobj.set_parameter('b', value=2, error=0.5)
parobj.a.value
>>> 1
parobj.b.error
>>> 0.5
parobj.set_parameter('b', value=3)
parobj.b.value
>>> 3
parobj.b.error
>>> 0.5
What I want
What I ultimately want, is to use Parameter.c the same way. For example:
parobj.c.value
>>> 4 # returns parobj.a.value + parobj.b.value
parobj.c.dist
>>> None # returns a.dist + b.dist, but since they are not currently set it is None
c therefore needs to be a Parameter object with all the same attributes as a and b, but where its value and dist are updated according to the current attributes of a and b.
However, I should also mention that I want to be able to set the allowed prior ranges for parameter c, e.g. parobj.set_parameter('c', prior=(0,10)) before making any calls to its value -- so c needs to be an already defined Parameter object upon the creation of the ParameterSample object.
How would I implement this into my ParameterSample class?
What I've tried
I have tried looking into making my own decorators, but I am not sure if that is the way to go since I don't fully understand how I would use those.
I've also considered adding a #property to c that creates a new Parameter object every time it is called, but I feel like that is not the way to go since it may slow down the code.
I should also note that the ParameterSample class above is going to be inherited in a different class, so whatever the solution is it should be able to be used in this setting:
class Companion(ParameterSample)
def __init__(self, name):
self.name = name
super(Companion, self).__init__()
comp = Companion(name='Earth')
comp.set_parameter('a', value=1)
comp.set_parameter('b', value=3)
comp.c.value
>>> 4

I could not get this to work in Python 2 - the setattr calls never seemed to propagate the attributes to the child classes (Companion would have no c attribute).
I was more successful with Python 3 though. Since you have two parameter types (varied vs. derived), it makes sense IMO to have two classes to implement the behavior, instead of treating them all as one.
I added a DerivedParameter class, inheriting from Parameter that takes a dependents argument (along with its parent class' args/kwargs), but redefining value and dist to give dependent behavior:
class DerivedParameter(Parameter):
def __init__(self, name, dependents, **kwargs):
self._dependents = dependents
super().__init__(name, **kwargs)
#property
def value(self):
try:
return sum(x._value for x in self._dependents if x is not None)
except TypeError:
return None
#property
def dist(self):
try:
return sum(x._dist for x in self._dependents if x is not None)
except TypeError:
return None
Then I adjusted how your parameter objects are added:
class ParameterSample:
def __init__(self):
# Store as instance attributes to reference later
self.varied_params = ('a', 'b') # parameter names whose `value` attribute is varied
self.derived_params = ('c',) # parameter names whose `value` attribute is varied, but depends on `a.value` and `b.value`
# No more combined names
# create `Parameter` objects for each varied parameter name
for name in self.varied_params:
setattr(self, name, Parameter(name))
# Create `DerivedParameter` objects for each derived parameter
# Derived parameters depend on all `Parameter` objects. It wasn't
# clear if this was the desired behavior though.
params = [v for _, v in self.__dict__.items() if isinstance(v, Parameter)]
for name in self.derived_params:
setattr(self, name, DerivedParameter(name, params))
def set_parameter(self, name, **kwargs):
for key, val in kwargs.items():
if key == 'value':
key = '_'.join(['', key]) # add underscore to set `Parameter._value`
setattr(getattr(self, name), key, val) # basically does e.g. `self.a.value = 1`
From this, I could then replicate your given example desired behavior:
>>> comp = Companion(name='Earth')
>>> comp.set_parameter('a', value=1)
>>> comp.set_parameter('b', value=3)
>>> print(comp.c.value)
>>> print(comp.c.dist)
4
None
>>> comp.set_parameter('c', prior=(0,10))
>>> print(comp.c.prior)
(0, 10)
As I pointed out in the comments, the design above ends up causing all derived parameters to use all varied parameters as their dependents - effectively making c and a potential d identical. You should be able to fix this fairly easily with some parameters/conditions.
Overall, I would have to agree with #Error - Syntactical Remorse though. This is a pretty complicated way to go about designing classes and would make maintenance confusing at best. I would strongly encourage you to reconsider your design and try to find an adaptable general solution that doesn't involve dynamic creation of attributes like this.

Related

Python simple lazy loading

I'm trying to clean up some logic and remove duplicate values in some code and am looking for a way to introduce some very simple lazy-loading to handle settings variables. Something that would work like this:
FOO = {'foo': 1}
BAR = {'test': FOO['foo'] }
# ...complex logic here which ultimately updates the value of Foo['foo']...
FOO['foo'] = 2
print(BAR['test']) # Outputs 1 but would like to get 2
Update:
My question may not have been clear based on the initial responses. I'm looking to replace the value being set for test in BAR with a lazy-loaded substitute. I know a way I can do this but it seems unnecessarily complex for what it is, I'm wondering if there's a simpler approach.
Update #2:
Okay, here's a solution that works. Is there any built-in type that can do this out of the box:
FOO = {'foo': 1}
import types
class LazyDict(dict):
def __getitem__(self, item):
value = super().__getitem__(item)
return value if not isinstance(value, types.LambdaType) else value()
BAR = LazyDict({ 'test': lambda: FOO['foo'] })
# ...complex logic here which ultimately updates the value of Foo['foo']...
FOO['foo'] = 2
print(BAR['test']) # Outputs 2
As I stated in the comment above, what you are seeking is some of the facilities of reactive programming paradigm. (not to be confounded with the JavaScript library which borrows its name from there).
It is possible to instrument objects in Python to do so - I think the minimum setup here would be a specialized target mapping, and a special object type you set as the values in it, that would fetch the target value.
Python can do this in more straightforward ways with direct attribute access (using the dot notation: myinstance.value) than by using the key-retrieving notation used in dictionaries mydata['value'] due to the fact a class is already a template to a certain data group, and class attributes can define mechanisms to access each instance's attribute value. That is called the "descriptor protocol" and is bound into the language model itself.
Nonetheless a minimalist Mapping based version can be implemented as such:
FOO = {'foo': 1}
from collections.abc import MutableMapping
class LazyValue:
def __init__(self, source, key):
self.source = source
self.key = key
def get(self):
return self.source[self.key]
def __repr__(self):
return f"<LazyValue {self.get()!r}>"
class LazyDict(MutableMapping):
def __init__(self, *args, **kw):
self.data = dict(*args, **kw)
def __getitem__(self, key):
value = self.data[key]
if isinstance(value, LazyValue):
value = value.get()
return value
def __setitem__(self, key, value):
self.data[key] = value
def __delitem__(key):
del self.data[key]
def __iter__(self):
return iter(self.data)
def __len__():
return len(self.data)
def __repr__():
return repr({key: value} for key, value in self.items())
BAR = LazyDict({'test': LazyValue(FOO, 'foo')})
# ...complex logic here which ultimately updates the value of Foo['foo']...
FOO['foo'] = 2
print(BAR['test']) # Outputs 2
The reason this much code is needed is that there are several ways to retrieve data from a dictionary or mapping (.values, .items, .get, .setdefault) and simply inheriting from dict and implementing __getitem__ would "leak" the special lazy object in any of the other methods. Going through this MutableMapping approach ensure a single point of reading of the value in the __getitem__ method - and the resulting instance can be used reliably anywhere a mapping is expected.
However, notice that if you are using normal classes and instances rather than dictionaries, this can be much simpler - you can just use plain Python "property" and have a getter that will fetch the value. The main factor you should ponder is whether your referenced data keys are fixed, and can be hard-coded when writting the source code, or if they are dynamic, and which keys will work as lazy-references are only known at runtime. In this last case, the custom mapping approach, as above, will be usually better:
FOO = {'foo': 1}
class LazyStuff:
def __init__(self, source):
self.source = source
#property
def test(self):
return self.source["foo"]
BAR = LazyStuff(FOO)
FOO["foo"] = 2
print(BAR.test)
Perceive that in this way you have to hardcode the key "foo" and "test" in the class body, but it is just plaincode, and no need for the intermediary "LazyValue" class. Also, if you need this data as a dictionary, you could add an .as_dict method to LazyStuff that would collect all attributes in the moment it were called and yield a snapshot of those values as a dictionary..
You can try using lambdas and calling the value on return. Like this:
FOO = {'foo': 1}
BAR = {'test': lambda: FOO['foo'] }
FOO['foo'] = 2
print(BAR['test']()) # Outputs 2
If you're only one level deep, you may wish to try ChainMap, E.g.,
>>> from collections import ChainMap
>>> defaults = {'foo': 42}
>>> myvalues = {}
>>> result = ChainMap(myvalues, defaults)
>>> result['foo']
42
>>> defaults['foo'] = 99
>>> result['foo']
99

Usage of setattr method in python

I have a question on the usage of the setattr method in python.
I have a python class with around 20 attributes, which can be initialized in the below manner:
class SomeClass():
def __init__(self, pd_df_row): # pd_df_row is one row from a dataframe
#initialize some attributes (attribute_A to attribute_Z) in a similar manner
if 'column_A' in pd_df_row.columns:
self.attribute_A = pd_df_row['column_A']
else:
self.attribute_A = np.nan
....
if 'column_Z' in pd_df_row.columns:
self.attribute_Z = pd_df_row['column_Z']
else:
self.attribute_Z = np.nan
# initialize some other attributes based on some other columns in pd_df_row
self.other_attribute = pre_process(pd_df_row['column_123'])
# some other methods
def compute_something(self):
return self.attribute_A + self.attribute_B
Is it advisable to write the class in the below way instead, making use of the setattr method and for loop in python:
class SomeClass():
# create a static list to store the mapping between attribute names and column names that can be initialized using a similar logic.
# However, the mapping would not cover all columns in the input pd_df_row or cover all attributes of the class, because not all columns are read and stored in the same way
# (this mapping will be hardcoded. Its initialization cannot be further simplified using a loop, because the attribute name and the corresponding column name do not actually follow any particular patterns)
ATTR_LIST = [('attribute_A', 'column_A'), ('attribute_B', 'column_B'), ...,('attribute_Z', 'column_Z')]
def __init__(self, pd_df_row): #where pd_df_row is a dataframe
#initialize some attributes (attribute_A to attribute_Z) in a loop
for attr_name, col_name in SomeClass.ATTR_LIST:
if col_name in pd_df_row.columns:
setattr(self, attr_name, pd_df_row[col_name])
else:
setattr(self, attr_name, np.nan)
# initialize some other attributes based on some other columns in pd_df_row
self.other_attribute = pre_process(pd_df_row['column_123'])
# some other methods
def compute_something(self):
return self.attribute_A + self.attribute_B
the second way of writing this class seem to be able to shorten the code. However, it also seem to make the structure of the class a bit confusing, by creating the static list of attribute and column name mapping (which will be used to initiate only some but not all of the attributes). Also, I noticed that code auto-completion will not work for the second piece of code as the code editor wont be able to know what attribute is created until run time. Therefore my question is, is it advisable to use setattr() in this way? In what cases should I write my code in this way and in what cases I should avoid doing so?
In addition, does creating the static mapping in the class violate object oriented programming principles? should I create and store this mapping in some other place instead?
Thank you.
You could, but I would consider having a dict of attributes rather than separate similarly named attributes.
class SomeClass():
def __init__(self, pd_df_row): # pd_df_row is one row from a dataframe
self.attributes = {}
for x in ['A', ..., 'Z']:
column = f'column_{x}'
if column in pd_df_row:
self.attributes[x] = pd_df_row[column]
else:
self.attributes[x] = np.nan
# initialize some other attributes
self.other_attribute = some_other_values
# some other methods
def compute_something(self):
return self.attribute['A'] + self.attribute['B']

Python setattr() to function takes initial function name

I do understand how setattr() works in python, but my question is when i try to dynamically set an attribute and give it an unbound function as a value, so the attribute is a callable, the attribute ends up taking the name of the unbound function when i call attr.__name__ instead of the name of the attribute.
Here's an example:
I have a Filter class:
class Filter:
def __init__(self, column=['poi_id', 'tp.event'], access=['con', 'don']):
self.column = column
self.access = access
self.accessor_column = dict(zip(self.access, self.column))
self.set_conditions()
def condition(self, name):
# i want to be able to get the name of the dynamically set
# function and check `self.accessor_column` for a value, but when
# i do `setattr(self, 'accessor', self.condition)`, the function
# name is always set to `condition` rather than `accessor`
return name
def set_conditions(self):
mapping = list(zip(self.column, self.access))
for i in mapping:
poi_column = i[0]
accessor = i[1]
setattr(self, accessor, self.condition)
In the class above, the set_conditions function dynamically set attributes (con and don) of the Filter class and assigns them a callable, but they retain the initial name of the function.
When i run this:
>>> f = Filter()
>>> print(f.con('linux'))
>>> print(f.con.__name__)
Expected:
linux
con (which should be the name of the dynamically set attribute)
I get:
linux
condition (name of the value (unbound self.condition) of the attribute)
But i expect f.con.__name__ to return the name of the attribute (con) and not the name of the unbound function (condition) assigned to it.
Can someone please explain to me why this behaviour is such and how can i go around it?
Thanks.
function.__name__ is the name under which the function has been initially defined, it has nothing to do with the name under which it is accessed. Actually, the whole point of function.__name__ is to correctly identify the function whatever name is used to access it. You definitly want to read this for more on what Python's "names" are.
One of the possible solutions here is replace the static definition of condition with a closure:
class Filter(object):
def __init__(self, column=['poi_id', 'tp.event'], access=['con', 'don']):
self.column = column
self.access = access
self.accessor_column = dict(zip(self.access, self.column))
self.set_conditions()
def set_conditions(self):
mapping = list(zip(self.column, self.access))
for column_name, accessor_name in mapping:
def accessor(name):
print("in {}.accessor '{}' for column '{}'".format(self, accessor_name, column_name))
return name
# this is now technically useless but helps with inspection
accessor.__name__ = accessor_name
setattr(self, accessor_name, accessor)
As a side note (totally unrelated but I thought you may want to know this), using mutable objects as function arguments defaults is one of the most infamous Python gotchas and may yield totally unexpected results, ie:
>>> f1 = Filter()
>>> f2 = Filter()
>>> f1.column
['poi_id', 'tp.event']
>>> f2.column
['poi_id', 'tp.event']
>>> f2.column.append("WTF")
>>> f1.column
['poi_id', 'tp.event', 'WTF']
EDIT:
thank you for your answer, but it doesn't touch my issue here. My problem is not how functions are named or defined, my problem it that when i use setattr() and i set an attribute and i give it a function as it's value, i can access the value and perform what the value does, but since it's a function, why doesn't it return it's name as the function name
Because as I already explained above, the function's __name__ attribute and the name of the Filter instance attribute(s) refering to this function are totally unrelated, and the function knows absolutely nothing about the names of variables or attributes that reference it, as explained in the reference article I linked to.
Actually the fact that the object you're passing to setattr is a function is totally irrelevant, from the object's POV it's just a name and an object, period. And actually the fact you're binding this object (function or just whatever object) to an instance attribute (whether directly or using setattr(), it works just the same) instead of a plain variable is also totally irrelevant - none of those operation will have any impact on the object that is bound (except for increasing it's ref counter but that's a CPython implementation detail - other implementations may implement garbage collection diffently).
May I suggest you this :
from types import SimpleNamespace
class Filter:
def __init__(self, column=['poi_id', 'tp.event'], access=['con', 'don']):
self.column = column
self.access = access
self.accessor_column = dict(zip(self.access, self.column))
self.set_conditions()
def set_conditions(self):
for i in self.access:
setattr(self, i, SimpleNamespace(name=i, func=lambda name: name))
f = Filter()
print(f.con.func('linux'))
>>> linux
print(f.con.name)
>>> con
[edited after bruno desthuilliers's comment.]

Implementing **<class>?

Edit:
This question has been marked duplicate but I don't think that it is. Implementing the suggested answer, that is to use the Mapping abc, does not have the behavior I would like:
from collections import Mapping
class data(Mapping):
def __init__(self,params):
self.params = params
def __getitem__(self,k):
print "getting",k
return self.params[k]
def __len__(self):
return len(self.params)
def __iter__(self):
return ( k for k in self.params.keys() )
def func(*args,**kwargs):
print "In func"
return None
ps = data({"p1":1.,"p2":2.})
print "\ncalling...."
func(ps)
print "\ncalling...."
func(**ps)
Output:
calling....
In func
calling....
in __getitem__ p2
in __getitem__ p1
In func
Which, as mentioned in the question, is not what I want.
The other solution, given in the comments, is to modify the routines that are causing problems. That will certainly work, however I was looking for a quick (lazy?) fix!
Question:
How can I implement the ** operator for a class, other than via __getitem__? For example I would like to be able to do this::
def func(**kwargs):
<do some clever stuff>
x = some_generic_class():
func( **x )
without an implicit call to some_generic_class.__getitem__(). In my application I have already implemented __getitem__ with some data logging which I do not want to perform when the class is referenced as above.
If it's not possible to overload the ** operator, is it possible to detect when __getitem__ is being called as a result of the class being passed to a function, rather than explicitly?
Background:
I am working on a physics model that is built out of a set of packages which are chosen according to user input at runtime. The flexible structure of the model means that I rarely know the required parameters and so i pass a dict of parameter names and values between the models. In order to make this more user friendly I am now trying to develop a class paramlist that overloads the dict functionality with a set of routines that do some consistency checking, set default values, etc. The idea is that I pass an instance of paramlist rather than a dict. One of the more important aims is to keep a log of which members of paramlist have been referenced by the physics packages and which ones have not. A stripped out version is below, which aims to maintain a second dict that logs whether a parameter has been referenced::
class paramlist(object):
def __init__( self, params ):
self.params = copy(params)
self.used = { k:False for k in self.params }
def __getitem__(self, k):
try:
v = self.params[k]
except KeyError:
raise KeyError("Parameter {} not in parameter list".format(k))
else:
self.used[k] = True
return v
def __setitem__(self,k,v):
self.params[k] = v
self.used[k] = False
Which does not have the behaviour I want:
ps = paramlist( {"p1":1.} )
def donothing( *args, **kwargs ):
return None
donothing(ps)
print paramlist.used["p1"]
donothing(**ps)
print paramlist.used["p1"]
Output:
False
True
I would like the use dict to remain False in both cases, so that I can tell the user that one of their parameters was not used (implying that they screwed up and a default value has been used instead). I presume that the ** case has the effect of calling __getitem__ on every entry in the paramlist.

Python : Argument based Singleton

I'm following this link and trying to make a singleton class. But, taking arguments (passed while initiating a class) into account so that the same object is returned if the arguments are same.
So, instead of storing class name/class reference as a dict key, I want to store passed arguments as keys in dict. But, there could be unhashable arguments also (like dict, set itself).
What is the best way to store class arguments and class objects mapping? So that I can return an object corresponding to the arguments.
Thanks anyways.
EDIT-1 :
A little more explanation. Let's say there is class as follows
class A:
__metaclass__ == Singleton
def __init__(arg1, arg2):
pass
Now, A(1,2) should always return the same object. But, it should be different from A(3,4)
I think, the arguments very much define the functioning of a class. Let's say if the class is to make redis connections. I might want to create 2 singletons objects with diff redis hosts as parameters, but the underlying class/code could be common.
As theheadofabroom and me already mentioned in the comments, there are some odds when relying on non-hashable values for instance caching or memoization. Therefore, if you still want to do exactly that, the following example does not hide the memoization in the __new__ or __init__ method. (A self-memoizing class would be hazardous because the memoization criterion can be fooled by code that you don't control).
Instead, I provide the function memoize which returns a memoizing factory function for a class. Since there is no generic way to tell from non-hashable arguments, if they will result in an instance that is equivalent to an already existing isntance, the memoization semantics have to be provided explicitly. This is achieved by passing the keyfunc function to memoize. keyfunc takes the same arguments as the class' __init__ method and returns a hashable key, whose equality relation (__eq__) determines memoization.
The proper use of the memoization is in the responsibility of the using code (providing a sensible keyfunc and using the factory), since the class to be memoized is not modified and can still be instantiated normally.
def memoize(cls, keyfunc):
memoized_instances = {}
def factory(*args, **kwargs):
key = keyfunc(*args, **kwargs)
if key in memoized_instances:
return memoized_instances[key]
instance = cls(*args, **kwargs)
memoized_instances[key] = instance
return instance
return factory
class MemoTest1(object):
def __init__(self, value):
self.value = value
factory1 = memoize(MemoTest1, lambda value : value)
class MemoTest2(MemoTest1):
def __init__(self, value, foo):
MemoTest1.__init__(self, value)
self.foo = foo
factory2 = memoize(MemoTest2, lambda value, foo : (value, frozenset(foo)))
m11 = factory1('test')
m12 = factory1('test')
assert m11 is m12
m21 = factory2('test', [1, 2])
lst = [1, 2]
m22 = factory2('test', lst)
lst.append(3)
m23 = factory2('test', lst)
assert m21 is m22
assert m21 is not m23
I only included MemoTest2 as a sublclass of MemoTest1 to show that there is no magic involved in using regular class inheritance.

Categories