I want to create some kind of descriptor on a class that returns a proxy object. The proxy object, when indexed retrieves members of the object and applies the index to them. Then it returns the sum.
E.g.,
class NDArrayProxy:
def __array__(self, dtype=None):
retval = self[:]
if dtype is not None:
return retval.astype(dtype, copy=False)
return retval
class ArraySumProxy(NDArrayProxy):
def __init__(self, arrays):
self.arrays = arrays
#property
def shape(self):
return self.arrays[0].shape
def __getitem__(self, indices):
return np.sum([a[indices]
for a in self.arrays],
axis=0)
This solution worked fine while I had actual arrays as member variables:
class CompartmentCluster(Cluster):
"""
Base class for cluster that manages evidence.
"""
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.variable_evidence = ArraySumProxy([])
class BasicEvidenceTargetCluster(CompartmentCluster):
# This class variable creates a Python object named basic_in on the
# class, which implements the descriptor protocol.
def __init__(self,
*,
**kwargs):
super().__init__(**kwargs)
self.basic_in = np.zeros(self.size)
self.variable_evidence.arrays.append(self.basic_in)
class ExplanationTargetCluster(CompartmentCluster):
"""
These clusters accept explanation evidence.
"""
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.explanation_in = np.zeros(self.size)
self.variable_evidence.arrays.append(self.explanation_in)
class X(BasicEvidenceTargetCluster, ExplanationTargetCluster):
pass
Now I've changed my arrays into Python descriptors (cluster_signal implements the descriptor protocol returning a numpy array):
class CompartmentCluster(Cluster):
"""
Base class for cluster that manages evidence.
"""
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.variable_evidence = ArraySumProxy([])
class BasicEvidenceTargetCluster(CompartmentCluster):
# This class variable creates a Python object named basic_in on the
# class, which implements the descriptor protocol.
basic_in = cluster_signal(text="Basic (in)",
color='bright orange')
def __init__(self,
*,
**kwargs):
super().__init__(**kwargs)
self.variable_evidence.arrays.append(self.basic_in)
class ExplanationTargetCluster(CompartmentCluster):
"""
These clusters accept explanation evidence.
"""
explanation_in = cluster_signal(text="Explanation (in)",
color='bright yellow')
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.variable_evidence.arrays.append(self.explanation_in)
class X(BasicEvidenceTargetCluster, ExplanationTargetCluster):
pass
This doesn't work because the append statements append the result of the descriptor call. What I need is to append either a bound method or similar proxy. What's the nicest way to modify my solution? In short: The variables basic_in and explanation_in were numpy arrays. They're now descriptors. I would like to develop some version of ArraySumProxy that works with descriptors rather than requiring actual arrays.
When you access a descriptor, it is evaluated and you only get the value. Since your descriptor does not always return the same object (I guess you cannot avioid it?), you dont want to access the descriptor when you are initializing your proxy.
The simplest way to avoid accessing it, is to just remember its name, so instead of:
self.variable_evidence.arrays.append(self.basic_in)
you do:
self.variable_evidence.arrays.append((self, 'basic_in'))
Then, of course, variable_evidence has to be aware of that and do getattr(obj, name) to access it.
Another option is to make the descriptor return a proxy object which is evaluated later. I don't know what you are doing, but that might be too many proxies for good taste...
EDIT
Or... you can store the getter:
self.variable_evidence.arrays.append(lambda: self.basic_in)
Related
Is there a way to access a class (where function is defined as a method) before there is an instance of that class?
class MyClass:
def method(self):
print("Calling me")
m1 = MyClass.method
instance = MyClass()
m2 = instance.method
print(m2.__self__.__class__) # <class 'MyClass'>
# how to access `MyClass` from `m1`?
For example I have m1 variable somewhere in my code and want to have a reference to MyClass the same way I can access it from bound method m2.__self__.__class__.
print(m1.__qualname__) # 'MyClass.method'
The only option I was able to find is __qualname__ which is a string containing name of the class.
The attribute __self__ itself is annotated by Python when the function is bound to an instance and become a method. (The code to that is run somewhere when running the __get__ code in the function, but passing an instance different than None).
So, as people pointed out, you have the option of getting the classname as a string by going through __qualname__. Otherwise, if the functions/methods for which you will need this feature are known beforehand, it is possible to create a decorator that will annotate their class when they are retrieved as a class attribute (in contrast to the native annotation which only takes place when retrieving then as an instance attribute):
class unboundmethod:
def __init__(self, func, cls):
self.__func__ = func
self.class_ = cls
self.__self__ = None
def __call__(self, instance, *args, **kw):
if not isinstance(instance, self.class_):
# This check is actually optional fancy stuff, since we are here! :-)
raise TypeError(f"First parameter fo {self.__func__.__name__} must be an instance of {self.class_}")
return self.__func__(instance, *args, **kw)
def __repr__(self):
return f"Unbound method {self.__func__!r} related to {self.class_}"
class clsbind:
def __init__(self, func):
self.func = func
def __get__(self, instance, owner):
if instance is None:
# the function is being retrieved from the class:
return unboundmethod(self.func, owner)
# return control to usual method creation codepath:
return self.func.__get__(instance, owner)
class MyClass:
#clsbind
def method(self):
print("Calling me")
And on the REPL you can have this:
In [136]: m1 = MyClass.method
In [137]: m1.class_
Out[137]: __main__.MyClass
In [138]: m1(MyClass())
Calling me
You can get the class instance using the __qualname__
my_class = eval(m1.__qualname__.split('.')[-2])
print(my_class)
Not the most generic and safest approach, but should work for this simple scenario.
Suppose there is a class A and a factory function make_A
class A():
...
def make_A(*args, **kwars):
# returns an object of type A
both defined in some_package.
Suppose also that I want to expand the functionality of A, by subclassing it,
without overriding the constructor:
from some_package import A, make_A
class B(A):
def extra_method(self, ...):
# adds extra functionality
What I also need is to write a new factory function make_B for subclass B.
The solution I have found so far is
def make_B(*args, **kwargs):
"""
same as make_A except that it returns an object of type B
"""
out = make_A(*args, **kwargs)
out.__class__ = B
return out
This seems to work, but I am a bit worried about directly modifying the
__class__ attribute, as it feels to me like a hack. I am also worried about
unexpected side-effects this modification may have. Is this the recommended
solution or is there a "cleaner" pattern to achieve the same result?
I guess I finally found something not verbose yet still working. For this you need to replace inheritance with composition, this will allow to consume an object A by doing self.a = ....
To mimic the methods of A you can use __getattr__ overload to delegate those methods (and fields) to self.a
The next snippet works for me
class A:
def __init__(self, val):
self.val = val
def method(self):
print(f"A={self.val}")
def make_A():
return A(42)
class B:
def __init__(self, *args, consume_A = None, **kwargs):
if consume_A is None:
self.a = A(*args, **kwargs)
else:
self.a = consume_A
def __getattr__(self, name):
return getattr(self.a, name)
def my_extension(self):
print(f"B={self.val * 100}")
def make_B(*args, **kwargs):
return B(consume_A=make_A(*args, **kwargs))
b = make_B()
b.method() # A=42
b.my_extension() # B=4200
What makes this approach superior to yours is that modifying __class__ is probably not harmless. On the other hand __getattr__ and __getattribute__ are specifically provided as the mechanisms to resolve attributes search in an object. For more details, see this tutorial.
Make your original factory function more general by accepting a class as parameter: remember, everything is an object in Python, even classes.
def make(class_type, *args, **kwargs):
return class_type(*args, **kwargs)
a = make(A)
b = make(B)
Since B has the same parameters as A, you don't need to make an A and then turn it into B: B inherits from A, so it "is an A" and will have the same functionality, plus the extra method that you added.
Original problem description
The problem arises when I implement some machine learning algorithm with numpy. I want some new class ludmo which works the same as numpy.ndarray, but with a few more properties. For example, with a new property ludmo.foo. I've tried several methods below, but none is satisfactory.
1. Wrapper
First I created a wrapper class for numpy.ndarray, as
import numpy as np
class ludmo(object):
def __init__(self)
self.foo = None
self.data = np.array([])
But when I use some function (in scikit-learn which I cannot modify) to manipulate a list of np.ndarray instance, I have to first extract all data field of each ludmo object and collect them into a list. After that the list is sorted and I lost the correspondence between the data and original ludmo objects.
2. Inheritance
Then I tried to make ludmo a subclass of numpy.ndarray, as
import numpy as np
class ludmo(np.ndarray):
def __init__(self, shape, dtype=float, buffer=None, offset=0, strides=None, order=None)
super().__init__(shape, dtype, buffer, offset, strides, order)
self.foo = None
But another problem arises then: the most common way to create a numpy.ndarray object is numpy.array(some_list), which returns a numpy.ndarray object, and I have to convert it to a ludmo object. But till now I found no good way to do this; simply changing the __class__ attribute will result in an error.
I'm new to Python and numpy, so there must be some elegant way that I don't know. Any advice is appreciated.
It's better if anyone can give an generic solution, which not only applies to the numpy.ndarray class but also all kinds of classes.
As explained in the docs you could add your own methods to np.ndarray doing:
import numpy as np
class Ludmo(np.ndarray):
def sumcols(self):
return self.sum(axis=1)
def sumrows(self):
return self.sum(axis=0)
def randomize(self):
self[:] = np.random.rand(*self.shape)
and then creating the instances using the np.ndarray.view() method:
a = np.random.rand(4,5).view(Ludmo)
And use the __array_finalize__() method to define new attributes:
def __array_finalize__(self, arr):
self.foo = 'foo'
Since you ask about a generic solution, here's a generic wrapper class that you can use: (from http://code.activestate.com/recipes/577555-object-wrapper-class/ )
class Wrapper(object):
'''
Object wrapper class.
This a wrapper for objects. It is initialiesed with the object to wrap
and then proxies the unhandled getattribute methods to it.
Other classes are to inherit from it.
'''
def __init__(self, obj):
'''
Wrapper constructor.
#param obj: object to wrap
'''
# wrap the object
self._wrapped_obj = obj
def __getattr__(self, attr):
# see if this object has attr
# NOTE do not use hasattr, it goes into
# infinite recurrsion
if attr in self.__dict__:
# this object has it
return getattr(self, attr)
# proxy to the wrapped object
return getattr(self._wrapped_obj, attr)
the way this works is:
when e.g. skicit would call ludmo.data python actually calls
ludmo.__getattribute__('data')
if ludmo doesn't have the 'data' attribute, python will call
ludmo.__getattr__('data')
by overridding the __getattr__ function you intercept this call, check if your ludmo has the data attribute (again, you could get into recursion otherwise), and send the call to your internal object. So you should have covered every possible call to your internal numpy object.
update:
You would also have to implement __setattr__ the same way, or you would get this
>>> class bla(object):
... def __init__(self):
... self.a = 1
... def foo(self):
... print self.a
...
>>> d = Wrapper(bla())
>>> d.a
1
>>> d.foo()
1
>>> d.a = 2
>>> d.a
2
>>> d.foo()
1
and you probably also want to set a new metaclass that intercepts calls to magic functions of new style classes (for full class see https://github.com/hpcugent/vsc-base/blob/master/lib/vsc/utils/wrapper.py
for info see How can I intercept calls to python's "magic" methods in new style classes?
)
however, this is only needed if you still want to be able to access x.__name__ or x.__file__ and get the magic attribute from the wrapped class, and not your class.
# create proxies for wrapped object's double-underscore attributes
class __metaclass__(type):
def __init__(cls, name, bases, dct):
def make_proxy(name):
def proxy(self, *args):
return getattr(self._obj, name)
return proxy
type.__init__(cls, name, bases, dct)
if cls.__wraps__:
ignore = set("__%s__" % n for n in cls.__ignore__.split())
for name in dir(cls.__wraps__):
if name.startswith("__"):
if name not in ignore and name not in dct:
setattr(cls, name, property(make_proxy(name)))
I'm trying to understand inheritance in Python. I have 4 different kind of logs that I want to process: cpu, ram, net and disk usage
I decided to implement this with classes, as they're formally the same except for the log file reading and the data type for the data. I have a the following code (log object is a logging object instance of a custom logging class)
class LogFile():
def __init__(self,log_file):
self._log_file=log_file
self.validate_log()
def validate_log(self):
try:
with open(self._log_file) as dummy_log_file:
pass
except IOError as e:
log.log_error(str(e[0])+' '+e[1]+' for log file '+self._log_file)
class Data(LogFile):
def __init__(self,log_file):
LogFile.__init__(self, log_file)
self._data=''
def get_data(self):
return self._data
def set_data(self,data):
self._data=data
def validate_data(self):
if self._data == '':
log.log_debug("Empty data list")
class DataCPU(Data):
def read_log(self):
self.validate_log()
reading and writing to LIST stuff
return LIST
class DataRAM(Data):
def read_log(self):
self.validate_log()
reading and writing to LIST stuff
return LIST
class DataNET(Data):
Now I want my DataNET class to be a object of the Data Class with some more attributes, in particular a dictionary for every one of the interfaces. How can I override the __init__() method to be the same as the Data.__init__() but adding self.dict={} without copying the Data builder? This is, without explicitly specifing the DataNet objects do have a ._data attribute, but inherited from Data.
Just call the Data.__init__() method from DataNET.__init__(), then set self._data = {}:
class DataNET(Data):
def __init__(self, logfile):
Data.__init__(self, logfile)
self._data = {}
Now whatever Data.__init__() does to self happens first, leaving your DataNET initializer to add new attributes or override attributes set by the parent initializer.
In Python 3 classes are already new-style, but if this is Python 2, I'd add object as a base class to LogFile() to make it new-style too:
class LogFile(object):
after which you can use super() to automatically look up the parent __init__ method to call; this has the advantage that in a more complex cooperative inheritance scheme the right methods are invoked in the right order:
class Data(LogFile):
def __init__(self,log_file):
super(Data, self).__init__(log_file)
self._data = ''
class DataNET(Data):
def __init__(self, logfile):
super(DataNET, self).__init__(logfile)
self._data = {}
super() provides you with bound methods, so you don't need to pass in self as an argument to __init__ in that case. In Python 3, you can omit the arguments to super() altogether:
class Data(LogFile):
def __init__(self,log_file):
super().__init__(log_file)
self._data = ''
class DataNET(Data):
def __init__(self, logfile):
super().__init__(logfile)
self._data = {}
Use new style classes (inherit from object) - change definition of LogFile to:
class LogFile(object):
and init method of Data to:
def __init__(self, log_file):
super(Data, self).__init__(log_file)
self._data = ''
Then you can define DataNET as:
class DataNET(Data):
def __init__(self, log_file):
super(DataNET, self).__init__(log_file)
self.dict = {}
I'm building a simulator, which will model various types of entities. So I've got a base class, ModelObject, and will use subclasses for all the different entities. Each entity will have a set of properties that I want to keep track of, so I've also got a class called RecordedDetail, that keeps tracks of changes (basically builds a list of (time_step, value) pairs) and each ModelObject has a dict to store these in. So I've got, effectively,
class ModelObject(object):
def __init__(self):
self.details = {}
self.time_step = 0
def get_detail(self, d_name):
""" get the current value of the specified RecordedDetail"""
return self.details[d_name].current_value()
def set_detail(self, d_name, value):
""" set the current value of the specified RecordedDetail"""
self.details[d_name].set_value(value, self.time_step)
class Widget(ModelObject):
def __init__(self):
super().__init__(self)
self.details["level"] = RecordedDetail()
self.details["angle"] = RecordedDetail()
#property
def level(self):
return self.get_detail("level")
#level.setter
def level(self, value):
self.set_detail("level", value)
#property
def angle(self):
return self.get_detail("angle")
#angle.setter
def angle(self):
self.set_detail("angle", value)
This gets terribly repetitious, and I can't help thinking there must be a way of automating it using a descriptor, but I can't work out how. I end up with
class RecordedProperty(object):
def __init__(self, p_name):
self.p_name = p_name
def __get__(self, instance, owner):
if instance is None:
return self
return instance.get_detail(self.p_name)
def __set__(self, instance, value):
instance.set_detail(self.p_name, value)
class Widget(ModelObject):
level = RecordedProperty("level")
angle = RecordedProperty("angle")
def __init__(self):
super().__init__(self)
self.details["level"] = RecordedDetail()
self.details["angle"] = RecordedDetail()
which is a bit of an improvement, but still a lot of typing.
So, a few questions.
Can I just add the descriptor stuff (__get__, __set__ etc) into the RecordedDetail class? Would there be any advantage to doing that?
Is there any way of typing the new property name (such as "level") fewer than three times, in two different places?
or
Am I barking up the wrong tree entirely?
The last bit of code is on the right track. You can make the process less nasty by using a metaclass to create a named RecordedProperty and a matching RecordedDetail for every item in a list. Here's a simple example:
class WidgetMeta(type):
def __new__(cls, name, parents, kwargs):
'''
Automate the creation of the class
'''
for item in kwargs['_ATTRIBS']:
kwargs[item] = RecordedProperty(item)
return super(WidgetMeta, cls).__new__(cls, name, parents, kwargs)
class Widget(ModelObject):
_ATTRIBS = ['level', 'angle']
__metaclass__ = WidgetMeta
def __init__(self, *args, **kwargs):
super().__init__(self)
self.Details = {}
for detail in self._ATTRIBS:
self.Details[detail] = RecordedDetail()
Subclasses would then just need to have different data in _ATTRIBS.
As an alternative (I think it's more complex) you could use the metaclass to customize the init in the same way you customize the new, creating the RecordedDetails out of the _ATTRIBS list.
A third option would be to create a RecordedDetail in every instance on first access. That would work fine as long as you don't have code that expects a RecordedDetail for every property even if the RecordedDetail has not been touched.
Caveat I'm not super familiar with Python3; I've used the above pattern often in 2.7x