raise flag if attribute is changed results in too many properties? - python

For my application I have an object which has some set of attributes. One set of attributes, the parameters, can be accessed and adjusted by the user. The other set of attributes, the outputs, should be accessible by the user but are calculated using internal methods. Furthermore, if any of the parameter attributes are adjusted the outputs must also be re-calculated from the internal methods and adjusted accordingly. However, as these calculations may be costly I do not want to needlessly run them unless (or until) they are requested.
Currently I can implement this by making each "parameter" attribute a property and including a self.calculated flag which is raised whenever any of the parameters are changed and also making each "output" attribute a property which checks the self.calculated flag and accordingly either returns the output directly if no calculation is needed or performs the calculation, lowers the flag, and returns the output.
See code
class Rectangle(object):
def __init__(self, length=1, width=1):
self._length = length
self._width = width
self._area = self.calc_area()
self._perim = self.calc_perim()
self.calculated = True
#property
def length(self):
return self._length
#length.setter
def length(self, value):
if value != self._length:
self._length = value
self.calculated = False
#property
def width(self):
return self._width
#width.setter
def width(self, value):
if value != self._width:
self._width = value
self.calculated = False
#property
def area(self):
if self.calculated is True:
return self._area
else:
self.recalculate()
return self._area
#property
def perim(self):
if self.calculated is True:
return self._perim
else:
self.recalculate()
return self._perim
def calc_area(self):
return self.length * self.width
def calc_perim(self):
return 2 * (self.length + self.width)
def recalculate(self):
self._area = self.calc_area()
self._perim = self.calc_perim()
self.calculated = True
def double_width(self):
self.width = 2 * self.width
This gives the desired behavior but seems like an excessive proliferation of properties which would be especially problematic if there got to be a large number of parameters and outputs.
Is there a cleaner way to implement this attribute change/recalculation structure? I have found a couple of posts where a solution is presented involving writing a __setattr__ method for the class but I'm not sure if that would be straight forward to implement in my since the behavior should be different depending on the particular attribute being set. I guess this could be handled with a check in the __setattr__ method about whether the attribute is a parameter or output...
Decorating a class to monitor attribute changes
How to identify when an attribute's attribute is being set?

There are several different options, and which to use is highly dependent on the use case. Clearly, the options below can be quite bad in many use cases.
Delete outputs when inputs change
An example of implementing this:
#width.setter
def width(self, value):
if value != self._width:
self._width = value
(self._area,self._perim) = (None,None)
def perim(self):
if not self._perim:
self._perim = calc_perim(self)
return self._perim
This doesn't address most of your concern, but it does get rid of the calculated flag (and while your code recalculates all of the outputs when any of them are requested after an update, this code just calculates the requested one).
Update values when inputs change
When you increase the width by x, the perimeter increases by 2*x and the area increases by x*length. In some cases, applying formulae such as these to update the values as the inputs change can be more efficient that calculating the outputs from scratch every time the inputs change.
Keep track of last values
Whenever you calculate the outputs, keep track not only of what results you got, but what inputs you used to calculate them. Then next time an object is asked what its outputs are, it can check whether the outputs were calculated according to its current attributes. Obviously, this requires multiplying the input storage space.
Memoization
Going even further than the previous option, create a dictionary where the keys are tuples of attributes, and the values are output. If you currently have a function calculate_output(attributes), replace all calls to the function with
def output_lookup(attributes):
if not attributes in output_dict.keys():
output_dict[attributes] = calculate_output(attributes)
return output_dict[attributes]
You should use this option if you expect particular combinations of attributes to be repeated often, calculating the outputs is expensive, and/or memory is cheap. This can be shared across the class, so if you have several instances of rectangles that have the same length and width, you can have one (_perim,_area) value stored, rather than duplicating it across each instance. So for some use cases, this can be more efficient.
Note that your issue ultimately derives from the fact that you are trying to engage in some memoization (you want to save the results from your calculations, so that when someone accesses an object's outputs, you don't have to calculate the outputs if they've already been calculated for the current inputs), but you need to keep track of when to "invalidate the cache", so to speak. If you were to simply treat the area and perimeter as methods rather than attributes, or you were to treats instances as immutable and require resetting attributes be done by creating a new instance with the new values, you would eliminate the complexities that you've added to the length and width. You can't have it all: you can't have cached values from mutable attributes without some overhead.
PS is True is redundant in if self.calculated is True:.

Related

Updating Dependent Attributes After Mutation

Let's say I have the following classes:
import math
class LineSegment:
def __init__(
self,
origin,
termination,
):
self.origin = origin
self.termination = termination
self.length = self.calculate_length()
def calculate_length(self):
return math.sqrt(
(self.origin.x - self.termination.x) ** 2
+ (self.origin.y - self.termination.y) ** 2
)
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
An object of the LineSegment class is composed of two objects of the Point class. Now, let's say I initialize an object as so:
this_origin = Point(x=0, y=0)
this_termination = Point(x=1, y=1)
this_line_segment = LineSegment(origin=this_origin, termination=this_termination)
Note: The initialization of the line segment automatically calculates its length. This is critical to other parts of the codebase, and cannot be changed. I can see its length like this:
print(this_line_segment.length) # This prints "1.4142135623730951" to the console.
Now, I need to mutate one parameter of this_line_segment's sub-objects:
this_line_segment.origin.x = 1
However, the this_line_segments length attribute does not update based on the new origin's x coordinate:
print(this_line_segment.length) # This still prints "1.4142135623730951" to the console.
What is the pythonic way to implement updating a class's attributes when one of the attributes they are dependent upon changes?
Option 1: Getter and Setter Methods
In other object-oriented programming languages, the behavior you desire, adding additional logic when accessing the value of an instance variable, is typically implemented by "getter" and "setter" methods on all instance variables in the object:
class LineSegment:
def __init__(
self,
origin,
termination,
):
self._origin = origin
self._termination = termination
# getter method for origin
def get_origin(self):
return self._origin
# setter method for origin
def set_origin(self,new_origin):
self._origin = new_origin
# getter method for termination
def get_termination(self):
return self._termination
# setter method for termination
def set_termination(self,new_termination):
self._termination = new_termination
def get_length(self):
return math.sqrt(
(self.get_origin().x - self.get_termination().x) ** 2
+ (self.get_origin().y - self.get_termination().y) ** 2
) #Calls the getters here, rather than the instance vars in case
# getter logic is added in the future
So that the extra length calculation is performed every time you get() the length variable, and instead of this_line_segment.origin.x = 1, you do:
new_origin = this_line_segment.get_origin()
new_origin.x = 1
this_line_segment.set_origin(new_origin)
print(this_line_segment.get_length())
(Note that I use _ in front of variables to denote that they are private and should only be accessed via getters and setters. For example, the variable length should never be set by the user--only through the LineSegment class.)
However, explicit getters and setters are clearly a clunky way to manage variables in Python, where the lenient access protections make accessing them directly more transparent.
Option 2: The #property decorator
A more Pythonic way to add getting and setting logic is the #property decorator, as #progmatico points out in their comment, which calls decorated getter and setter methods when an instance variable is accessed. Since all we need to do is calculate the length whenever it is needed, we can leave the other instance variables public for now:
class LineSegment:
def __init__(
self,
origin,
termination,
):
self.origin = origin
self.termination = termination
# getter method for length
#property
def length(self):
return math.sqrt(
(self.origin.x - self.termination.x) ** 2
+ (self.origin.y - self.termination.y) ** 2
)
And usage:
this_line_segment = LineSegment(origin=Point(x=0,y=0),
termination=Point(x=1,y=1))
print(this_line_segment.length) # Prints 1.4142135623730951
this_line_segment.origin.x = 1
print(this_line_segment.length) # Prints 1.0
Tested in Python 3.7.7.
Note: We must do the length calculation in the length getter and not upon initialization of the LineSegment. We can't do the length calculation in the setter methods for the origin and termination instance variables and thus also in the initialization because the Point object is mutable, and mutating it does not call LineSegment's setter method. Although we could do this in Option 1, it would lead to an antipattern, in which we would have to recalculate every other instance variable in the setter for each instance variable of an object in the cases for which the instance variables depend on one another.

Do all object variables need to be defined in the constructor?

I was wondering if all self. has to be defined in __init__, for example, i have this code right here:
class Colour:
def __init__(self, r, g, b):
self._red = r
self._green = g
self._blue = b
self._rgb = (self._red, self._green, self._blue)
def luminosity(self):
self._luminosity = 0.5 * ((max(self._red, self._green, self._blue))/255)+((min(self._red, self._green, self._blue))/255)
return self._luminosity
Am i right to define self.luminosity in the function def luminosity(self) or should i define it in __init__?
In this case, you don't need to define it, because it's only set and then returned when you could directly return the calculated value from your method!
Additionally, you can simplify the calculation a little, though I am not sure it is really luminosity, as there are a variety of interpretations different to yours
def luminosity(self):
return 0.5 * (
max(self._red, self._green, self._blue) + \
min(self._red, self._green, self._blue)
) / 255
If instead, you were caching the value (which may make sense if you do a more complex calculation or call the luminosity method many times), it would make sense to set it in __init__() and check before calculating (effectively caching the last call)
As #laol suggests, you can also use #property to simplify some of the its use
And finally, you can take advantage of your combined RGB for the calculation
class Colour():
def __init__(self, r, g, b):
self._red = r
self._green = g
self._blue = b
self._luminosity = None
#property
def rgb(self):
return (self._red, self._green, self._blue)
#property
def luminosity(self):
if self._luminosity is None:
self._luminosity = 0.5 * (max(self.rgb) + min(self.rgb)) / 255
return self._luminosity
c = Colour(128,100,100)
print(c.luminosity)
0.44705882352941173
Extending this even further, setting new values for the color components can set the cached value back to None, triggering re-calculation on the next call (rather than immediately, saving some calculation if many changes are made before the value is wanted), but this is left as an exercise to the reader
I suggest to define it as a property:
#property
def luminosity(self):
return 0.5 * ((max(self._red, self._green, self._blue))/255)+((min(self._red, self._green, self._blue))/255)
By this you can directly return it from any Colour c by
c.luminosity
No, instance variables do not need to be defined in __init__. Instance variables are completely dynamic and can be added any time either in a method or outside of the object (see note). However, if you don't define them, you have created an object access protocol that needs to be managed. Suppose another method is added:
def half_luminosity(self):
return self._luminosity/2
It is an error to call it before luminosity. This code will raise AttributeError if its called at the wrong time. You could assign self._luminosity = None in __init__ and check it
def half_luminosity(self):
if self._luminosity is None:
raise ValueError("Attempt to use luminosity before set")
but that's not much different than
def half_luminosity(self):
if not hasattr(self, '_luminosity'):
raise ValueError("Attempt to use luminosity before set")
If you have a class that is setup in more than one step, either way will do. PEP8 favors the first because its easier for a futurer reader to see what's going on.
NOTE: Classes that use __slots__ or one of the getattr methods can change the rules as can C extensions.

Override all binary operators (or other way to respect physics dimensions) in python?

I'm building classes inherited from float to respect a dimension in some chemical calculations. e.g.:
class volume(float):
def __init__(self, value):
float._init_(value)
Now my goal is:
to raise an error whenever + or - is used between a normal float and an instance of volume
return an instance of volume, whenever + or - is used between two instances of volume
return an instance of volume, whenever * is used (from both sides) and / is used (from left)
raise an error when / is used from left
return a float whenever two instances of volume are divided.
Right now, I'm going to override all these four operators, from left and right (e.g. _add_ and _radd_);
err='Only objects of type volume can be added or subtracted together'
def __add__(self,volume2):
if isinstance(volume2,volume): return volume(float(self)+volume2)
else: raise TypeError(err)
def __radd__(self,volume2):
if isinstance(volume2,volume): return volume(float(self)+volume2)
else: raise TypeError(err)
Is there any easier way to access all of them, or at least an expression to include both left and right uses of the operator?
It seems that this question is primarily about avoiding code duplication. Regarding the multiply and divide cases, you have slightly different functionality and may have to write separate methods explicitly, but for the addition and subtraction related methods, the following technique would work. It is essentially monkey-patching the class, and it is okay to do this, although you should not attempt similarly to monkey-patch instances in Python 3.
I have called the class Volume with capital V in accordance with convention.
class Volume(float):
pass
def _make_wrapper(name):
def wrapper(self, other):
if not isinstance(other, Volume):
raise ValueError
return Volume(getattr(float, name)(self, other))
setattr(Volume, name, wrapper)
for _method in ('__add__', '__radd__',
'__sub__', '__rsub__',):
_make_wrapper(_method)
The equivalent explicit method in these cases looks like the following, so adapt as required for the multiply / divide cases, but note the explicit use of float.__add__(self, other) rather than self + other as the question suggests that you intended to use (where the question mentions volume(self+volume2)), which would lead to infinite recursion.
def __add__(self, other):
if not isinstance(other, Volume):
raise ValueError
return Volume(float.__add__(self, other))
Regarding the __init__, I have now removed it above, because if all it does is call float.__init__, then it is not necessary to define it at all (let it simply inherit __init__ from the base class). If you want to have an __init__ method in order to initialise something else, then yes you will also need to include the explicit call to float.__init__ as you do in the question (although note the double underscores -- in the question you are trying to call float._init_).
metaclass is a the way to control how class are constructed.
You can use metaclasses to overload all the math oeprators like this:
err='Only objects of type volume can be added or subtracted together'
class OverLoadMeta(type):
def __new__(meta,name,bases,dct):
# this is the operation you want to use instead of default add or subtract.
def op(self,volume2):
if isinstance(volume2,Volume):
return Volume(float.__add__(self,volume2))
else:
raise TypeError(err)
# you can overload whatever method you want here
for method in ('__add__','__radd__','__sub__'):
dct[method] = op
return super(OverLoadMeta, meta).__new__(meta, name, bases, dct)
class Volume(float,metaclass=OverLoadMeta):
""
# you can use it like this:
a = Volume(1)
b = Volume(2)
c = a+b
print(c.__class__)
# class will be <class '__main__.Volume'>
a + 1
# raise TypeError: Only objects of type volume can be added or subtracted together

How to set up a class with all the methods of and functions like a built in such as float, but holds onto extra data?

I am working with 2 data sets on the order of ~ 100,000 values. These 2 data sets are simply lists. Each item in the list is a small class.
class Datum(object):
def __init__(self, value, dtype, source, index1=None, index2=None):
self.value = value
self.dtype = dtype
self.source = source
self.index1 = index1
self.index2 = index2
For each datum in one list, there is a matching datum in the other list that has the same dtype, source, index1, and index2, which I use to sort the two data sets such that they align. I then do various work with the matching data points' values, which are always floats.
Currently, if I want to determine the relative values of the floats in one data set, I do something like this.
minimum = min([x.value for x in data])
for datum in data:
datum.value -= minimum
However, it would be nice to have my custom class inherit from float, and be able to act like this.
minimum = min(data)
data = [x - minimum for x in data]
I tried the following.
class Datum(float):
def __new__(cls, value, dtype, source, index1=None, index2=None):
new = float.__new__(cls, value)
new.dtype = dtype
new.source = source
new.index1 = index1
new.index2 = index2
return new
However, doing
data = [x - minimum for x in data]
removes all of the extra attributes (dtype, source, index1, index2).
How should I set up a class that functions like a float, but holds onto the extra data that I instantiate it with?
UPDATE: I do many types of mathematical operations beyond subtraction, so rewriting all of the methods that work with a float would be very troublesome, and frankly I'm not sure I could rewrite them properly.
I suggest subclassing float and using a couple decorators to "capture" the float output from any method (except for __new__ of course) and returning a Datum object instead of a float object.
First we write the method decorator (which really isn't being used as a decorator below, it's just a function that modifies the output of another function, AKA a wrapper function):
def mydecorator(f,cls):
#f is the method being modified, cls is its class (in this case, Datum)
def func_wrapper(*args,**kwargs):
#*args and **kwargs are all the arguments that were passed to f
newvalue = f(*args,**kwargs)
#newvalue now contains the output float would normally produce
##Now get cls instance provided as part of args (we need one
##if we're going to reattach instance information later):
try:
self = args[0]
##Now check to make sure new value is an instance of some numerical
##type, but NOT a bool or a cls type (which might lead to recursion)
##Including ints so things like modulo and round will work right
if (isinstance(newvalue,float) or isinstance(newvalue,int)) and not isinstance(newvalue,bool) and type(newvalue) != cls:
##If newvalue is a float or int, now we make a new cls instance using the
##newvalue for value and using the previous self instance information (arg[0])
##for the other fields
return cls(newvalue,self.dtype,self.source,self.index1,self.index2)
#IndexError raised if no args provided, AttributeError raised of self isn't a cls instance
except (IndexError, AttributeError):
pass
##If newvalue isn't numerical, or we don't have a self, just return what
##float would normally return
return newvalue
#the function has now been modified and we return the modified version
#to be used instead of the original version, f
return func_wrapper
The first decorator only applies to a method to which it is attached. But we want it to decorate all (actually, almost all) the methods inherited from float (well, those that appear in the float's __dict__, anyway). This second decorator will apply our first decorator to all of the methods in the float subclass except for those listed as exceptions (see this answer):
def for_all_methods_in_float(decorator,*exceptions):
def decorate(cls):
for attr in float.__dict__:
if callable(getattr(float, attr)) and not attr in exceptions:
setattr(cls, attr, decorator(getattr(float, attr),cls))
return cls
return decorate
Now we write the subclass much the same as you had before, but decorated, and excluding __new__ from decoration (I guess we could also exclude __init__ but __init__ doesn't return anything, anyway):
#for_all_methods_in_float(mydecorator,'__new__')
class Datum(float):
def __new__(klass, value, dtype="dtype", source="source", index1="index1", index2="index2"):
return super(Datum,klass).__new__(klass,value)
def __init__(self, value, dtype="dtype", source="source", index1="index1", index2="index2"):
self.value = value
self.dtype = dtype
self.source = source
self.index1 = index1
self.index2 = index2
super(Datum,self).__init__()
Here are our testing procedures; iteration seems to work correctly:
d1 = Datum(1.5)
d2 = Datum(3.2)
d3 = d1+d2
assert d3.source == 'source'
L=[d1,d2,d3]
d4=max(L)
assert d4.source == 'source'
L = [i for i in L]
assert L[0].source == 'source'
assert type(L[0]) == Datum
minimum = min(L)
assert [x - minimum for x in L][0].source == 'source'
Notes:
I am using Python 3. Not certain if that will make a difference for you.
This approach effectively overrides EVERY method of float other than the exceptions, even the ones for which the result isn't modified. There may be side effects to this (subclassing a built-in and then overriding all of its methods), e.g. a performance hit or something; I really don't know.
This will also decorate nested classes.
This same approach could also be implemented using a metaclass.
The problem is when you do :
x - minimum
in terms of types you are doing either :
datum - float, or datum - integer
Either way python doesn't know how to do either of them, so what it does is look at parent classes of the arguments if it can. since datum is a type of float, it can easily use float - and the calculation ends up being
float - float
which will obviously result in a 'float' - python has no way of knowing how to construct your datum object unless you tell it.
To solve this you either need to implement the mathematical operators so that python knows how to do datum - float or come up with a different design.
Assuming that 'dtype', 'source', index1 & index2 need to stay the same after a calculation - then as an example your class needs :
def __sub__(self, other):
return datum(value-other, self.dtype, self.source, self.index1, self.index2)
this should work - not tested
and this will now allow you to do this
d = datum(23.0, dtype="float", source="me", index1=1)
e = d - 16
print e.value, e.dtype, e.source, e.index1, e.index2
which should result in :
7.0 float me 1 None

sharing a string between two objects

I want two objects to share a single string object. How do I pass the string object from the first to the second such that any changes applied by one will be visible to the other? I am guessing that I would have to wrap the string in a sort of buffer object and do all sorts of complexity to get it to work.
However, I have a tendency to overthink problems, so undoubtedly there is an easier way. Or maybe sharing the string is the wrong way to go? Keep in mind that I want both objects to be able to edit the string. Any ideas?
Here is an example of a solution I could use:
class Buffer(object):
def __init__(self):
self.data = ""
def assign(self, value):
self.data = str(value)
def __getattr__(self, name):
return getattr(self.data, name)
class Descriptor(object):
def __get__(self, instance, owner):
return instance._buffer.data
def __set__(self, instance, value):
if not hasattr(instance, "_buffer"):
if isinstance(value, Buffer):
instance._buffer = value
return
instance._buffer = Buffer()
instance._buffer.assign(value)
class First(object):
data = Descriptor()
def __init__(self, data):
self.data = data
def read(self, size=-1):
if size < 0:
size = len(self.data)
data = self.data[:size]
self.data = self.data[size:]
return data
class Second(object):
data = Descriptor()
def __init__(self, data):
self.data = data
def add(self, newdata):
self.data += newdata
def reset(self):
self.data = ""
def spawn(self):
return First(self._buffer)
s = Second("stuff")
f = s.spawn()
f.data == s.data
#True
f.read(2)
#"st"
f.data
# "uff"
f.data == s.data
#True
s.data
#"uff"
s._buffer == f._buffer
#True
Again, this seems like absolute overkill for what seems like a simple problem. As well, it requires the use of the Buffer class, a descriptor, and the descriptor's impositional _buffer variable.
An alternative is to put one of the objects in charge of the string and then have it expose an interface for making changes to the string. Simpler, but not quite the same effect.
I want two objects to share a single
string object.
They will, if you simply pass the string -- Python doesn't copy unless you tell it to copy.
How do I pass the string object from
the first to the second such that any
changes applied by one will be visible
to the other?
There can never be any change made to a string object (it's immutable!), so your requirement is trivially met (since a false precondition implies anything).
I am guessing that I would have to
wrap the string in a sort of buffer
object and do all sorts of complexity
to get it to work.
You could use (assuming this is Python 2 and you want a string of bytes) an array.array with a typecode of c. Arrays are mutable, so you can indeed alter them (with mutating methods -- and some operators, which are a special case of methods since they invoke special methods on the object). They don't have the myriad non-mutating methods of strings, so, if you need those, you'll indeed need a simple wrapper (delegating said methods to the str(...) of the array that the wrapper also holds).
It doesn't seem there should be any special complexity, unless of course you want to do something truly weird as you seem to given your example code (have an assignment, i.e., a *rebinding of a name, magically affect a different name -- that has absolutely nothing to do with whatever object was previously bound to the name you're rebinding, nor does it change that object in any way -- the only object it "changes" is the one holding the attribute, so it's obvious that you need descriptors or other magic on said object).
You appear to come from some language where variables (and particularly strings) are "containers of data" (like C, Fortran, or C++). In Python (like, say, in Java), names (the preferred way to call what others call "variables") always just refer to objects, they don't contain anything except exactly such a reference. Some objects can be changed, some can't, but that has absolutely nothing to do with the assignment statement (see note 1) (which doesn't change objects: it rebinds names).
(note 1): except of course that rebinding an attribute or item does alter the object that "contains" that item or attribute -- objects can and do contain, it's names that don't.
Just put your value to be shared in a list, and assign the list to both objects.
class A(object):
def __init__(self, strcontainer):
self.strcontainer = strcontainer
def upcase(self):
self.strcontainer[0] = self.strcontainer[0].upper()
def __str__(self):
return self.strcontainer[0]
# create a string, inside a shareable list
shared = ['Hello, World!']
x = A(shared)
y = A(shared)
# both objects have the same list
print id(x.strcontainer)
print id(y.strcontainer)
# change value in x
x.upcase()
# show how value is changed in both x and y
print str(x)
print str(y)
Prints:
10534024
10534024
HELLO, WORLD!
HELLO, WORLD!
i am not a great expert in python, but i think that if you declare a variable in a module and add a getter/setter to the module for this variable you will be able to share it this way.

Categories