I was wondering if all self. has to be defined in __init__, for example, i have this code right here:
class Colour:
def __init__(self, r, g, b):
self._red = r
self._green = g
self._blue = b
self._rgb = (self._red, self._green, self._blue)
def luminosity(self):
self._luminosity = 0.5 * ((max(self._red, self._green, self._blue))/255)+((min(self._red, self._green, self._blue))/255)
return self._luminosity
Am i right to define self.luminosity in the function def luminosity(self) or should i define it in __init__?
In this case, you don't need to define it, because it's only set and then returned when you could directly return the calculated value from your method!
Additionally, you can simplify the calculation a little, though I am not sure it is really luminosity, as there are a variety of interpretations different to yours
def luminosity(self):
return 0.5 * (
max(self._red, self._green, self._blue) + \
min(self._red, self._green, self._blue)
) / 255
If instead, you were caching the value (which may make sense if you do a more complex calculation or call the luminosity method many times), it would make sense to set it in __init__() and check before calculating (effectively caching the last call)
As #laol suggests, you can also use #property to simplify some of the its use
And finally, you can take advantage of your combined RGB for the calculation
class Colour():
def __init__(self, r, g, b):
self._red = r
self._green = g
self._blue = b
self._luminosity = None
#property
def rgb(self):
return (self._red, self._green, self._blue)
#property
def luminosity(self):
if self._luminosity is None:
self._luminosity = 0.5 * (max(self.rgb) + min(self.rgb)) / 255
return self._luminosity
c = Colour(128,100,100)
print(c.luminosity)
0.44705882352941173
Extending this even further, setting new values for the color components can set the cached value back to None, triggering re-calculation on the next call (rather than immediately, saving some calculation if many changes are made before the value is wanted), but this is left as an exercise to the reader
I suggest to define it as a property:
#property
def luminosity(self):
return 0.5 * ((max(self._red, self._green, self._blue))/255)+((min(self._red, self._green, self._blue))/255)
By this you can directly return it from any Colour c by
c.luminosity
No, instance variables do not need to be defined in __init__. Instance variables are completely dynamic and can be added any time either in a method or outside of the object (see note). However, if you don't define them, you have created an object access protocol that needs to be managed. Suppose another method is added:
def half_luminosity(self):
return self._luminosity/2
It is an error to call it before luminosity. This code will raise AttributeError if its called at the wrong time. You could assign self._luminosity = None in __init__ and check it
def half_luminosity(self):
if self._luminosity is None:
raise ValueError("Attempt to use luminosity before set")
but that's not much different than
def half_luminosity(self):
if not hasattr(self, '_luminosity'):
raise ValueError("Attempt to use luminosity before set")
If you have a class that is setup in more than one step, either way will do. PEP8 favors the first because its easier for a futurer reader to see what's going on.
NOTE: Classes that use __slots__ or one of the getattr methods can change the rules as can C extensions.
Related
Let's say I have the following classes:
import math
class LineSegment:
def __init__(
self,
origin,
termination,
):
self.origin = origin
self.termination = termination
self.length = self.calculate_length()
def calculate_length(self):
return math.sqrt(
(self.origin.x - self.termination.x) ** 2
+ (self.origin.y - self.termination.y) ** 2
)
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
An object of the LineSegment class is composed of two objects of the Point class. Now, let's say I initialize an object as so:
this_origin = Point(x=0, y=0)
this_termination = Point(x=1, y=1)
this_line_segment = LineSegment(origin=this_origin, termination=this_termination)
Note: The initialization of the line segment automatically calculates its length. This is critical to other parts of the codebase, and cannot be changed. I can see its length like this:
print(this_line_segment.length) # This prints "1.4142135623730951" to the console.
Now, I need to mutate one parameter of this_line_segment's sub-objects:
this_line_segment.origin.x = 1
However, the this_line_segments length attribute does not update based on the new origin's x coordinate:
print(this_line_segment.length) # This still prints "1.4142135623730951" to the console.
What is the pythonic way to implement updating a class's attributes when one of the attributes they are dependent upon changes?
Option 1: Getter and Setter Methods
In other object-oriented programming languages, the behavior you desire, adding additional logic when accessing the value of an instance variable, is typically implemented by "getter" and "setter" methods on all instance variables in the object:
class LineSegment:
def __init__(
self,
origin,
termination,
):
self._origin = origin
self._termination = termination
# getter method for origin
def get_origin(self):
return self._origin
# setter method for origin
def set_origin(self,new_origin):
self._origin = new_origin
# getter method for termination
def get_termination(self):
return self._termination
# setter method for termination
def set_termination(self,new_termination):
self._termination = new_termination
def get_length(self):
return math.sqrt(
(self.get_origin().x - self.get_termination().x) ** 2
+ (self.get_origin().y - self.get_termination().y) ** 2
) #Calls the getters here, rather than the instance vars in case
# getter logic is added in the future
So that the extra length calculation is performed every time you get() the length variable, and instead of this_line_segment.origin.x = 1, you do:
new_origin = this_line_segment.get_origin()
new_origin.x = 1
this_line_segment.set_origin(new_origin)
print(this_line_segment.get_length())
(Note that I use _ in front of variables to denote that they are private and should only be accessed via getters and setters. For example, the variable length should never be set by the user--only through the LineSegment class.)
However, explicit getters and setters are clearly a clunky way to manage variables in Python, where the lenient access protections make accessing them directly more transparent.
Option 2: The #property decorator
A more Pythonic way to add getting and setting logic is the #property decorator, as #progmatico points out in their comment, which calls decorated getter and setter methods when an instance variable is accessed. Since all we need to do is calculate the length whenever it is needed, we can leave the other instance variables public for now:
class LineSegment:
def __init__(
self,
origin,
termination,
):
self.origin = origin
self.termination = termination
# getter method for length
#property
def length(self):
return math.sqrt(
(self.origin.x - self.termination.x) ** 2
+ (self.origin.y - self.termination.y) ** 2
)
And usage:
this_line_segment = LineSegment(origin=Point(x=0,y=0),
termination=Point(x=1,y=1))
print(this_line_segment.length) # Prints 1.4142135623730951
this_line_segment.origin.x = 1
print(this_line_segment.length) # Prints 1.0
Tested in Python 3.7.7.
Note: We must do the length calculation in the length getter and not upon initialization of the LineSegment. We can't do the length calculation in the setter methods for the origin and termination instance variables and thus also in the initialization because the Point object is mutable, and mutating it does not call LineSegment's setter method. Although we could do this in Option 1, it would lead to an antipattern, in which we would have to recalculate every other instance variable in the setter for each instance variable of an object in the cases for which the instance variables depend on one another.
I've found that I have two unrelated functions that implement identical behavior in different ways. I'm now wondering if there's a way, via decorators probably, to deal with this efficiently, to avoid writing the same logic over and over if the behavior is added elsewhere.
Essentially I have two functions in two different classes that have a flag called exact_match. Both functions check for some type of equivalence in the objects that they are members of. The exact_match flag forces to function to check float comparisons exactly instead of with a tolerance. You can see how I do this below.
def is_close(a, b, rel_tol=1e-09, abs_tol=0.0):
return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
def _equal(val_a, val_b):
"""Wrapper for equality test to send in place of is_close."""
return val_a == val_b
#staticmethod
def get_equivalence(obj_a, obj_b, check_name=True, exact_match=False):
equivalence_func = is_close
if exact_match:
# If we're looking for an exact match, changing the function we use to the equality tester.
equivalence_func = _equal
if check_name:
return obj_a.name == obj_b.name
# Check minimum resolutions if they are specified
if 'min_res' in obj_a and 'min_res' in obj_b and not equivalence_func(obj_a['min_res'], obj_b['min_res']):
return False
return False
As you can see, standard procedure has us use the function is_close when we don't need an exact match, but we swap out the function call when we do. Now another function needs this same logic, swapping out the function. Is there a way to use decorators or something similar to handle this type of logic when I know a specific function call may need to be swapped out?
No decorator needed; just pass the desired function as an argument to get_equivalence (which is now little more than a wrapper that applies
the argument).
def make_eq_with_tolerance(rel_tol=1e-09, abs_tol=0.0):
def _(a, b):
return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
return _
# This is just operator.eq, by the way
def _equal(val_a, val_b-):
return val_a == val_b
def same_name(a, b):
return a.name == b.name
Now get_equivalence takes three arguments: the two objects to compare
and a function that gets called on those two arguments.
#staticmethod
def get_equivalence(obj_a, obj_b, equivalence_func):
return equivalence_func(obj_a, obj_b)
Some example calls:
get_equivalence(a, b, make_eq_with_tolerance())
get_equivalence(a, b, make_eq_with_tolerance(rel_tol=1e-12)) # Really tight tolerance
get_equivalence(a, b, _equal)
get_equivalence(a, b, same_name)
I came up with an alternative solution that is perhaps less correct but answers let's me solve the problem as I originally wanted to.
My solution uses a utility class that can be used as a member of a class or as a mixin for the class to provide the utility functions in a convenient way. Below, the functions _equals and is_close are defined elsewhere as their implementations is besides the point.
class EquivalenceUtil(object):
def __init__(self, equal_comparator=_equals, inexact_comparator=is_close):
self.equals = equal_comparator
self.default_comparator = inexact_comparator
def check_equivalence(self, obj_a, obj_b, exact_match=False, **kwargs):
return self.equals(obj_a, obj_b, **kwargs) if exact_match else self.default_comparator(obj_a, obj_b, **kwargs)
It's a simple class that can be used like so:
class BBOX(object):
_equivalence = EquivalenceUtil()
def __init__(self, **kwargs):
...
#classmethod
def are_equivalent(cls, bbox_a, bbox_b, exact_match=False):
"""Test for equivalence between two BBOX's."""
bbox_list = bbox_a.as_list
other_list = bbox_b.as_list
for _index in range(0, 3):
if not cls._equivalence.check_equivalence(bbox_list[_index],
other_list[_index],
exact_match=exact_match):
return False
return True
This solution is more opaque to the user about how things are checked behind the scenes, which is important for my project. Additionally it is pretty flexible and can be reused within a class in multiple places and ways, and easily added to a new class.
In my original example the code can turn into this:
class TileGrid(object):
def __init__(self, **kwargs):
...
#staticmethod
def are_equivalent(grid1, grid2, check_name=False, exact_match=False):
if check_name:
return grid1.name == grid2.name
# Check minimum resolutions if they are specified
if 'min_res' in grid1 and 'min_res' in grid2 and not cls._equivalence.check_equivalence(grid1['min_res'], grid2['min_res'], exact_match=exact_match):
return False
# Compare the bounding boxes of the two grids if they exist in the grid
if 'bbox' in grid1 and 'bbox' in grid2:
return BBOX.are_equivalent(grid1.bbox, grid2.bbox, exact_mach=exact_match)
return False
I can't recommend this approach in the general case, because I can't help but feel there's some code smell to it, but it does exactly what I need it to and will solve a great many problems for my current codebase. We have specific requirements, this is a specific solution. The solution by chepner is probably best for the general case of letting the user decide how a function should test equivalence.
I'm using the property and setter decorators int he following way:
class PCAModel(object):
def __init__(self):
self.M_inv = None
#property
def M_inv(self):
return self.__M_inv
#M_inv.setter
def set_M_inv(self):
M = self.var * np.eye(self.W.shape[1]) + np.matmul(self.W.T, self.W)
self.__M_inv = np.linalg.inv(M)
This generates an error in the __init__ function because my setter is not taking an argument:
TypeError: M_inv() takes 1 positional argument but 2 were given
I don't want to set the M_inv with an argument, since the calculations of M_inv rely solely on other properties of the class object. I could put a dummy argument in the setter:
#M_inv.setter
def set_M_inv(self, foo):
M = self.var * np.eye(self.W.shape[1]) + np.matmul(self.W.T, self.W)
self.__M_inv = np.linalg.inv(M)
but that feels dirty. Is there a better way to get around this?
You are missing the point of setters and getters, although the names are pretty self-explanatory. If your parameter is calculated independently from what you are trying to set (you want to ommit the argument in the setter), then a setter is just not needed at all. Since all you wanna do is calculate this parameter for each instance, just calculate and return the value in the getter, so you will be getting the correct, newly-calculated value each time you try to access your parameter.
#property
def M_inv(self):
M = self.var * np.eye(self.W.shape[1]) + np.matmul(self.W.T, self.W)
return np.linalg.inv(M)
For my application I have an object which has some set of attributes. One set of attributes, the parameters, can be accessed and adjusted by the user. The other set of attributes, the outputs, should be accessible by the user but are calculated using internal methods. Furthermore, if any of the parameter attributes are adjusted the outputs must also be re-calculated from the internal methods and adjusted accordingly. However, as these calculations may be costly I do not want to needlessly run them unless (or until) they are requested.
Currently I can implement this by making each "parameter" attribute a property and including a self.calculated flag which is raised whenever any of the parameters are changed and also making each "output" attribute a property which checks the self.calculated flag and accordingly either returns the output directly if no calculation is needed or performs the calculation, lowers the flag, and returns the output.
See code
class Rectangle(object):
def __init__(self, length=1, width=1):
self._length = length
self._width = width
self._area = self.calc_area()
self._perim = self.calc_perim()
self.calculated = True
#property
def length(self):
return self._length
#length.setter
def length(self, value):
if value != self._length:
self._length = value
self.calculated = False
#property
def width(self):
return self._width
#width.setter
def width(self, value):
if value != self._width:
self._width = value
self.calculated = False
#property
def area(self):
if self.calculated is True:
return self._area
else:
self.recalculate()
return self._area
#property
def perim(self):
if self.calculated is True:
return self._perim
else:
self.recalculate()
return self._perim
def calc_area(self):
return self.length * self.width
def calc_perim(self):
return 2 * (self.length + self.width)
def recalculate(self):
self._area = self.calc_area()
self._perim = self.calc_perim()
self.calculated = True
def double_width(self):
self.width = 2 * self.width
This gives the desired behavior but seems like an excessive proliferation of properties which would be especially problematic if there got to be a large number of parameters and outputs.
Is there a cleaner way to implement this attribute change/recalculation structure? I have found a couple of posts where a solution is presented involving writing a __setattr__ method for the class but I'm not sure if that would be straight forward to implement in my since the behavior should be different depending on the particular attribute being set. I guess this could be handled with a check in the __setattr__ method about whether the attribute is a parameter or output...
Decorating a class to monitor attribute changes
How to identify when an attribute's attribute is being set?
There are several different options, and which to use is highly dependent on the use case. Clearly, the options below can be quite bad in many use cases.
Delete outputs when inputs change
An example of implementing this:
#width.setter
def width(self, value):
if value != self._width:
self._width = value
(self._area,self._perim) = (None,None)
def perim(self):
if not self._perim:
self._perim = calc_perim(self)
return self._perim
This doesn't address most of your concern, but it does get rid of the calculated flag (and while your code recalculates all of the outputs when any of them are requested after an update, this code just calculates the requested one).
Update values when inputs change
When you increase the width by x, the perimeter increases by 2*x and the area increases by x*length. In some cases, applying formulae such as these to update the values as the inputs change can be more efficient that calculating the outputs from scratch every time the inputs change.
Keep track of last values
Whenever you calculate the outputs, keep track not only of what results you got, but what inputs you used to calculate them. Then next time an object is asked what its outputs are, it can check whether the outputs were calculated according to its current attributes. Obviously, this requires multiplying the input storage space.
Memoization
Going even further than the previous option, create a dictionary where the keys are tuples of attributes, and the values are output. If you currently have a function calculate_output(attributes), replace all calls to the function with
def output_lookup(attributes):
if not attributes in output_dict.keys():
output_dict[attributes] = calculate_output(attributes)
return output_dict[attributes]
You should use this option if you expect particular combinations of attributes to be repeated often, calculating the outputs is expensive, and/or memory is cheap. This can be shared across the class, so if you have several instances of rectangles that have the same length and width, you can have one (_perim,_area) value stored, rather than duplicating it across each instance. So for some use cases, this can be more efficient.
Note that your issue ultimately derives from the fact that you are trying to engage in some memoization (you want to save the results from your calculations, so that when someone accesses an object's outputs, you don't have to calculate the outputs if they've already been calculated for the current inputs), but you need to keep track of when to "invalidate the cache", so to speak. If you were to simply treat the area and perimeter as methods rather than attributes, or you were to treats instances as immutable and require resetting attributes be done by creating a new instance with the new values, you would eliminate the complexities that you've added to the length and width. You can't have it all: you can't have cached values from mutable attributes without some overhead.
PS is True is redundant in if self.calculated is True:.
I have started learning python classes some time ago, and there is something that I do not understand when it comes to usage of self.variables inside of a class. I googled, but couldn't find the answer. I am not a programmer, just a python hobbyist.
Here is an example of a simple class, with two ways of defining it:
1)first way:
class Testclass:
def __init__(self, a,b,c):
self.a = a
self.b = b
self.c = c
def firstMethod(self):
self.d = self.a + 1
self.e = self.b + 2
def secondMethod(self):
self.f = self.c + 3
def addMethod(self):
return self.d + self.e + self.f
myclass = Testclass(10,20,30)
myclass.firstMethod()
myclass.secondMethod()
addition = myclass.addMethod()
2)second way:
class Testclass:
def __init__(self, a,b,c):
self.a = a
self.b = b
self.c = c
def firstMethod(self):
d = self.a + 1
e = self.b + 2
return d,e
def secondMethod(self):
f = self.c + 3
return f
def addMethod(self, d, e, f):
return d+e+f
myclass = Testclass(10,20,30)
d, e = myclass.firstMethod()
f= myclass.secondMethod()
addition = myclass.addMethod(d,e,f)
What confuses me is which of these two is valid?
Is it better to always define the variables inside the methods (the variables we expect to use later) as self.variables (which would make them global inside of class) and then just call them inside some other method of that class (that would be the 1st way in upper code)?
Or is it better not to define variables inside methods as self.variables, but simply as regular variables, then return at the end of the method. And then "reimport" them back into some other method as its arguments (that would be 2nd way in upper code)?
EDIT: just to make it clear, I do not want to define the self.d, self.e, self.f or d,e,f variables under the init method. I want to define them at some other methods like showed in the upper code.
Sorry for not mentioning that.
Both are valid approaches. Which one is right completely depends on the situation.
E.g.
Where you are 'really' getting the values of a, b, c from
Do you want/need to use them multiple times
Do you want/need to use them within other methods of the class
What does the class represent
Are a b and c really 'fixed' attributes of the class, or do they depend on external factors?
In the example you give in the comment below:
Let's say that a,b,c depend on some outer variables (for example a = d+10, b = e+20, c = f+30, where d,e,f are supplied when instantiating a class: myclass = Testclass("hello",d,e,f)). Yes, let's say I want to use a,b,c (or self.a,self.b,self.c) variables within other methods of the class too.
So in that case, the 'right' approach depends mainly on whether you expect a, b, c to change during the life of the class instance. For example, if you have a class where hte attributes (a,b,c) will never or rarely change, but you use the derived attribures (d,e,f) heavily, then it makes sense to calculate them once and store them. Here's an example:
class Tiger(object):
def __init__(self, num_stripes):
self.num_stripes = num_stripes
self.num_black_stripes = self.get_black_stripes()
self.num_orange_stripes = self.get_orange_stripes()
def get_black_stripes(self):
return self.num_stripes / 2
def get_orange_stripes(self):
return self.num_stripes / 2
big_tiger = Tiger(num_stripes=200)
little_tiger = Tiger(num_stripes=30)
# Now we can do logic without having to keep re-calculating values
if big_tiger.num_black_stripes > little_tiger.num_orange_stripes:
print "Big tiger has more black stripes than little tiger has orange"
This works well because each individual tiger has a fixed number of stripes. If we change the example to use a class for which instances will change often, then out approach changes too:
class BankAccount(object):
def __init__(self, customer_name, balance):
self.customer_name = customer_name
self.balance = balance
def get_interest(self):
return self.balance / 100
my_savings = BankAccount("Tom", 500)
print "I would get %d interest now" % my_savings.get_interest()
# Deposit some money
my_savings.balance += 100
print "I added more money, my interest changed to %d" % my_savings.get_interest()
So in this (somewhat contrived) example, a bank account balance changes frequently - therefore there is no value in storing interest in a self.interest variable - every time balance changes, the interest amount will change too. Therefore it makes sense to calculate it every time we need to use it.
There are a number of more complex approaches you can take to get some benefit from both of these. For example, you can make your program 'know' that interest is linked to balance and then it will temporarily remember the interest value until the balance changes (this is a form of caching - we use more memory but save some CPU/computation).
Unrelated to original question
A note about how you declare your classes. If you're using Python 2, it's good practice to make your own classes inherit from python's built in object class:
class Testclass(object):
def __init__(self, printHello):
Ref NewClassVsClassicClass - Python Wiki:
Python 3 uses there new-style classes by default, so you don't need to explicitly inherit from object if using py3.
EDITED:
If you want to preserve the values inside the object after perform addMethod, for exmaple, if you want call addMethod again. then use the first way. If you just want to use some internal values of the class to perform the addMethod, use the second way.
You really can't draw any conclusions on this sort of question in the absence of a concrete and meaningful example, because it's going to depend on the facts and circumstances of what you're trying to do.
That being said, in your first example, firstMethod() and secondMethod() are just superfluous. They serve no purpose at all other than to compute values that addMethod() uses. Worse, to make addMethod() function, the user has to first make two inexplicable and apparently unrelated calls to firstMethod() and secondMethod(), which is unquestionably bad design. If those two methods actually did something meaningful it might make sense (but probably doesn't) but in the absence of a real example it's just bad.
You could replace the first example by:
class Testclass:
def __init__(self, a,b,c):
self.a = a
self.b = b
self.c = c
def addMethod(self):
return self.a + self.b + self.c + 6
myclass = Testclass(10,20,30)
addition = myclass.addMethod()
The second example is similar, except firstMethod() and secondMethod() actually do something, since they return values. If there was some reason you'd want these values separately for some reason other than passing them to addMethod(), then again, it might make sense. If there wasn't, then again you could define addMethod() as I just did, and dispense with those two additional functions altogether, and there wouldn't be any difference between the two examples.
But this is all very unsatisfactory in the absence of a concrete example. Right now all we can really say is that it's a slightly silly class.
In general, objects in the OOP sense are conglomerates of data (instance variables) and behavior (methods). If a method doesn't access instance variables - or doesn't need to - then it generally should be a standalone function, and not be in a class at all. Once in a while it will make sense to have a class or static method that doesn't access instance variables, but in general you should err towards preferring standalone functions.