I'm using the property and setter decorators int he following way:
class PCAModel(object):
def __init__(self):
self.M_inv = None
#property
def M_inv(self):
return self.__M_inv
#M_inv.setter
def set_M_inv(self):
M = self.var * np.eye(self.W.shape[1]) + np.matmul(self.W.T, self.W)
self.__M_inv = np.linalg.inv(M)
This generates an error in the __init__ function because my setter is not taking an argument:
TypeError: M_inv() takes 1 positional argument but 2 were given
I don't want to set the M_inv with an argument, since the calculations of M_inv rely solely on other properties of the class object. I could put a dummy argument in the setter:
#M_inv.setter
def set_M_inv(self, foo):
M = self.var * np.eye(self.W.shape[1]) + np.matmul(self.W.T, self.W)
self.__M_inv = np.linalg.inv(M)
but that feels dirty. Is there a better way to get around this?
You are missing the point of setters and getters, although the names are pretty self-explanatory. If your parameter is calculated independently from what you are trying to set (you want to ommit the argument in the setter), then a setter is just not needed at all. Since all you wanna do is calculate this parameter for each instance, just calculate and return the value in the getter, so you will be getting the correct, newly-calculated value each time you try to access your parameter.
#property
def M_inv(self):
M = self.var * np.eye(self.W.shape[1]) + np.matmul(self.W.T, self.W)
return np.linalg.inv(M)
Related
Let's say I have the following classes:
import math
class LineSegment:
def __init__(
self,
origin,
termination,
):
self.origin = origin
self.termination = termination
self.length = self.calculate_length()
def calculate_length(self):
return math.sqrt(
(self.origin.x - self.termination.x) ** 2
+ (self.origin.y - self.termination.y) ** 2
)
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
An object of the LineSegment class is composed of two objects of the Point class. Now, let's say I initialize an object as so:
this_origin = Point(x=0, y=0)
this_termination = Point(x=1, y=1)
this_line_segment = LineSegment(origin=this_origin, termination=this_termination)
Note: The initialization of the line segment automatically calculates its length. This is critical to other parts of the codebase, and cannot be changed. I can see its length like this:
print(this_line_segment.length) # This prints "1.4142135623730951" to the console.
Now, I need to mutate one parameter of this_line_segment's sub-objects:
this_line_segment.origin.x = 1
However, the this_line_segments length attribute does not update based on the new origin's x coordinate:
print(this_line_segment.length) # This still prints "1.4142135623730951" to the console.
What is the pythonic way to implement updating a class's attributes when one of the attributes they are dependent upon changes?
Option 1: Getter and Setter Methods
In other object-oriented programming languages, the behavior you desire, adding additional logic when accessing the value of an instance variable, is typically implemented by "getter" and "setter" methods on all instance variables in the object:
class LineSegment:
def __init__(
self,
origin,
termination,
):
self._origin = origin
self._termination = termination
# getter method for origin
def get_origin(self):
return self._origin
# setter method for origin
def set_origin(self,new_origin):
self._origin = new_origin
# getter method for termination
def get_termination(self):
return self._termination
# setter method for termination
def set_termination(self,new_termination):
self._termination = new_termination
def get_length(self):
return math.sqrt(
(self.get_origin().x - self.get_termination().x) ** 2
+ (self.get_origin().y - self.get_termination().y) ** 2
) #Calls the getters here, rather than the instance vars in case
# getter logic is added in the future
So that the extra length calculation is performed every time you get() the length variable, and instead of this_line_segment.origin.x = 1, you do:
new_origin = this_line_segment.get_origin()
new_origin.x = 1
this_line_segment.set_origin(new_origin)
print(this_line_segment.get_length())
(Note that I use _ in front of variables to denote that they are private and should only be accessed via getters and setters. For example, the variable length should never be set by the user--only through the LineSegment class.)
However, explicit getters and setters are clearly a clunky way to manage variables in Python, where the lenient access protections make accessing them directly more transparent.
Option 2: The #property decorator
A more Pythonic way to add getting and setting logic is the #property decorator, as #progmatico points out in their comment, which calls decorated getter and setter methods when an instance variable is accessed. Since all we need to do is calculate the length whenever it is needed, we can leave the other instance variables public for now:
class LineSegment:
def __init__(
self,
origin,
termination,
):
self.origin = origin
self.termination = termination
# getter method for length
#property
def length(self):
return math.sqrt(
(self.origin.x - self.termination.x) ** 2
+ (self.origin.y - self.termination.y) ** 2
)
And usage:
this_line_segment = LineSegment(origin=Point(x=0,y=0),
termination=Point(x=1,y=1))
print(this_line_segment.length) # Prints 1.4142135623730951
this_line_segment.origin.x = 1
print(this_line_segment.length) # Prints 1.0
Tested in Python 3.7.7.
Note: We must do the length calculation in the length getter and not upon initialization of the LineSegment. We can't do the length calculation in the setter methods for the origin and termination instance variables and thus also in the initialization because the Point object is mutable, and mutating it does not call LineSegment's setter method. Although we could do this in Option 1, it would lead to an antipattern, in which we would have to recalculate every other instance variable in the setter for each instance variable of an object in the cases for which the instance variables depend on one another.
I was wondering if all self. has to be defined in __init__, for example, i have this code right here:
class Colour:
def __init__(self, r, g, b):
self._red = r
self._green = g
self._blue = b
self._rgb = (self._red, self._green, self._blue)
def luminosity(self):
self._luminosity = 0.5 * ((max(self._red, self._green, self._blue))/255)+((min(self._red, self._green, self._blue))/255)
return self._luminosity
Am i right to define self.luminosity in the function def luminosity(self) or should i define it in __init__?
In this case, you don't need to define it, because it's only set and then returned when you could directly return the calculated value from your method!
Additionally, you can simplify the calculation a little, though I am not sure it is really luminosity, as there are a variety of interpretations different to yours
def luminosity(self):
return 0.5 * (
max(self._red, self._green, self._blue) + \
min(self._red, self._green, self._blue)
) / 255
If instead, you were caching the value (which may make sense if you do a more complex calculation or call the luminosity method many times), it would make sense to set it in __init__() and check before calculating (effectively caching the last call)
As #laol suggests, you can also use #property to simplify some of the its use
And finally, you can take advantage of your combined RGB for the calculation
class Colour():
def __init__(self, r, g, b):
self._red = r
self._green = g
self._blue = b
self._luminosity = None
#property
def rgb(self):
return (self._red, self._green, self._blue)
#property
def luminosity(self):
if self._luminosity is None:
self._luminosity = 0.5 * (max(self.rgb) + min(self.rgb)) / 255
return self._luminosity
c = Colour(128,100,100)
print(c.luminosity)
0.44705882352941173
Extending this even further, setting new values for the color components can set the cached value back to None, triggering re-calculation on the next call (rather than immediately, saving some calculation if many changes are made before the value is wanted), but this is left as an exercise to the reader
I suggest to define it as a property:
#property
def luminosity(self):
return 0.5 * ((max(self._red, self._green, self._blue))/255)+((min(self._red, self._green, self._blue))/255)
By this you can directly return it from any Colour c by
c.luminosity
No, instance variables do not need to be defined in __init__. Instance variables are completely dynamic and can be added any time either in a method or outside of the object (see note). However, if you don't define them, you have created an object access protocol that needs to be managed. Suppose another method is added:
def half_luminosity(self):
return self._luminosity/2
It is an error to call it before luminosity. This code will raise AttributeError if its called at the wrong time. You could assign self._luminosity = None in __init__ and check it
def half_luminosity(self):
if self._luminosity is None:
raise ValueError("Attempt to use luminosity before set")
but that's not much different than
def half_luminosity(self):
if not hasattr(self, '_luminosity'):
raise ValueError("Attempt to use luminosity before set")
If you have a class that is setup in more than one step, either way will do. PEP8 favors the first because its easier for a futurer reader to see what's going on.
NOTE: Classes that use __slots__ or one of the getattr methods can change the rules as can C extensions.
This question already has answers here:
What is the purpose of the `self` parameter? Why is it needed?
(26 answers)
Closed 2 years ago.
I've been reading OOP and trying to grasp the concept of self and __init__ and I think I found an explanation that makes sense (to me at least). this is an article on building a linear regression estimator using OOP concepts.
Article Link
class MyLinearRegression:
def __init__(self, fit_intercept=True):
self.coef_ = None
self.intercept_ = None
self._fit_intercept = fit_intercept
The layman explanation is as follows:
At a high level, __init__ provides a recipe for how to build an instance of MyLinearRegression ...
Since an instance of MyLinearRegression can take on any name a user gives it,
we need a way to link the user’s instance name back to the class so we can accomplish certain tasks.
Think of self as a variable whose sole job is to learn the name of a particular instance
so I think this makes sense. what I dont get is why self is used again in when defining new methods.
def predict(self, X):
"""
Output model prediction.
Arguments:
X: 1D or 2D numpy array
"""
# check if X is 1D or 2D array
if len(X.shape) == 1:
X = X.reshape(-1,1)
return self.intercept_ + np.dot(X, self.coef_)
In this version. What is self referring to?
self (or generally first parameter of an instance method; the name self is conventional) refers to the instance itself whose method has been called. In your example, intercept_ attribute of that specific method would be accessed in the return statement.
Consider the following example:
class C:
def m(self):
print(self.a)
c1 = C()
c1.a = 1
c2 = C()
c2.a = 2
c1.m() # prints 1, value of "c1.a"
c2.m() # prints 2, value of "c2.a"
We have a class C and we instantiate two objects. Instance c1 and instance c2. We assign a different value to an attribute a of either instance and then we call a method m which accesses attribute a of a its instance and prints it.
When you create an instance of the class MyLinearRegression i.e.
linear_regression = MyLinearRegression(fit_intercept=True)
Your linear_regression object has been initialised with the following attributes:
linear_regression.coef_ = None
linear_regression.intercept_ = None
linear_regression._fit_intercept = fit_intercept
Notice here how the "self" in the class definition refers to the object instance that we created (i.e. linear_regression)
The class method "predict" can be called as follows:
linear_regression.predict(X)
Here Python adds syntactic sugar, so under the hood the function call above is transformed as follows:
MyLinearRegression.predict(linear_regression, X)
Taking the instance "linear_regression" and inserting it inplace of "self".
Note: For additional reference you are able to see all of the attributes/methods for any object via the following:
print(dir(<insert_object_here>))
I hope this helped.
If you use self as the first parameter of a function, it means that only an instance of this class can call this function. The functions in a class can be classified as class method, instance method and static method.
class method: It's a method that can be called by instance and class. Usually it's used with variables belong to class not to instance.
instance method: It's a method that can be called by only the instance of a class. Usually it's used with the variables belong to the instance.
static method: It's a method can be called by instance and class. Usually it's used with variables that belong neither to the class nor to the instance.
class X:
x = 2
def __init__(self):
self.x = 1
def instance_method(self):
return self.x
#classmethod
def class_method(cls):
return cls.x
print(X.instance_method()) # raises a TypeError
print(X().instance_method()) # not raises a TypeError, prints 1
print(X.class_method()) # not raises a TypeError, prints 2
I think it may help to refer to what the python docs have to say about self in the random remarks of the page on classes:
Often, the first argument of a method is called self. This is nothing more than a convention: the name self has absolutely no special meaning to Python. Note, however, that by not following the convention your code may be less readable to other Python programmers, and it is also conceivable that a class browser program might be written that relies upon such a convention.
This is an important distinction to make because there's a difference depending on whether predict is in a class or not. Let's revisit an expanded version of your example:
class MyLinearRegression:
def __init__(self, fit_intercept=True):
self.coef_ = None
self.intercept_ = None
self._fit_intercept = fit_intercept
def predict(self, X):
"""
Output model prediction.
Arguments:
X: 1D or 2D numpy array
"""
# check if X is 1D or 2D array
if len(X.shape) == 1:
X = X.reshape(-1,1)
return self.intercept_ + np.dot(X, self.coef_)
mlr = MyLinearRegression()
mlr.predict(SomeXData)
When mlr.predict() is called, the mlr instance is passed in as the first parameter of the function predict. This is so that the predict function can refer to the class it is defined in. It's important to note that __init__ is not a special function with respect to self. All member functions accept as their first parameter a reference to the instance of the object that called the function.
This is not the only approach. Consider this alternate example:
class MyLinearRegression:
def __init__(self, fit_intercept=True):
self.coef_ = None
self.intercept_ = None
self._fit_intercept = fit_intercept
def predict(self, X):
"""
Output model prediction.
Arguments:
X: 1D or 2D numpy array
"""
# check if X is 1D or 2D array
if len(X.shape) == 1:
X = X.reshape(-1,1)
return self.intercept_ + np.dot(X, self.coef_)
mlr = MyLinearRegression()
predict(mlr, SomeXData)
The signature for predict hasn't changed, just the way the function is called. This is why self isn't special as a parameter name. We could pass in any class to predict and it would still run, although probably with errors.
I know first argument in Python methods will be an instance of this class. So we need use "self" as first argument in methods. But should we also specify attribures (variables) in method starting with "self."?
My method work even if i don't specify self in his attributes:
class Test:
def y(self, x):
c = x + 3
print(c)
t = Test()
t.y(2)
5
and
class Test:
def y(self, x):
self.c = x + 3
print(self.c)
t = Test()
t.y(2)
5
For what i would need specify an attribute in methods like "self.a" instead of just "a"?
In which cases first example will not work but second will? Want to see situation which shows really differences between two of them, because now they behave the same from my point of view.
The reason you do self.attribute_name in a class method is to perform computation on that instances attribute as opposed to using a random variable.For Example
class Car:
def __init__(self,size):
self.size = size
def can_accomodate(self,number_of_people):
return self.size> number_of_people
def change_size(self,new_size):
self.size=new_size
#works but bad practice
def can_accomodate_v2(self,size,number_of_people):
return size> number_of_people
c = Car(5)
print(c.can_accomodate(2))
print(c.can_accomodate_v2(4,2))
In the above example you can see that the can_accomodate use's self.size while can_accomodate_v2 passes the size variable which is bad practice.Both will work but the v2 is a bad practice and should not be used.You can pass argument into a class method not related to the instance/class for example "number_of_people" in can_accomodate funtion.
Hope this helps.
I have a class which represents a mathematical model. Within this class, I have methods to solve the model, methods to print the results, etc.
The model can be solved for different functions. I am wondering what is the best way to pass a method when I initialize a python object.
To give an example, suppose I had the class Foo and the method f1. My goal is that instead of being defined inside the class definition, f1 would be passed as a parameter.
So, I know I can do this:
class Foo:
def __init__(self, x1):
self.x1 = x1
def f1(self):
return self.x1
bar = Foo(10)
print(bar.f1())
# Result: 10
But is there a way to do this:
class Foo:
def __init__(self, x1, f1):
self.x1 = x1
self.f1 = f1
def f_outside(self):
return self.x1
bar = Foo(10, f_outside)
print(bar.f1())
# Result: 10
The last code example does not work. The error is: missing 1 required positional argument: 'self'
Yes, you can do that, because methods are just bound functions.
When you look up a method on an instance (such as bar.f1, note, no call yet) and Python finds a function by the name f1 on the class (not the instance itself), then the function is bound, resulting in a method object. Python uses the __get__ method for this; functions define that method, and calling it with the right arguments produces a method.
When you store a function on an instance however, that doesn't happen. It's already part of an instance, it doesn't need to be bound, right? So when you use self.f1 = f1, no binding takes place. Calling bar.f1() will then fail to pass in self so you get an error.
But nothing stops you from binding the function yourself:
class Foo:
def __init__(self, x1, f1):
self.x1 = x1
self.f1 = f1.__get__(self)
Now bar.f1() works, because f1 has been bound to bar:
>>> class Foo:
... def __init__(self, x1, f1):
... self.x1 = x1
... self.f1 = f1.__get__(self)
...
>>> def f_outside(self):
... return self.x1
...
>>> bar = Foo(10, f_outside)
>>> bar.f1()
10
There are other ways of achieving the same thing; you could store a function created with lambda to explicitly pass in self:
self.f1 = lambda: f1(self)
or you could use a functools.partial() object to have it pass in self:
self.f1 = partial(f1, self)
or you could create a method instance from by using the type object for methods directly; there is a reference to the type via typing.MethodType:
self.f1 = MethodType(f1, self)
but that last one is going to achieve the exact same thing as f1.__get__(self).
If you want to dive into the deep end and learn more about binding, then you want to read the descriptor HOWTO.
You hopefully have better names in production code. Also, it'd probably make more sense to pass "x1" to the call, not to the initializer.
…
def __call__(self):
return self.f1(self.x1)
…
print(bar())
You can make your second thing work if you pass self to the callable as in self.f1(self) (again, design wise that'd be possibly on the smelly side).
For a better overall structure, look into the "template method" design pattern.