When I write a class, I declare a some variables within the __init__ method and some other in the other functions. So, I usually end up with something like this:
class Foo:
def __init__(self,a, b):
self.a=a
self.b=b
def foo1(self,c, d)
sum=self.a+self.b+c+d
foo2(sum)
def foo2(self,sum)
print ("The sum is ", sum)
I find this way a bit messy because it gets difficult to keep track of all variables. In contrary, managing variables when they are declared within the __init__ method becomes more easy. So, instead of the previous form, we would have:
class Foo:
def __init__(self,a, b, c, d, sum):
self.a=a
self.b=b
self.c=c
self.d=d
self.sum=sum
def foo1(self)
self.sum=self.a+self.b+self.c+self.d
foo2(self.sum)
def foo2(self)
print ("The sum is ", self.sum)
Which one would you choose and why? Do you think declaring all the variables of all functions of the class in the __init__ method would be a better practice?
Your sum is a good candidate for a computed value i.e. method that acts like class variable (not method). This can be done by #property
class Foo(object):
def __init__(self, a, b):
self.a = a
self.b = b
#property
def sum(self)
return self.a + self.b
f = Foo(1, 2)
print 'The sum is' + f.sum
http://docs.python.org/2/library/functions.html#property
There are at least three aspects to this:
From a design standpoint your constructor should only define variables, that are used at least in two methods in your class or that convey an essential characteristic of the thing you try to model.
From a performance point of view, you should use variables with the smallest scope possible, it saves lookups.
Keeping the variables local, keeps the cognitive load low.
Related
Usually, to refer to an instance variable, the variable name must be preceded with self, as in,
class A:
def __init__(self, x: int):
self.x = x
def print_x(self):
print(self.x)
However, I noticed that if the instance variable is an object, this is not neccesary. That is, I can do,
class B:
pass
class A:
def __init__(self, b: B):
pass
def print_b(self):
print(b)
b = B()
a = A(b)
a.print_b()
and calling print_b from an A object will print the memory address of the B object, without raising an error.
Is this equivalent to explicitly declaring b to be an instance variabe, via self.b = b in __init__ and referring to b as self.b thereafter? And if so, is this proper convention?
And if so, is this proper convention?
In both situations, the design may suffer of tight coupling. The general rule of thumb is to always depend on abstractions, not on concretions.
Only if you share with us the actual code (MCVE) and provide some context then we could provide a practical and better solution to your problem.
I'm working on an abstraction layer to a database, and I have a super class defined similar to this:
class Test():
__init__(self, object):
self.obj = object
#classmethod
def find_object(cls, **kwargs):
# Code to search for object to put in parameter using kwargs.
return cls(found_object)
I then break down that superclass into subclasses that are more specific to the objects they represent.
class Test_B(Test):
# Subclass defining more specific version of Test.
Now, each separate subclass of Test has predefined search criteria. For example, Test_B needs an object with a = 10, b = 30, c = "Pie".
Which would be more "Pythonic"? Using the find_object method from the super class:
testb = Test_B.find_object(a=10, b=30, c="Pie")
or to overwrite the find_object method to expect a, b, and c as parameters:
#classmethod
def find_object(cls, a, b, c):
return super().find_object(a=a, b=b, c=c)
testb = Test_B.find_object(10, 30, "Pie")
First one. "Explicit is better than implicit" - Zen of Python: line 2
Test.find_object isn't intended to be used directly, so I would name it
#classmethod
def _find_object(cls, **kwargs):
...
then have each child class call it to implement its own find_object:
#classmethod
def find_object(cls, a, b, c):
return super()._find_object(a=a, b=b, c=c)
When using super, it's a good idea to preserve the signature of a method if overriding it, because you can never be certain for which class super will return a proxy.
skilsuper - you're right about
Explicit is better than implicit
However, that doesn't mean the first answer is better - you can still apply the same principal on the second solution: find_object(10, 30, "Pie") is implicit, but nothing is stopping you from using find_object(a=10, b=30, c="Pie") (you should use it).
The first solution is problematic, because you might forget an argument (for example, find_object(a=10, b=30)). In that case, the first solution will let it slide, but the second solution will issue a TypeError saying that you're missing an argument.
Suppose I have the following classes:
class base(object):
def __init__(self, name):
self.name = name
self.last_x = 0.0
def calc(self, x):
return x
class A(base):
def calc(self, x):
return f_A(x)
class B(base):
def calc(self, x):
return f_B(x)
...
Each of the lettered classes is basically a wrapper for a corresponding lettered function f_A, f_B. The class instances include a state variable self.last_x as well as the lettered functions are assumed to be state-dependent (i.e. a Markov Chain type process).
What I would like to do is to define dependency chains between instances of these classes in order to try out different functional convolutions. For example, if we wanted to calculate a chain [a, b] on a numerical input value x we would have to do
a = A('firstnode')
b = B('secondnode')
res = b.calc(a.calc(x))
The goal is to do this with arbitrarily long chains, while also being able to access results from each intermediate calculation. I.e. if the chain is [a, b, c] I would like to make accessible results of [a] and [a, b] as well (which is why I included a name string for each node in my current implementation).
What would be the right way to setup my classes and data structures for this use case?
So far I have a fairly heavy-handed solution involving multiple dictionaries to keep track of things, but it feels inelegant and I think I might be missing out on something obvious.
Unfortunately you're improperly reusing names (thus hiding their previous values). E.g, after:
a = A('firstnode')
calling a.calc will try to call this instance (since the assignment has replaced the fact that previously name a was bound to a function) and fail. Best would be to use more sensible naming. If for some reason that's not practical, you need to bind the function names internally at class definition time:
class A(base):
def calc(self, x, a=a):
return a(x)
where the a=a does the trick, and so forth.
Having passed that hurdle, the second one is that you want the last result of each class to be saved, but, you don't save it. So, change the code to e.g
class A(base):
def calc(self, x, a=a):
self.last_result = a(x)
return self.last_result
Once that is done, performing your desired operation on a list of class instances is the least of your problems. E.g
def doit(instances, x):
curr = x
for inst in instances: curr=inst.calc(curr)
return curr
and after this
[inst.last_result for inst in instances]
will give you the intermediate results you're looking for.
I have started learning python classes some time ago, and there is something that I do not understand when it comes to usage of self.variables inside of a class. I googled, but couldn't find the answer. I am not a programmer, just a python hobbyist.
Here is an example of a simple class, with two ways of defining it:
1)first way:
class Testclass:
def __init__(self, a,b,c):
self.a = a
self.b = b
self.c = c
def firstMethod(self):
self.d = self.a + 1
self.e = self.b + 2
def secondMethod(self):
self.f = self.c + 3
def addMethod(self):
return self.d + self.e + self.f
myclass = Testclass(10,20,30)
myclass.firstMethod()
myclass.secondMethod()
addition = myclass.addMethod()
2)second way:
class Testclass:
def __init__(self, a,b,c):
self.a = a
self.b = b
self.c = c
def firstMethod(self):
d = self.a + 1
e = self.b + 2
return d,e
def secondMethod(self):
f = self.c + 3
return f
def addMethod(self, d, e, f):
return d+e+f
myclass = Testclass(10,20,30)
d, e = myclass.firstMethod()
f= myclass.secondMethod()
addition = myclass.addMethod(d,e,f)
What confuses me is which of these two is valid?
Is it better to always define the variables inside the methods (the variables we expect to use later) as self.variables (which would make them global inside of class) and then just call them inside some other method of that class (that would be the 1st way in upper code)?
Or is it better not to define variables inside methods as self.variables, but simply as regular variables, then return at the end of the method. And then "reimport" them back into some other method as its arguments (that would be 2nd way in upper code)?
EDIT: just to make it clear, I do not want to define the self.d, self.e, self.f or d,e,f variables under the init method. I want to define them at some other methods like showed in the upper code.
Sorry for not mentioning that.
Both are valid approaches. Which one is right completely depends on the situation.
E.g.
Where you are 'really' getting the values of a, b, c from
Do you want/need to use them multiple times
Do you want/need to use them within other methods of the class
What does the class represent
Are a b and c really 'fixed' attributes of the class, or do they depend on external factors?
In the example you give in the comment below:
Let's say that a,b,c depend on some outer variables (for example a = d+10, b = e+20, c = f+30, where d,e,f are supplied when instantiating a class: myclass = Testclass("hello",d,e,f)). Yes, let's say I want to use a,b,c (or self.a,self.b,self.c) variables within other methods of the class too.
So in that case, the 'right' approach depends mainly on whether you expect a, b, c to change during the life of the class instance. For example, if you have a class where hte attributes (a,b,c) will never or rarely change, but you use the derived attribures (d,e,f) heavily, then it makes sense to calculate them once and store them. Here's an example:
class Tiger(object):
def __init__(self, num_stripes):
self.num_stripes = num_stripes
self.num_black_stripes = self.get_black_stripes()
self.num_orange_stripes = self.get_orange_stripes()
def get_black_stripes(self):
return self.num_stripes / 2
def get_orange_stripes(self):
return self.num_stripes / 2
big_tiger = Tiger(num_stripes=200)
little_tiger = Tiger(num_stripes=30)
# Now we can do logic without having to keep re-calculating values
if big_tiger.num_black_stripes > little_tiger.num_orange_stripes:
print "Big tiger has more black stripes than little tiger has orange"
This works well because each individual tiger has a fixed number of stripes. If we change the example to use a class for which instances will change often, then out approach changes too:
class BankAccount(object):
def __init__(self, customer_name, balance):
self.customer_name = customer_name
self.balance = balance
def get_interest(self):
return self.balance / 100
my_savings = BankAccount("Tom", 500)
print "I would get %d interest now" % my_savings.get_interest()
# Deposit some money
my_savings.balance += 100
print "I added more money, my interest changed to %d" % my_savings.get_interest()
So in this (somewhat contrived) example, a bank account balance changes frequently - therefore there is no value in storing interest in a self.interest variable - every time balance changes, the interest amount will change too. Therefore it makes sense to calculate it every time we need to use it.
There are a number of more complex approaches you can take to get some benefit from both of these. For example, you can make your program 'know' that interest is linked to balance and then it will temporarily remember the interest value until the balance changes (this is a form of caching - we use more memory but save some CPU/computation).
Unrelated to original question
A note about how you declare your classes. If you're using Python 2, it's good practice to make your own classes inherit from python's built in object class:
class Testclass(object):
def __init__(self, printHello):
Ref NewClassVsClassicClass - Python Wiki:
Python 3 uses there new-style classes by default, so you don't need to explicitly inherit from object if using py3.
EDITED:
If you want to preserve the values inside the object after perform addMethod, for exmaple, if you want call addMethod again. then use the first way. If you just want to use some internal values of the class to perform the addMethod, use the second way.
You really can't draw any conclusions on this sort of question in the absence of a concrete and meaningful example, because it's going to depend on the facts and circumstances of what you're trying to do.
That being said, in your first example, firstMethod() and secondMethod() are just superfluous. They serve no purpose at all other than to compute values that addMethod() uses. Worse, to make addMethod() function, the user has to first make two inexplicable and apparently unrelated calls to firstMethod() and secondMethod(), which is unquestionably bad design. If those two methods actually did something meaningful it might make sense (but probably doesn't) but in the absence of a real example it's just bad.
You could replace the first example by:
class Testclass:
def __init__(self, a,b,c):
self.a = a
self.b = b
self.c = c
def addMethod(self):
return self.a + self.b + self.c + 6
myclass = Testclass(10,20,30)
addition = myclass.addMethod()
The second example is similar, except firstMethod() and secondMethod() actually do something, since they return values. If there was some reason you'd want these values separately for some reason other than passing them to addMethod(), then again, it might make sense. If there wasn't, then again you could define addMethod() as I just did, and dispense with those two additional functions altogether, and there wouldn't be any difference between the two examples.
But this is all very unsatisfactory in the absence of a concrete example. Right now all we can really say is that it's a slightly silly class.
In general, objects in the OOP sense are conglomerates of data (instance variables) and behavior (methods). If a method doesn't access instance variables - or doesn't need to - then it generally should be a standalone function, and not be in a class at all. Once in a while it will make sense to have a class or static method that doesn't access instance variables, but in general you should err towards preferring standalone functions.
Edit: There was some confusion, but I want to ask a general question about object oriented design in Python.
Consider a class that lets you map data values to counts or frequencies:
class DataMap(dict):
pass
Now consider a subclass that allows you to construct a histogram from a list of data:
class Histogram(DataMap):
def __init__(self, list_of_values):
# 1. Put appropriate super(...) call here if necessary
# 2. Build the map of values to counts in self
pass
Now consider a class that lets you make a smoothed probability mass table rather than a Histogram.
class ProbabilityMass(DataMap):
pass
What is the best way to allow a ProbabilityMass to be constructed from either a Histogram or a list of values?
I "grew up" programming in C++, and in this case I would use an overloaded constructor. In Python I've thought of doing this with:
The constructor takes multiple arguments (all but one of these should == None)
I define from_Histogram and from_list methods
In the second case (which I believe is better), what is the best way to allow the from_list method to use the shared code from the Histogram constructor? A ProbabilityMass table is nearly identical to a Histogram table, but it is scaled so that the sum of all value is 1.0.
If you have come across a similar problem, please share your expertise!
To start with, if you think you want #staticmethod, you almost always don't. Either the function is not part of the class, in which case it should just be a free function, or it is part of the class, but not tied to an instance, and it should be a #classmethod. Your named constructor is a good candidate for a #classmethod.
Also note that you should invoke A.__init__ from B via super(), otherwise multiple inheritance can bite you bad.
class A:
def __init__(self, data):
self.values_to_counts = {}
for val in data:
if val in self.values_to_counts:
self.values_to_counts[val] += 1
else:
self.values_to_counts[val] = 1
#classmethod
def from_values_to_counts(cls, values_to_counts):
self = cls([])
self.values_to_counts = values_to_counts
return self
class B(A):
def __init__(self, data, parameter):
super(B, self).__init__(data)
self.parameter = parameter
def print_parameter(self):
print self.parameter
In this case, you don't need a B.from_values_to_counts, it inherits from A, and it will return an instance of B, since that's how it was called.
If you need to do more complex initialization in B, you can, using super(), which looks very similar to the way it would when you use it with instances. after all, a classmethod really isn't anything more complex than an instancemethod where the im_self attribute is assigned to the class itself.
class A:
def __init__(self, data):
self.values_to_counts = {}
for val in data:
if val in self.values_to_counts:
self.values_to_counts[val] += 1
else:
self.values_to_counts[val] = 1
#classmethod
def from_values_to_counts(cls, values_to_counts):
self = cls([])
self.values_to_counts = values_to_counts
return self
class B(A):
def __init__(self, data, parameter):
super(B, self).__init__(data)
self.parameter = parameter
def print_parameter(self):
print self.parameter
#classmethod
def from_values_to_counts(cls, values_to_counts):
self = super(B, cls).from_values_to_counts(values_to_counts)
do_more_initialization(self)
return self