I have a question about multiple class inheritance in python. I think that I have already implemented it correctly, however it is somehow not in line with my usual understanding of inheritance (the usage of super(), specifically) and I am not really sure whether this could lead to errors or certain attributes not be updated etc.
So, let me try to describe the basic problem clearly:
I have three classes Base, First and Second
Both First and Second need to inherit from Base
Second also inherits from First
Base is an external module that has certain base methods needed for First and Second to function correctly
First is a base class for Second, which contains methods that I would have to repetitively write down in Second
Second is the actual class that I use. It implements additional methods and attributes. Second is a class for which the design may vary a lot, so I want to flexibly change it without having all the code from first written in Second.
The most important point about Second however is the following: As visible below, in Second's init, I firstly want to inherit from Base and perform some operations that require methods from Base. Then, after that, I would like to launch the operations in the init of First, which manipulate some of the parameters that are instantiated in Second. For that, I inherit from First at the end of Second's init-body.
You can see how the variable a is manipulated by throughout the initialization of Second. The current behavior is as I wish, but the structure of my code looks somehow weird, which is why I am asking.
Why the hell do I want to do this? Think of the First class having many methods and also performing many operations (on parameters from Second) in it's init body. I don't want to have all these methods in the body of Second and all these operations in the init of Second (here the parameter a). First is a class that will rarely change, so it is better for clarity and compactness to move it back to another file, at least in my opinion ^^. Also, due to the sequence of calls in Second's init, I did not find another way to realize it.
Now the code:
class Base():
def __init__(self):
pass
def base_method1(self):
print('Base Method 1')
pass
def base_method2(self):
pass
# ...
class First(Base):
def __init__(self):
super().__init__()
print('Add in Init')
self.first_method1()
def first_method1(self):
self.a += 1.5
def first_method2(self):
pass
# ...
class Second(First):
def __init__(self,a):
# set parameters
self.a = a
# inherit from Base class
Base.__init__(self)
# some operations that rely on Base-methods
self.base_method1()
print(self.a)
# inherit from First and perform operations in First-init
# that must follow AFTER the body of Second-init
First.__init__(self)
print(self.a)
# checking whether Second has inherited the method(s) from First
print('Add by calling method')
self.first_method1()
print(self.a)
sec = Second(0)
The output of the statement sec = Second(0) prints:
Base Method 1
0
Add in Init
1.5
Add by calling method
3.0
I hope it is more or less clear; if not, I am glad to clarify!
Thanks, I appreciate any comment!
Best, JZ
So - the basic problem here is that you are trying something for which multiple-inheritance is not a solution on itself - however there are ways to structure your code so that they work.
When using multiple-inheritance proper in Python, one should only have to call super().method once in each defined method, out of the base class, and only once, and do not call specific versions of the method by hardcoding an ancestor like you do in Base.__init__() on Second class. Just for start, with this design as is, Base.__init__() will run twice each time you instantiate Second.
The main problem in your assumptions lies in
I firstly want to inherit from Base and perform some operations that require methods from Base. Then, after that, I would like to launch the operations in the init of First, which manipulate some of the parameters that are instantiated in Second. For that, I inherit from First at the end of Second's init-body.
So - if you call super()as you shall have perceived, no matter if you write Second as class Second(Base, First) , super will run the method in First before the method in Base. It happens because Python does linearize all ancestor classes when there is multiple-inheritance, so that there is always a predictable and deterministic order, in which each ancestor class shows up only once, to find attributes and call methods (the "mro" or "Method Resolution Order") - and the only possible way with your arrangement is to go Second->First->Base.
Now, if you really want to perform initialization of certain parts in Base prior to running intialization in First, that is where "more than multiple inheritance" comes into play: you have to define slots in your class - that is a design decision you do by yourself, and nothing ready-made on the language: agree on some convention of methods that Base.__init__ will call at varying states of the initialization, so that they are activated at the proper time.
This way you could even skip having a First.__init__ method - just have some method you know will be called on the child, by Base.__init__ and override those as you need.
The language itself use this strategy when it offers both a __new__ and a __init__ method spec: each will run a different stage of initialization of a new instance.
Note that by doing it this way, you don't even need multiple inheritance for this case:
class Base():
def __init__(self):
# initial step
self.base_method1()
...
# first initalization step:
self.init_1()
#Intemediate common initialization
...
# second initialization step
...
def init_1(self):
pass
def init_2(self):
pass
def base_method1(self):
pass
# ...
class First(Base):
# __init__ actually uneeded here:
#def __init__(self):
# super().__init__()
def init_1(self):
print('Add in Init')
self.first_method1()
return super().init_1()
# init_2 not used,
def first_method1(self):
self.a += 1.5
def first_method2(self):
pass
# ...
class Second(First):
def __init__(self,a):
# set parameters
self.a = a
# some operations that rely on Base-methods
super().__init__() # will call Base.__init__ which should call "base_method1"
# if you need access to self.a _before_ it is changed by First, you implement
# the "init_1" slot in this "Second" class
# At this point, the code in First that updates the attribute has already run
print(self.a)
def init_1(self):
# this part of the code will be called by Base.__init__
_before_ First.init_1 is executed:
print("Initial value of self.a", a)
# delegate to "init_1" stage on First:
return super().init_1()
sec = Second(0)
Related
I'm sorry for the rather vague formulation of the question. I'll try to clarify what I mean through a simple example below. I have a class I want to use as a base for other classes:
class parent:
def __init__(self, constant):
self.constant = constant
def addconstant(self, number):
return self.constant + number
The self.constant parameter is paramount for the usability of the class, as the addconstant method depends on it. Therefore the __init__ method takes care of forcing the user to define the value of this parameter. So far so good.
Now, I define a child class like this:
class child(parent):
def __init__(self):
pass
Nothing stops me from creating an instance of this class, but when I try to use the addconstant, it will obviously crash.
b = child()
b.addconstant(5)
AttributeError: 'child' object has no attribute 'constant'
Why does Python allow to do this!? I feel that Python is "too flexible" and somehow breaks the point of OOP and encapsulation. If I want to extend a class to take advantage of inheritance, I must be careful and know certain details of the implementation of the class. In this case, I have to know that forcing the user to set the constant parameter constant is fundamental not to break the usability of the class. Doesn't this somehow break the encapsulation principle?
Because Python is a dynamic language. It doesn't know what attributes are on a parent instance until parent's initializer puts them there. In other words, the attributes of parent are determined when you instantiate parent and not an instant before.
In a language like Java, the attributes of parent would be established at compile time. But Python class definitions are executed, not compiled.
In practice, this isn't really a problem. Yes, if you forget to call the parent class's initializer, you will have trouble. But calling the parent class's initializer is something you pretty much always do, in any language, because you want the parent class's behavior. So, don't forget to do that.
A sometimes-useful technique is to define a reasonable default on the class itself.
class parent:
constant = 0
def __init__(self, constant):
self.constant = constant
def addconstant(self, number):
return self.constant + number
Python falls back to accessing the class's attribute if you haven't defined it on an instance. So, this would provide a fallback value in case you forget to do that.
I just can't see why do we need to use #staticmethod. Let's start with an exmaple.
class test1:
def __init__(self,value):
self.value=value
#staticmethod
def static_add_one(value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
a=test1(3)
print(a.new_val) ## >>> 4
class test2:
def __init__(self,value):
self.value=value
def static_add_one(self,value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
b=test2(3)
print(b.new_val) ## >>> 4
In the example above, the method, static_add_one , in the two classes do not require the instance of the class(self) in calculation.
The method static_add_one in the class test1 is decorated by #staticmethod and work properly.
But at the same time, the method static_add_one in the class test2 which has no #staticmethod decoration also works properly by using a trick that provides a self in the argument but doesn't use it at all.
So what is the benefit of using #staticmethod? Does it improve the performance? Or is it just due to the zen of python which states that "Explicit is better than implicit"?
The reason to use staticmethod is if you have something that could be written as a standalone function (not part of any class), but you want to keep it within the class because it's somehow semantically related to the class. (For instance, it could be a function that doesn't require any information from the class, but whose behavior is specific to the class, so that subclasses might want to override it.) In many cases, it could make just as much sense to write something as a standalone function instead of a staticmethod.
Your example isn't really the same. A key difference is that, even though you don't use self, you still need an instance to call static_add_one --- you can't call it directly on the class with test2.static_add_one(1). So there is a genuine difference in behavior there. The most serious "rival" to a staticmethod isn't a regular method that ignores self, but a standalone function.
Today I suddenly find a benefit of using #staticmethod.
If you created a staticmethod within a class, you don't need to create an instance of the class before using the staticmethod.
For example,
class File1:
def __init__(self, path):
out=self.parse(path)
def parse(self, path):
..parsing works..
return x
class File2:
def __init__(self, path):
out=self.parse(path)
#staticmethod
def parse(path):
..parsing works..
return x
if __name__=='__main__':
path='abc.txt'
File1.parse(path) #TypeError: unbound method parse() ....
File2.parse(path) #Goal!!!!!!!!!!!!!!!!!!!!
Since the method parse is strongly related to the classes File1 and File2, it is more natural to put it inside the class. However, sometimes this parse method may also be used in other classes under some circumstances. If you want to do so using File1, you must create an instance of File1 before calling the method parse. While using staticmethod in the class File2, you may directly call the method by using the syntax File2.parse.
This makes your works more convenient and natural.
I will add something other answers didn't mention. It's not only a matter of modularity, of putting something next to other logically related parts. It's also that the method could be non-static at other point of the hierarchy (i.e. in a subclass or superclass) and thus participate in polymorphism (type based dispatching). So if you put that function outside the class you will be precluding subclasses from effectively overriding it. Now, say you realize you don't need self in function C.f of class C, you have three two options:
Put it outside the class. But we just decided against this.
Do nothing new: while unused, still keep the self parameter.
Declare you are not using the self parameter, while still letting other C methods to call f as self.f, which is required if you wish to keep open the possibility of further overrides of f that do depend on some instance state.
Option 2 demands less conceptual baggage (you already have to know about self and methods-as-bound-functions, because it's the more general case). But you still may prefer to be explicit about self not being using (and the interpreter could even reward you with some optimization, not having to partially apply a function to self). In that case, you pick option 3 and add #staticmethod on top of your function.
Use #staticmethod for methods that don't need to operate on a specific object, but that you still want located in the scope of the class (as opposed to module scope).
Your example in test2.static_add_one wastes its time passing an unused self parameter, but otherwise works the same as test1.static_add_one. Note that this extraneous parameter can't be optimized away.
One example I can think of is in a Django project I have, where a model class represents a database table, and an object of that class represents a record. There are some functions used by the class that are stand-alone and do not need an object to operate on, for example a function that converts a title into a "slug", which is a representation of the title that follows the character set limits imposed by URL syntax. The function that converts a title to a slug is declared as a staticmethod precisely to strongly associate it with the class that uses it.
I have a program that must continuously create thousands of objects off of a class that has about 12–14 methods. Will the fact that they are of a complex class cause a performance hit over creating a simpler object like a list or dictionary, or even another object with fewer methods?
Some details about my situation:
I have a bunch of “text” objects that continuously create and refresh “prints” of their contents. The print objects have many methods but only a handful of attributes. The print objects can’t be contained within the text objects because the text objects need to be “reusable” and make multiple independent copies of their prints, so that rules out just swapping out the print objects’ attributes on refresh.
Am I better off,
Continuously creating the new print objects with all their methods as the application refreshes?
Unraveling the class and turning the print objects into simple structs and the methods into independent functions that take the objects as arguments?
This I assume would depend on whether or not there is a large cost associated with generating new objects with all the methods included in them, versus having to import all the independent functions to wherever they would have been called as object methods.
It doesn't matter how complex the class is; when you create an instance, you only store a reference to the class with the instance. All methods are accessed via this one reference.
No, it should not make a difference.
Consider that when you do the following:
a = Foo()
a.bar()
The call to the bar method is in fact translated under the covers to:
Foo.bar(a)
I.e. bar is "static" under the class definition, and there exists only one instance of the function. When you look at it this way then it suggests that no, there will be no significant impact from the number of methods. The methods are instantiated when you first run the Python program rather than create objects.
I did some testing.
I have the following function:
def call_1000000_times(f):
start = time.time()
for i in xrange(1000000):
f(a=i, b=10000-i)
return time.time() - start
As you can see, this function takes another function, calls it 1000000 times, and returns how long that took, in seconds.
I also created two classes:
A small class:
class X(object):
def __init__(self, a, b):
self.a = a
self.b = b
And a rather large one:
class Y(object):
def __init__(self, a, b):
self.a = a
self.b = b
def foo(self): pass
def bar(self): pass
def baz(self): pass
def anothermethod(self):
pass
#classmethod
def hey_a_class_method(cls, argument):
pass
def thisclassdoeswaytoomuch(self): pass
def thisclassisbecomingbloated(self): pass
def almostattheendoftheclass(self): pass
def imgonnahaveacouplemore(self): pass
def somanymethodssssss(self): pass
def should_i_add_more(self): pass
def yes_just_a_couple(self): pass
def just_for_lolz(self): pass
def coming_up_with_good_method_names_is_hard(self): pass
The results:
>>> call_1000000_times(dict)
0.2680389881134033
>>> call_1000000_times(X)
0.6771988868713379
>>> call_1000000_times(Y)
0.6260080337524414
As you can see, the difference between a large class and a small class is very small, with the large class even being faster in this case. I assume that if you ran this function multiple times with the same type, and averaged the numbers, they'd be even closer, but it's 3AM and I need sleep, so I'm not going to set that up right now.
On the other hand, just calling dict was about 2.5x faster, so if your bottleneck is instantiation, this might be a place to optimize things.
Be wary of premature optimization though. Classes, by holding data and code together, can make your code easier to understand and build upon (functional programming lovers, this is not the place for an argument). It might be a good idea to use the python profiler or other performance measuring tools to find out what parts of your code are slowing it down.
What if anything is the important difference between the following uses of calling the superclass initiation function?
class Child_1(Parent):
def __init__(self):
super(Child, self).__init__()
class Child_2(Parent):
def __init__(self):
super(Parent, self).__init__()
class Child_3(Parent):
def __init__(self):
Parent.__init__(self)
The first form (though you'd fix the typo and make it Child_1 in the call to super) would be what you'd generally want. This will look up the correct method in the inheritence hierarchy.
For the second form, you're looking for parents classes of Parent that implement this method, and you'd have to have a very special use case (if you want to skip a parent, don't derive from them) in order to want to do that.
The third in many cases would wind up doing the same as the first, though without seeing the code for Parent, it's hard to be sure. The advantage of the first method over the third is that you can change the base class of the child and the right method will still be called.
Also, the first form allows for cooperative multiple inheritence. See this post or this writeup to understand the cases where this would be useful or necessary.
I've got a unittest test file containing four test classes each of which is responsible for running tests on one specific class. Each test class makes us of exactly the same set-up and teardown methods. The set-up method is relatively large, initiating about 20 different variables, while the teardown method simply resets these twenty variables to their initial state.
Up to now I have been putting the twenty variables in each of the four setUp classes. This works, but is not very easily maintained; if I decide to change one variable, I must change it in all four setUp methods. My search for a more elegant solution has failed however. Ideally I'd just like to enter my twenty variables once, call them up in each of my four setup methods, then tear them down after each of my test methods. With this end in mind I tried putting the variables in a separate module and importing this in each setUp, but of course the variables are then only available in the setup method (plus, though I couldn't put my finger on the exact reasons, this felt like a potentially problem-prone way of doing it
from unittest import TestCase
class Test_Books(TestCase):
def setup():
# a quick and easy way of making my variables available at the class level
# without typing them all in
def test_method_1(self):
# setup variables available here in their original state
# ... mess about with the variables ...
# reset variables to original state
def test_method_2(self):
# setup variables available here in their original state
# etc...
def teardown(self):
# reset variables to original state without having to type them all in
class Books():
def method_1(self):
pass
def method_2(self):
pass
An alternative is to put the twenty variables into a separate class, set the values in the class's __init__ and then access the data as class.variable, thus the only place to set the variable s in the __init__ and the code s not duplicated.
class Data:
def __init__(self):
data.x= ...
data.y = ....
class Test_Books(TestCase):
def setup():
self.data = Data()
def test_method_1(self):
value = self.data.x # get the data from the variable
This solution makes more sense if the twenty pieces of data are related to each other. Also if you have twenty pieces of data I would expect them to be related and so they should be combined in the real code not just in test.
What I would do is make the 4 test classes each a subclass of one base test class which itself is a subclass of TestCase. Then put setip and teardown in the base class and the rest in the others.
e.g.
class AbstractBookTest(TestCase):
def setup():
...
class Test_Book1(AbstractBookTest):
def test_method_1(self):
...
An alternative is just to make one class not the four you have which seems to be a bit more logical here unless you give a reason for the split.