I'm learning Python and I've found something about how Python constructs a sub class which confuses me.
I have a class that inherits from the list class as follows.
class foo(list):
def __init__(self, a_bar):
list.__init__([])
self.bar = a_bar
I know that list.__init__([]) needs to be there but I'm confused about it. It seems to me that this line will just create a new list object and then assign it to nothing, so I would suspect that it would just get garbage collected. How does Python know that this list is part of my object? I suspect that there is something happening behind the scenes and I'd like to know what it is.
The multiple-inheritance-safe way of doing it is:
class foo(list):
def __init__(self, a_bar):
super(foo, self).__init__()
...
which, perhaps, makes it clearer that you're calling the baseclass ctor.
You usually do this when subclassing and overriding the __init__() function:
list.__init__(self)
If you're using Python 3, you can make use of super():
super().__init__()
The actual object is not created with __init__ but with __new__. __init__ is not for creating the object itself but for initializing it --- that is, adding attributes, etc. By the time __init__ is called, __new__ has already been called, so in your example the list was already created before your code even runs. __init__ shouldn't return anything because it's supposed to initialize the object "in-place" (by mutating it), so it works by side-effects. (See a previous question and the documentation.)
You're partly right:
list.__init__([])
"creates a new list object." But this code is wrong. The correct code _should_be:
list.__init__(self)
The reason you need it to be there is because you're inheriting from a list that has it's own __init__() method where it (presumably) does important to initialize itself. When you define your own __init__() method, you're effectively overriding the inherited method of the same name. In order to make sure that the parent class's __init__() code is executed as well, you need to call that parent class's __init__().
There are several ways of doing this:
#explicitly calling the __init__() of a specific class
#"list"--in this case
list.__init__(self, *args, **kwargs)
#a little more flexible. If you change the parent class, this doesn't need to change
super(foo, self).__init__(*args, **kwargs)
For more on super() see this question, for guidance on the pitfalls of super, see this article.
Related
I'm currently learning tkinter from sentdex's tutorial and to me it seems that I'm writing to run __init__ in its own definition, what does a line like that mean? Is it tKinter's __init__ function?
class seaOfBTCapp(tk.Tk):
def __init__(self,*args,**kwargs)
tk.Tk.__init__(self,*args,**kwargs)
It's invoking another class's constructor on itself.
This is a fun quirk of python's object-oriented design. "Instance methods" are really just class methods that take the current instance as an implicit parameter. You can, in fact, call them as class methods and provide the object explicitly:
ex = [1, 2, 3, 4, 5]
# the following are equivalent:
ex.pop(0) # call the method on the instance, passing it implicitly
list.pop(ex, 0) # call the method on the class `list`, passing the instance explicitly
The same behavior is being invoked here. You're taking the __init__ method of the tk.TK class, and passing self in as the "instance". This is an uncommon, but valid, way of accessing methods in the superclass that have been overridden in your subclass (for example, the constructor).
As in #Barmar's answer, a better solution is using super(), which produces something resembling an instance of the superclass, which you then call __init__ on to get the superclass's implementation of __init__() passing self implicitly, as you would expect.
What that line does is call your parent class's __init__ method. That's what would have happened if you didn't define your own method, so if you're not doing anything else in your __init__, you should probably just skip it and let the inherited method run normally.
It's also probably better to call super().__init__(*args, **kwargs), rather than naming the parent class explicitly (and needing to pass self by hand). This is particularly the case if you might ever use this class in a situation involving multiple inheritance, where explicitly naming the next class to be called can get the MRO wrong. If you're just starting in programming, don't worry too much about this, multiple inheritance is a pretty advanced topic (though it's easier to get right in Python than in many other languages).
I think this is equivalent to the more modern:
class seaOfBTCapp(tk.Tk):
def __init__(self,*args,**kwargs)
super().__init__(*args,**kwargs)
I am a beginner in Python and using Lutz's book to understand OOPS in Python. This question might be basic, but I'd appreciate any help. I researched SO and found answers on "how", but not "why."
As I understand from the book, if Sub inherits Super then one need not call superclass' (Super's) __init__() method.
Example:
class Super:
def __init__(self,name):
self.name=name
print("Name is:",name)
class Sub(Super):
pass
a = Sub("Harry")
a.name
Above code does assign attribute name to the object a. It also prints the name as expected.
However, if I modify the code as:
class Super:
def __init__(self,name):
print("Inside Super __init__")
self.name=name
print("Name is:",name)
class Sub(Super):
def __init__(self,name):
Super(name) #Call __init__ directly
a = Sub("Harry")
a.name
The above code doesn't work fine. By fine, I mean that although Super.__init__() does get called (as seen from the print statements), there is no attribute attached to a. When I run a.name, I get an error, AttributeError: 'Sub' object has no attribute 'name'
I researched this topic on SO, and found the fix on Chain-calling parent constructors in python and Why aren't superclass __init__ methods automatically invoked?
These two threads talk about how to fix it, but they don't provide a reason for why.
Question: Why do I need to call Super's __init__ using Super.__init__(self, name) OR super(Sub, self).__init__(name) instead of a direct call Super(name)?
In Super.__init__(self, name) and Super(name), we see that Super's __init__() gets called, (as seen from print statements), but only in Super.__init__(self, name) we see that the attribute gets attached to Sub class.
Wouldn't Super(name) automatically pass self (child) object to Super? Now, you might ask that how do I know that self is automatically passed? If I modify Super(name) to Super(self,name), I get an error message that TypeError: __init__() takes 2 positional arguments but 3 were given. As I understand from the book, self is automatically passed. So, effectively, we end up passing self twice.
I don't know why Super(name) doesn't attach name attribute to Sub even though Super.__init__() is run. I'd appreciate any help.
For reference, here's the working version of the code based on my research from SO:
class Super:
def __init__(self,name):
print("Inside __init__")
self.name=name
print("Name is:",name)
class Sub(Super):
def __init__(self,name):
#Super.__init__(self, name) #One way to fix this
super(Sub, self).__init__(name) #Another way to fix this
a = Sub("Harry")
a.name
PS: I am using Python-3.6.5 under Anaconda Distribution.
As I understand from the book, if Sub inherits Super then one need not call superclass' (Super's) __init__() method.
This is misleading. It's true that you aren't required to call the superclass's __init__ method—but if you don't, whatever it does in __init__ never happens. And for normal classes, all of that needs to be done. It is occasionally useful, usually when a class wasn't designed to be inherited from, like this:
class Rot13Reader:
def __init__(self, filename):
self.file = open(filename):
def close(self):
self.file.close()
def dostuff(self):
line = next(file)
return codecs.encode(line, 'rot13')
Imagine that you want all the behavior of this class, but with a string rather than a file. The only way to do that is to skip the open:
class LocalRot13Reader(Rot13Reader):
def __init__(self, s):
# don't call super().__init__, because we don't have a filename to open
# instead, set up self.file with something else
self.file = io.StringIO(s)
Here, we wanted to avoid the self.file assignment in the superclass. In your case—as with almost all classes you're ever going to write—you don't want to avoid the self.name assignment in the superclass. That's why, even though Python allows you to not call the superclass's __init__, you almost always call it.
Notice that there's nothing special about __init__ here. For example, we can override dostuff to call the base class's version and then do extra stuff:
def dostuff(self):
result = super().dostuff()
return result.upper()
… or we can override close and intentionally not call the base class:
def close(self):
# do nothing, including no super, because we borrowed our file
The only difference is that good reasons to avoid calling the base class tend to be much more common in normal methods than in __init__.
Question: Why do I need to call Super's __init__ using Super.__init__(self, name) OR super(Sub, self).__init__(name) instead of a direct call Super(name)?
Because these do very different things.
Super(name) constructs a new Super instance, calls __init__(name) on it, and returns it to you. And you then ignore that value.
In particular, Super.__init__ does get called one time either way—but the self it gets called with is that new Super instance, that you're just going to throw away, in the Super(name) case, while it's your own self in the super(Sub, self).__init__(name) case.
So, in the first case, it sets the name attribute on some other object that gets thrown away, and nobody ever sets it on your object, which is why self.name later raises an AttributeError.
It might help you understand this if you add something to both class's __init__ methods to show which instance is involved:
class Super:
def __init__(self,name):
print(f"Inside Super __init__ for {self}")
self.name=name
print("Name is:",name)
class Sub(Super):
def __init__(self,name):
print(f"Inside Sub __init__ for {self}")
# line you want to experiment with goes here.
If that last line is super().__init__(name), super(Sub, self).__init__name), or Super.__init__(self, name), you will see something like this:
Inside Sub __init__ for <__main__.Sub object at 0x10f7a9e80>
Inside Super __init__ for <__main__.Sub object at 0x10f7a9e80>
Notice that it's the same object, the Sub at address 0x10f7a9e80, in both cases.
… but if that last line is Super(name):
Inside Sub __init__ for <__main__.Sub object at 0x10f7a9ea0>
Inside Super __init__ for <__main__.Super object at 0x10f7a9ec0>
Now we have two different objects, at different addresses 0x10f7a9ea0 and 0x10f7a9ec0, and with different types.
If you're curious about what the magic all looks like under the covers, Super(name) does something like this (oversimplifying a bit and skipping over some steps1):
_newobj = Super.__new__(Super)
if isinstance(_newobj, Super):
Super.__init__(_newobj, name)
… while super(Sub, self).__init__(name) does something like this:
_basecls = magically_find_next_class_in_mro(Sub)
_basecls.__init__(self, name)
As a side note, if a book is telling you to use super(Sub, self).__init__(name) or Super.__init__(self, name), it's probably an obsolete book written for Python 2.
In Python 3, you just do this:
super().__init__(name): Calls the correct next superclass by method resolution order. You almost always want this.
super(Sub, self).__init__(name): Calls the correct next superclass—unless you make a mistake and get Sub wrong there. You only need this if you're writing dual-version code that has to run in 2.7 as well as 3.x.
Super.__init__(self, name): Calls Super, whether it's the correct next superclass or not. You only need this if the method resolution order is wrong and you have to work around it.2
If you want to understand more, it's all in the docs, but it can be a bit daunting:
__new__
__init__
super (also see Raymond Hettinger's blog post)
method invocation (also see the HOWTO)
The original introduction to super, __new__, and all the related features was very helpful to me in understanding all of this. I'm not sure if it'll be as helpful to someone who's not coming at this already understanding old-style Python classes, but it's pretty well written, and Guido (obviously) knows what he's talking about, so it might be worth reading.
1. The biggest cheat in this explanation is that super actually returns a proxy object that acts like _baseclass bound to self in the same way methods are bound, which can be used to bind methods, like __init__. This is useful/interesting knowledge if you know how methods work, but probably just extra confusion if you don't.
2. … or if you're working with old-style classes, which don't support super (or proper method-resolution order). This never comes up in Python 3, which doesn't have old-style classes. But, unfortunately, you will see it in lots of tkinter examples, because the best tutorial is still Effbot's, which was written for Python 2.3, when Tkinter was all old-style classes, and has never been updated.
Super(name) is not a "direct call" to the superclass __init__. After all, you called Super, not Super.__init__.
Super.__init__ takes an uninitialized Super instance and initializes it. Super creates and initializes a new, completely separate instance from the one you wanted to initialize (and then you immediately throw the new instance away). The instance you wanted to initialize is untouched.
Super(name) instantiates a new instance of super. Think of this example:
def __init__(self, name):
x1 = Super(name)
x2 = Super("some other name")
assert x1 is not self
assert x2 is not self
In order to explicitly call The Super's constructor on the current instance, you'd have to use the following syntax:
def __init__(self, name):
Super.__init__(self, name)
Now, maybe you don't want read further if you are a beginner.
If you do, you will see that there is a good reason to use super(Sub, self).__init__(name) (or super().__init__(name) in Python 3) instead of Super.__init__(self, name).
Super.__init__(self, name) works fine, as long as you are certain that Super is in fact your superclass. But in fact, you don't know ever that for sure.
You could have the following code:
class Super:
def __init__(self):
print('Super __init__')
class Sub(Super):
def __init__(self):
print('Sub __init__')
Super.__init__(self)
class Sub2(Super):
def __init__(self):
print('Sub2 __init__')
Super.__init__(self)
class SubSub(Sub, Sub2):
pass
You would now expect that SubSub() ends up calling all of the above constructors, but it does not:
>>> x = SubSub()
Sub __init__
Super __init__
>>>
To correct it, you'd have to do:
class Super:
def __init__(self):
print('Super __init__')
class Sub(Super):
def __init__(self):
print('Sub __init__')
super().__init__()
class Sub2(Super):
def __init__(self):
print('Sub2 __init__')
super().__init__()
class SubSub(Sub, Sub2):
pass
Now it works:
>>> x = SubSub()
Sub __init__
Sub2 __init__
Super __init__
>>>
The reason is that the super class of Sub is declared to be Super, in case of multiple inheritance in class SubSub, Python's MRO establishes the inheritance as: SubSub inherits from Sub, which inherits from Sub2, which inherits from Super, which inherits from object.
You can test that:
>>> SubSub.__mro__
(<class '__main__.SubSub'>, <class '__main__.Sub'>, <class '__main__.Sub2'>, <class '__main__.Super'>, <class 'object'>)
Now, the super() call in constructors of each of the classes finds the next class in the MRO so that the constructor of that class can be called.
See https://www.python.org/download/releases/2.3/mro/
I am new to Python. I came across Python code in an OpenFlow controller that I am working on.
class SimpleSwitch(app_manager.RyuApp):
OFP_VERSIONS = [ofproto_v1_0.OFP_VERSION]
def __init__(self, *args, **kwargs):
super(SimpleSwitch, self).__init__(*args, **kwargs)
self.mac_to_port = {}
My questions are as follows.
Is __init__ the constructor for a class?
Is self the same as C++'s this pointer?
Does super(SimpleSwitch, self).__init__(*args, **kwargs) mean calling constructor for parent/super class?
Can you add a new member to self as mac_to_port? Or has that been already added and just being initialized here?
__init__ is the initialiser; __new__ is the constructor. See e.g. this question.
Effectively yes: the first argument to instance methods in Python, called self by convention, is the instance itself.
Calling the parent class's initialiser, yes.
It's adding a new attribute to SimpleSwitch in addition to what the parent class already has, an empty dictionary.
Super in Python is not like C++'s super. I have not used C++, but I can tell you that super in python does not act the same. Instead of calling the parent, python super calls the children of the class in which super is called, then moves in an interesting chain. Think of three tiered class system where there is a single base class, two subclasses to that base class, and two subclasses to those subclasses. Calling super on the bottom tier would call the parent immediately above it, but calling super in one of the second-tier classes would call their children first, then it looks to the side and calls the other classes on it's own tier, then the children of that same-tier class are called. Once all of the same-tier classes and all of their children re called, then super calls the parent of the middle-tier classes.
It's hard to explain in words. Watch Raymond Hettinger's "super considered super" talk from PyCon. He gives a very good explanation of how it works, and why python's super should not be called 'super'.
I am writing a class with multiple constructors using #classmethod. Now I would like both the __init__ constructor as well as the classmethod constructor call some routine of the class to set initial values before doing other stuff.
From __init__ this is usually done with self:
def __init__(self, name="", revision=None):
self._init_attributes()
def _init_attributes(self):
self.test = "hello"
From a classmethod constructor, I would call another classmethod instead, because the instance (i.e. self) is not created until I leave the classmethod with return cls(...). Now, I can call my _init_attributes() method as
#classmethod
def from_file(cls, filename=None)
cls._init_attributes()
# do other stuff like reading from file
return cls()
and this actually works (in the sense that I don't get an error and I can actually see the test attribute after executing c = Class.from_file(). However, if I understand things correctly, then this will set the attributes on the class level, not on the instance level. Hence, if I initialize an attribute with a mutable object (e.g. a list), then all instances of this class would use the same list, rather than their own instance list. Is this correct? If so, is there a way to initialize "instance" attributes in classmethods, or do I have to write the code in such a way that all the attribute initialisation is done in init?
Hmmm. Actually, while writing this: I may even have greater trouble than I thought because init will be called upon return from the classmethod, won't it? So what would be a proper way to deal with this situation?
Note: Article [1] discusses a somewhat similar problem.
Yes, you'r understanding things correctly: cls._init_attributes() will set class attributes, not instance attributes.
Meanwhile, it's up to your alternate constructor to construct and return an instance. In between constructing it and returning it, that's when you can call _init_attributes(). In other words:
#classmethod
def from_file(cls, filename=None)
obj = cls()
obj._init_attributes()
# do other stuff like reading from file
return obj
However, you're right that the only obvious way to construct and return an instance is to just call cls(), which will call __init__.
But this is easy to get around: just have the alternate constructors pass some extra argument to __init__ meaning "skip the usual initialization, I'm going to do it later". For example:
def __init__(self, name="", revision=None, _skip_default_init=False):
# blah blah
#classmethod
def from_file(cls, filename=""):
# blah blah setup
obj = cls(_skip_default_init=True)
# extra initialization work
return obj
If you want to make this less visible, you can always take **kwargs and check it inside the method body… but remember, this is Python; you can't prevent people from doing stupid things, all you can do is make it obvious that they're stupid. And the _skip_default_init should be more than enough to handle that.
If you really want to, you can override __new__ as well. Constructing an object doesn't call __init__ unless __new__ returns an instance of cls or some subclass thereof. So, you can give __new__ a flag that tells it to skip over __init__ by munging obj.__class__, then restore the __class__ yourself. This is really hacky, but could conceivably be useful.
A much cleaner solution—but for some reason even less common in Python—is to borrow the "class cluster" idea from Smalltalk/ObjC: Create a private subclass that has a different __init__ that doesn't super (or intentionally skips over its immediate base and supers from there), and then have your alternate constructor in the base class just return an instance of that subclass.
Alternatively, if the only reason you don't want to call __init__ is so you can do the exact same thing __init__ would have done… why? DRY stands for "don't repeat yourself", not "bend over backward to find ways to force yourself to repeat yourself", right?
As might be familiar to most of you, this is from Mark Pilgrim's book DIP, chapter 5
class FileInfo(UserDict):
"store file metadata"
def __init__(self, filename=None):
UserDict.__init__(self)
self["name"] = filename
Well I am new to python, coming from basic C background and having confusion understanding it. Stating what I understand, before what I don't understand.
Statement 0: FileInfo is inheriting from class UserDict
Statement 1: __init__ is not a constructor, however after the class instantiates, this is the first method that is defined.
Statement2: self is almost like this
Now the trouble:
as per St1 init is defined as the first function.
UserDict.__init__(self)
Now within the same function __init__ why is the function being referenced, there is no inherent recursion I guess. Or is it trying to override the __init__ method of the class UserDict which the class FileInfo has inherited and put an extra parameter(key value pair) of filename and reference it to the filename being passed to __init__ method.
I am partly sure, I have answered my question, however as you can sense there is confusion, would be great if someone can explain me how to rule this confusion out with some more advanced use case and detailed example of how generally code is written.
You're correct, the __init__ method is not a constructor, it's an initializer called after the object is instantiated.
In the code you've presented, the __init__ method on the FileInfo class is extending the functionality of the __init__ method of the base class, UserDict. By calling the base __init__ method, it executes any code in the base class's initialization, and then adds its own. Without a call to the base class's __init__ method, only the code explicitly added to FileInfo's __init__ method would be called.
The conventional way to do this is by using the super method.
class FileInfo(UserDict):
"store file metadata"
def __init__(self, filename=None):
super(UserDict, self).__init__()
self["name"] = filename
A common use case is returning extra values or adding additional functionality. In Django's class based views, the method get_context_data is used to get the data dictionary for rendering templates. So in an extended method, you'd get whatever values are returned from the base method, and then add your own.
class MyView(TemplateView):
def get_context_data(self, **kwargs):
context = super(MyClass, self).get_context_data(**kwargs)
context['new_key'] = self.some_custom_method()
return kwargs
This way you do not need to reimplement the functionality of the base method when you want to extend it.
Creating an object in Python is a two-step process:
__new__(self, ...) # constructor
__init__(self, ...) # initializer
__new__ has the responsibility of creating the object, and is used primarily when the object is supposed to be immutable.
__init__ is called after __new__, and does any further configuration needed. Since most objects in Python are mutable, __new__ is usually skipped.
self refers to the object in question. For example, if you have d = dict(); d.keys() then in the keys method self would refer to d, not to dict.
When a subclass has a method of the same name as its parent class, Python calls the subclass' method and ignores the parent's; so if the parent's method needs to be called, the subclass method must call it.
"Or is it trying to override the init method of the class UserDict which the class FileInfo has inherited and put an extra parameter(key value pair) of filename and reference it to the filename being passed to init method."
It's exactly that. UserDict.__init__(self) calls the superclass init method.
Since you come from C, maybe you're not well experienced with OOP, so you could read this article : http://en.wikipedia.org/wiki/Inheritance_(object-oriented_programming) to understand the inheritance principle better (and the "superclass" term I used).
.. the self variable represents the instance of the object itself. In python this is not a hidden parameter as in other languages. You have to declare it explicitly. When you create an instance of the FileInfo class and call its methods, it will be passed automatically,
The __init__ method is roughly what represents a constructor in Python.
The __init__ method of FileInfo is overriding the __init__ method of UserDict.
Then FileInfo.__init__ calls UserDict.__init__ on the newly created FileInfo instance (self). This way all properties and magic available to UserDict are now available to that FileInfo instance (ie. they are inherited from UserDict).
The last line is the reason for overriding UserDict.__init__ : UserDict does not create the wanted property self.filename.
When you call __init__ method for a class that is inheriting from a base class, you generally modify the ancestor class and as a part of customization, you extend the ancestor's init method with proper arguements.
__init__ is not a constructor, however after the class instantiates, this is the first method that is defined.
This method is called when an instance is being initialized, after __new__ (i.e. when you call ClassName()). I'm not sure what difference there is as opposed to a constructor.
Statement2: self is almost like this
Yes but it is not a language construct. The name self is just convention. The first parameter passed to an instance method is always a reference to the class instance itself, so writing self there is just to name it (assign it to variable).
UserDict.__init__(self)
Here you are calling the UserDict's __init__ method and passing it a reference to the new instance (because you are not calling it with self.method_name, it is not passed automatically. You cannot call an inherited class's constructor without referencing its name, or using super). So what you are doing is initializing your object the same way any UserDict object would be initialized.