understanding python self and init - python

As might be familiar to most of you, this is from Mark Pilgrim's book DIP, chapter 5
class FileInfo(UserDict):
"store file metadata"
def __init__(self, filename=None):
UserDict.__init__(self)
self["name"] = filename
Well I am new to python, coming from basic C background and having confusion understanding it. Stating what I understand, before what I don't understand.
Statement 0: FileInfo is inheriting from class UserDict
Statement 1: __init__ is not a constructor, however after the class instantiates, this is the first method that is defined.
Statement2: self is almost like this
Now the trouble:
as per St1 init is defined as the first function.
UserDict.__init__(self)
Now within the same function __init__ why is the function being referenced, there is no inherent recursion I guess. Or is it trying to override the __init__ method of the class UserDict which the class FileInfo has inherited and put an extra parameter(key value pair) of filename and reference it to the filename being passed to __init__ method.
I am partly sure, I have answered my question, however as you can sense there is confusion, would be great if someone can explain me how to rule this confusion out with some more advanced use case and detailed example of how generally code is written.

You're correct, the __init__ method is not a constructor, it's an initializer called after the object is instantiated.
In the code you've presented, the __init__ method on the FileInfo class is extending the functionality of the __init__ method of the base class, UserDict. By calling the base __init__ method, it executes any code in the base class's initialization, and then adds its own. Without a call to the base class's __init__ method, only the code explicitly added to FileInfo's __init__ method would be called.
The conventional way to do this is by using the super method.
class FileInfo(UserDict):
"store file metadata"
def __init__(self, filename=None):
super(UserDict, self).__init__()
self["name"] = filename
A common use case is returning extra values or adding additional functionality. In Django's class based views, the method get_context_data is used to get the data dictionary for rendering templates. So in an extended method, you'd get whatever values are returned from the base method, and then add your own.
class MyView(TemplateView):
def get_context_data(self, **kwargs):
context = super(MyClass, self).get_context_data(**kwargs)
context['new_key'] = self.some_custom_method()
return kwargs
This way you do not need to reimplement the functionality of the base method when you want to extend it.

Creating an object in Python is a two-step process:
__new__(self, ...) # constructor
__init__(self, ...) # initializer
__new__ has the responsibility of creating the object, and is used primarily when the object is supposed to be immutable.
__init__ is called after __new__, and does any further configuration needed. Since most objects in Python are mutable, __new__ is usually skipped.
self refers to the object in question. For example, if you have d = dict(); d.keys() then in the keys method self would refer to d, not to dict.
When a subclass has a method of the same name as its parent class, Python calls the subclass' method and ignores the parent's; so if the parent's method needs to be called, the subclass method must call it.

"Or is it trying to override the init method of the class UserDict which the class FileInfo has inherited and put an extra parameter(key value pair) of filename and reference it to the filename being passed to init method."
It's exactly that. UserDict.__init__(self) calls the superclass init method.
Since you come from C, maybe you're not well experienced with OOP, so you could read this article : http://en.wikipedia.org/wiki/Inheritance_(object-oriented_programming) to understand the inheritance principle better (and the "superclass" term I used).

.. the self variable represents the instance of the object itself. In python this is not a hidden parameter as in other languages. You have to declare it explicitly. When you create an instance of the FileInfo class and call its methods, it will be passed automatically,
The __init__ method is roughly what represents a constructor in Python.

The __init__ method of FileInfo is overriding the __init__ method of UserDict.
Then FileInfo.__init__ calls UserDict.__init__ on the newly created FileInfo instance (self). This way all properties and magic available to UserDict are now available to that FileInfo instance (ie. they are inherited from UserDict).
The last line is the reason for overriding UserDict.__init__ : UserDict does not create the wanted property self.filename.

When you call __init__ method for a class that is inheriting from a base class, you generally modify the ancestor class and as a part of customization, you extend the ancestor's init method with proper arguements.

__init__ is not a constructor, however after the class instantiates, this is the first method that is defined.
This method is called when an instance is being initialized, after __new__ (i.e. when you call ClassName()). I'm not sure what difference there is as opposed to a constructor.
Statement2: self is almost like this
Yes but it is not a language construct. The name self is just convention. The first parameter passed to an instance method is always a reference to the class instance itself, so writing self there is just to name it (assign it to variable).
UserDict.__init__(self)
Here you are calling the UserDict's __init__ method and passing it a reference to the new instance (because you are not calling it with self.method_name, it is not passed automatically. You cannot call an inherited class's constructor without referencing its name, or using super). So what you are doing is initializing your object the same way any UserDict object would be initialized.

Related

How to decorate a python class and override a method?

I have a class
class A:
def sample_method():
I would like to decorate class A sample_method() and override the contents of sample_method()
class DecoratedA(A):
def sample_method():
The setup above resembles inheritance, but I need to keep the preexisting instance of class A when the decorated function is used.
a # preexisting instance of class A
decorated_a = DecoratedA(a)
decorated_a.functionInClassA() #functions in Class A called as usual with preexisting instance
decorated_a.sample_method() #should call the overwritten sample_method() defined in DecoratedA
What is the proper way to go about this?
There isn't a straightforward way to do what you're asking. Generally, after an instance has been created, it's too late to mess with the methods its class defines.
There are two options you have, as far as I see it. Either you create a wrapper or proxy object for your pre-existing instance, or you modify the instance to change its behavior.
A proxy defers most behavior to the object itself, while only adding (or overriding) some limited behavior of its own:
class Proxy:
def __init__(self, obj):
self.obj = obj
def overridden_method(self): # add your own limited behavior for a few things
do_stuff()
def __getattr__(self, name): # and hand everything else off to the other object
return getattr(self.obj, name)
__getattr__ isn't perfect here, it can only work for regular methods, not special __dunder__ methods that are often looked up directly in the class itself. If you want your proxy to match all possible behavior, you probably need to add things like __add__ and __getitem__, but that might not be necessary in your specific situation (it depends on what A does).
As for changing the behavior of the existing object, one approach is to write your subclass, and then change the existing object's class to be the subclass. This is a little sketchy, since you won't have ever initialized the object as the new class, but it might work if you're only modifying method behavior.
class ModifiedA(A):
def overridden_method(self): # do the override in a normal subclass
do_stuff()
def modify_obj(obj): # then change an existing object's type in place!
obj.__class__ = ModifiedA # this is not terribly safe, but it can work
You could also consider adding an instance variable that would shadow the method you want to override, rather than modifying __class__. Writing the function could be a little tricky, since it won't get bound to the object automatically when called (that only happens for functions that are attributes of a class, not attributes of an instance), but you could probably do the binding yourself (with partial or lambda if you need to access self.
First, why not just define it from the beginning, how you want it, instead of decorating it?
Second, why not decorate the method itself?
To answer the question:
You can reassign it
class A:
def sample_method(): ...
pass
A.sample_method = DecoratedA.sample_method;
but that affects every instance.
Another solution is to reassign the method for just one object.
import functools;
a.sample_method = functools.partial(DecoratedA.sample_method, a);
Another solution is to (temporarily) change the type of an existing object.
a = A();
a.__class__ = DecoratedA;
a.sample_method();
a.__class__ = A;

Getting private attribute in parent class using super(), outside of a method

I have a class with a private constant _BAR = object().
In a child class, outside of a method (no access to self), I want to refer to _BAR.
Here is a contrived example:
class Foo:
_BAR = object()
def __init__(self, bar: object = _BAR):
...
class DFoo(Foo):
"""Child class where I want to access private class variable from parent."""
def __init__(self, baz: object = super()._BAR):
super().__init__(baz)
Unfortunately, this doesn't work. One gets an error: RuntimeError: super(): no arguments
Is there a way to use super outside of a method to get a parent class attribute?
The workaround is to use Foo._BAR, I am wondering though if one can use super to solve this problem.
Inside of DFoo, you cannot refer to Foo._BAR without referring to Foo. Python variables are searched in the local, enclosing, global and built-in scopes (and in this order, it is the so called LEGB rule) and _BAR is not present in any of them.
Let's ignore an explicit Foo._BAR.
Further, it gets inherited: DFoo._BAR will be looked up first in DFoo, and when not found, in Foo.
What other means are there to get the Foo reference? Foo is a base class of DFoo. Can we use this relationship? Yes and no. Yes at execution time and no at definition time.
The problem is when the DFoo is being defined, it does not exist yet. We have no start point to start following the inheritance chain. This rules out an indirect reference (DFoo -> Foo) in a def method(self, ....): line and in a class attribute _DBAR = _BAR.
It is possible to work around this limitation using a class decorator. Define the class and then modify it:
def deco(cls):
cls._BAR = cls.__mro__[1]._BAR * 2 # __mro__[0] is the class itself
return cls
class Foo:
_BAR = 10
#deco
class DFoo(Foo):
pass
print(Foo._BAR, DFoo._BAR) # 10 20
Similar effect can be achieved with a metaclass.
The last option to get a reference to Foo is at execution time. We have the object self, its type is DFoo, and its parent type is Foo and there exists the _BAR. The well known super() is a shortcut to get the parent.
I have assumed only one base class for simplicity. If there were several base classes, super() returns only one of them. The example class decorator does the same. To understand how several bases are sorted to a sequence, see how the MRO works (Method Resolution Order).
My final thought is that I could not think up a use-case where such access as in the question would be required.
Short answer: you can't !
I'm not going into much details about super class itself here. (I've written a pure Python implementation in this gist if you like to read.)
But now let's see how we can call super:
1- Without arguments:
From PEP 3135:
This PEP proposes syntactic sugar for use of the super type to
automatically construct instances of the super type binding to the
class that a method was defined in, and the instance (or class object
for classmethods) that the method is currently acting upon.
The new syntax:
super()
is equivalent to:
super(__class__, <firstarg>)
...and <firstarg> is the first parameter of the method
So this is not an option because you don't have access to the "instance".
(Body of the function/methods is not executed unless it gets called, so no problem if DFoo doesn't exist yet inside the method definition)
2- super(type, instance)
From documentation:
The zero argument form only works inside a class definition, as the
compiler fills in the necessary details to correctly retrieve the
class being defined, as well as accessing the current instance for
ordinary methods.
What were those necessary details mentioned above? A "type" and A "instance":
We can't pass neither "instance" nor "type" which is DFoo here. The first one is because it's not inside the method so we don't have access to instance(self). Second one is DFoo itself. By the time the body of the DFoo class is being executed there is no reference to DFoo, it doesn't exist yet. The body of the class is executed inside a namespace which is a dictionary. After that a new instance of type type which is here named DFoo is created using that populated dictionary and added to the global namespaces. That's what class keyword roughly does in its simple form.
3- super(type, type):
If the second argument is a type, issubclass(type2, type) must be
true
Same reason mentioned in above about accessing the DFoo.
4- super(type):
If the second argument is omitted, the super object returned is
unbound.
If you have an unbound super object you can't do lookup(unless for the super object's attributes itself). Remember super() object is a descriptor. You can turn an unbound object to a bound object by calling __get__ and passing the instance:
class A:
a = 1
class B(A):
pass
class C(B):
sup = super(B)
try:
sup.a
except AttributeError as e:
print(e) # 'super' object has no attribute 'a'
obj = C()
print(obj.sup.a) # 1
obj.sup automatically calls the __get__.
And again same reason about accessing DFoo type mentioned above, nothing changed. Just added for records. These are the ways how we can call super.

Default call to __init__ when creating an instance

Just wanted some help understanding these lines of code:
class Parent:
def __init__(self):
print("instance created")
parent1=Parent()
parent2=Parent.__init__(parent1)
output
instance created
instance created
I am trying to understand how a constructor is called in OOP for python.
In the first line the the method __init__ is called by default and the self argument that is passed is somehow parent1?
The second line is the more traditional way I would've thought methods would be called. Since __init__ takes an instance of the parent class as an argument I passed parent1 and it works. I get what is happening in the second line, just wanted to ask what the computer is doing to create the instance parent1 in the first line.
__init__ is not a constructor, it's an initializer. When Python creates an object, it's actually created in __new__ (usually left as the default, which just makes an empty object of the right class), which receives a reference to the class, and returns an instance (typically empty; no attributes set). The resulting instance is passed implicitly as the self in __init__, which then establishes the instance attributes.
Typically, you don't call special methods like __init__ directly (aside from cases involving super() with cooperative inheritance), you just let Python do it for you. The only way to avoid calling __init__ would be to explicitly invoke the class's __new__ (which is also extremely unusual).
__init__ is the equivalent to a constructor in Python. Think of an object oriented language as one that has a mandatory argument for functions that represent object methods, so you always have access to that object in that function. Most languages don't make you type out the way you pass in this. Python uses self, and makes you type it out for every method. It's the same thing, it's just not doing extra work for you.
So when Python instantiates a class, it passes the class to the class's __new__ function, generates an object, and then passes that object to the class's __init__ function as the first argument.
You are correct that __init__() work like a constructor, is automatically runs when an object is instantiated (as would happen with Java constructor, if that helps). Although you can call __init__, you shouldn't call functions/methods starting with _ or __, they are meant to be called from with the class/object.
When self appears as a parameter in a class method you won't have to supply the object's name, Python will figure it out. So the second line above (Parent2 = ...) is not recommended.
See the documentation:
object.__init__(self[, ...])
Called after the instance has been created (by __new__()), but before it is returned to the caller. The arguments are those passed to the class constructor expression.
object.__new__(cls[, ...])
Called to create a new instance of class cls. __new__() is a static method (special-cased so you need not declare it as such) that takes the class of which an instance was requested as its first argument. The remaining arguments are those passed to the object constructor expression
So under the hood in parent1 = Parent(), Python's basically doing this:
_temp_new_parent = Parent.__new__(Parent) # Inherited from "object.__new__"
Parent.__init__(_temp_new_parent)
parent1 = _temp_new_parent
(_temp_new_parent doesn't really exist, I'm just using it as an abstraction.)
Note that __init__() doesn't return anything, so in your code, parent2 is None. And if __init__() had set instance attributes, it would have set them on parent1 since that's what you passed in.

running __init__ inside its own definition

I'm currently learning tkinter from sentdex's tutorial and to me it seems that I'm writing to run __init__ in its own definition, what does a line like that mean? Is it tKinter's __init__ function?
class seaOfBTCapp(tk.Tk):
def __init__(self,*args,**kwargs)
tk.Tk.__init__(self,*args,**kwargs)
It's invoking another class's constructor on itself.
This is a fun quirk of python's object-oriented design. "Instance methods" are really just class methods that take the current instance as an implicit parameter. You can, in fact, call them as class methods and provide the object explicitly:
ex = [1, 2, 3, 4, 5]
# the following are equivalent:
ex.pop(0) # call the method on the instance, passing it implicitly
list.pop(ex, 0) # call the method on the class `list`, passing the instance explicitly
The same behavior is being invoked here. You're taking the __init__ method of the tk.TK class, and passing self in as the "instance". This is an uncommon, but valid, way of accessing methods in the superclass that have been overridden in your subclass (for example, the constructor).
As in #Barmar's answer, a better solution is using super(), which produces something resembling an instance of the superclass, which you then call __init__ on to get the superclass's implementation of __init__() passing self implicitly, as you would expect.
What that line does is call your parent class's __init__ method. That's what would have happened if you didn't define your own method, so if you're not doing anything else in your __init__, you should probably just skip it and let the inherited method run normally.
It's also probably better to call super().__init__(*args, **kwargs), rather than naming the parent class explicitly (and needing to pass self by hand). This is particularly the case if you might ever use this class in a situation involving multiple inheritance, where explicitly naming the next class to be called can get the MRO wrong. If you're just starting in programming, don't worry too much about this, multiple inheritance is a pretty advanced topic (though it's easier to get right in Python than in many other languages).
I think this is equivalent to the more modern:
class seaOfBTCapp(tk.Tk):
def __init__(self,*args,**kwargs)
super().__init__(*args,**kwargs)

Mutliple constructors in python calling the same routine

I am writing a class with multiple constructors using #classmethod. Now I would like both the __init__ constructor as well as the classmethod constructor call some routine of the class to set initial values before doing other stuff.
From __init__ this is usually done with self:
def __init__(self, name="", revision=None):
self._init_attributes()
def _init_attributes(self):
self.test = "hello"
From a classmethod constructor, I would call another classmethod instead, because the instance (i.e. self) is not created until I leave the classmethod with return cls(...). Now, I can call my _init_attributes() method as
#classmethod
def from_file(cls, filename=None)
cls._init_attributes()
# do other stuff like reading from file
return cls()
and this actually works (in the sense that I don't get an error and I can actually see the test attribute after executing c = Class.from_file(). However, if I understand things correctly, then this will set the attributes on the class level, not on the instance level. Hence, if I initialize an attribute with a mutable object (e.g. a list), then all instances of this class would use the same list, rather than their own instance list. Is this correct? If so, is there a way to initialize "instance" attributes in classmethods, or do I have to write the code in such a way that all the attribute initialisation is done in init?
Hmmm. Actually, while writing this: I may even have greater trouble than I thought because init will be called upon return from the classmethod, won't it? So what would be a proper way to deal with this situation?
Note: Article [1] discusses a somewhat similar problem.
Yes, you'r understanding things correctly: cls._init_attributes() will set class attributes, not instance attributes.
Meanwhile, it's up to your alternate constructor to construct and return an instance. In between constructing it and returning it, that's when you can call _init_attributes(). In other words:
#classmethod
def from_file(cls, filename=None)
obj = cls()
obj._init_attributes()
# do other stuff like reading from file
return obj
However, you're right that the only obvious way to construct and return an instance is to just call cls(), which will call __init__.
But this is easy to get around: just have the alternate constructors pass some extra argument to __init__ meaning "skip the usual initialization, I'm going to do it later". For example:
def __init__(self, name="", revision=None, _skip_default_init=False):
# blah blah
#classmethod
def from_file(cls, filename=""):
# blah blah setup
obj = cls(_skip_default_init=True)
# extra initialization work
return obj
If you want to make this less visible, you can always take **kwargs and check it inside the method body… but remember, this is Python; you can't prevent people from doing stupid things, all you can do is make it obvious that they're stupid. And the _skip_default_init should be more than enough to handle that.
If you really want to, you can override __new__ as well. Constructing an object doesn't call __init__ unless __new__ returns an instance of cls or some subclass thereof. So, you can give __new__ a flag that tells it to skip over __init__ by munging obj.__class__, then restore the __class__ yourself. This is really hacky, but could conceivably be useful.
A much cleaner solution—but for some reason even less common in Python—is to borrow the "class cluster" idea from Smalltalk/ObjC: Create a private subclass that has a different __init__ that doesn't super (or intentionally skips over its immediate base and supers from there), and then have your alternate constructor in the base class just return an instance of that subclass.
Alternatively, if the only reason you don't want to call __init__ is so you can do the exact same thing __init__ would have done… why? DRY stands for "don't repeat yourself", not "bend over backward to find ways to force yourself to repeat yourself", right?

Categories