I'm learning python and the construct method __init__ makes me confused.
class Test(object):
def __init__(self):
pass
As known, python implicitly pass a parameter representing the object itself at the first place. It is acceptable in common member functions. But what makes me confused is that the constructor requests an object itself.
It looks like the c++ code below. This code does not make sense! The parameter thiz does not even exit before the constructor!
class Test {
public:
Test(Test *thiz);
};
So, my question is, why python needs a "non-existent" self object in constructor __init__?
__init__ behaves like any other normal function in a class. It gets called by the interpreter at a special time if it is defined, but there is nothing special about it. You can call it any time yourself, just like a regular method.
Here are some facts to help you see the picture:
Function objects are non-data descriptors in python. That means that function objects have a __get__ method, but not __set__. And if course they have a __call__ method to actually run the function.
When you put a def statement in a class body, it creates a regular function, just like it would elsewhere. You can see this by looking at type(Test.__init__). To call Test.__init__, you would have to manually pass in a self parameter.
The magic happens when you call __init__ on an instance of Test. For example:
a = Test()
a.__init__()
This code actually calls __init__ twice (which we'll get into momentarily). The explicit second call does not pass in a parameter to the method. That's because a.__init__ has special meaning when the name __init__ is present in the class but not the instance. If __init__ is a descriptor, the interpreter will not return it directly, but rather will call __get__ on it and return the result. This is called binding.
a.__init__() is roughly equivalent to type(a).__init__.__get__(a, None)(). The call .__get__(a, None) returns a callable object that is a bound method. It is like a special type of partial function, in which the first positional parameter is set to a. You can see this by checking type(a.__init__).
The self parameter is passed to methods as the first positional parameter. As such, it does not have to be called self. The name is just a convention. def __init__(this): or def __init__(x) would work just as well.
Finally, let's discuss how a = Test() ends up calling __init__. Class objects are callable in python. They have a class themselves, called a metaclass (usually just type), which actually defines a __call__ method. When you do Test(), you are literally calling the class.
By default, the __call__ method of a class looks something (very roughly approximately) like this:
def __call__(cls, *args, **kwargs):
self = cls.__new__(cls, *args, **kwargs)
if isinstance(self, cls):
type(self).__init__(self, *args, **kwargs)
So a new instance is only created by __new__, not __init__. The latter is an initializer, not a constructor or allocator. In fact, as I mentioned before, you can call __init__ as often as you like on most classes. Notable exceptions include immutable built-in classes like tuple and int.
As an initializer, __init__ obviously needs access to the object it is initializing. Your example does nothing with self, but it is pretty standard to set instance attributes and do something things to prep the object. For cases like yours, an explicit __init__ is not necessary at all, since a missing function will be found in the base class hierarchy.
Related
I'm learning overloading in Python 3.X and to better understand the topic, I wrote the following code that works in 3.X but not in 2.X. I expected the below code to fail since I've not defined __call__ for class Test. But to my surprise, it works and prints "constructor called". Demo.
class Test:
def __init__(self):
print("constructor called")
#Test.__getitem__() #error as expected
Test.__call__() #this works in 3.X(but not in 2.X) and prints "constructor called"! WHY THIS DOESN'T GIVE ERROR in 3.x?
So my question is that how/why exactly does this code work in 3.x but not in 2.x. I mean I want to know the mechanics behind what is going on.
More importantly, why __init__ is being used here when I am using __call__?
In 3.x:
About attribute lookup, type and object
Every time an attribute is looked up on an object, Python follows a process like this:
Is it directly a part of the actual data in the object? If so, use that and stop.
Is it directly a part of the object's class? If so, hold onto that for step 4.
Otherwise, check the object's class for __getattr__ and __getattribute__ overrides, look through base classes in the MRO, etc. (This is a massive simplification, of course.)
If something was found in step 2 or 3, check if it has a __get__. If it does, look that up (yes, that means starting over at step 1 for the attribute named __get__ on that object), call it, and use its return value. Otherwise, use what was returned directly.
Functions have a __get__ automatically; it is used to implement method binding. Classes are objects; that's why it's possible to look up attributes in them. That is: the purpose of the class Test: block is to define a data type; the code creates an object named Test which represents the data type that was defined.
But since the Test class is an object, it must be an instance of some class. That class is called type, and has a built-in implementation.
>>> type(Test)
<class 'type'>
Notice that type(Test) is not a function call. Rather, the name type is pre-defined to refer to a class, which every other class created in user code is (by default) an instance of.
In other words, type is the default metaclass: the class of classes.
>>> type
<class 'type'>
One may ask, what class does type belong to? The answer is surprisingly simple - itself:
>>> type(type) is type
True
Since the above examples call type, we conclude that type is callable. To be callable, it must have a __call__ attribute, and it does:
>>> type.__call__
<slot wrapper '__call__' of 'type' objects>
When type is called with a single argument, it looks up the argument's class (roughly equivalent to accessing the __class__ attribute of the argument). When called with three arguments, it creates a new instance of type, i.e., a new class.
How does type work?
Because this is digging right at the core of the language (allocating memory for the object), it's not quite possible to implement this in pure Python, at least for the reference C implementation (and I have no idea what sort of magic is going on in PyPy here). But we can approximately model the type class like so:
def _validate_type(obj, required_type, context):
if not isinstance(obj, required_type):
good_name = required_type.__name__
bad_name = type(obj).__name__
raise TypeError(f'{context} must be {good_name}, not {bad_name}')
class type:
def __new__(cls, name_or_obj, *args):
# __new__ implicitly gets passed an instance of the class, but
# `type` is its own class, so it will be `type` itself.
if len(args) == 0: # 1-argument form: check the type of an existing class.
return obj.__class__
# otherwise, 3-argument form: create a new class.
try:
bases, attrs = args
except ValueError:
raise TypeError('type() takes 1 or 3 arguments')
_validate_type(name, str, 'type.__new__() argument 1')
_validate_type(bases, tuple, 'type.__new__() argument 2')
_validate_type(attrs, dict, 'type.__new__() argument 3')
# This line would not work if we were actually implementing
# a replacement for `type`, as it would route to `object.__new__(type)`,
# which is explicitly disallowed. But let's pretend it does...
result = super().__new__()
# Now, fill in attributes from the parameters.
result.__name__ = name_or_obj
# Assigning to `__bases__` triggers a lot of other internal checks!
result.__bases__ = bases
for name, value in attrs.items():
setattr(result, name, value)
return result
del __new__.__get__ # `__new__`s of builtins don't implement this.
def __call__(self, *args):
return self.__new__(self, *args)
# this, however, does have a `__get__`.
What happens (conceptually) when we call the class (Test())?
Test() uses function-call syntax, but it's not a function. To figure out what should happen, we translate the call into Test.__class__.__call__(Test). (We use __class__ directly here, because translating the function call using type - asking type to categorize itself - would end up in endless recursion.)
Test.__class__ is type, so this becomes type.__call__(Test).
type contains a __call__ directly (type is its own class, remember?), so it's used directly - we don't go through the __get__ descriptor. We call the function, with Test as self, and no other arguments. (We have a function now, so we don't need to translate the function call syntax again. We could - given a function func, func.__class__.__call__.__get__(func) gives us an instance of an unnamed builtin "method wrapper" type, which does the same thing as func when called. Repeating the loop on the method wrapper creates a separate method wrapper that still does the same thing.)
This attempts the call Test.__new__(Test) (since self was bound to Test). Test.__new__ isn't explicitly defined in Test, but since Test is a class, we don't look in Test's class (type), but instead in Test's base (object).
object.__new__(Test) exists, and does magical built-in stuff to allocate memory for a new instance of the Test class, make it possible to assign attributes to that instance (even though Test is a subtype of object, which disallows that), and set its __class__ to Test.
Similarly, when we call type, the same logical chain turns type(Test) into type.__class__.__call__(type, Test) into type.__call__(type, Test), which forwards to type.__new__(type, Test). This time, there is a __new__ attribute directly in type, so this doesn't fall back to looking in object. Instead, with name_or_obj being set to Test, we simply return Test.__class__, i.e., type. And with separate name, bases, attrs arguments, type.__new__ instead creates an instance of type.
Finally: what happens when we call Test.__call__() explicitly?
If there's a __call__ defined in the class, it gets used, since it's found directly. This will fail, however, because there aren't enough arguments: the descriptor protocol isn't used since the attribute was found directly, so self isn't bound, and so that argument is missing.
If there isn't a __call__ method defined, then we look in Test's class, i.e., type. There's a __call__ there, so the rest proceeds like steps 3-5 in the previous section.
In Python 3.x, every class is implicitely a child of the builtin class object. And at least in the CPython implementation, the object class has a __call__ method which is defined in its metaclass type.
That means that Test.__call__() is exactly the same as Test() and will return a new Test object, calling your custom __init__ method.
In Python 2.x classes are by default old-style classes and are not child of object. Because of that __call__ is not defined. You can get the same behaviour in Python 2.x by using new style classes, meaning by making an explicit inheritance on object:
# Python 2 new style class
class Test(object):
...
Why it's required to use self keyword as an argument when calling the parent method from the child method?
Let me give an example,
class Account:
def __init__(self,filepath):
self.filepath = filepath
with open(self.filepath,"r") as file:
self.blanace = int(file.read())
def withDraw(self,amount):
self.blanace = self.blanace - amount
self.commit()
def deposite(self,amount):
self.blanace = self.blanace + amount
self.commit()
def commit(self):
with open(self.filepath,"w") as file:
file.write(str(self.blanace))
class Checking(Account):
def __init__(self,filepath):
Account.__init__(sellf,filepath) ######## I'm asking about this line.
Regarding this code,
I understand that self is automatically passed to the class when declaring a new object, so,
I expect when I declare new object, python will set self = the declared object, so now the self keyword will be available in the "'init'" child method, so no need to write it manually again like
Account.__init__(sellf,filepath) ######## I'm asking about this line.
All instance methods are just function-valued class attributes. If you access the attribute via an instance, some behind-the-scenes "magic" (known as the descriptor protocol) takes care of changing foo.bar() to type(foo).bar(foo). __init__ itself is also just another instance method, albeit one you usually only call explicitly when overriding __init__ in a child.
In your example, you are explicitly invoking the parent class's __init__ method via the class, so you have to pass self explicitly (self.__init__(filepath) would result in infinite recursion).
One way to avoid this is to not refer to the parent class explicitly, but to let a proxy determine the "closest" parent for you.
super().__init__(filepath)
There is some magic here: super with no arguments determines, with some help from the Python implementation, which class it statically occurs in (in this case, Checking) and passes that, along with self, as the implicit arguments to super. In Python 2, you always had to be explicit: super(Checking, self).__init__(filepath). (In Python 3, you can still pass argument explicitly, because there are some use cases, though rare, for passing arguments other than the current static class and self. Most commonly, super(SomeClass) does not get self as an implicit second argument, and handles class-level proxying.)
Specifically, the function class defines a __get__ method; if the result of an attribute lookup defines __get__, the return value of that method is returned instead of the attribute value itself. In other words,
foo.bar
becomes
foo.__dict__['bar'].__get__(foo, type(foo))
and that return value is an object of type method. Calling a method instance simply causes the original function to be called, with its first argument being the instance that __get__ took as its first argument, and its remaining arguments are whatever other arguments were passed to the original method call.
Generally speaking, I would tally this one up to the Zen of Python -- specifically, the following statements:
Explicit is better than implicit.
Readability counts.
In the face of ambiguity, refuse the temptation to guess.
... and so on.
It's the mantra of Python -- this, along with many other cases may seem redundant and overly simplistic, but being explicit is one of Python's key "goals." Perhaps another user can give more explicit examples, but in this case, I would say it makes sense to not have arguments be explicitly defined in one call, then vanish -- it might make things unclear when looking at a child function without also looking at its parent.
I'm trying to develop an understanding of OOP programming in python and one thing that has confused me is how a function becomes a bound method. What I think I understand so far is
In python 3.x "unbound methods" no longer exist. There's only functions and (bound) methods.
When an instance invokes a class function obj.func descriptor protocol is used and func.__get__ is invoked
The object returned is a bound method to the obj instance where the first parameter (usually self) is the obj being passed
What happens in between step 2 and 3?
It seems like something along the lines of
obj.func=type(obj).func.__get__(func, obj)
Where somewhere in __get__ there's something like a decorator that will then return the bound method.
While executing the following code:
class Test():
def __init__(self):
self.hi_there()
self.a = 5
def hi_there(self):
print(self.a)
new_object = Test()
new_object.hi_there()
I have received an error:
Traceback (most recent call last):
File "/root/a.py", line 241, in <module>
new_object = Test()
File "/root/a.py", line 233, in __init__
self.hello()
File "/root/a.py", line 238, in hello
print(self.a)
AttributeError: 'Test' object has no attribute 'a'
Why do we need to specify the self inside the function while the object is not initialized yet? The possibility to call hi_there() function means that the object is already set, but how come if other variables attributed to this instances haven't been initialized yet?
What is the self inside the __init__ function if it's not a "full" object yet?
Clearly this part of code works:
class Test():
def __init__(self):
#self.hi_there()
self.a = 5
self.hi_there()
def hi_there(self):
print(self.a)
new_object = Test()
new_object.hi_there()
I come from C++ world, there you have to declare the variables before you assign them.
I fully understand your the use of self. Although I don't understand what is the use of self inside__init__() if the self object is not fully initialized.
There is no magic. By the time __init__ is called, the object is created and its methods defined, but you have the chance to set all the instance attributes and do all other initialization. If you look at execution in __init__:
def __init__(self):
self.hi_there()
self.a = 5
def hi_there(self):
print(self.a)
the first thing that happens in __init__ is that hi_there is called. The method already exists, so the function call works, and we drop into hi_there(), which does print(self.a). But this is the problem: self.a isn't set yet, since this only happens in the second line of __init__, but we called hi_there from the first line of __init__. Execution hasn't reached the line where you set self.a = 5, so there's no way that the method call self.hi_there() issued before this assignment can use self.a. This is why you get the AttributeError.
Actually, the object has already been created when __init__ is called. That's why you need self as a parameter. And because of the way Python works internally, you don't have access to the objects without self (Bear in mind that it doesn't need to be called self, you can call it anything you want as long as it is a valid name. The instance is always the first parameter of a method, whatever it's name is.).
The truth is that __init__ doesn't create the object, it just initializes it. There is a class method called __new__, which is in charge of creating the instance and returning it. That's where the object is created.
Now, when does the object get it's a attribute. That's in __init__, but you do have access to it's methods inside of __init__. I'm not completely knowledable about how the creation of the objects works, but methods are already set once you get to that point. That doesn't happen with values, so they are not available until you define them yourself in __init__.
Basically Python creates the object, gives it it's methods, and then gives you the instance so you can initialize it's attributes.
EDIT
Another thing I forgot to mention. Just like you define __init__, you can define __new__ yourself. It's not very common, but you do it when you need to modify the actual object's creation. I've only seen it when defining metaclasses (What are metaclasses in Python?). Another method you can define in that case is __call__, giving you even more control.
Not sure what you meant here, but I guess the first code sample should call an hello() function instead of the hi_there() function.
Someone corrects me if I'm wrong, but in Python, defining a class, or a function is dynamic. By this I mean, defining a class or a function happens at runtime: these are regular statements that are executed just like others.
This language feature allows powerful thing such as decorating the behavior of a function to enrich it with extra functionality (see decorators).
Therefore, when you create an instance of the Test class, you try to call the hello() function before you have set explicitly the value of a. Therefore, the Test class is not YET aware of its a attribute. It has to be read sequentially.
I would like to have a function pointer ptr that can point to either:
a function,
the method of an object instance, or
the constructor of the object.
In the latter case, the execution of ptr() should instantiate the class.
def function(argument) :
print("Function called with argument: "+str(argument))
class C(object) :
def __init__(self,argument) :
print("C's __init__ method called with argument: "+str(argument))
def m(self,argument) :
print("C's method 'm' called with argument: "+str(argument))
## works
ptr = function
ptr('A')
## works
instance = C('asdf')
ptr = instance.m
ptr('A')
## fails
constructorPtr = C.__init__
constructorPtr('A')
This produces as output:
Function called with argument: A
C's __init__ method called with argument: asdf
C's method 'm' called with argument: A
Traceback (most recent call last): File "tmp.py", line 24, in <module>
constructorPtr('A')
TypeError: unbound method __init__() must be called with C instance as first argument (got str instance instead)
showing that the first two ptr() calls worked, but the last did not.
The reason this doesn't work is that the __init__ method isn't a constructor, it's an initializer.*
Notice that its first argument is self—that self has to be already constructed before its __init__ method gets called, otherwise, where would it come from.
In other words, it's a normal instance method, just like instance.m is, but you're trying to call it as an unbound method—exactly as if you'd tried to call C.m instead of instance.m.
Python does have a special method for constructors, __new__ (although Python calls this a "creator" to avoid confusion with languages with single-stage construction). This is a static method that takes the class to construct as its first argument and the constructor arguments as its other arguments. The default implementation that you've inherited from object just creates an instance of that class and passes the arguments along to its initializer.** So:
constructor = C.__new__
constructor(C, 'A')
Or, if you prefer:
from functools import partial
constructor = partial(C.__new__, C)
constructor('A')
However, it's incredibly rare that you'll ever want to call __new__ directly, except from a subclass's __new__. Classes themselves are callable, and act as their own constructors—effectively that means that they call the __new__ method with the appropriate arguments, but there are some subtleties (and, in every case where they differ, C() is probably what you want, not C.__new__(C)).
So:
constructor = C
constructor('A')
As user2357112 points out in a comment:
In general, if you want a ptr that does whatever_expression(foo) when you call ptr(foo), you should set ptr = whatever_expression
That's a great, simple rule of thumb, and Python has been carefully designed to make that rule of thumb work whenever possible.
Finally, as a side note, you can assign ptr to anything callable, not just the cases you described:
a function,
a bound method (your instance.m),
a constructor (that is, a class),
an unbound method (e.g., C.m—which you can call just fine, but you'll have to pass instance as the first argument),
a bound classmethod (e.g., both C.cm and instance.cm, if you defined cm as a #classmethod),
an unbound classmethod (harder to construct, and less useful),
a staticmethod (e.g., both C.sm and instance.sm, if you defined sm as a #staticmethod),
various kinds of implementation-specific "builtin" types that simulate functions, methods, and classes.
an instance of any type with a __call__ method,
And in fact, all of these are just special cases of the last one—the type type has a __call__ method, as do types.FunctionType and types.MethodType, and so on.
* If you're familiar with other languages like Smalltalk or Objective-C, you may be thrown off by the fact that Python doesn't look like it has two-stage construction. In ObjC terms, you rarely implement alloc, but you call it all the time: [[MyClass alloc] initWithArgument:a]. In Python, you can pretend that MyClass(a) means the same thing (although really it's more like [MyClass allocWithArgument:a], where allocWithArgument: automatically calls initWithArgument: for you).
** Actually, this isn't quite true; the default implementation just returns an instance of C, and Python automatically calls the __init__ method if isinstance(returnvalue, C).
I had a hard time finding the answer to this problem online, but I figured it out, so here is the solution.
Instead of pointing constructorPtr at C.__init__, you can just point it at C, like this.
constructorPtr = C
constructorPtr('A')
which produces as output:
C's __init__ method called with argument: A