Why is Python 3.x's super() magic? - python

In Python 3.x, super() can be called without arguments:
class A(object):
def x(self):
print("Hey now")
class B(A):
def x(self):
super().x()
>>> B().x()
Hey now
In order to make this work, some compile-time magic is performed, one consequence of which is that the following code (which rebinds super to super_) fails:
super_ = super
class A(object):
def x(self):
print("No flipping")
class B(A):
def x(self):
super_().x()
>>> B().x()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in x
RuntimeError: super(): __class__ cell not found
Why is super() unable to resolve the superclass at runtime without assistance from the compiler? Are there practical situations in which this behaviour, or the underlying reason for it, could bite an unwary programmer?
... and, as a side question: are there any other examples in Python of functions, methods etc. which can be broken by rebinding them to a different name?

The new magic super() behaviour was added to avoid violating the D.R.Y. (Don't Repeat Yourself) principle, see PEP 3135. Having to explicitly name the class by referencing it as a global is also prone to the same rebinding issues you discovered with super() itself:
class Foo(Bar):
def baz(self):
return super(Foo, self).baz() + 42
Spam = Foo
Foo = something_else()
Spam().baz() # liable to blow up
The same applies to using class decorators where the decorator returns a new object, which rebinds the class name:
#class_decorator_returning_new_class
class Foo(Bar):
def baz(self):
# Now `Foo` is a *different class*
return super(Foo, self).baz() + 42
The magic super() __class__ cell sidesteps these issues nicely by giving you access to the original class object.
The PEP was kicked off by Guido, who initially envisioned super becoming a keyword, and the idea of using a cell to look up the current class was also his. Certainly, the idea to make it a keyword was part of the first draft of the PEP.
However, it was in fact Guido himself who then stepped away from the keyword idea as 'too magical', proposing the current implementation instead. He anticipated that using a different name for super() could be a problem:
My patch uses an intermediate solution: it assumes you need __class__
whenever you use a variable named 'super'. Thus, if you (globally)
rename super to supper and use supper but not super, it won't work
without arguments (but it will still work if you pass it either
__class__ or the actual class object); if you have an unrelated
variable named super, things will work but the method will use the
slightly slower call path used for cell variables.
So, in the end, it was Guido himself that proclaimed that using a super keyword did not feel right, and that providing a magic __class__ cell was an acceptable compromise.
I agree that the magic, implicit behaviour of the implementation is somewhat surprising, but super() is one of the most mis-applied functions in the language. Just take a look at all the misapplied super(type(self), self) or super(self.__class__, self) invocations found on the Internet; if any of that code was ever called from a derived class you'd end up with an infinite recursion exception. At the very least the simplified super() call, without arguments, avoids that problem.
As for the renamed super_; just reference __class__ in your method as well and it'll work again. The cell is created if you reference either the super or __class__ names in your method:
>>> super_ = super
>>> class A(object):
... def x(self):
... print("No flipping")
...
>>> class B(A):
... def x(self):
... __class__ # just referencing it is enough
... super_().x()
...
>>> B().x()
No flipping

Related

python property decorator for __name__ attr in class

I find a good desc for python property in this link
How does the #property decorator work in Python?
below example shows how it works, while I find an exception for class attr 'name'
now I have a reload function which will raise an error
#property
def foo(self): return self._foo
really means the same thing as
def foo(self): return self._foo
foo = property(foo)
here is my example
class A(object):
#property
def __name__(self):
return 'dd'
a = A()
print(a.__name__)
dd
this works, however below cannot work
class B(object):
pass
def test(self):
return 'test'
B.t = property(test)
print(B.t)
B.__name__ = property(test)
<property object at 0x7f71dc5e1180>
Traceback (most recent call last):
File "<string>", line 23, in <module>
TypeError: can only assign string to B.__name__, not 'property'
Does anyone knows the difference for builtin name attr, it works if I use normal property decorator, while not works for the 2nd way. now I have a requirement to reload the function when code changes, however this error will block the reload procedure. Can anyone helps? thanks.
The short answer is: __name__ is deep magic in CPython.
So, first, let's get the technicalities out of the way. To quote what you said
#property
def foo(self): return self._foo
really means the same thing as
def foo(self): return self._foo
foo = property(foo)
This is correct. But it can be a bit misleading. You have this A class
class A(object):
#property
def __name__(self):
return 'dd'
And you claim that it's equivalent to this B class
class B(object):
pass
def test(self):
return 'test'
B.__name__ = property(test)
which is not correct. It's actually equivalent to this
def test(self):
return 'test'
class B(object):
__name__ = property(test)
which works and does what you expect it to. And you're also correct that, for most names in Python, your B and my B would be the same. What difference does it make whether I'm assigning to a name inside the class or immediately after its declaration? Replace __name__ with ravioli in the above snippets and either will work. So what makes __name__ special?
That's where the magic comes in. When you define a name inside the class, you're working directly on the class' internal dictionary, so
class A:
foo = 1
def bar(self):
return 1
This defines two things on the class A. One happens to be a number and the other happens to be a function (which will likely be called as a bound method). Now we can access these.
A.foo # Returns 1, simple access
A.bar # Returns the function object bar
A().foo # Returns 1
A().bar # Returns a bound method object
When we look up the names directly on A, we simply access the slots like we would on any object. However, when we look them up on A() (an instance of A), a multi-step process happens
Look up the name on the instance's __dict__ directly.
If that failed, then look up the name on the class' __dict__.
If we found it on the class, see if there's a __get__ on the result and call it.
That third step is what allows bound method objects to work, and it's also the mechanism underlying the property decorators in Python.
Let's go through this whole process with a property called ravioli. No magic here.
class A(object):
#property
def ravioli(self):
return 'dd'
When we do A().ravioli, first we see if there's a ravioli on the instance we just made. There isn't, so we check the class' __dict__, and indeed we find a property object at that position. That property object has a __get__, so we call it, and it returns 'dd', so indeed we get the string 'dd'.
>>> A().ravioli
'dd'
Now I would expect that, if I do A.ravioli, we will simply get the property object. Since we're not calling it on an instance, we don't call __get__.
>>> A.ravioli
<property object at 0x7f5bd3690770>
And indeed, we get the property object, as expected.
Now let's do the exact same thing but replace ravioli with __name__.
class A(object):
#property
def __name__(self):
return 'dd'
Great! Now let's make an instance.
>>> A().__name__
'dd'
Sensible, we looked up __name__ on A's __dict__ and found a property, so we called its __get__. Nothing weird.
Now
>>> A.__name__
'A'
Um... what? If we had just found the property on A's __dict__, then we should see that property here, right?
Well, no, not always. See, in the abstract, foo.bar normally looks in foo.__dict__ for a field called bar. But it doesn't do that if the type of foo defines a __getattribute__. If it defines that, then that method is always called instead.
Now, the type of A is type, the type of all Python types. Read that sentence a few times and make sure it makes sense. And if we do a bit of spelunking into the CPython source code, we see that type actually defines __getattribute__ and __setattr__ for the following names:
__name__
__qualname__
__bases__
__module__
__abstractmethods__
__dict__
__doc__
__text_signature__
__annotations__
That explains how __name__ can serve double duty as a property on the class instances and also as an accessible field on the same class. It also explains why you get that highly specialized error message when reassigning to B.__name__: the line
B.__name__ = property(test)
is actually equivalent to
type.__setattr__(B, '__name__', property(test))
which is calling our special-case checker in CPython.
For any other type in Python, in particular for user-defined types, we could get around this with object.__setattr__. Unfortunately,
>>> object.__setattr__(B, '__name__', property(test))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can't apply this __setattr__ to type object
There's a really specific check to make sure we don't do exactly this, and the comment reads
/* Reject calls that jump over intermediate C-level overrides. */
We also can't use metaclasses to override __setattr__ and __getattribute__, because the instance lookup procedure specifically doesn't call those (in the above examples, __getattribute__ was called in every case except the one we care about for property purposes). I even tried subclassing str to trick __setattr__ into accepting our made-up value
class NameProperty(str):
def __new__(cls, value, **kwargs):
return str.__new__(cls, value)
def __init__(self, value, method):
self.method = method
def __get__(self, instance, owner):
return self.method(instance)
B.__name__ = NameProperty(B.__name__, method=test)
This actually passes the __setattr__ check, but it doesn't assign to B.__dict__ (since the __setattr__ still assigns to the actual CPython-level name, not to B.__dict__['__name__']), so the property lookup doesn't work.
So... that's how I reached my conclusion of: __name__ is deep magic in CPython. All of the usual Python metaprogramming techniques have failed, and all of the methods getting called are written deep down in C. My advice to you is: Stop using __name__ for things it's not intended for, or be prepared to write some C code and hack on CPython directly.

A function in a class without any decorator or `self`

I have following class with a function:
class A:
def myfn():
print("In myfn method.")
Here, the function does not have self as argument. It also does not have #classmethod or #staticmethod as decorator. However, it works if called with class:
A.myfn()
Output:
In myfn method.
But give an error if called from any instance:
a = A()
a.myfn()
Error output:
Traceback (most recent call last):
File "testing.py", line 16, in <module>
a.myfn()
TypeError: myfn() takes 0 positional arguments but 1 was given
probably because self was also sent as an argument.
What kind of function will this be called? Will it be a static function? Is it advisable to use function like this in classes? What is the drawback?
Edit: This function works only when called with class and not with object/instance. My main question is what is such a function called?
Edit2: It seems from the answers that this type of function, despite being the simplest form, is not accepted as legal. However, as no serious drawback is mentioned in any of many answers, I find this can be a useful construct, especially to group my own static functions in a class that I can call as needed. I would not need to create any instance of this class. In the least, it saves me from typing #staticmethod every time and makes code look less complex. It also gets derived neatly for someone to extend my class. Although all such functions can be kept at top/global level, keeping them in class is more modular. However, I feel there should be a specific name for such a simple construct which works in this specific way and it should be recognized as legal. It may also help beginners understand why self argument is needed for usual functions in a Python class. This will only add to the simplicity of this great language.
The function type implements the descriptor protocol, which means when you access myfn via the class or an instance of the class, you don't get the actual function back; you get instead the result of that function's __get__ method. That is,
A.myfn == A.myfn.__get__(None, A)
Here, myfn is an instance method, though one that hasn't been defined properly to be used as such. When accessed via the class, though, the return value of __get__ is simply the function object itself, and the function can be called the same as a static method.
Access via an instance results in a different call to __get__. If a is an instance of A, then
a.myfn() == A.myfn.__get__(a, A)
Here , __get__ tries to return, essentially, a partial application of myfn to a, but because myfn doesn't take any arguments, that fails.
You might ask, what is a static method? staticmethod is a type that wraps a function and defines its own __get__ method. That method returns the underlying function whether or not the attribute is accessed via the class or an instance. Otherwise, there is very little difference between a static method and an ordinary function.
This is not a true method. Correctly declarated instance methods should have a self argument (the name is only a convention and can be changed if you want hard to read code), and classmethods and staticmethods should be introduced by their respective decorator.
But at a lower level, def in a class declaration just creates a function and assigns it to a class member. That is exactly what happens here: A.my_fn is a function and can successfully be called as A.my_fn().
But as it was not declared with #staticmethod, it is not a true static method and it cannot be applied on a A instance. Python sees a member of that name that happens to be a function which is neither a static nor a class method, so it prepends the current instance to the list of arguments and tries to execute it.
To answer your exact question, this is not a method but just a function that happens to be assigned to a class member.
Such a function isn't the same as what #staticmethod provides, but is indeed a static method of sorts.
With #staticmethod you can also call the static method on an instance of the class. If A is a class and A.a is a static method, you'll be able to do both A.a() and A().a(). Without this decorator, only the first example will work, because for the second one, as you correctly noticed, "self [will] also [be] sent as an argument":
class A:
#staticmethod
def a():
return 1
Running this:
>>> A.a() # `A` is the class itself
1
>>> A().a() # `A()` is an instance of the class `A`
1
On the other hand:
class B:
def b():
return 2
Now, the second version doesn't work:
>>> B.b()
2
>>> B().b()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: b() takes 0 positional arguments but 1 was given
further to #chepnet's answer, if you define a class whose objects implement the descriptor protocol like:
class Descr:
def __get__(self, obj, type=None):
print('get', obj, type)
def __set__(self, obj, value):
print('set', obj, value)
def __delete__(self, obj):
print('delete', obj)
you can embed an instance of this in a class and invoke various operations on it:
class Foo:
foo = Descr()
Foo.foo
obj = Foo()
obj.foo
which outputs:
get None <class '__main__.Foo'>
get <__main__.Foo object at 0x106d4f9b0> <class '__main__.Foo'>
as functions also implement the descriptor protocol, we can replay this by doing:
def bar():
pass
print(bar)
print(bar.__get__(None, Foo))
print(bar.__get__(obj, Foo))
which outputs:
<function bar at 0x1062da730>
<function bar at 0x1062da730>
<bound method bar of <__main__.Foo object at 0x106d4f9b0>>
hopefully that complements chepnet's answer which I found a little terse/opaque

Python - calling ancestor methods when multiple inheritance is involved

Edit: I'm using Python 3 (some people asked).
I think this is just a syntax question, but I want to be sure there's nothing I'm missing. Notice the syntax difference in how Foo and Bar are implemented. They achieve the same thing and I want to make sure they're really doing the same thing. The output suggests that there are just two ways to do the same thing. Is that the case?
Code:
class X:
def some_method(self):
print("X.some_method called")
class Y:
def some_method(self):
print("Y.some_method called")
class Foo(X,Y):
def some_method(self):
X().some_method()
Y().some_method()
print("Foo.some_method called")
class Bar(X,Y):
def some_method(self):
X.some_method(self)
Y.some_method(self)
print("Bar.some_method called")
print("=== Fun with Foo ===")
foo_instance = Foo()
foo_instance.some_method()
print("=== Fun with Bar ===")
bar_instance = Bar()
bar_instance.some_method()
Output:
=== Fun with Foo ===
X.some_method called
Y.some_method called
Foo.some_method called
=== Fun with Bar ===
X.some_method called
Y.some_method called
Bar.some_method called
PS - Hopefully it goes without saying but this is just an abstract example, let's not worry about why I'd want to call some_method on both ancestors, I'm just trying to understand the syntax and mechanics of the language here. Thanks all!
You should be using new-style classes. If this is Python 3, you are; if you are using Python 2, you should inherit from object (or some other new-style class).
The usual way to invoke ancestor methods is using super. Read about it in the standard docs, and the other excellent articles on how it operates. It is never recommended to invoke the methods in the way you are doing because (a) it will be fragile in the face of further inheritance; and (b) you increase the maintenance effort by hardcoding references to classes.
Update: Here is an example showing how to use super to achieve this: http://ideone.com/u3si2
Also look at: http://rhettinger.wordpress.com/2011/05/26/super-considered-super/
Update 2: Here's a little library for python 2 that adds a __class__ variable and a no-args super to every method to avoid hardcoding the current name: https://github.com/marcintustin/superfixer
They aren't the same. X() creates an object of class X. When you do X().someMethod() you create a new object and then call the method on that object, not on self. X.someMethod(self) is what you want, since that calls the inherited method on the same object.
You will see the difference if your method actually does anything to the self object. For instance, if you put self.blah = 8 into your method, then after X.someMethod(self) the object you call it on will have the blah attribute set, but after X().someMethod() it will not. (Instead, you will have created a new object, set blah on that, and then thrown away that new object without using it, leaving the original object untouched.)
Here is a simple example modifying your code:
>>> class X:
...
... def some_method(self):
... print("X.some_method called on", self)
...
... class Y:
...
... def some_method(self):
... print("Y.some_method called on", self)
...
... class Foo(X,Y):
...
... def some_method(self):
... X().some_method()
... Y().some_method()
... print("Foo.some_method called on", self)
...
... class Bar(X,Y):
...
... def some_method(self):
... X.some_method(self)
... Y.some_method(self)
... print("Bar.some_method called on", self)
>>> Foo().some_method()
('X.some_method called on', <__main__.X instance at 0x0142F3C8>)
('Y.some_method called on', <__main__.Y instance at 0x0142F3C8>)
('Foo.some_method called on', <__main__.Foo instance at 0x0142F3A0>)
>>> Bar().some_method()
('X.some_method called on', <__main__.Bar instance at 0x0142F3C8>)
('Y.some_method called on', <__main__.Bar instance at 0x0142F3C8>)
('Bar.some_method called on', <__main__.Bar instance at 0x0142F3C8>)
Note that when I use Foo, the objects printed are not the same; one is an X instance, one is a Y instance, and the last is the original Foo instance that I called the method on. When Bar is used, it is the same object in each method call.
(You can also use super in some cases to avoid naming the base classes explicitly; e.g., super(Foo, self).someMethod() or in Python 3 just super().someMethod(). However, if you have a need to directly call inherited methods from two base classes, super might not be a good fit. It is generally aimed at cases where each method calls super just once, passing control to the next version of the method in the inheritance chain, which will then pass it along to the next, etc.)

Using class methods as simple functions

I have a class with many methods. How can I modify my methods so that they can also be accessed directly as a function without creating object of that class? Is it possible.
The methods will be "unbound" (meaning, essentially, that they have no self to work with). If the functions do not operate upon self, you can turn them into static-methods (which do not take a self first argument) and then assign them to variables to be used like functions.
Like so:
class MyClass(object):
#staticmethod
def myfunc():
return "It works!"
myfunc = MyClass.myfunc
myfunc() # prints "It works!"
Essentially, you need to ask yourself "What data does my method need to (er) function?" Depending on your answer, you can use #staticmethod or #classmethod or you may find that you do in fact need a self in which case you will need to create an object before trying to use its methods.
That final case would look something like:
myobj = MyClass()
del MyClass # This is a singleton class
myfunc = myobj.myfunc
All of that aside, if you find that all of your methods are actually staticmethods, then it's better style to refactor them out of the class into plain-old functions, which they really are already. You may have learned this "class as namespace" style from Java, but that isn't correct in Python. Python namespaces are represented by modules.
Unbound Methods
To create an unbound method (i.e., its first variable is'nt self), you can decorate the method using the #staticmethod built-in decorator. If decorators or any of that is not making sense, check out the Wiki, this simple explanation, decorators as syntactic sugar and learn how to write a good one.
>>> class foo(object):
... #staticmethod
... def bar(blah_text):
... print "Unbound method `bar` of Class `foo`"
... return blah_text
...
>>> foobar = foo.bar
>>> foobar("We are the Knights who say 'Ni'!")
Unbound method `bar` of Class `foo`
"We are the Knights who say 'Ni'!"
Bound Methods
These methods are not technically 'bound', but are meant to be binded when called. You just have to point a reference to them and "Wala", you now have a reference to that method. Now you just have to pass a valid instance of that Class:
>>> class foo:
... def __init__(self, bar_value = 'bar'):
... self.bar_value = bar_value
... def bar(self, blah_text):
... return self.bar_value + blah_text
...
>>> bar = foo.bar
>>> bar(foo('We are the Knights who say '), "'Ni'")
"We are the Knights who say 'Ni'"
Edit: As is pointed out in the comments, it seems my usage of 'binding' is wrong. Could somebody with knowledge of it edit/correct my post?
You can call the function with the class name as a parameter, if you do not want to lose self:
class something:
def test(self):
print('Hello')
something.test(something)
#prints "Hello"

When to inline definitions of metaclass in Python?

Today I have come across a surprising definition of a metaclass in Python here, with the metaclass definition effectively inlined. The relevant part is
class Plugin(object):
class __metaclass__(type):
def __init__(cls, name, bases, dict):
type.__init__(name, bases, dict)
registry.append((name, cls))
When does it make sense to use such an inline definition?
Further Arguments:
An argument one way would be that the created metaclass is not reusable elsewhere using this technique. A counter argument is that a common pattern in using metaclasses is defining a metaclass and using it in one class and then inhertiting from that. For example, in a conservative metaclass the definition
class DeclarativeMeta(type):
def __new__(meta, class_name, bases, new_attrs):
cls = type.__new__(meta, class_name, bases, new_attrs)
cls.__classinit__.im_func(cls, new_attrs)
return cls
class Declarative(object):
__metaclass__ = DeclarativeMeta
def __classinit__(cls, new_attrs): pass
could have been written as
class Declarative(object): #code not tested!
class __metaclass__(type):
def __new__(meta, class_name, bases, new_attrs):
cls = type.__new__(meta, class_name, bases, new_attrs)
cls.__classinit__.im_func(cls, new_attrs)
return cls
def __classinit__(cls, new_attrs): pass
Any other considerations?
Like every other form of nested class definition, a nested metaclass may be more "compact and convenient" (as long as you're OK with not reusing that metaclass except by inheritance) for many kinds of "production use", but can be somewhat inconvenient for debugging and introspection.
Basically, instead of giving the metaclass a proper, top-level name, you're going to end up with all custom metaclasses defined in a module being undistiguishable from each other based on their __module__ and __name__ attributes (which is what Python uses to form their repr if needed). Consider:
>>> class Mcl(type): pass
...
>>> class A: __metaclass__ = Mcl
...
>>> class B:
... class __metaclass__(type): pass
...
>>> type(A)
<class '__main__.Mcl'>
>>> type(B)
<class '__main__.__metaclass__'>
IOW, if you want to examine "which type is class A" (a metaclass is the class's type, remember), you get a clear and useful answer -- it's Mcl in the main module. However, if you want to examine "which type is class B", the answer is not all that useful: it says it's __metaclass__ in the main module, but that's not even true:
>>> import __main__
>>> __main__.__metaclass__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute '__metaclass__'
>>>
...there is no such thing, actually; that repr is misleading and not very helpful;-).
A class's repr is essentially '%s.%s' % (c.__module__, c.__name__) -- a simple, useful, and consistent rule -- but in many cases such as, the class statement not being unique at module scope, or not being at module scope at all (but rather within a function or class body), or not even existing (classes can of course be built without a class statement, by explicitly calling their metaclass), this can be somewhat misleading (and the best solution is to avoid, in as far as possible, those peculiar cases, except when substantial advantage can be obtained by using them). For example, consider:
>>> class A(object):
... def foo(self): print('first')
...
>>> x = A()
>>> class A(object):
... def foo(self): print('second')
...
>>> y = A()
>>> x.foo()
first
>>> y.foo()
second
>>> x.__class__
<class '__main__.A'>
>>> y.__class__
<class '__main__.A'>
>>> x.__class__ is y.__class__
False
with two class statement at the same scope, the second one rebinds the name (here, A), but existing instances refer to the first binding of the name by object, not by name -- so both class objects remain, one accessible only through the type (or __class__ attribute) of its instances (if any -- if none, that first object disappears) -- the two classes have the same name and module (and therefore the same representation), but they're distinct objects. Classes nested within class or function bodies, or created by directly calling the metaclass (including type), may cause similar confusion if debugging or introspection is ever called for.
So, nesting the metaclass is OK if you'll never need to debug or otherwise introspect that code, and can be lived with if whoever is so doing understand this quirks (though it will never be as convenient as using a nice, real name, of course -- just like debugging a function coded with lambda cannot possibly ever be so convenient as debugging one coded with def). By analogy with lambda vs def you can reasonably claim that anonymous, "nested" definition is OK for metaclasses which are so utterly simple, such no-brainers, that no debugging or introspection will ever conceivably be required.
In Python 3, the "nested definition" just doesn't work -- there, a metaclass must be passed as a keyword argument to the class, as in class A(metaclass=Mcl):, so defining __metaclass__ in the body has no effect. I believe this also suggests that a nested metaclass definition in Python 2 code is probably appropriate only if you know for sure that code will never need to be ported to Python 3 (since you're making that port so much harder, and will need to de-nest the metaclass definition for the purpose) -- "throwaway" code, in other words, which won't be around in a few years when some version of Python 3 acquires huge, compelling advantages of speed, functionality, or third-party support, over Python 2.7 (the last ever version of Python 2).
Code that you expect to be throwaway, as the history of computing shows us, has an endearing habit of surprising you utterly, and being still around 20 years later (while perhaps the code you wrote around the same time "for the ages" is utterly forgotten;-). This would certainly seem to suggest avoiding nested definition of metaclasses.

Categories