How do I know which magic methods are called in python statements? - python

The documentation lists many magic methods. But I don't think it is enough. It does not tell me what methods are called when I do for x in c.
To do this, I tried a simple code snippet to print each attribute reference:
class Print(object):
def __getattribute__(self, item):
print(item)
return super().__getattribute__(item)
a = Print()
Sometimes it works:
import pickle
pickle.dumps(a)
# print the following and then raise an error
__reduce_ex__
__reduce__
__getstate__
__class__
Then I know pickle.dump calls these magic methods.
But sometimes it does not work:
for x in a:
continue
# direct error, no print
Are there any way to tell what magic methods are called in a python statement?
Update:
It seems Cpython bypasses getattribute calls for some special methods for speedup. Check special-method-lookup section for details.
Therefore, the answer seems to be no. We cannot catch each attribute reference.
Just take this example:
class C:
pass
c = C()
c.__len__ = lambda: 5
len(c)
# TypeError: object of type 'C' has no len()

Related

Why is `x[i]` not equivalent to `x.__getitem__(x)`?

From the documentation:
x[i] is roughly equivalent to type(x).__getitem__(x, i).
What is the benefit of the above rather than having a seemingly simpler x.__getitem__(i)?
EDIT: Why is Python behaving this way?
As a downside of the standard behavior let me show this sample code where I was surprised to find the last assertion fails while second to last one (calling __getitem__ directly) passes.
def poww_bar(base):
class Bar():
def __getitem__(self, x):
return lambda: base**x
return Bar()
def poww_foo(base):
class Foo():
pass
f = Foo()
f.__getitem__ = lambda x: lambda: base ** x
return f
pow_bar2 = poww_bar(2)
pow_foo2 = poww_foo(2)
assert pow_bar2.__getitem__(3)() == 8 # OK
assert pow_bar2[3]() == 8 # OK
assert pow_foo2.__getitem__(3)() == 8 # OK
assert pow_foo2[3]() == 8 # TypeError: 'Foo' object is not subscriptable
Methods are class attributes, not instance attributes.
There is no instance attribute named __getitem__ associated with pow_bar2. So lookup proceeds to checking the class for an attribute by that name, and it succeeds in finding Bar.__getitem__.
But the process doesn't end there. pow_bar2.__getitem__(i) is not equivalent to Bar.__getitem__(i), because Python first checks of the attribute lookup produces an object that implements the descriptor protocol. Since Bar.__getitem__ is an instance of function, it does implement the descriptor protocol.
The next step is then to return not the function itself, but the result of Bar.__dict__['__getitem__'].__get__(pow_bar2, Bar). (I'm switching to the use of Bar.__dict__ to emphasize that we do not get into an infinite loop of triggering the descriptor protocol.) This is an instance of method, which is itself a callable that passes is own arguments, along with pow_bar2, as arguments to the original function.
Thus, pow_bar2.__getitem__(i) is equivalent to Bar.__dict__['__getitem__'].__get__(pow_bar2, Bar)(i), which is roughly equivalent to Bar.__dict__['__getitem__'](pow_bar2, i).
But really, pow_bar2[i] is just shorter and more easily recognizable (due to decades of established support for this syntax in other languages) than pow_bar2.__getitem__(i). __getitem__ is what makes the use of [] extendable to other classes, rather than limiting it to built-in types.
The descriptor protocol is not just a one-shot feature that makes instance-method behavior seem more complicated than necessary. It also determines how class methods, static methods, and properties work, and can further be used to customize attribute behavior in other ways.
It could just be an optimization. A class function will only have one reference in the class definition. An object function will have a reference in every object. So the __getitem__ method was specified to be a class function, so they didn't need to waste time looking in the object definitions for it.
This is all speculation of course.

How do Python tell “this is called as a function”?

A callable object is supposed to be so by defining __call__. A class is supposed to be an object… or at least with some exceptions. This exception is what I'm failing to formally clarify, thus this question posted here.
Let A be a simple class:
class A(object):
def call(*args):
return "In `call`"
def __call__(*args):
return "In `__call__`"
The first function is purposely named “call”, to make clear the purpose is the comparison with the other.
Let's instantiate it and forget about the expression it implies:
a = A() # Think of it as `a = magic` and forget about `A()`
Now what's worth:
print(A.call())
print(a.call())
print(A())
print(a())
Result in:
>>> In `call`
>>> In `call`
>>> <__main__.A object at 0xNNNNNNNN>
>>> In `__call__`
The output (third statement not running __call__) does not come as a surprise, but when I think every where it is said “Python class are objects”…
This, more explicit, however run __call__
print(A.__call__())
print(a.__call__())
>>> “In `__call__`”
>>> “In `__call__`”
All of this is just to show how finally A() may looks strange.
There are exception in Python rules, but the documentation about “object.call” does not say a lot about __call__… not more than that:
3.3.5. Emulating callable objects
object.__call__(self[, args...])
Called when the instance is “called” as a function; […]
But how do Python tell “it's called as a function” and honour or not the object.__call__ rule?
This could be a matter of type, but even type has object as its base class.
Where can I learn more (and formally) about it?
By the way, is there any difference here between Python 2 and Python 3?
----- %< ----- edit ----- >% -----
Conclusions and other experiments after one answer and one comment
Update #1
After #Veedrac's answer and #chepner's comment, I came to this other test, which complete the comments from both:
class M(type):
def __call__(*args):
return "In `M.__call__`"
class A(object, metaclass=M):
def call(*args):
return "In `call`"
def __call__(*args):
return "In `A.__call__`"
print(A())
The result is:
>>> In `M.__call__`
So it seems that's the meta‑class which drives the “call” operations. If I understand correctly, the meta‑class does not matter only with class, but also with classes instances.
Update #2
Another relevant test, which shows this is not an attribute of the object which matters, but an attribute of the type of the object:
class A(object):
def __call__(*args):
return "In `A.__call__`"
def call2(*args):
return "In `call2`"
a = A()
print(a())
As expected, it prints:
>>> In `A.__call__`
Now this:
a.__call__ = call2
print(a())
It prints:
>>> In `A.__call__`
The same a before the attribute was assigned. It does not print In call2, it's still In A.__call__. That's important to note and also explain why that's the __call__ of the meta‑class which was invoked (keep in mind the meta‑class is the type of the class object). The __call__ used to call as function, is not from the object, it's from its type.
x(*args, **kwargs) is the same as type(x).__call__(x, *args, **kwargs).
So you have
>>> type(A).__call__(A)
<__main__.A object at 0x7f4d88245b50>
and it all makes sense.
chepner points out in the comments that type(A) == type. This is kind-of wierd, because type(A)(A) just gives type again! But remember that we're instead using type(A).__call__(A) which is not the same.
So this resolves to type.__call__(A). This is the constructor function for classes, which builds the data-structures and does all the construction magic.
The same is true of most dunder (double underscore) methods, such as __eq__. This is partially an optimisation in those cases.

How to get actual list of names of object if custom __dir__ implemented?

Official docs says:
If the object has a method named __dir__(), this method will be called
and must return the list of attributes. This allows objects that
implement a custom __getattr__() or __getattribute__() function to
customize the way dir() reports their attributes.
If custom __dir__ implemented, results, returning by another function, inspect.getmembers(), also affected.
For example:
class С(object):
__slots__ = ['atr']
def __dir__(self):
return ['nothing']
def method(self):
pass
def __init__(self):
self.atr = 'string'
c = C()
print dir(f) #If we try this - well get ['nothing'] returned by custom __dir__()
print inspect.getmembers(f) #Here we get []
print f.__dict__ #And here - exception will be raised because of __slots__
How in this case list of names of object might be getted?
Answer to original question- does inspect.getmembers() use __dir__() like dir() does?
Here's the source code for inspect.getmembers() so we can see what it's really doing:
def getmembers(object, predicate=None):
"""Return all members of an object as (name, value) pairs sorted by name.
Optionally, only return members that satisfy a given predicate."""
results = []
for key in dir(object):
try:
value = getattr(object, key)
except AttributeError:
continue
if not predicate or predicate(value):
results.append((key, value))
results.sort()
return results
From this we see that it is using dir() and just filtering the results a bit.
How to get attributes with an overridden __dir__()?
According to this answer, it isn't possible to always get a complete list of attributes, but we can still definitely get them in some cases/get enough to be useful.
From the docs:
If the object does not provide __dir__(), the function tries its best
to gather information from the object’s __dict__ attribute, if
defined, and from its type object. The resulting list is not
necessarily complete, and may be inaccurate when the object has a
custom __getattr__().
So if you are not using __slots__, you could look at your object's __dict__ (and it's type object's) to get basically the same info that dir() would normally give you. So, just like with dir(), you would have to use a more rigorous method to get metaclass methods.
If you are using __slots__, then getting class attributes is, in a way, a bit more simple. Yes, there's no dict, but there is __slots__ itself, which contains the names of all of the attributes. For example, adding print c.__slots__ to your example code yields ['atr']. (Again, a more rigorous approach is needed to get the attributes of superclasses as well.)
How to get methods
You might need a different solution depending on the use case, but if you just want to find out the methods easily, you can simply use the builtin help().
Modified PyPy dir()
Here's an alternative to some of the above: To get a version of dir() that ignores user-defined __dir__ methods, you could just take PyPy's implementation of dir() and delete the parts that reference __dir__ methods.
As Matthew pointed out in the other answer, getmembers apparently returns the subset of dir results that are actual attributes.
>>> class C:
>>> def foo(self):
>>> pass
>>> def __dir__(self):
>>> return ['test']
>>>
>>> import inspect
>>> c = C()
>>> dir(c)
['test']
>>> inspect.getmembers(c)
[]

Which special methods bypasses __getattribute__ in Python?

In addition to bypassing any instance attributes in the interest of correctness, implicit special method lookup generally also bypasses the __getattribute__() method even of the object’s metaclass.
The docs mention special methods such as __hash__, __repr__ and __len__, and I know from experience it also includes __iter__ for Python 2.7.
To quote an answer to a related question:
"Magic __methods__() are treated specially: They are internally assigned to "slots" in the type data structure to speed up their look-up, and they are only looked up in these slots."
In a quest to improve my answer to another question, I need to know: Which methods, specifically, are we talking about?
You can find an answer in the python3 documentation for object.__getattribute__, which states:
Called unconditionally to implement attribute accesses for instances of the class. If the class also defines __getattr__(), the
latter will not be called unless __getattribute__() either calls it
explicitly or raises an AttributeError. This method should return the
(computed) attribute value or raise an AttributeError exception. In
order to avoid infinite recursion in this method, its implementation
should always call the base class method with the same name to access
any attributes it needs, for example, object.__getattribute__(self,
name).
Note
This method may still be bypassed when looking up special methods as the result of implicit invocation via language syntax or built-in
functions. See Special method lookup.
also this page explains exactly how this "machinery" works. Fundamentally __getattribute__ is called only when you access an attribute with the .(dot) operator(and also by hasattr as Zagorulkin pointed out).
Note that the page does not specify which special methods are implicitly looked up, so I deem that this hold for all of them(which you may find here.
Checked in 2.7.9
Couldn't find any way to bypass the call to __getattribute__, with any of the magical methods that are found on object or type:
# Preparation step: did this from the console
# magics = set(dir(object) + dir(type))
# got 38 names, for each of the names, wrote a.<that_name> to a file
# Ended up with this:
a.__module__
a.__base__
#...
Put this at the beginning of that file, which i renamed into a proper python module (asdf.py)
global_counter = 0
class Counter(object):
def __getattribute__(self, name):
# this will count how many times the method was called
global global_counter
global_counter += 1
return super(Counter, self).__getattribute__(name)
a = Counter()
# after this comes the list of 38 attribute accessess
a.__module__
#...
a.__repr__
#...
print global_counter # you're not gonna like it... it printer 38
Then i also tried to get each of those names by getattr and hasattr -> same result. __getattribute__ was called every time.
So if anyone has other ideas... I was too lazy to look inside C code for this, but I'm sure the answer lies somewhere there.
So either there's something that i'm not getting right, or the docs are lying.
super().method will also bypass __getattribute__. This atrocious code will run just fine (Python 3.11).
class Base:
def print(self):
print("whatever")
def __getattribute__(self, item):
raise Exception("Don't access this with a dot!")
class Sub(Base):
def __init__(self):
super().print()
a = Sub()
# prints 'whatever'
a.print()
# Exception Don't access this with a dot!

Python: dereferencing weakproxy

Is there any way to get the original object from a weakproxy pointed to it? eg is there the inverse to weakref.proxy()?
A simplified example(python2.7):
import weakref
class C(object):
def __init__(self, other):
self.other = weakref.proxy(other)
class Other(object):
pass
others = [Other() for i in xrange(3)]
my_list = [C(others[i % len(others)]) for i in xrange(10)]
I need to get the list of unique other members from my_list. The way I prefer for such tasks
is to use set:
unique_others = {x.other for x in my_list}
Unfortunately this throws TypeError: unhashable type: 'weakproxy'
I have managed to solve the specific problem in an imperative way(slow and dirty):
unique_others = []
for x in my_list:
if x.other in unique_others:
continue
unique_others.append(x.other)
but the general problem noted in the caption is still active.
What if I have only my_list under control and others are burried in some lib and someone may delete them at any time, and I want to prevent the deletion by collecting nonweak refs in a list?
Or I may want to get the repr() of the object itself, not <weakproxy at xx to Other at xx>
I guess there should be something like weakref.unproxy I'm not aware about.
I know this is an old question but I was looking for an answer recently and came up with something. Like others said, there is no documented way to do it and looking at the implementation of weakproxy type confirms that there is no standard way to achieve this.
My solution uses the fact that all Python objects have a set of standard methods (like __repr__) and that bound method objects contain a reference to the instance (in __self__ attribute).
Therefore, by dereferencing the proxy to get the method object, we can get a strong reference to the proxied object from the method object.
Example:
>>> def func():
... pass
...
>>> weakfunc = weakref.proxy(func)
>>> f = weakfunc.__repr__.__self__
>>> f is func
True
Another nice thing is that it will work for strong references as well:
>>> func.__repr__.__self__ is func
True
So there's no need for type checks if either a proxy or a strong reference could be expected.
Edit:
I just noticed that this doesn't work for proxies of classes. This is not universal then.
Basically there is something like weakref.unproxy, but it's just named weakref.ref(x)().
The proxy object is only there for delegation and the implementation is rather shaky...
The == function doesn't work as you would expect it:
>>> weakref.proxy(object) == object
False
>>> weakref.proxy(object) == weakref.proxy(object)
True
>>> weakref.proxy(object).__eq__(object)
True
However, I see that you don't want to call weakref.ref objects all the time. A good working proxy with dereference support would be nice.
But at the moment, this is just not possible. If you look into python builtin source code you see, that you need something like PyWeakref_GetObject, but there is just no call to this method at all (And: it raises a PyErr_BadInternalCall if the argument is wrong, so it seems to be an internal function). PyWeakref_GET_OBJECT is used much more, but there is no method in weakref.py that could be able to do that.
So, sorry to disappoint you, but you weakref.proxy is just not what most people would want for their use cases. You can however make your own proxy implementation. It isn't to hard. Just use weakref.ref internally and override __getattr__, __repr__, etc.
On a little sidenote on how PyCharm is able to produce the normal repr output (Because you mentioned that in a comment):
>>> class A(): pass
>>> a = A()
>>> weakref.proxy(a)
<weakproxy at 0x7fcf7885d470 to A at 0x1410990>
>>> weakref.proxy(a).__repr__()
'<__main__.A object at 0x1410990>'
>>> type( weakref.proxy(a))
<type 'weakproxy'>
As you can see, calling the original __repr__ can really help!
weakref.ref is hashable whereas weakref.proxy is not. The API doesn't say anything about how you actually can get a handle on the object a proxy points to. with weakref, it's easy, you can just call it. As such, you can roll your own proxy-like class...Here's a very basic attemp:
import weakref
class C(object):
def __init__(self,obj):
self.object=weakref.ref(obj)
def __getattr__(self,key):
if(key == "object"): return object.__getattr__(self,"object")
elif(key == "__init__"): return object.__getattr__(self,"__init__")
else:
obj=object.__getattr__(self,"object")() #Dereference the weakref
return getattr(obj,key)
class Other(object):
pass
others = [Other() for i in range(3)]
my_list = [C(others[i % len(others)]) for i in range(10)]
unique_list = {x.object for x in my_list}
Of course, now unique_list contains refs, not proxys which is fundamentally different...
I know that this is an old question, but I've been bitten by it (so, there's no real 'unproxy' in the standard library) and wanted to share my solution...
The way I solved it to get the real instance was just creating a property which returned it (although I suggest using weakref.ref instead of a weakref.proxy as code should really check if it's still alive before accessing it instead of having to remember to catch an exception whenever any attribute is accessed).
Anyways, if you still must use a proxy, the code to get the real instance is:
import weakref
class MyClass(object):
#property
def real_self(self):
return self
instance = MyClass()
proxied = weakref.proxy(instance)
assert proxied.real_self is instance

Categories