Why don't methods have reference equality? - python

I had a bug where I was relying on methods being equal to each other when using is. It turns out that's not the case:
>>> class What:
... def meth(self):
... pass
>>> What.meth is What.meth
True
>>> inst = What()
>>> inst.meth is inst.meth
False
Why is that the case? It works for regular functions:
>>> def func(): pass
>>> func is func
True

Method objects are created each time you access them. Functions act as descriptors, returning a method object when their .__get__ method is called:
>>> What.__dict__['meth']
<function What.meth at 0x10a6f9c80>
>>> What.__dict__['meth'].__get__(What(), What)
<bound method What.meth of <__main__.What object at 0x10a6f7b10>>
If you're on Python 3.8 or later, you can use == equality testing instead. On Python 3.8 and later, two methods are equal if their .__self__ and .__func__ attributes are identical objects (so if they wrap the same function, and are bound to the same instance, both tested with is).
Before 3.8, method == behaviour is inconsistent based on how the method was implemented - Python methods and one of the two C method types compare __self__ for equality instead of identity, while the other C method type compares __self__ by identity. See Python issue 1617161.
If you need to test that the methods represent the same underlying function, test their __func__ attributes:
>>> What.meth == What.meth # functions (or unbound methods in Python 2)
True
>>> What().meth == What.meth # bound method and function
False
>>> What().meth == What().meth # bound methods with *different* instances
False
>>> What().meth.__func__ == What().meth.__func__ # functions
True

Martijn is right that a new Methods are objects generated by .__get__ so their address pointers don't equate with an is evaluation. Note that using == will evaluate as intended in Python 2.7.
Python2.7
class Test(object):
def tmethod(self):
pass
>>> Test.meth is Test.meth
False
>>> Test.meth == Test.meth
True
>>> t = Test()
>>> t.meth is t.meth
False
>>> t.meth == t.meth
True
Note however that methods referenced from an instance do not equate to those referenced from class because of the self reference carried along with the method from an instance.
>>> t = Test()
>>> t.meth is Test.meth
False
>>> t.meth == Test.meth
False
In Python 3.3 the is operator for methods more often behaves the same as the == so you get the expected behavior instead in this example. This results from both __cmp__ disappearing and a cleaner method object representation in Python 3; methods now have __eq__ and references are not built-on-the-fly objects, so the behavior follows as one might expect without Python 2 expectations.
Python3.3
>>> Test.meth is Test.meth
True
>>> Test.meth == Test.meth
True
>>> Test.meth.__eq__(Test.meth)
True

Related

How to find out if a function has been declared by `lambda` or `def`?

If I declare two functions a and b:
def a(x):
return x**2
b = lambda x: x**2
I can not use type to differentiate them, since they're both of the same type.
assert type(a) == type(b)
Also, types.LambdaType doesn't help:
>>> import types
>>> isinstance(a, types.LambdaType)
True
>>> isinstance(b, types.LambdaType)
True
One could use __name__ like:
def is_lambda_function(function):
return function.__name__ == "<lambda>"
>>> is_lambda_function(a)
False
>>> is_lambda_function(b)
True
However, since __name__ could have been modified, is_lambda_function is not guaranteed to return the correct result:
>>> a.__name__ = '<lambda>'
>>> is_lambda_function(a)
True
Is there a way which produces a more reliable result than the __name__ attribute?
AFAIK, you cannot reliably in Python 3.
Python 2 used to define a bunch of function types. For that reason, methods, lambdas and plain functions have each their own type.
Python 3 has only one type which is function. There are indeed different side effects where declaring a regular function with def and a lambda: def sets the name to the name (and qualified name) of the function and can set a docstring, while lambda sets the name (and qualified name) to be <lambda>, and sets the docstring to None. But as this can be changed...
If the functions are loaded from a regular Python source (and not typed in an interactive environment), the inspect module allows to access the original Python code:
import inspect
def f(x):
return x**2
g = lambda x: x**2
def is_lambda_func(f):
"""Tests whether f was declared as a lambda.
Returns: True for a lambda, False for a function or method declared with def
Raises:
TypeError if f in not a function
OSError('could not get source code') if f was not declared in a Python module
but (for example) in an interactive session
"""
if not inspect.isfunction(f):
raise TypeError('not a function')
src = inspect.getsource(f)
return not src.startswith('def') and not src.startswith('#') # provision for decorated funcs
g.__name__ = 'g'
g.__qualname__ = 'g'
print(f, is_lambda_func(f))
print(g, is_lambda_func(g))
This will print:
<function f at 0x00000253957B7840> False
<function g at 0x00000253957B78C8> True
By the way, if the problem was serialization of function, a function declared as a lambda can successfully be pickled, provided you give it a unique qualified name:
>>> g = lambda x: 3*x
>>> g.__qualname__ = "g"
>>> pickle.dumps(g)
b'\x80\x03c__main__\ng\nq\x00.'
You can check __code__.co_name. It contains what the name was at the time the function/lambda was compiled:
def a(x):
return x**2
b = lambda x: x**2
def is_lambda_function(f):
return f.__code__.co_name == "<lambda>"
>>> is_lambda_function(a)
False
>>> is_lambda_function(b)
True
And, contrary to __name__, __code__.co_name is read-only...
>>> a.__name__ = "<lambda>"
>>> b.__name__ = "b"
>>> a.__code__.co_name = "<lambda>"
Traceback (most recent call last):
File "<console>", line 1, in <module>
AttributeError: readonly attribute
>>> b.__code__.co_name = "b"
Traceback (most recent call last):
File "<console>", line 1, in <module>
AttributeError: readonly attribute
... so the results will stay the same:
>>> is_lambda_function(a)
False
>>> is_lambda_function(b)
True
I took the chance to dive in cpython's source to see if I could find anything, and I am afraid I have to second Serge's answer: you cannot.
Briefly, this is a lambda's journey in the interpreter:
During parsing, lambdas, just like every other expression, are read into an expr_ty, which is a huge union containing data of every expression.
This expr_ty is then converted to the appropriate type (Lambda, in our case)
After some time we land into the function that compiles lambdas
This function calls assemble, which calls makecode, which initializes a PyCodeObject (functions, methods, as well as lambdas, all end up here).
From this, I don't see anything particular that is specific to lambdas. This, combined with the fact that Python lets you modify pretty much every attribute of objects makes me/us believe what you want to do is not possible.

Python 3 == operator

I'm confused as to how the == operator works in Python 3. From the docs, eq(a, b) is equivalent to a == b. Also eq and __eq__ are equivalent.
Take the following example:
class Potato:
def __eq__(self, other):
print("In Potato's __eq__")
return True
>> p = Potato()
>> p == "hello"
In Potato's __eq__ # As expected, p.__eq__("hello") is called
True
>> "hello" == p
In Potato's __eq__ # Hmm, I expected this to be false because
True # this should call "hello".__eq__(p)
>> "hello".__eq__(p)
NotImplemented # Not implemented? How does == work for strings then?
AFAIK, the docs only talk about the == -> __eq__ mapping, but don't say anything about what happens either one of the arguments is not an object (e.g. 1 == p), or when the first object's __eq__ is NotImplemented, like we saw with "hello".__eq(p).
I'm looking for the general algorithm that is employed for equality... Most, if not all other SO answers, refer to Python 2's coercion rules, which don't apply anymore in Python 3.
You're mixing up the functions in the operator module and the methods used to implement those operators. operator.eq(a, b) is equivalent to a == b or operator.__eq__(a, b), but not to a.__eq__(b).
In terms of the __eq__ method, == and operator.eq work as follows:
def eq(a, b):
if type(a) is not type(b) and issubclass(type(b), type(a)):
# Give type(b) priority
a, b = b, a
result = a.__eq__(b)
if result is NotImplemented:
result = b.__eq__(a)
if result is NotImplemented:
result = a is b
return result
with the caveat that the real code performs method lookup for __eq__ in a way that bypasses instance dicts and custom __getattribute__/__getattr__ methods.
When you do this:
"hello" == potato
Python first calls "hello".__eq__(potato). That return NotImplemented, so Python tries it the other way: potato.__eq__("hello").
Returning NotImplemented doesn't mean there's no implementation of .__eq__ on that object. It means that the implementation didn't know how to compare to the value that was passed in. From https://docs.python.org/3/library/constants.html#NotImplemented:
Note When a binary (or in-place) method returns NotImplemented the
interpreter will try the reflected operation on the other type (or
some other fallback, depending on the operator). If all attempts
return NotImplemented, the interpreter will raise an appropriate
exception. Incorrectly returning NotImplemented will result in a
misleading error message or the NotImplemented value being returned to
Python code. See Implementing the arithmetic operations for examples.
I'm confused as to how the == operator works in Python 3. From the docs, eq(a, b) is equivalent to a == b. Also eq and __eq__ are equivalent.
No that is only the case in the operator module. The operator module is used to pass an == as a function for instance. But operator has not much to do with vanilla Python itself.
AFAIK, the docs only talk about the == -> eq mapping, but don't say anything about what happens either one of the arguments is not an object (e.g. 1 == p), or when the first object's.
In Python everything is an object: an int is an object, a "class" is an object", a None is an object, etc. We can for instance get the __eq__ of 0:
>>> (0).__eq__
<method-wrapper '__eq__' of int object at 0x55a81fd3a480>
So the equality is implemented in the "int class". As specified in the documentation on the datamodel __eq__ can return several values: True, False but any other object (for which the truthiness will be calculated). If on the other hand NotImplemented is returned, Python will fallback and call the __eq__ object on the object on the other side of the equation.

What operation is called on A in "while A:"?

Lets say I have:
class Bar:
pass
A = Bar()
while A:
print("Foo!")
What operation is then called on A in order to determine the while loop?
I've tried __eq__ but that didn't do much.
User-defined objects are truthy, unless you define a custom __bool__:
>>> class A:
... pass
...
>>> a = A()
>>> if a: print(1)
...
1
>>> class B:
... def __bool__(self):
... return False
...
>>> b = B()
>>> if b: print(1)
...
>>>
The while statement is composed of the while keyword followed by an expression.
When an expression is used in a control flow statement the truth value of that expression is evaluated by calling the objects __bool__ method:
In the context of Boolean operations, and also when expressions are used by control flow statements, the following values are interpreted as false: False, None, numeric zero of all types, and empty strings and containers (including strings, tuples, lists, dictionaries, sets and frozensets). All other values are interpreted as true. User-defined objects can customize their truth value by providing a __bool__() method.
In short, the result depends on what the __bool__ of your object returns; since you haven't specified one, a default value of True is used.
There are different methods, that can be called, to determine, whether an object evaluates to True or False.
If a __bool__-method is defined, this is called, otherwise, if __len__ is defined, its result is compared to 0.

Are classobjects singletons?

If we have x = type(a) and x == y, does it necessarily imply that x is y?
Here is a counter-example, but it's a cheat:
>>> class BrokenEq(type):
... def __eq__(cls, other):
... return True
...
>>> class A(metaclass=BrokenEq):
... pass
...
>>> a = A()
>>> x = type(a)
>>> x == A, x is A
(True, True)
>>> x == BrokenEq, x is BrokenEq
(True, False)
And I could not create a counterexample like this:
>>> A1 = type('A', (), {})
>>> A2 = type('A', (), {})
>>> a = A1()
>>> x = type(a)
>>> x == A1, x is A1
(True, True)
>>> x == A2, x is A2
(False, False)
To clarify my question - without overriding equality operators to do something insane, is it possible for a class to exist at two different memory locations or does the import system somehow prevent this?
If so, how can we demonstrate this behavior - for example, doing weird things with reload or __import__?
If not, is that guaranteed by the language or documented anywhere?
Epilogue:
# thing.py
class A:
pass
Finally, this is what clarified the real behaviour for me (and it's supporting the claims in Blckknght answer)
>>> import sys
>>> from thing import A
>>> a = A()
>>> isinstance(a, A), type(a) == A, type(a) is A
(True, True, True)
>>> del sys.modules['thing']
>>> from thing import A
>>> isinstance(a, A), type(a) == A, type(a) is A
(False, False, False)
So, although code that uses importlib.reload could break type checking by class identity, it will also break isinstance anyway.
No, there's no way to create two class objects that compare equal without being identical, except by messing around with metaclass __eq__ methods.
This behavior though is not something unique to classes. It's the default behavior for any object without an __eq__ method defined in its class. The behavior is inherited from object, which is the base class for all other (new-style) classes. It's only overridden for builtin types that have some other semantic for equality (e.g. container types which compare their contents) and for custom classes that define an __eq__ operator of their own.
As for getting two different refernces to the same class at different memory locations, that's not really possible due to Python's object semantics. The memory location of the object is its identity (in cpython at least). Another class with identical contents can exist somewhere else, but like in your A1 and A2 example, it's going to be seen as a different object by all Python logic.
I'm not aware of any documentation about how == works for types, but it definitely works by identity. You can see that the CPython 2.7 implementation is a pointer comparison:
static PyObject*
type_richcompare(PyObject *v, PyObject *w, int op)
{
...
/* Compare addresses */
vv = (Py_uintptr_t)v;
ww = (Py_uintptr_t)w;
switch (op) {
...
case Py_EQ: c = vv == ww; break;
In CPython 3.5, type doesn't implement its own tp_richcompare, so it inherits the default equality comparison from object, which is a pointer comparison:
PyTypeObject PyType_Type = {
...
0, /* tp_richcompare */

Does comparing using `==` compare identities before comparing values?

If I compare two variables using ==, does Python compare the identities, and, if they're not the same, then compare the values?
For example, I have two strings which point to the same string object:
>>> a = 'a sequence of chars'
>>> b = a
Does this compare the values, or just the ids?:
>>> b == a
True
It would make sense to compare identity first, and I guess that is the case, but I haven't yet found anything in the documentation to support this. The closest I've got is this:
x==y calls x.__eq__(y)
which doesn't tell me whether anything is done before calling x.__eq__(y).
For user-defined class instances, is is used as a fallback - where the default __eq__ isn't overridden, a == b is evaluated as a is b. This ensures that the comparison will always have a result (except in the NotImplemented case, where comparison is explicitly forbidden).
This is (somewhat obliquely - good spot Sven Marnach) referred to in the data model documentation (emphasis mine):
User-defined classes have __eq__() and __hash__() methods by
default; with them, all objects compare unequal (except with
themselves) and x.__hash__() returns an appropriate value such
that x == y implies both that x is y and hash(x) == hash(y).
You can demonstrate it as follows:
>>> class Unequal(object):
def __eq__(self, other):
return False
>>> ue = Unequal()
>>> ue is ue
True
>>> ue == ue
False
so __eq__ must be called before id, but:
>>> class NoEqual(object):
pass
>>> ne = NoEqual()
>>> ne is ne
True
>>> ne == ne
True
so id must be invoked where __eq__ isn't defined.
You can see this in the CPython implementation, which notes:
/* If neither object implements it, provide a sensible default
for == and !=, but raise an exception for ordering. */
The "sensible default" implemented is a C-level equality comparison of the pointers v and w, which will return whether or not they point to the same object.
In addition to the answer by #jonrsharpe: if the objects being compared implement __eq__, it would be wrong for Python to check for identity first.
Look at the following example:
>>> x = float('nan')
>>> x is x
True
>>> x == x
False
NaN is a specific thing that should never compare equal to itself; however, even in this case x is x should return True, because of the semantics of is.

Categories