Comparison operators vs “rich comparison” methods in Python

Comparison operators vs “rich comparison” methods in Python - python

Can someone explain me the differences between the two. Are those normally equivalent ? Maybe I'm completely wrong here, but I thought that each comparison operator was necessarily related to one “rich comparison” method. This is from the documentation:
The correspondence between operator symbols and method names is as
follows:
x<y calls x.__lt__(y), x<=y calls x.__le__(y), x==y calls x.__eq__(y), x!=y calls x.__ne__(y), x>y calls x.__gt__(y), and x>=y calls x.__ge__(y).
Here is an example that demonstrates my confusion.
Python 3.x:
dict1 = {1:1}
dict2 = {2:2}
>>> dict1 < dict2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'dict' and 'dict'
>>> dict1.__lt__(dict2)
NotImplemented
Python 2.x:
dict1 = {1:1}
dict2 = {2:2}
>>> dict1 < dict2
True
>>> dict1.__lt__(dict2)
NotImplemented
From the python 3 example, it seems logic that calling dict1 < dict2 is not supported. But what about Python 2 example ? Why is it accepted ?
I know that unlike Python 2, in Python 3, not all objects supports comparison operators. At my surprise though, both version return the NotImplemented singleton when calling __lt__().

This is relying on the __cmp__ magic method, which is what the rich-comparison operators were meant to replace:
>>> dict1 = {1:1}
>>> dict2 = {2:2}
>>> dict1.__cmp__
<method-wrapper '__cmp__' of dict object at 0x10f075398>
>>> dict1.__cmp__(dict2)
-1
As to the ordering logic, here is the Python 2.7 documentation:
Mappings (instances of dict) compare equal if and only if they have
equal (key, value) pairs. Equality comparison of the keys and values
enforces reflexivity.
Outcomes other than equality are resolved consistently, but are not
otherwise defined.
With a footnote:
Earlier versions of Python used lexicographic comparison of the sorted
(key, value) lists, but this was very expensive for the common case of
comparing for equality. An even earlier version of Python compared
dictionaries by identity only, but this caused surprises because
people expected to be able to test a dictionary for emptiness by
comparing it to {}.
And, in Python 3.0, ordering has been simplified. This is from the documentation:
The ordering comparison operators (<, <=, >=, >) raise a TypeError
exception when the operands don’t have a meaningful natural ordering.
builtin.sorted() and list.sort() no longer accept the cmp argument
providing a comparison function. Use the key argument instead.
The cmp() function should be treated as gone, and the __cmp__() special method
is no longer supported. Use __lt__() for sorting, __eq__() with
__hash__(), and other rich comparisons as needed. (If you really need the cmp() functionality, you could use the expression (a > b) - (a <> b) as the equivalent for cmp(a, b).)
So, to be explicit, in Python 2, since the rich comparison operators are not implemented, dict objects will fall-back to __cmp__, from the data-model documentation:
object.__cmp__(self, other)
Called by comparison operations if rich
comparison (see above) is not defined. Should return a negative
integer if self < other, zero if self == other, a positive integer
if self > other.

Note for operator < versus __lt__:
import types
class A:
def __lt__(self, other): return True
def new_lt(self, other): return False
a = A()
print(a < a, a.__lt__(a)) # True True
a.__lt__ = types.MethodType(new_lt, a)
print(a < a, a.__lt__(a)) # True False
A.__lt__ = types.MethodType(new_lt, A)
print(a < a, a.__lt__(a)) # False False
< calls __lt__ defined on class; __lt__ calls __lt__ defined on object.
It's usually the same :) And it is totally delicious to use: A.__lt__ = new_lt

Related

Python 3 == operator

I'm confused as to how the == operator works in Python 3. From the docs, eq(a, b) is equivalent to a == b. Also eq and __eq__ are equivalent.
Take the following example:
class Potato:
def __eq__(self, other):
print("In Potato's __eq__")
return True
>> p = Potato()
>> p == "hello"
In Potato's __eq__ # As expected, p.__eq__("hello") is called
True
>> "hello" == p
In Potato's __eq__ # Hmm, I expected this to be false because
True # this should call "hello".__eq__(p)
>> "hello".__eq__(p)
NotImplemented # Not implemented? How does == work for strings then?
AFAIK, the docs only talk about the == -> __eq__ mapping, but don't say anything about what happens either one of the arguments is not an object (e.g. 1 == p), or when the first object's __eq__ is NotImplemented, like we saw with "hello".__eq(p).
I'm looking for the general algorithm that is employed for equality... Most, if not all other SO answers, refer to Python 2's coercion rules, which don't apply anymore in Python 3.

You're mixing up the functions in the operator module and the methods used to implement those operators. operator.eq(a, b) is equivalent to a == b or operator.__eq__(a, b), but not to a.__eq__(b).
In terms of the __eq__ method, == and operator.eq work as follows:
def eq(a, b):
if type(a) is not type(b) and issubclass(type(b), type(a)):
# Give type(b) priority
a, b = b, a
result = a.__eq__(b)
if result is NotImplemented:
result = b.__eq__(a)
if result is NotImplemented:
result = a is b
return result
with the caveat that the real code performs method lookup for __eq__ in a way that bypasses instance dicts and custom __getattribute__/__getattr__ methods.

When you do this:
"hello" == potato
Python first calls "hello".__eq__(potato). That return NotImplemented, so Python tries it the other way: potato.__eq__("hello").
Returning NotImplemented doesn't mean there's no implementation of .__eq__ on that object. It means that the implementation didn't know how to compare to the value that was passed in. From https://docs.python.org/3/library/constants.html#NotImplemented:
Note When a binary (or in-place) method returns NotImplemented the
interpreter will try the reflected operation on the other type (or
some other fallback, depending on the operator). If all attempts
return NotImplemented, the interpreter will raise an appropriate
exception. Incorrectly returning NotImplemented will result in a
misleading error message or the NotImplemented value being returned to
Python code. See Implementing the arithmetic operations for examples.

I'm confused as to how the == operator works in Python 3. From the docs, eq(a, b) is equivalent to a == b. Also eq and __eq__ are equivalent.
No that is only the case in the operator module. The operator module is used to pass an == as a function for instance. But operator has not much to do with vanilla Python itself.
AFAIK, the docs only talk about the == -> eq mapping, but don't say anything about what happens either one of the arguments is not an object (e.g. 1 == p), or when the first object's.
In Python everything is an object: an int is an object, a "class" is an object", a None is an object, etc. We can for instance get the __eq__ of 0:
>>> (0).__eq__
<method-wrapper '__eq__' of int object at 0x55a81fd3a480>
So the equality is implemented in the "int class". As specified in the documentation on the datamodel __eq__ can return several values: True, False but any other object (for which the truthiness will be calculated). If on the other hand NotImplemented is returned, Python will fallback and call the __eq__ object on the object on the other side of the equation.

Set contains for user defined classes using hash function

Given:
class T:
def __hash__(self):
return 1234
t1 = T()
t2 = T()
my_set = { t1 }
I would expect the following to print True:
print t2 in my_set
Isn't this supposed to print True because t1 and t2 have the same hash value. How can I make the in operator of the set use the given hash function?

You need to define an __eq__ method because only instances that are identical a is b or equal a == b (besides having the same hash) will be recognized as equal by set and dict:
class T:
def __hash__(self):
return 1234
def __eq__(self, other):
return True
t1 = T()
t2 = T()
my_set = { t1 }
print(t2 in my_set) # True
The data model on __hash__ (and the same documentation page for Python 2) explains this:
__hash__
Called by built-in function hash() and for operations on members of hashed collections including set, frozenset, and dict. __hash__() should return an integer. The only required property is that objects which compare equal have the same hash value; it is advised to mix together the hash values of the components of the object that also play a part in comparison of objects by packing them into a tuple and hashing the tuple.
If a class does not define an __eq__() method it should not define a __hash__() operation either; if it defines __eq__() but not __hash__(), its instances will not be usable as items in hashable collections. If a class defines mutable objects and implements an __eq__() method, it should not implement __hash__(), since the implementation of hashable collections requires that a key’s hash value is immutable (if the object’s hash value changes, it will be in the wrong hash bucket).
User-defined classes have __eq__() and __hash__() methods by default; with them, all objects compare unequal (except with themselves) and x.__hash__() returns an appropriate value such that x == y implies both that x is y and hash(x) == hash(y).
(Emphasis mine)
Note: In Python 2 you can also implement a __cmp__ method instead of __eq__.

In psuedocode, the logic for set.__contains__() when called by x in s is roughly:
h = hash(s) # This uses your class's __hash__()
i = h % table_size # This logic is internal to the hash table
if table[i] is empty: return False # Nothing found in the set
if table[i] is x: return True # Identity implies equality
if hash(table[i]) != h: return False # Hash mismatch implies inequality
return table[i] == x # This needs __eq__() in your class

Does comparing using `==` compare identities before comparing values?

If I compare two variables using ==, does Python compare the identities, and, if they're not the same, then compare the values?
For example, I have two strings which point to the same string object:
>>> a = 'a sequence of chars'
>>> b = a
Does this compare the values, or just the ids?:
>>> b == a
True
It would make sense to compare identity first, and I guess that is the case, but I haven't yet found anything in the documentation to support this. The closest I've got is this:
x==y calls x.__eq__(y)
which doesn't tell me whether anything is done before calling x.__eq__(y).

For user-defined class instances, is is used as a fallback - where the default __eq__ isn't overridden, a == b is evaluated as a is b. This ensures that the comparison will always have a result (except in the NotImplemented case, where comparison is explicitly forbidden).
This is (somewhat obliquely - good spot Sven Marnach) referred to in the data model documentation (emphasis mine):
User-defined classes have __eq__() and __hash__() methods by
default; with them, all objects compare unequal (except with
themselves) and x.__hash__() returns an appropriate value such
that x == y implies both that x is y and hash(x) == hash(y).
You can demonstrate it as follows:
>>> class Unequal(object):
def __eq__(self, other):
return False
>>> ue = Unequal()
>>> ue is ue
True
>>> ue == ue
False
so __eq__ must be called before id, but:
>>> class NoEqual(object):
pass
>>> ne = NoEqual()
>>> ne is ne
True
>>> ne == ne
True
so id must be invoked where __eq__ isn't defined.
You can see this in the CPython implementation, which notes:
/* If neither object implements it, provide a sensible default
for == and !=, but raise an exception for ordering. */
The "sensible default" implemented is a C-level equality comparison of the pointers v and w, which will return whether or not they point to the same object.

In addition to the answer by #jonrsharpe: if the objects being compared implement __eq__, it would be wrong for Python to check for identity first.
Look at the following example:
>>> x = float('nan')
>>> x is x
True
>>> x == x
False
NaN is a specific thing that should never compare equal to itself; however, even in this case x is x should return True, because of the semantics of is.

Check if an object is order-able in python?

How can I check if an object is orderable/sortable in Python?
I'm trying to implement basic type checking for the __init__ method of my binary tree class, and I want to be able to check if the value of the node is orderable, and throw an error if it isn't. It's similar to checking for hashability in the implementation of a hashtable.
I'm trying to accomplish something similar to Haskell's (Ord a) => etc. qualifiers. Is there a similar check in Python?

If you want to know if an object is sortable, you must check if it implements the necessary methods of comparison.
In Python 2.X there were two different ways to implement those methods:
cmp method (equivalent of compareTo in Java per example)
__cmp__(self, other): returns >0, 0 or <0 wether self is more, equal or less than other
rich comparison methods
__lt__, __gt__, __eq__, __le__, __ge__, __ne__
The sort() functions call this method to make the necessary comparisons between instances (actually sort only needs the __lt__ or __gt__ methods but it's recommended to implement all of them)
In Python 3.X the __cmp__ was removed in favor of the rich comparison methods as having more than one way to do the same thing is really against Python's "laws".
So, you basically need a function that check if these methods are implemented by a class:
# Python 2.X
def is_sortable(obj):
return hasattr(obj, "__cmp__") or \
hasattr(obj, "__lt__") or \
hasattr(obj, "__gt__")
# Python 3.X
def is_sortable(obj):
cls = obj.__class__
return cls.__lt__ != object.__lt__ or \
cls.__gt__ != object.__gt__
Different functions are needed for Python 2 and 3 because a lot of other things also change about unbound methods, method-wrappers and other internal things in Python 3.
Read this links you want better understanding of the sortable objects in Python:
http://python3porting.com/problems.html#unorderable-types-cmp-and-cmp
http://docs.python.org/2/howto/sorting.html#the-old-way-using-the-cmp-parameter
PS: this was a complete re-edit of my first answer, but it was needed as I investigated the problem better and had a cleaner idea about it :)

While the explanations in answers already here address runtime type inspection, here's how the static types are annotated by typeshed. They start by defining a collection of comparison Protocols, e.g.
class SupportsDunderLT(Protocol):
def __lt__(self, __other: Any) -> bool: ...
which are then collected into rich comparison sum types, such as
SupportsRichComparison = Union[SupportsDunderLT, SupportsDunderGT]
SupportsRichComparisonT = TypeVar("SupportsRichComparisonT", bound=SupportsRichComparison)
then finally these are used to type e.g. the key functions of list.sort:
#overload
def sort(self: list[SupportsRichComparisonT], *, key: None = ..., reverse: bool = ...) -> None: ...
#overload
def sort(self, *, key: Callable[[_T], SupportsRichComparison], reverse: bool = ...) -> None: ...
and sorted:
#overload
def sorted(
__iterable: Iterable[SupportsRichComparisonT], *, key: None = ..., reverse: bool = ...
) -> list[SupportsRichComparisonT]: ...
#overload
def sorted(__iterable: Iterable[_T], *, key: Callable[[_T], SupportsRichComparison], reverse: bool = ...) -> list[_T]: ...

Regrettably it is not enough to check that your object implements lt.
numpy uses the '<' operator to return an array of Booleans, which has no truth value. SQL Alchemy uses it to return a query filter, which again no truth value.
Ordinary sets uses it to check for a subset relationship, so that
set1 = {1,2}
set2 = {2,3}
set1 == set2
False
set1 < set2
False
set1 > set2
False
The best partial solution I could think of (starting from a single object of unknown type) is this, but with rich comparisons it seems to be officially impossible to determine orderability:
if hasattr(x, '__lt__'):
try:
isOrderable = ( ((x == x) is True) and ((x > x) is False)
and not isinstance(x, (set, frozenset)) )
except:
isOrderable = False
else:
isOrderable = False

Edited
As far as I know, all lists are sortable, so if you want to know if a list is "sortable", the answer is yes, no mather what elements it has.
class C:
def __init__(self):
self.a = 5
self.b = "asd"
c = C()
d = True
list1 = ["abc", "aad", c, 1, "b", 2, d]
list1.sort()
print list1
>>> [<__main__.C instance at 0x0000000002B7DF08>, 1, True, 2, 'aad', 'abc', 'b']
You could determine what types you consider "sortable" and implement a method to verify if all elements in the list are "sortable", something like this:
def isSortable(list1):
types = [int, float, str]
res = True
for e in list1:
res = res and (type(e) in types)
return res
print isSortable([1,2,3.0, "asd", [1,2,3]])

How is eq handled in Python and in what order?

Since Python does not provide left/right versions of its comparison operators, how does it decide which function to call?
class A(object):
def __eq__(self, other):
print "A __eq__ called"
return self.value == other
class B(object):
def __eq__(self, other):
print "B __eq__ called"
return self.value == other
>>> a = A()
>>> a.value = 3
>>> b = B()
>>> b.value = 4
>>> a == b
"A __eq__ called"
"B __eq__ called"
False
This seems to call both __eq__ functions.
I am looking for the official decision tree.

The a == b expression invokes A.__eq__, since it exists. Its code includes self.value == other. Since int's don't know how to compare themselves to B's, Python tries invoking B.__eq__ to see if it knows how to compare itself to an int.
If you amend your code to show what values are being compared:
class A(object):
def __eq__(self, other):
print("A __eq__ called: %r == %r ?" % (self, other))
return self.value == other
class B(object):
def __eq__(self, other):
print("B __eq__ called: %r == %r ?" % (self, other))
return self.value == other
a = A()
a.value = 3
b = B()
b.value = 4
a == b
it will print:
A __eq__ called: <__main__.A object at 0x013BA070> == <__main__.B object at 0x013BA090> ?
B __eq__ called: <__main__.B object at 0x013BA090> == 3 ?

When Python2.x sees a == b, it tries the following.
If type(b) is a new-style class, and type(b) is a subclass of type(a), and type(b) has overridden __eq__, then the result is b.__eq__(a).
If type(a) has overridden __eq__ (that is, type(a).__eq__ isn't object.__eq__), then the result is a.__eq__(b).
If type(b) has overridden __eq__, then the result is b.__eq__(a).
If none of the above are the case, Python repeats the process looking for __cmp__. If it exists, the objects are equal iff it returns zero.
As a final fallback, Python calls object.__eq__(a, b), which is True iff a and b are the same object.
If any of the special methods return NotImplemented, Python acts as though the method didn't exist.
Note that last step carefully: if neither a nor b overloads ==, then a == b is the same as a is b.
From https://eev.ee/blog/2012/03/24/python-faq-equality/

Python 3 Changes/Updates for this algorithm
How is __eq__ handled in Python and in what order?
a == b
It is generally understood, but not always the case, that a == b invokes a.__eq__(b), or type(a).__eq__(a, b).
Explicitly, the order of evaluation is:
if b's type is a strict subclass (not the same type) of a's type and has an __eq__, call it and return the value if the comparison is implemented,
else, if a has __eq__, call it and return it if the comparison is implemented,
else, see if we didn't call b's __eq__ and it has it, then call and return it if the comparison is implemented,
else, finally, do the comparison for identity, the same comparison as is.
We know if a comparison isn't implemented if the method returns NotImplemented.
(In Python 2, there was a __cmp__ method that was looked for, but it was deprecated and removed in Python 3.)
Let's test the first check's behavior for ourselves by letting B subclass A, which shows that the accepted answer is wrong on this count:
class A:
value = 3
def __eq__(self, other):
print('A __eq__ called')
return self.value == other.value
class B(A):
value = 4
def __eq__(self, other):
print('B __eq__ called')
return self.value == other.value
a, b = A(), B()
a == b
which only prints B __eq__ called before returning False.
Note that I also correct a small error in the question where self.value is compared to other instead of other.value - in this comparison, we get two objects (self and other), usually of the same type since we are doing no type-checking here (but they can be of different types), and we need to know if they are equal. Our measure of whether or not they are equal is to check the value attribute, which must be done on both objects.
How do we know this full algorithm?
The other answers here seem incomplete and out of date, so I'm going to update the information and show you how how you could look this up for yourself.
This is handled at the C level.
We need to look at two different bits of code here - the default __eq__ for objects of class object, and the code that looks up and calls the __eq__ method regardless of whether it uses the default __eq__ or a custom one.
Default __eq__
Looking __eq__ up in the relevant C api docs shows us that __eq__ is handled by tp_richcompare - which in the "object" type definition in cpython/Objects/typeobject.c is defined in object_richcompare for case Py_EQ:.
case Py_EQ:
/* Return NotImplemented instead of False, so if two
objects are compared, both get a chance at the
comparison. See issue #1393. */
res = (self == other) ? Py_True : Py_NotImplemented;
Py_INCREF(res);
break;
So here, if self == other we return True, else we return the NotImplemented object. This is the default behavior for any subclass of object that does not implement its own __eq__ method.
How __eq__ gets called
Then we find the C API docs, the PyObject_RichCompare function, which calls do_richcompare.
Then we see that the tp_richcompare function, created for the "object" C definition is called by do_richcompare, so let's look at that a little more closely.
The first check in this function is for the conditions the objects being compared:
are not the same type, but
the second's type is a subclass of the first's type, and
the second's type has an __eq__ method,
then call the other's method with the arguments swapped, returning the value if implemented. If that method isn't implemented, we continue...
if (!Py_IS_TYPE(v, Py_TYPE(w)) &&
PyType_IsSubtype(Py_TYPE(w), Py_TYPE(v)) &&
(f = Py_TYPE(w)->tp_richcompare) != NULL) {
checked_reverse_op = 1;
res = (*f)(w, v, _Py_SwappedOp[op]);
if (res != Py_NotImplemented)
return res;
Py_DECREF(res);
Next we see if we can lookup the __eq__ method from the first type and call it.
As long as the result is not NotImplemented, that is, it is implemented, we return it.
if ((f = Py_TYPE(v)->tp_richcompare) != NULL) {
res = (*f)(v, w, op);
if (res != Py_NotImplemented)
return res;
Py_DECREF(res);
Else if we didn't try the other type's method and it's there, we then try it, and if the comparison is implemented, we return it.
if (!checked_reverse_op && (f = Py_TYPE(w)->tp_richcompare) != NULL) {
res = (*f)(w, v, _Py_SwappedOp[op]);
if (res != Py_NotImplemented)
return res;
Py_DECREF(res);
}
Finally, we get a fallback in case it isn't implemented for either one's type.
The fallback checks for the identity of the object, that is, whether it is the same object at the same place in memory - this is the same check as for self is other:
/* If neither object implements it, provide a sensible default
for == and !=, but raise an exception for ordering. */
switch (op) {
case Py_EQ:
res = (v == w) ? Py_True : Py_False;
break;
Conclusion
In a comparison, we respect the subclass implementation of comparison first.
Then we attempt the comparison with the first object's implementation, then with the second's if it wasn't called.
Finally we use a test for identity for comparison for equality.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Comparison operators vs “rich comparison” methods in Python - python

Related

Python 3 == operator

Set contains for user defined classes using hash function

Does comparing using `==` compare identities before comparing values?

Check if an object is order-able in python?

How is eq handled in Python and in what order?

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Comparison operators vs “rich comparison” methods in Python - python

Related

Python 3 == operator

Set contains for user defined classes using __hash__ function

Does comparing using `==` compare identities before comparing values?

Check if an object is order-able in python?

How is __eq__ handled in Python and in what order?

Categories

Resources

Set contains for user defined classes using hash function

How is eq handled in Python and in what order?