Python 2.x has two ways to overload comparison operators, __cmp__ or the "rich comparison operators" such as __lt__. The rich comparison overloads are said to be preferred, but why is this so?
Rich comparison operators are simpler to implement each, but you must implement several of them with nearly identical logic. However, if you can use the builtin cmp and tuple ordering, then __cmp__ gets quite simple and fulfills all the comparisons:
class A(object):
def __init__(self, name, age, other):
self.name = name
self.age = age
self.other = other
def __cmp__(self, other):
assert isinstance(other, A) # assumption for this example
return cmp((self.name, self.age, self.other),
(other.name, other.age, other.other))
This simplicity seems to meet my needs much better than overloading all 6(!) of the rich comparisons. (However, you can get it down to "just" 4 if you rely on the "swapped argument"/reflected behavior, but that results in a net increase of complication, in my humble opinion.)
Are there any unforeseen pitfalls I need to be made aware of if I only overload __cmp__?
I understand the <, <=, ==, etc. operators can be overloaded for other purposes, and can return any object they like. I am not asking about the merits of that approach, but only about differences when using these operators for comparisons in the same sense that they mean for numbers.
Update: As Christopher pointed out, cmp is disappearing in 3.x. Are there any alternatives that make implementing comparisons as easy as the above __cmp__?
Yep, it's easy to implement everything in terms of e.g. __lt__ with a mixin class (or a metaclass, or a class decorator if your taste runs that way).
For example:
class ComparableMixin:
def __eq__(self, other):
return not self<other and not other<self
def __ne__(self, other):
return self<other or other<self
def __gt__(self, other):
return other<self
def __ge__(self, other):
return not self<other
def __le__(self, other):
return not other<self
Now your class can define just __lt__ and multiply inherit from ComparableMixin (after whatever other bases it needs, if any). A class decorator would be quite similar, just inserting similar functions as attributes of the new class it's decorating (the result might be microscopically faster at runtime, at equally minute cost in terms of memory).
Of course, if your class has some particularly fast way to implement (e.g.) __eq__ and __ne__, it should define them directly so the mixin's versions are not use (for example, that is the case for dict) -- in fact __ne__ might well be defined to facilitate that as:
def __ne__(self, other):
return not self == other
but in the code above I wanted to keep the pleasing symmetry of only using <;-).
As to why __cmp__ had to go, since we did have __lt__ and friends, why keep another, different way to do exactly the same thing around? It's just so much dead-weight in every Python runtime (Classic, Jython, IronPython, PyPy, ...). The code that definitely won't have bugs is the code that isn't there -- whence Python's principle that there ought to be ideally one obvious way to perform a task (C has the same principle in the "Spirit of C" section of the ISO standard, btw).
This doesn't mean we go out of our way to prohibit things (e.g., near-equivalence between mixins and class decorators for some uses), but it definitely does mean that we don't like to carry around code in the compilers and/or runtimes that redundantly exists just to support multiple equivalent approaches to perform exactly the same task.
Further edit: there's actually an even better way to provide comparison AND hashing for many classes, including that in the question -- a __key__ method, as I mentioned on my comment to the question. Since I never got around to writing the PEP for it, you must currently implement it with a Mixin (&c) if you like it:
class KeyedMixin:
def __lt__(self, other):
return self.__key__() < other.__key__()
# and so on for other comparators, as above, plus:
def __hash__(self):
return hash(self.__key__())
It's a very common case for an instance's comparisons with other instances to boil down to comparing a tuple for each with a few fields -- and then, hashing should be implemented on exactly the same basis. The __key__ special method addresses that need directly.
To simplify this case there's a class decorator in Python 2.7+/3.2+, functools.total_ordering, that can be used to implement what Alex suggests. Example from the docs:
#total_ordering
class Student:
def __eq__(self, other):
return ((self.lastname.lower(), self.firstname.lower()) ==
(other.lastname.lower(), other.firstname.lower()))
def __lt__(self, other):
return ((self.lastname.lower(), self.firstname.lower()) <
(other.lastname.lower(), other.firstname.lower()))
This is covered by PEP 207 - Rich Comparisons
Also, __cmp__ goes away in python 3.0. ( Note that it is not present on http://docs.python.org/3.0/reference/datamodel.html but it IS on http://docs.python.org/2.7/reference/datamodel.html )
(Edited 6/17/17 to take comments into account.)
I tried out the comparable mixin answer above. I ran into trouble with "None". Here is a modified version that handles equality comparisons with "None". (I saw no reason to bother with inequality comparisons with None as lacking semantics):
class ComparableMixin(object):
def __eq__(self, other):
if not isinstance(other, type(self)):
return NotImplemented
else:
return not self<other and not other<self
def __ne__(self, other):
return not __eq__(self, other)
def __gt__(self, other):
if not isinstance(other, type(self)):
return NotImplemented
else:
return other<self
def __ge__(self, other):
if not isinstance(other, type(self)):
return NotImplemented
else:
return not self<other
def __le__(self, other):
if not isinstance(other, type(self)):
return NotImplemented
else:
return not other<self
Inspired by Alex Martelli's ComparableMixin & KeyedMixin answers, I came up with the following mixin.
It allows you to implement a single _compare_to() method, which uses key-based comparisons
similar to KeyedMixin, but allows your class to pick the most efficient comparison key based on the type of other. (Note that this mixin doesn't help much for objects which can be tested for equality but not order).
class ComparableMixin(object):
"""mixin which implements rich comparison operators in terms of a single _compare_to() helper"""
def _compare_to(self, other):
"""return keys to compare self to other.
if self and other are comparable, this function
should return ``(self key, other key)``.
if they aren't, it should return ``None`` instead.
"""
raise NotImplementedError("_compare_to() must be implemented by subclass")
def __eq__(self, other):
keys = self._compare_to(other)
return keys[0] == keys[1] if keys else NotImplemented
def __ne__(self, other):
return not self == other
def __lt__(self, other):
keys = self._compare_to(other)
return keys[0] < keys[1] if keys else NotImplemented
def __le__(self, other):
keys = self._compare_to(other)
return keys[0] <= keys[1] if keys else NotImplemented
def __gt__(self, other):
keys = self._compare_to(other)
return keys[0] > keys[1] if keys else NotImplemented
def __ge__(self, other):
keys = self._compare_to(other)
return keys[0] >= keys[1] if keys else NotImplemented
Related
I am trying sort a list of strings in a way that uses a special comparison. I am trying to use functools.total_ordering, but I'm not sure whether it's filling out the undefined comparisons correctly.
The two I define ( > and ==) work as expected, but < does not. In particular, I print all three and I get that a > b and a < b. How is this possible? I would think that total_ordering would simply define < as not > and not ==. The result of my < test is what you would get with regular str comparison, leading me to believe that total_ordering isn't doing anything.
Perhaps the problem is that I am inheriting str, which already has __lt__ implemented? If so, is there a fix to this issue?
from functools import total_ordering
#total_ordering
class SortableStr(str):
def __gt__(self, other):
return self+other > other+self
#Is this necessary? Or will default to inherited class?
def __eq__(self, other):
return str(self) == str(other)
def main():
a = SortableStr("99")
b = SortableStr("994")
print(a > b)
print(a == b)
print(a < b)
if __name__ == "__main__":
main()
OUTPUT:
True
False
True
You're right that the built-in str comparison operators are interfering with your code. From the docs
Given a class defining one or more rich comparison ordering methods, this class decorator supplies the rest.
So it only supplies the ones not already defined. In your case, the fact that some of them are defined in a parent class is undetectable to total_ordering.
Now, we can dig deeper into the source code and find the exact check
roots = {op for op in _convert if getattr(cls, op, None) is not getattr(object, op, None)}
So it checks if the values are equal to the ones defined in the root object object. We can make that happen
#total_ordering
class SortableStr(str):
__lt__ = object.__lt__
__le__ = object.__le__
__ge__ = object.__ge__
def __gt__(self, other):
return self+other > other+self
#Is this necessary? Or will default to inherited class?
def __eq__(self, other):
return str(self) == str(other)
Now total_ordering will see that __lt__, __le__, and __ge__ are equal to the "original" object values and overwrite them, as desired.
This all being said, I would argue that this is a poor use of inheritance. You're violating Liskov substitution at the very least, in that mixed comparisons between str and SortableStr are going to, to put it lightly, produce counterintuitive results.
My more general recommendation is to favor composition over inheritance and, rather than defining a thing that "is a" specialized string, consider defining a type that "contains" a string and has specialized behavior.
#total_ordering
class SortableStr:
def __init__(self, value):
self.value = value
def __gt__(self, other):
return self.value + other.value > other.value + self.value
def __eq__(self, other):
return self.value == other.value
There, no magic required. Now SortableStr("99") is a valid object that is not a string but exhibits the behavior you want.
Not sure if this is correct, but glancing at the documentation of functools.total_ordering, this stands out to me:
Given a class defining one or more rich comparison ordering methods,
this class decorator supplies the rest.
Emphasis mine. Your class inherits __lt__ from str, so it does not get re-implemented by total_ordering since it isn't missing. That's my best guess.
The following piece of code
class point:
def __init__(self, x, y):
self.x = x
self.y = y
def dispc(self):
return ('(' + str(self.x) + ',' + str(self.y) + ')')
def __cmp__(self, other):
return ((self.x > other.x) and (self.y > other.y))
works fine in Python 2, but in Python 3 I get an error:
>>> p=point(2,3)
>>> q=point(3,4)
>>> p>q
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: point() > point()
It only works for == and !=.
You need to provide the rich comparison methods for ordering in Python 3, which are __lt__, __gt__, __le__, __ge__, __eq__, and __ne__. See also: PEP 207 -- Rich Comparisons.
__cmp__ is no longer used.
More specifically, __lt__ takes self and other as arguments, and needs to return whether self is less than other. For example:
class Point(object):
...
def __lt__(self, other):
return ((self.x < other.x) and (self.y < other.y))
(This isn't a sensible comparison implementation, but it's hard to tell what you were going for.)
So if you have the following situation:
p1 = Point(1, 2)
p2 = Point(3, 4)
p1 < p2
This will be equivalent to:
p1.__lt__(p2)
which would return True.
__eq__ would return True if the points are equal and False otherwise. The other methods work analogously.
If you use the functools.total_ordering decorator, you only need to implement e.g. the __lt__ and __eq__ methods:
from functools import total_ordering
#total_ordering
class Point(object):
def __lt__(self, other):
...
def __eq__(self, other):
...
This was a major and deliberate change in Python 3. See here for more details.
The ordering comparison operators (<, <=, >=, >) raise a TypeError exception when the operands don’t have a meaningful natural ordering. Thus, expressions like 1 < '', 0 > None or len <= len are no longer valid, and e.g. None < None raises TypeError instead of returning False. A corollary is that sorting a heterogeneous list no longer makes sense – all the elements must be comparable to each other. Note that this does not apply to the == and != operators: objects of different incomparable types always compare unequal to each other.
builtin.sorted() and list.sort() no longer accept the cmp argument providing a comparison function. Use the key argument instead. N.B. the key and reverse arguments are now “keyword-only”.
The cmp() function should be treated as gone, and the __cmp__() special method is no longer supported. Use __lt__() for sorting, __eq__() with __hash__(), and other rich comparisons as needed. (If you really need the cmp() functionality, you could use the expression (a > b) - (a < b) as the equivalent for cmp(a, b).)
In Python3 the six rich comparison operators
__lt__(self, other)
__le__(self, other)
__eq__(self, other)
__ne__(self, other)
__gt__(self, other)
__ge__(self, other)
must be provided individually. This can be abbreviated by using functools.total_ordering.
This however turns out rather unreadable and unpractical most of the time. Still you have to put similar code pieces in 2 funcs - or use a further helper func.
So mostly I prefer to use the mixin class PY3__cmp__ shown below. This reestablishes the single __cmp__ method framework, which was and is quite clear and practical in most cases. One can still override selected rich comparisons.
Your example would just become:
class point(PY3__cmp__):
...
# unchanged code
The PY3__cmp__ mixin class:
PY3 = sys.version_info[0] >= 3
if PY3:
def cmp(a, b):
return (a > b) - (a < b)
# mixin class for Python3 supporting __cmp__
class PY3__cmp__:
def __eq__(self, other):
return self.__cmp__(other) == 0
def __ne__(self, other):
return self.__cmp__(other) != 0
def __gt__(self, other):
return self.__cmp__(other) > 0
def __lt__(self, other):
return self.__cmp__(other) < 0
def __ge__(self, other):
return self.__cmp__(other) >= 0
def __le__(self, other):
return self.__cmp__(other) <= 0
else:
class PY3__cmp__:
pass
The following piece of code
class point:
def __init__(self, x, y):
self.x = x
self.y = y
def dispc(self):
return ('(' + str(self.x) + ',' + str(self.y) + ')')
def __cmp__(self, other):
return ((self.x > other.x) and (self.y > other.y))
works fine in Python 2, but in Python 3 I get an error:
>>> p=point(2,3)
>>> q=point(3,4)
>>> p>q
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: point() > point()
It only works for == and !=.
You need to provide the rich comparison methods for ordering in Python 3, which are __lt__, __gt__, __le__, __ge__, __eq__, and __ne__. See also: PEP 207 -- Rich Comparisons.
__cmp__ is no longer used.
More specifically, __lt__ takes self and other as arguments, and needs to return whether self is less than other. For example:
class Point(object):
...
def __lt__(self, other):
return ((self.x < other.x) and (self.y < other.y))
(This isn't a sensible comparison implementation, but it's hard to tell what you were going for.)
So if you have the following situation:
p1 = Point(1, 2)
p2 = Point(3, 4)
p1 < p2
This will be equivalent to:
p1.__lt__(p2)
which would return True.
__eq__ would return True if the points are equal and False otherwise. The other methods work analogously.
If you use the functools.total_ordering decorator, you only need to implement e.g. the __lt__ and __eq__ methods:
from functools import total_ordering
#total_ordering
class Point(object):
def __lt__(self, other):
...
def __eq__(self, other):
...
This was a major and deliberate change in Python 3. See here for more details.
The ordering comparison operators (<, <=, >=, >) raise a TypeError exception when the operands don’t have a meaningful natural ordering. Thus, expressions like 1 < '', 0 > None or len <= len are no longer valid, and e.g. None < None raises TypeError instead of returning False. A corollary is that sorting a heterogeneous list no longer makes sense – all the elements must be comparable to each other. Note that this does not apply to the == and != operators: objects of different incomparable types always compare unequal to each other.
builtin.sorted() and list.sort() no longer accept the cmp argument providing a comparison function. Use the key argument instead. N.B. the key and reverse arguments are now “keyword-only”.
The cmp() function should be treated as gone, and the __cmp__() special method is no longer supported. Use __lt__() for sorting, __eq__() with __hash__(), and other rich comparisons as needed. (If you really need the cmp() functionality, you could use the expression (a > b) - (a < b) as the equivalent for cmp(a, b).)
In Python3 the six rich comparison operators
__lt__(self, other)
__le__(self, other)
__eq__(self, other)
__ne__(self, other)
__gt__(self, other)
__ge__(self, other)
must be provided individually. This can be abbreviated by using functools.total_ordering.
This however turns out rather unreadable and unpractical most of the time. Still you have to put similar code pieces in 2 funcs - or use a further helper func.
So mostly I prefer to use the mixin class PY3__cmp__ shown below. This reestablishes the single __cmp__ method framework, which was and is quite clear and practical in most cases. One can still override selected rich comparisons.
Your example would just become:
class point(PY3__cmp__):
...
# unchanged code
The PY3__cmp__ mixin class:
PY3 = sys.version_info[0] >= 3
if PY3:
def cmp(a, b):
return (a > b) - (a < b)
# mixin class for Python3 supporting __cmp__
class PY3__cmp__:
def __eq__(self, other):
return self.__cmp__(other) == 0
def __ne__(self, other):
return self.__cmp__(other) != 0
def __gt__(self, other):
return self.__cmp__(other) > 0
def __lt__(self, other):
return self.__cmp__(other) < 0
def __ge__(self, other):
return self.__cmp__(other) >= 0
def __le__(self, other):
return self.__cmp__(other) <= 0
else:
class PY3__cmp__:
pass
I have a question regarding shapely and the usage of == operator. There exists a function to test equality of geometric object: .equals(). However == does not work.
Point((0, 2)).equals(Point((0,2))
returns True.
However:
Point((0, 2)) == Point((0, 2))
returns False
I would like to be able to use the == operator to check if a Point is already present in a list. One use case could be:
if Point not in list_of_points:
list_of_points.append(Point)
As far as I understand, this does not work because == returns False. I know there exists alternative to in by using the any() function, but I would prefer the in keyword:
if not any(Point.equals(point) for point in list_of_points):
list_of_points.append(Point)
Would it be a large effort to implement __eq__ in the shapely/geometry/base.py?
What do you think of this naive implementation of __eq__?
class BaseGeometry(object):
def __eq__(self, other):
return self.equals(other)
or
class BaseGeometry(object):
def __eq__(self, other):
return bool(self.impl['equals'](self, other))
One side effect of implementing __eq__ is that a Point can no longer be a key in a dictionary. If you want that feature, you can add this:
def __hash__(self):
return hash(id(self))
In the manual is says:
in general, __lt__() and __eq__() are sufficient, if you want the
conventional meanings of the comparison operators
But I see the error:
> assert 2 < three
E TypeError: unorderable types: int() < IntVar()
when I run this test:
from unittest import TestCase
class IntVar(object):
def __init__(self, value=None):
if value is not None: value = int(value)
self.value = value
def __int__(self):
return self.value
def __lt__(self, other):
return self.value < other
def __eq__(self, other):
return self.value == other
def __hash__(self):
return hash(self.value)
class DynamicTest(TestCase):
def test_lt(self):
three = IntVar(3)
assert three < 4
assert 2 < three
assert 3 == three
I am surprised that when IntVar() is on the right, __int__() is not being called. What am I doing wrong?
Adding __gt__() fixes this, but means I don't understand what the minimal requirements are for ordering...
Thanks,
Andrew
Python 3.x will never do any type coercions for operators, so __int__() is not used in this context. The comparison
a < b
will first try to call type(a).__lt__(a, b), and if this returns NotImplemented, it will call type(b).__gt__(b, a).
The quote from the documentation is about making comparisons work for a single type, and the above explanation shows why this would be enough for a single type.
To make your type interact correctly with int, you should either implement all the comparison operator, or use the total_ordering decorator available in Python 2.7 or 3.2.