How to override Integer in Python? - python

I want to inherit from integers and only redefine some methods.
The goal is to have this behaviour:
>>> i = Iter()
>>> i == 0
True
>>> next(i)
Iter<1>
>>> next(i)
Iter<2>
>>> i + 10
12
The naive approach would be to inherit from int:
class Iter(int):
def __new__(cls, start=0, increment=1):
return super().__new__(cls, start)
def __init__(self, start=0, increment=1):
self.increment = increment
def __repr__(self):
return f"Iter<{int(self)}>"
def __next__(self):
self += self.increment # DO NOT WORK
return self.value
Unfortunately, int is immutable. I tried to use the ABC for integers, but I don't really want to redefine all operators:
from numbers import Integral
class Iter(Integral):
...
i = Iter()
TypeError: Can't instantiate abstract class I with abstract methods
__abs__, __add__, __and__, __ceil__, __eq__, __floor__,
__floordiv__, __int__, __invert__, __le__, __lshift__, __lt__,
__mod__, __mul__, __neg__, __or__, __pos__, __pow__, __radd__,
__rand__, __rfloordiv__, __rlshift__, __rmod__, __rmul__, __ror__,
__round__, __rpow__, __rrshift__, __rshift__, __rtruediv__,
__rxor__, __truediv__, __trunc__, __xor__
Any other ideas?

Are you aware that itertools.count exists? It does most of what you are trying to do, except for being able to use the instance itself as an integer.
It is not possible to extend either int or itertools.count for this purpose.
Regarding the methods for operators, the reason you would need to define them is because there is not an obvious return value in this case, even if extending int or count worked. What would __add__ return - an integer, or another instance of Iter with its value incremented by the given amount? You would need to decide that, and implement them for your use case.
Rather than extending any existing builtin class, it may be easier to define your own. You can define the __int__ method to control what happens when int(...) is called on an instance of your class.
Example implementation:
class Iter:
def __init__(self, start: int = 0, increment: int = 1):
self._value = start
self._increment = increment
def __int__(self) -> int:
return self._value
def __repr__(self) -> str:
return f'Iter<{self._value}>'
def __next__(self) -> int:
self._value += self._increment
return self._value
Example use:
>>> i = Iter()
>>> next(i)
1
>>> next(i)
2
>>> i
Iter<2>
>>> int(i)
2

Related

How to subclass "float" without implementing its method?

I want to subclass from float but don't want it to init soon. I also don't want to explicitly call float() for my object.
For example, I don't want to calculate anything before it is required. I want only to do an object that behaves like float. Here is how I want to create class:
class MassiveAverage(float):
def __init__(self, floats: list[float]):
self.floats = floats
def __float__(self) -> float:
return sum(self.floats) / len(self.floats)
And this is how I want to use it:
massive_average = MassiveAverage([1.1, 2.2]) # no any calculations
massive_sum = massive_average * 2 # this is were it calculates its float value
For the answer to this question I am going to assume you are already familiar with python's "magic methods". #gftea's answer has a link to the documentation for some of the magic methods if you are not familiar.
You are going to have to manually define each "magic function" __mul__, __add__, __sub__, etc.
class MassiveAverage:
def __init__(self, floats):
self._avg = sum(floats)/len(floats)
def __mul__(self, other):
return self._avg * other
def __sub__(self, other):
return self._avg - other
def __add__(self, other):
return self._avg + other
...
But, this doesn't handle your lazy evaluation use case. Instead, we could maintain an internal cache, and on the first time one of these magic methods are evaluated, we could run the average function.
class MassiveAverage:
def __init__(self, floats):
self._floats = floats
self._avg = None
#property
def avg(self):
if self._avg is None:
self._avg = sum(self._floats) / len(self._floats)
return self._avg
Then, we can replace our magic functions and use self.avg.
def __mul__(self, other):
return self.avg * other
def __add__(self, other):
return self.avg + other
def __sub__(self, other):
return self.avg - other
...
Unfortunately, you cannot subclass float in the manner you want. Because you are specifying lazy evaluation, you are fundamentally changing how the methods in the float class work (since they don't need lazy evaluation). You would still have to manually change each magic method.
you should overwrite the operator, for example, to overwrite *, you can overwrite the __mul__ method
def __mul__(self, float): ...
see below for methods can be defined to emulate numeric objects
https://docs.python.org/3/reference/datamodel.html?highlight=rmul#emulating-numeric-types
__float__ is used for exactly one purpose: to define the behavior of float(x) as x.__float__(). There is no implicit conversion in an expression like massive_average * 2. This could mean any number of things:
massive_average.__int__() * 2
massive_average.__float__() * 2
massive_average.__complex__() * 2
massive_avarge.__str__() * 2
so Python refuses to guess. It will try massive_average.__mul__(2), and failing that, (2).__rmul__(massive_average), before giving up.
Each of the type-specific "conversion" methods are used only by the corresponding type itself. print, for example, does not call __str__ (directly); it only is defined to call str on each of its arguments, and str takes care of calling __str__.

functools total_ordering doesn't appear to do anything with inherited class

I am trying sort a list of strings in a way that uses a special comparison. I am trying to use functools.total_ordering, but I'm not sure whether it's filling out the undefined comparisons correctly.
The two I define ( > and ==) work as expected, but < does not. In particular, I print all three and I get that a > b and a < b. How is this possible? I would think that total_ordering would simply define < as not > and not ==. The result of my < test is what you would get with regular str comparison, leading me to believe that total_ordering isn't doing anything.
Perhaps the problem is that I am inheriting str, which already has __lt__ implemented? If so, is there a fix to this issue?
from functools import total_ordering
#total_ordering
class SortableStr(str):
def __gt__(self, other):
return self+other > other+self
#Is this necessary? Or will default to inherited class?
def __eq__(self, other):
return str(self) == str(other)
def main():
a = SortableStr("99")
b = SortableStr("994")
print(a > b)
print(a == b)
print(a < b)
if __name__ == "__main__":
main()
OUTPUT:
True
False
True
You're right that the built-in str comparison operators are interfering with your code. From the docs
Given a class defining one or more rich comparison ordering methods, this class decorator supplies the rest.
So it only supplies the ones not already defined. In your case, the fact that some of them are defined in a parent class is undetectable to total_ordering.
Now, we can dig deeper into the source code and find the exact check
roots = {op for op in _convert if getattr(cls, op, None) is not getattr(object, op, None)}
So it checks if the values are equal to the ones defined in the root object object. We can make that happen
#total_ordering
class SortableStr(str):
__lt__ = object.__lt__
__le__ = object.__le__
__ge__ = object.__ge__
def __gt__(self, other):
return self+other > other+self
#Is this necessary? Or will default to inherited class?
def __eq__(self, other):
return str(self) == str(other)
Now total_ordering will see that __lt__, __le__, and __ge__ are equal to the "original" object values and overwrite them, as desired.
This all being said, I would argue that this is a poor use of inheritance. You're violating Liskov substitution at the very least, in that mixed comparisons between str and SortableStr are going to, to put it lightly, produce counterintuitive results.
My more general recommendation is to favor composition over inheritance and, rather than defining a thing that "is a" specialized string, consider defining a type that "contains" a string and has specialized behavior.
#total_ordering
class SortableStr:
def __init__(self, value):
self.value = value
def __gt__(self, other):
return self.value + other.value > other.value + self.value
def __eq__(self, other):
return self.value == other.value
There, no magic required. Now SortableStr("99") is a valid object that is not a string but exhibits the behavior you want.
Not sure if this is correct, but glancing at the documentation of functools.total_ordering, this stands out to me:
Given a class defining one or more rich comparison ordering methods,
this class decorator supplies the rest.
Emphasis mine. Your class inherits __lt__ from str, so it does not get re-implemented by total_ordering since it isn't missing. That's my best guess.

Inventory system adds new object instead of increasing amount by 1 [duplicate]

I have a class MyClass, which contains two member variables foo and bar:
class MyClass:
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar
I have two instances of this class, each of which has identical values for foo and bar:
x = MyClass('foo', 'bar')
y = MyClass('foo', 'bar')
However, when I compare them for equality, Python returns False:
>>> x == y
False
How can I make python consider these two objects equal?
You should implement the method __eq__:
class MyClass:
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar
def __eq__(self, other):
if not isinstance(other, MyClass):
# don't attempt to compare against unrelated types
return NotImplemented
return self.foo == other.foo and self.bar == other.bar
Now it outputs:
>>> x == y
True
Note that implementing __eq__ will automatically make instances of your class unhashable, which means they can't be stored in sets and dicts. If you're not modelling an immutable type (i.e. if the attributes foo and bar may change the value within the lifetime of your object), then it's recommended to just leave your instances as unhashable.
If you are modelling an immutable type, you should also implement the data model hook __hash__:
class MyClass:
...
def __hash__(self):
# necessary for instances to behave sanely in dicts and sets.
return hash((self.foo, self.bar))
A general solution, like the idea of looping through __dict__ and comparing values, is not advisable - it can never be truly general because the __dict__ may have uncomparable or unhashable types contained within.
N.B.: be aware that before Python 3, you may need to use __cmp__ instead of __eq__. Python 2 users may also want to implement __ne__, since a sensible default behaviour for inequality (i.e. inverting the equality result) will not be automatically created in Python 2.
You override the rich comparison operators in your object.
class MyClass:
def __lt__(self, other):
# return comparison
def __le__(self, other):
# return comparison
def __eq__(self, other):
# return comparison
def __ne__(self, other):
# return comparison
def __gt__(self, other):
# return comparison
def __ge__(self, other):
# return comparison
Like this:
def __eq__(self, other):
return self._id == other._id
If you're dealing with one or more classes that you can't change from the inside, there are generic and simple ways to do this that also don't depend on a diff-specific library:
Easiest, unsafe-for-very-complex-objects method
pickle.dumps(a) == pickle.dumps(b)
pickle is a very common serialization lib for Python objects, and will thus be able to serialize pretty much anything, really. In the above snippet, I'm comparing the str from serialized a with the one from b. Unlike the next method, this one has the advantage of also type checking custom classes.
The biggest hassle: due to specific ordering and [de/en]coding methods, pickle may not yield the same result for equal objects, especially when dealing with more complex ones (e.g. lists of nested custom-class instances) like you'll frequently find in some third-party libs. For those cases, I'd recommend a different approach:
Thorough, safe-for-any-object method
You could write a recursive reflection that'll give you serializable objects, and then compare results
from collections.abc import Iterable
BASE_TYPES = [str, int, float, bool, type(None)]
def base_typed(obj):
"""Recursive reflection method to convert any object property into a comparable form.
"""
T = type(obj)
from_numpy = T.__module__ == 'numpy'
if T in BASE_TYPES or callable(obj) or (from_numpy and not isinstance(T, Iterable)):
return obj
if isinstance(obj, Iterable):
base_items = [base_typed(item) for item in obj]
return base_items if from_numpy else T(base_items)
d = obj if T is dict else obj.__dict__
return {k: base_typed(v) for k, v in d.items()}
def deep_equals(*args):
return all(base_typed(args[0]) == base_typed(other) for other in args[1:])
Now it doesn't matter what your objects are, deep equality is assured to work
>>> from sklearn.ensemble import RandomForestClassifier
>>>
>>> a = RandomForestClassifier(max_depth=2, random_state=42)
>>> b = RandomForestClassifier(max_depth=2, random_state=42)
>>>
>>> deep_equals(a, b)
True
The number of comparables doesn't matter as well
>>> c = RandomForestClassifier(max_depth=2, random_state=1000)
>>> deep_equals(a, b, c)
False
My use case for this was checking deep equality among a diverse set of already trained Machine Learning models inside BDD tests. The models belonged to a diverse set of third-party libs. Certainly implementing __eq__ like other answers here suggest wasn't an option for me.
Covering all the bases
You may be in a scenario where one or more of the custom classes being compared do not have a __dict__ implementation. That's not common by any means, but it is the case of a subtype within sklearn's Random Forest classifier: <type 'sklearn.tree._tree.Tree'>. Treat these situations on a case by case basis - e.g. specifically, I decided to replace the content of the afflicted type with the content of a method that gives me representative information on the instance (in this case, the __getstate__ method). For such, the second-to-last row in base_typed became
d = obj if T is dict else obj.__dict__ if '__dict__' in dir(obj) else obj.__getstate__()
Edit: for the sake of organization, I replaced the hideous oneliner above with return dict_from(obj). Here, dict_from is a really generic reflection made to accommodate more obscure libs (I'm looking at you, Doc2Vec)
def isproperty(prop, obj):
return not callable(getattr(obj, prop)) and not prop.startswith('_')
def dict_from(obj):
"""Converts dict-like objects into dicts
"""
if isinstance(obj, dict):
# Dict and subtypes are directly converted
d = dict(obj)
elif '__dict__' in dir(obj):
# Use standard dict representation when available
d = obj.__dict__
elif str(type(obj)) == 'sklearn.tree._tree.Tree':
# Replaces sklearn trees with their state metadata
d = obj.__getstate__()
else:
# Extract non-callable, non-private attributes with reflection
kv = [(p, getattr(obj, p)) for p in dir(obj) if isproperty(p, obj)]
d = {k: v for k, v in kv}
return {k: base_typed(v) for k, v in d.items()}
Do mind none of the above methods yield True for objects with the same key-value pairs in a differing order, as in
>>> a = {'foo':[], 'bar':{}}
>>> b = {'bar':{}, 'foo':[]}
>>> pickle.dumps(a) == pickle.dumps(b)
False
But if you want that you could use Python's built-in sorted method beforehand anyway.
With Dataclasses in Python 3.7 (and above), a comparison of object instances for equality is an inbuilt feature.
A backport for Dataclasses is available for Python 3.6.
(Py37) nsc#nsc-vbox:~$ python
Python 3.7.5 (default, Nov 7 2019, 10:50:52)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from dataclasses import dataclass
>>> #dataclass
... class MyClass():
... foo: str
... bar: str
...
>>> x = MyClass(foo="foo", bar="bar")
>>> y = MyClass(foo="foo", bar="bar")
>>> x == y
True
Implement the __eq__ method in your class; something like this:
def __eq__(self, other):
return self.path == other.path and self.title == other.title
Edit: if you want your objects to compare equal if and only if they have equal instance dictionaries:
def __eq__(self, other):
return self.__dict__ == other.__dict__
As a summary :
It's advised to implement __eq__ rather than __cmp__, except if you run python <= 2.0 (__eq__ has been added in 2.1)
Don't forget to also implement __ne__ (should be something like return not self.__eq__(other) or return not self == other except very special case)
Don`t forget that the operator must be implemented in each custom class you want to compare (see example below).
If you want to compare with object that can be None, you must implement it. The interpreter cannot guess it ... (see example below)
class B(object):
def __init__(self):
self.name = "toto"
def __eq__(self, other):
if other is None:
return False
return self.name == other.name
class A(object):
def __init__(self):
self.toto = "titi"
self.b_inst = B()
def __eq__(self, other):
if other is None:
return False
return (self.toto, self.b_inst) == (other.toto, other.b_inst)
Depending on your specific case, you could do:
>>> vars(x) == vars(y)
True
See Python dictionary from an object's fields
You should implement the method __eq__:
class MyClass:
def __init__(self, foo, bar, name):
self.foo = foo
self.bar = bar
self.name = name
def __eq__(self,other):
if not isinstance(other,MyClass):
return NotImplemented
else:
#string lists of all method names and properties of each of these objects
prop_names1 = list(self.__dict__)
prop_names2 = list(other.__dict__)
n = len(prop_names1) #number of properties
for i in range(n):
if getattr(self,prop_names1[i]) != getattr(other,prop_names2[i]):
return False
return True
When comparing instances of objects, the __cmp__ function is called.
If the == operator is not working for you by default, you can always redefine the __cmp__ function for the object.
Edit:
As has been pointed out, the __cmp__ function is deprecated since 3.0.
Instead you should use the “rich comparison” methods.
I wrote this and placed it in a test/utils module in my project. For cases when its not a class, just plan ol' dict, this will traverse both objects and ensure
every attribute is equal to its counterpart
No dangling attributes exist (attrs that only exist on one object)
Its big... its not sexy... but oh boi does it work!
def assertObjectsEqual(obj_a, obj_b):
def _assert(a, b):
if a == b:
return
raise AssertionError(f'{a} !== {b} inside assertObjectsEqual')
def _check(a, b):
if a is None or b is None:
_assert(a, b)
for k,v in a.items():
if isinstance(v, dict):
assertObjectsEqual(v, b[k])
else:
_assert(v, b[k])
# Asserting both directions is more work
# but it ensures no dangling values on
# on either object
_check(obj_a, obj_b)
_check(obj_b, obj_a)
You can clean it up a little by removing the _assert and just using plain ol' assert but then the message you get when it fails is very unhelpful.
Below works (in my limited testing) by doing deep compare between two object hierarchies. In handles various cases including the cases when objects themselves or their attributes are dictionaries.
def deep_comp(o1:Any, o2:Any)->bool:
# NOTE: dict don't have __dict__
o1d = getattr(o1, '__dict__', None)
o2d = getattr(o2, '__dict__', None)
# if both are objects
if o1d is not None and o2d is not None:
# we will compare their dictionaries
o1, o2 = o1.__dict__, o2.__dict__
if o1 is not None and o2 is not None:
# if both are dictionaries, we will compare each key
if isinstance(o1, dict) and isinstance(o2, dict):
for k in set().union(o1.keys() ,o2.keys()):
if k in o1 and k in o2:
if not deep_comp(o1[k], o2[k]):
return False
else:
return False # some key missing
return True
# mismatched object types or both are scalers, or one or both None
return o1 == o2
This is a very tricky code so please add any cases that might not work for you in comments.
class Node:
def __init__(self, value):
self.value = value
self.next = None
def __repr__(self):
return str(self.value)
def __eq__(self,other):
return self.value == other.value
node1 = Node(1)
node2 = Node(1)
print(f'node1 id:{id(node1)}')
print(f'node2 id:{id(node2)}')
print(node1 == node2)
>>> node1 id:4396696848
>>> node2 id:4396698000
>>> True
Use the setattr function. You might want to use this when you can't add something inside the class itself, say, when you are importing the class.
setattr(MyClass, "__eq__", lambda x, y: x.foo == y.foo and x.bar == y.bar)
If you want to get an attribute-by-attribute comparison, and see if and where it fails, you can use the following list comprehension:
[i for i,j in
zip([getattr(obj_1, attr) for attr in dir(obj_1)],
[getattr(obj_2, attr) for attr in dir(obj_2)])
if not i==j]
The extra advantage here is that you can squeeze it one line and enter in the "Evaluate Expression" window when debugging in PyCharm.
I tried the initial example (see 7 above) and it did not work in ipython. Note that cmp(obj1,obj2) returns a "1" when implemented using two identical object instances. Oddly enough when I modify one of the attribute values and recompare, using cmp(obj1,obj2) the object continues to return a "1". (sigh...)
Ok, so what you need to do is iterate two objects and compare each attribute using the == sign.
Instance of a class when compared with == comes to non-equal. The best way is to ass the cmp function to your class which will do the stuff.
If you want to do comparison by the content you can simply use cmp(obj1,obj2)
In your case cmp(doc1,doc2) It will return -1 if the content wise they are same.

How to make default (resettable) int in python

I'd like to have an ability to reset an integer/float to the predefined default value without overriding all arithmetic operations. Something like
class DefaultInt(int):
def __init__(self, value):
super(DefaultInt, self).__init__(value)
self.default_value = value
def reset(self):
self.value = self.default_value
my_int = DefaultInt(19)
my_int += 1
my_int.reset()
But there are two problems:
I cannot access the hidden value itself by subclassing int class.
After my_int += 1 the my_int becomes, obviously, a simple int.
Your immediate problems:
in-place addition needs you to redefine __iadd__ (in-place addition) to return a DefaultInt object (better save the default value, else it becomes the new value)
The reset thing looks not possible as you've written, just because integers are immutable. But you could assign back the result of reset to the same name. That would work.
class DefaultInt(int):
def __init__(self,value=0):
super(DefaultInt, self).__init__()
self.default_value = value
def __iadd__(self,value):
old_default = self.default_value
r = DefaultInt(value + self)
r.default_value = old_default
return r
def reset(self):
return DefaultInt(self.default_value)
my_int = DefaultInt(19)
my_int += 1
print(my_int)
my_int = my_int.reset()
print(my_int)
output:
20
19
Your long-term problems:
But that's a first step. If you try my_int + 12 you'll see that it returns an int as well: you'll have to define __add__. Same goes for __sub__... and there's the "hidden" value problem, and the immutable problem which prevents you to perform an in-place reset.
Conclusion:
I think the best approach would be not to inherit int and create your own, mutable, class with all the special methods crafted for your needs (plus the reset method that would work now). You won't have the constraints you're having when overriding int, and your code will be clearer, even if method-exhaustive (at least if a method is missing, you'll notice it, instead of calling a method that doesn't fit).
I think that, if you only need a reset function, you can simply use:
default_value = 5
my_int = int.__new__(int, default_value)
my_int += 1
print my_int # prints 6
my_int = int.__new__(int, default_value)
print my_int # prints 5

Compare object instances for equality by their attributes

I have a class MyClass, which contains two member variables foo and bar:
class MyClass:
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar
I have two instances of this class, each of which has identical values for foo and bar:
x = MyClass('foo', 'bar')
y = MyClass('foo', 'bar')
However, when I compare them for equality, Python returns False:
>>> x == y
False
How can I make python consider these two objects equal?
You should implement the method __eq__:
class MyClass:
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar
def __eq__(self, other):
if not isinstance(other, MyClass):
# don't attempt to compare against unrelated types
return NotImplemented
return self.foo == other.foo and self.bar == other.bar
Now it outputs:
>>> x == y
True
Note that implementing __eq__ will automatically make instances of your class unhashable, which means they can't be stored in sets and dicts. If you're not modelling an immutable type (i.e. if the attributes foo and bar may change the value within the lifetime of your object), then it's recommended to just leave your instances as unhashable.
If you are modelling an immutable type, you should also implement the data model hook __hash__:
class MyClass:
...
def __hash__(self):
# necessary for instances to behave sanely in dicts and sets.
return hash((self.foo, self.bar))
A general solution, like the idea of looping through __dict__ and comparing values, is not advisable - it can never be truly general because the __dict__ may have uncomparable or unhashable types contained within.
N.B.: be aware that before Python 3, you may need to use __cmp__ instead of __eq__. Python 2 users may also want to implement __ne__, since a sensible default behaviour for inequality (i.e. inverting the equality result) will not be automatically created in Python 2.
You override the rich comparison operators in your object.
class MyClass:
def __lt__(self, other):
# return comparison
def __le__(self, other):
# return comparison
def __eq__(self, other):
# return comparison
def __ne__(self, other):
# return comparison
def __gt__(self, other):
# return comparison
def __ge__(self, other):
# return comparison
Like this:
def __eq__(self, other):
return self._id == other._id
If you're dealing with one or more classes that you can't change from the inside, there are generic and simple ways to do this that also don't depend on a diff-specific library:
Easiest, unsafe-for-very-complex-objects method
pickle.dumps(a) == pickle.dumps(b)
pickle is a very common serialization lib for Python objects, and will thus be able to serialize pretty much anything, really. In the above snippet, I'm comparing the str from serialized a with the one from b. Unlike the next method, this one has the advantage of also type checking custom classes.
The biggest hassle: due to specific ordering and [de/en]coding methods, pickle may not yield the same result for equal objects, especially when dealing with more complex ones (e.g. lists of nested custom-class instances) like you'll frequently find in some third-party libs. For those cases, I'd recommend a different approach:
Thorough, safe-for-any-object method
You could write a recursive reflection that'll give you serializable objects, and then compare results
from collections.abc import Iterable
BASE_TYPES = [str, int, float, bool, type(None)]
def base_typed(obj):
"""Recursive reflection method to convert any object property into a comparable form.
"""
T = type(obj)
from_numpy = T.__module__ == 'numpy'
if T in BASE_TYPES or callable(obj) or (from_numpy and not isinstance(T, Iterable)):
return obj
if isinstance(obj, Iterable):
base_items = [base_typed(item) for item in obj]
return base_items if from_numpy else T(base_items)
d = obj if T is dict else obj.__dict__
return {k: base_typed(v) for k, v in d.items()}
def deep_equals(*args):
return all(base_typed(args[0]) == base_typed(other) for other in args[1:])
Now it doesn't matter what your objects are, deep equality is assured to work
>>> from sklearn.ensemble import RandomForestClassifier
>>>
>>> a = RandomForestClassifier(max_depth=2, random_state=42)
>>> b = RandomForestClassifier(max_depth=2, random_state=42)
>>>
>>> deep_equals(a, b)
True
The number of comparables doesn't matter as well
>>> c = RandomForestClassifier(max_depth=2, random_state=1000)
>>> deep_equals(a, b, c)
False
My use case for this was checking deep equality among a diverse set of already trained Machine Learning models inside BDD tests. The models belonged to a diverse set of third-party libs. Certainly implementing __eq__ like other answers here suggest wasn't an option for me.
Covering all the bases
You may be in a scenario where one or more of the custom classes being compared do not have a __dict__ implementation. That's not common by any means, but it is the case of a subtype within sklearn's Random Forest classifier: <type 'sklearn.tree._tree.Tree'>. Treat these situations on a case by case basis - e.g. specifically, I decided to replace the content of the afflicted type with the content of a method that gives me representative information on the instance (in this case, the __getstate__ method). For such, the second-to-last row in base_typed became
d = obj if T is dict else obj.__dict__ if '__dict__' in dir(obj) else obj.__getstate__()
Edit: for the sake of organization, I replaced the hideous oneliner above with return dict_from(obj). Here, dict_from is a really generic reflection made to accommodate more obscure libs (I'm looking at you, Doc2Vec)
def isproperty(prop, obj):
return not callable(getattr(obj, prop)) and not prop.startswith('_')
def dict_from(obj):
"""Converts dict-like objects into dicts
"""
if isinstance(obj, dict):
# Dict and subtypes are directly converted
d = dict(obj)
elif '__dict__' in dir(obj):
# Use standard dict representation when available
d = obj.__dict__
elif str(type(obj)) == 'sklearn.tree._tree.Tree':
# Replaces sklearn trees with their state metadata
d = obj.__getstate__()
else:
# Extract non-callable, non-private attributes with reflection
kv = [(p, getattr(obj, p)) for p in dir(obj) if isproperty(p, obj)]
d = {k: v for k, v in kv}
return {k: base_typed(v) for k, v in d.items()}
Do mind none of the above methods yield True for objects with the same key-value pairs in a differing order, as in
>>> a = {'foo':[], 'bar':{}}
>>> b = {'bar':{}, 'foo':[]}
>>> pickle.dumps(a) == pickle.dumps(b)
False
But if you want that you could use Python's built-in sorted method beforehand anyway.
With Dataclasses in Python 3.7 (and above), a comparison of object instances for equality is an inbuilt feature.
A backport for Dataclasses is available for Python 3.6.
(Py37) nsc#nsc-vbox:~$ python
Python 3.7.5 (default, Nov 7 2019, 10:50:52)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from dataclasses import dataclass
>>> #dataclass
... class MyClass():
... foo: str
... bar: str
...
>>> x = MyClass(foo="foo", bar="bar")
>>> y = MyClass(foo="foo", bar="bar")
>>> x == y
True
Implement the __eq__ method in your class; something like this:
def __eq__(self, other):
return self.path == other.path and self.title == other.title
Edit: if you want your objects to compare equal if and only if they have equal instance dictionaries:
def __eq__(self, other):
return self.__dict__ == other.__dict__
As a summary :
It's advised to implement __eq__ rather than __cmp__, except if you run python <= 2.0 (__eq__ has been added in 2.1)
Don't forget to also implement __ne__ (should be something like return not self.__eq__(other) or return not self == other except very special case)
Don`t forget that the operator must be implemented in each custom class you want to compare (see example below).
If you want to compare with object that can be None, you must implement it. The interpreter cannot guess it ... (see example below)
class B(object):
def __init__(self):
self.name = "toto"
def __eq__(self, other):
if other is None:
return False
return self.name == other.name
class A(object):
def __init__(self):
self.toto = "titi"
self.b_inst = B()
def __eq__(self, other):
if other is None:
return False
return (self.toto, self.b_inst) == (other.toto, other.b_inst)
Depending on your specific case, you could do:
>>> vars(x) == vars(y)
True
See Python dictionary from an object's fields
You should implement the method __eq__:
class MyClass:
def __init__(self, foo, bar, name):
self.foo = foo
self.bar = bar
self.name = name
def __eq__(self,other):
if not isinstance(other,MyClass):
return NotImplemented
else:
#string lists of all method names and properties of each of these objects
prop_names1 = list(self.__dict__)
prop_names2 = list(other.__dict__)
n = len(prop_names1) #number of properties
for i in range(n):
if getattr(self,prop_names1[i]) != getattr(other,prop_names2[i]):
return False
return True
When comparing instances of objects, the __cmp__ function is called.
If the == operator is not working for you by default, you can always redefine the __cmp__ function for the object.
Edit:
As has been pointed out, the __cmp__ function is deprecated since 3.0.
Instead you should use the “rich comparison” methods.
I wrote this and placed it in a test/utils module in my project. For cases when its not a class, just plan ol' dict, this will traverse both objects and ensure
every attribute is equal to its counterpart
No dangling attributes exist (attrs that only exist on one object)
Its big... its not sexy... but oh boi does it work!
def assertObjectsEqual(obj_a, obj_b):
def _assert(a, b):
if a == b:
return
raise AssertionError(f'{a} !== {b} inside assertObjectsEqual')
def _check(a, b):
if a is None or b is None:
_assert(a, b)
for k,v in a.items():
if isinstance(v, dict):
assertObjectsEqual(v, b[k])
else:
_assert(v, b[k])
# Asserting both directions is more work
# but it ensures no dangling values on
# on either object
_check(obj_a, obj_b)
_check(obj_b, obj_a)
You can clean it up a little by removing the _assert and just using plain ol' assert but then the message you get when it fails is very unhelpful.
Below works (in my limited testing) by doing deep compare between two object hierarchies. In handles various cases including the cases when objects themselves or their attributes are dictionaries.
def deep_comp(o1:Any, o2:Any)->bool:
# NOTE: dict don't have __dict__
o1d = getattr(o1, '__dict__', None)
o2d = getattr(o2, '__dict__', None)
# if both are objects
if o1d is not None and o2d is not None:
# we will compare their dictionaries
o1, o2 = o1.__dict__, o2.__dict__
if o1 is not None and o2 is not None:
# if both are dictionaries, we will compare each key
if isinstance(o1, dict) and isinstance(o2, dict):
for k in set().union(o1.keys() ,o2.keys()):
if k in o1 and k in o2:
if not deep_comp(o1[k], o2[k]):
return False
else:
return False # some key missing
return True
# mismatched object types or both are scalers, or one or both None
return o1 == o2
This is a very tricky code so please add any cases that might not work for you in comments.
class Node:
def __init__(self, value):
self.value = value
self.next = None
def __repr__(self):
return str(self.value)
def __eq__(self,other):
return self.value == other.value
node1 = Node(1)
node2 = Node(1)
print(f'node1 id:{id(node1)}')
print(f'node2 id:{id(node2)}')
print(node1 == node2)
>>> node1 id:4396696848
>>> node2 id:4396698000
>>> True
Use the setattr function. You might want to use this when you can't add something inside the class itself, say, when you are importing the class.
setattr(MyClass, "__eq__", lambda x, y: x.foo == y.foo and x.bar == y.bar)
If you want to get an attribute-by-attribute comparison, and see if and where it fails, you can use the following list comprehension:
[i for i,j in
zip([getattr(obj_1, attr) for attr in dir(obj_1)],
[getattr(obj_2, attr) for attr in dir(obj_2)])
if not i==j]
The extra advantage here is that you can squeeze it one line and enter in the "Evaluate Expression" window when debugging in PyCharm.
I tried the initial example (see 7 above) and it did not work in ipython. Note that cmp(obj1,obj2) returns a "1" when implemented using two identical object instances. Oddly enough when I modify one of the attribute values and recompare, using cmp(obj1,obj2) the object continues to return a "1". (sigh...)
Ok, so what you need to do is iterate two objects and compare each attribute using the == sign.
Instance of a class when compared with == comes to non-equal. The best way is to ass the cmp function to your class which will do the stuff.
If you want to do comparison by the content you can simply use cmp(obj1,obj2)
In your case cmp(doc1,doc2) It will return -1 if the content wise they are same.

Categories