Comparing two objects using __dict__ - python

Is there ever a reason not to do this to compare two objects:
def __eq__(self, other):
return self.__dict__ == other.__dict__
as opposed to checking each individual attribute:
def __eq__(self, other):
return self.get_a() == other.get_a() and self.get_b() == other.get_b() and ...
Initially I had the latter, but figured the former was the cleaner solution.

You can be explicit and concise:
def __eq__(self, other):
fetcher = operator.attrgetter("a", "b", "c", "d")
try:
return self is other or fetcher(self) == fetcher(other)
except AttributeError:
return False
Just comparing the __dict__ attribute (which might not exist if __slots__ is used) leaves you open to the risk that an unexpected attribute exists on the object:
class A:
def __init__(self, a):
self.a = a
def __eq__(self, other):
return self.__dict__ == other.__dict__
a1 = A(5)
a2 = A(5)
a1.b = 3
assert a1 == a2 # Fails

Some comments:
You should include a self is other check, otherwise, under certain conditions, the same object in memory can compare unequal to itself. Here is a demonstration. The instance-check chrisz mentioned in the comments is a good idea as well.
The dicts of self and other probably contain many more items than the attributes you are manually checking for in the second version. Therefore, the first one will be slower.
(Lastly, but not related to the question, we don't write getters and setters in Python. Access attributes directly with the dot-notation, and if something special needs to happen when getting/setting an attribute, use a property.)

Related

Python: Raise error when comparing objects of different type for equality

As far as I understand, in Python, comparing the instances of two different classes for equality, does:
evaluate the __eq__ method of the first instance and return the result
Except if the result is NotImplemented, then evaluate the __eq__ method of the second instance and return the result
Except if the result is NotImplemented, then compare for object identity and return the result (which is False)
I encountered multiple times situations where a raised Exception would have helped me to spot a bug in my code. Martijn Pieters has sketched in this post a way to do that, but also mentioned that it's unpythonic.
Besides being unpythonic, are there actual problems arising from this approach?
Strictly speaking, it is, of course, "unpythonic" because Python encourages duck typing alongside strict subtyping. But I personally prefer strictly typed interfaces, so I throw TypeErrors. Never had any issues, but I can't difinitively say that you will never have any.
Theoretically, I can imagine it being a problem if you find yourself in need to use a mixed-type container like list[Optional[YourType]] and then compare its elements, maybe indirectly, like in constructing set(your_mixed_list).
It would cause problems for dictionaries with mixed keys that have the same hash. Dictionaries allow several keys to have the same hash, and then do an equality check to distinguish keys. See https://stackoverflow.com/a/9022835/1217284
The following script demonstrates this, it fails when the class Misbehaving1 is used:
#!/usr/bin/env python3
class Behaving1:
def __init__(self, value):
self.value = value
def __hash__(self):
return 7
class Behaving2:
def __init__(self, value):
self.value = value
def __hash__(self):
return 7
def __eq__(self, other):
if not (isinstance(other, type(self)) or isinstance(self, type(other))):
return NotImplemented
return self.value == other.value
class MisBehaving1:
def __init__(self, value):
self.value = value
def __hash__(self):
return 7
def __eq__(self, other):
if not (isinstance(other, type(self)) or isinstance(self, type(other))):
raise TypeError(f"Cannot compare {self} of type {type(self)} with {other} of type {type(other)}")
return self.value == other.value
d = {Behaving1(5): 'hello', Behaving2(4): 'world'}
print(d)
dd = {Behaving1(5): 'hello', MisBehaving1(4): 'world'}
print(dd)

Assert Equal an Object which has an Array within it

My goal is to check the similarity of an object using unittesting in python. I have this kind of object
class ImageModel(object):
def __init__(self):
self.data = data #this an array
self.name = name
self.path = path
I have read that if you want to do test the similarity of an array self.assertEqual(arr1,arr2) you have to put .all() after each array. But I have to check the similarity of an object which has an array within it. For my case, it would be:
self.assertEqual(ImageObj1, ImageObj2)
But it always show that those object isn't similar, i assume the problem is at the ImageObj.data
so, is there any way to assert equal an array within an object?
One possibility is to override the __eq__ method of your class. Basically, this is the method that is called when you're using the == operator on an ImageModel instance.
Here is an example:
def __eq__(self, other):
return (
# the two instances must be of the same class
isinstance(other, self.__class__) and
# compare name and path, that's straightforward
self.name == other.name and
self.path == other.path and
# and compare data
len(self.data) == len(other.data) and
all(a == b for a, b in zip(self.data, other.data))
)

Monkey patching __eq__ in Python

Having some trouble understanding why I'm able to re-define (monkey patch) __eq__ outside of a class, but not change its definition through __init__ or in a method:
class SpecialInteger:
def __init__(self,x):
self.x = x
self.__eq__ = self.equals_normal
def equals_normal(self,other):
return self.x == other.x
def equals_special(self,other):
return self.x != other.x
def switch_to_normal(self):
self.__eq__ = self.equals_normal
def switch_to_special(self):
self.__eq__ = self.equals_special
a = SpecialInteger(3)
b = SpecialInteger(3)
print(a == b) # false
a.switch_to_normal()
print(a == b) # false
SpecialInteger.__eq__ = SpecialInteger.equals_normal
print(a == b) # true
SpecialInteger.__eq__ = SpecialInteger.equals_special
print(a == b) # false
Am I just using self incorrectly or is there some other reason it works like this?
To do it inside the class, you would simply define the __eq__ method inside of your class.
class SpecialInteger:
def __init__(self,x):
self.x = x
def __eq__(self, other):
# do stuff, call whatever other methods you want
EDIT: I see what you are asking, you wish to override the method (which is a "magic" method) at the instance level. I don't believe this is possible in the base construct of the language, per this discussion.
The reason your monkey patch works in that example is because it is being passed on the Class level, as opposed to the instance level, whereas self is referring to the instance.
Just to add on to an excellent existing answer, but this doesn't work because you are modifying the class instance, and not the class.
In order to get the behavior you desire, you can modify the class during __init__, however, this is woefully inadequate (since it modifies the class, and therefore all instances of the class), and you are better off making those changes visible at the class scope.
For example, the following are equivalent:
class SpecialInteger1:
def __init__(self,x):
self.x = x
self.__class__.__eq__ = self.equals_normal
...
class SpecialInteger2:
def __init__(self,x):
self.x = x
def equals_normal(self,other):
return self.x == other.x
def __eq__(self, other):
return self.equals_normal(other)
You should prefer case SpecialInteger2 in all examples, since it is more explicit about what it does.
However, none of this actually solves the issue you are trying to solve: how can I create a specialized equality comparison at the instance level that I can toggle? The answer is through the use of an enum (in Python 3):
from enum import Enum
class Equality(Enum):
NORMAL = 1
SPECIAL = 2
class SpecialInteger:
def __init__(self, x, eq = Equality.NORMAL):
self.x = x
self.eq = eq
def equals_normal(self, other):
return self.x == other.x
def equals_special(self, other):
return self.x != other.x
def __eq__(self, other):
return self.__comp[self.eq](self, other)
# Define a dictionary for O(1) access
# to call the right method.
__comp = {
Equality.NORMAL: equals_normal,
Equality.SPECIAL: equals_special
}
Let's walk through this quickly, since there are 3 parts:
An instance member variable of eq, which can be modified dynamically.
An implementation of __eq__ that selects the correct equality function based on the value of self.eq.
A namespace-mangled dictionary (a class/member variable that starts with __, in this case, self.__comp) that allows efficient lookup of the desired equality method.
The dictionary can easily be done-away with, especially for cases where you only wish to support 1-5 different possible comparisons, and replaced with idiomatic if/then statements, however, if you ever wish to support many more comparison options (say, 300), a dictionary will be much more efficient O(1) than if/then comparisons (linear search, O(n)).
If you wish to do this with setters (like in the original example), and actually hide the member functions from the user, you can also do this by directly storing the function as a variable.
All method definitions are defined at class level (literally the name is a key in a dict belonging to the class). This is also true of anything else you put at class level. Which is why for instance a variable assignment outside a method in a class produces a class variable.
The easiest way to keep the same functionality would be to just refer to some other variable from __eq__. It could be some reference variable, or a saved method.
class SpecialInteger:
def __init__(self,x):
self.x = x
self._equal_method = self.equals_normal
# ...
def switch_to_normal(self):
self._equal_method = self.equals_normal
def switch_to_special(self):
self._equal_method = self.equals_special
def __eq__(self, other):
return self._equal_method(other)

'Reversed' comparison operator in Python

class Inner():
def __init__(self, x):
self.x = x
def __eq__(self, other):
if isinstance(other, Inner):
return self.x == other.x
else:
raise TypeError("Incorrect type to compare")
class Outer():
def __init__(self, y):
self.y = Inner(y)
def __eq__(self, other):
if isinstance(other, Outer):
return self.y == other.y
elif isinstance(other, Inner):
return self.y == other
else:
raise TypeError("Incorrect type to compare")
if __name__ == "__main__":
a = Outer(1)
b = Inner(1)
print(a == b) # ok no problem
print(b == a) # This will raise a type error
In the example I have inner and outer class. I have no control over what Inner implements just wanted to simulate the situation. I have only control over Outer's behavior. I want Outer instances to be able to compare to Inner instances (not just equality). With the given implementation only the first comparison works because that is calling Outer's __eq__ method allowed to be compared to Outer and Inner instances but the second one is calling Inner's __eq__ which will not allow the comparison to Outer - heck it doesn't know Outer exists why should it bother to implement it.
Is there a way to get the second type of comparison to work, with something similar like the __radd__ and such functions.
I know for instance in C++ you resolve this with inline operator definitions, but we don't have such in Python.
Not to put too fine a point on it: Inner.__eq__ is broken. At the very least, rather than throwing an error it should return NotImplemented, which would allow Python to try the reverse comparison:
When NotImplemented is returned, the interpreter will then try the
reflected operation on the other type, or some other fallback,
depending on the operator. If all attempted operations return
NotImplemented, the interpreter will raise an appropriate exception.
Better yet it would use "duck typing", rather than insisting on a specific class (unless the class, rather than its interface, is an explicitly important part of the comparison):
def __eq__(self, other):
try:
return self.x == other.x
except AttributeError:
return NotImplemented
However, as you say you cannot control this, you will have to manually implement similar functionality, for example:
def compare(a, b):
"""'Safe' comparison between two objects."""
try:
return a == b
except TypeError:
return b == a
as there is no such thing as __req__ in Python's data model.

Comparing instances of a dict subclass

I have subclassed dict to add an extra method (so no overriding).
Now, I try to compare two of those subclasses, and I get something weird :
>>> d1.items() == d2.items()
True
>>> d1.values() == d2.values()
True
>>> d1.keys() == d2.keys()
True
>>> d1 == d2
False
EDIT
That's damn weird ... I don't understand at all ! Anybody with an insight on how the dict.eq is implemented ?
Following is all the code :
# ------ Bellow is my dict subclass (with no overriding) :
class ClassSetDict(dict):
def subsetget(self, klass, default=None):
class_sets = set(filter(lambda cs: klass <= cs, self))
# Eliminate supersets
for cs1 in class_sets.copy():
for cs2 in class_sets.copy():
if cs1 <= cs2 and not cs1 is cs2:
class_sets.discard(cs2)
try:
best_match = list(class_sets)[0]
except IndexError:
return default
return self[best_match]
# ------ Then an implementation of class sets
class ClassSet(object):
# Set of classes, allowing to easily calculate inclusions
# with comparison operators : `a < B` <=> "A strictly included in B"
def __init__(self, klass):
self.klass = klass
def __ne__(self, other):
return not self == other
def __gt__(self, other):
other = self._default_to_singleton(other)
return not self == other and other < self
def __le__(self, other):
return self < other or self == other
def __ge__(self, other):
return self > other or self == other
def _default_to_singleton(self, klass):
if not isinstance(klass, ClassSet):
return Singleton(klass)
else:
return klass
class Singleton(ClassSet):
def __eq__(self, other):
other = self._default_to_singleton(other)
return self.klass == other.klass
def __lt__(self, other):
if isinstance(other, AllSubSetsOf):
return issubclass(self.klass, other.klass)
else:
return False
class AllSubSetsOf(ClassSet):
def __eq__(self, other):
if isinstance(other, AllSubSetsOf):
return self.klass == other.klass
else:
return False
def __lt__(self, other):
if isinstance(other, AllSubSetsOf):
return issubclass(self.klass, other.klass) and not other == self
else:
return False
# ------ and finally the 2 dicts that don't want to be equal !!!
d1 = ClassSetDict({AllSubSetsOf(object): (int,)})
d2 = ClassSetDict({AllSubSetsOf(object): (int,)})
the problem you're seing has nothing at all to do with subclassing dict. in fact this behavior can be seen using a regular dict. The problem is how you have defined the keys you're using. A simple class like:
>>> class Foo(object):
... def __init__(self, value):
... self.value = value
...
... def __eq__(self, other):
... return self.value == other.value
...
Is enough to demonstrate the problem:
>>> f1 = Foo(5)
>>> f2 = Foo(5)
>>> f1 == f2
True
>>> d1 = {f1: 6}
>>> d2 = {f2: 6}
>>> d1.items() == d2.items()
True
>>> d1 == d2
False
What's missing is that you forgot to define __hash__. Every time you change the equality semantics of a class, you should make sure that the __hash__ method agrees with it: when two objects are equal, they must have equal hashes. dict behavior depends strongly on the hash value of keys.
When you inherit from object, you automatically get both __eq__ and __hash__, the former compares object identity, and the latter returns the address of the object (so they agree), but when you change __eq__, you're still seeing the old __hash__, which no longer agrees and dict gets lost.
Simply provide a __hash__ method that in a stable way combines the hash values of its attributes.
>>> class Bar(object):
... def __init__(self, value):
... self.value = value
...
... def __eq__(self, other):
... return self.value == other.value
...
... def __hash__(self):
... return hash((Bar, self.value))
...
>>> b1 = Bar(5)
>>> b2 = Bar(5)
>>> {b1: 6} == {b2: 6}
True
>>>
When using __hash__ in this way, it's also a good idea to make sure that the attributes do not (or better, cannot) change after the object is created. If the hash value changes while collected in a dict, the key will be "lost", and all sorts of weird things can happen (even weirder than the issue you initially asked about)
This most probably depends from some implementation details, in fact a basic subclassing doesn't show this problem:
>>> class D(dict):
... def my_method(self):
... pass
...
>>> d1 = D(alpha=123)
>>> d1
{'alpha': 123}
>>> d2 = D(alpha=123)
>>> d1.items() == d2.items()
True
>>> d1.values() == d2.values()
True
>>> d1.keys() == d2.keys()
True
>>> d1 == d2
True
Your instance of "AllSubSetsOf" asre used as dict keys -- they should have a hash method.
Try adding a
def __hash__(self):
return hash(self.klass)
method to either ClassSet or AllSubSetsOf
I do so hate it when people say things like "The dicts contain funky stuff, so it wouldn't help much to show" since it is precisely the nature of the funky stuff that matters here.
The first thing to note is that if you had exactly the opposite result it wouldn't be surprising at all: i.e. if d1.items(), d1.values(), d1.keys() were not equal to d2.items(), d2.values(), d2.keys() you could quite happily have d1 == d2. That's because dictionaries don't compare by comparing items or keys, they use a different technique which (I think) is the source of your problem.
Effectively comparing two dictionaries first checks they are the same length, then goes through all the keys in the first dictionary to find the smallest one that doesn't match the key/value from the second dictionary. So what we are actually looking for is a case where d1.keys()==d2.keys() but for some k either k not in d1 or k not in d2 or d1[k] != d2[k].
I think the clue may be in the objects you are using as dictionary keys. If they are mutable you can store an object in the dictionary but then mutate it and it becomes inaccessible through normal means. The keys() method may still find it though and in that case you could get what you are seeing.
Now you've updated the question with the AllSubSetsOf class: it is the missing __hash__() method that is the problem. Two different instances can compare equal: AllSubSetsOf(object)==allSubSetsOf(object) but the hash values are just hashing on the address so they will be different.
>>> class AllSubSetsOf(object):
def __init__(self, klass):
self.klass = klass
def __eq__(self, other):
if isinstance(other, AllSubSetsOf):
return self.klass == other.klass
else:
return False
def __lt__(self, other):
if isinstance(other, AllSubSetsOf):
return issubclass(self.klass, other.klass) and not other == self
else:
return False
>>> a = AllSubSetsOf(object)
>>> b = AllSubSetsOf(object)
>>> a==b
True
>>> hash(a), hash(b)
(2400161, 2401895)
>>>

Categories