In Python, is object() equal to anything besides itself? - python

If I have the code my_object = object() in Python, will my_object be equal to anything except for itself?
I suspect the answer lies in the __eq__ method of the default object returned by object(). What is the implementation of __eq__ for this default object?
EDIT: I'm using Python 2.7, but am also interested in Python 3 answers. Please clarify whether your answer applies to Python 2, 3, or both.

object().__eq__ returns the NotImplemented singleton:
print(object().__eq__(3))
NotImplemented
By the reflexive rules of rich comparisons, when NotImplemented is returned, the "reflected" operation is tried. So if you have an object on the RHS that returns True for that comparison, then you can get a True response even though the LHS did not implement the comparison.
class EqualToEverything(object):
def __eq__(self,other):
return True
ete = EqualToEverything()
ete == object() # we implemented `ete.__eq__`, so this is obviously True
Out[74]: True
object() == ete # still True due to the reflexive rules of rich comparisons
Out[75]: True
python 2 specific bit: if neither object implements __eq__, then python moves on to check if either implement __cmp__. Equivalent reflexive rules apply here.
class ComparableToEverything(object):
def __cmp__(self,other):
return 0
cte = ComparableToEverything()
cte == object()
Out[5]: True
object() == cte
Out[6]: True
__cmp__ is gone in python 3.
In both python 2 and 3, when we exhaust all of these comparison operators and all are NotImplemented, the final fallback is checking identity. (a is b)

object doesn't implement __eq__, so falls back on the default comparison id(x) == id(y), i.e. are they the same object instance (x is y)?
As a new instance is created every time you call object(), my_object will never* compare equal to anything except itself.
This applies to both 2.x and 3.x:
# 3.4.0
>>> object().__eq__(object())
NotImplemented
# 2.7.6
>>> object().__eq__(object())
Traceback (most recent call last):
File "<pyshell#60>", line 1, in <module>
object().__eq__(object())
AttributeError: 'object' object has no attribute '__eq__'
* or rather, as roippi's answer points out, hardly ever, assuming sensible __eq__ implementations elsewhere.

Related

What is the order of execution of __eq__ if one side inherits from the other? [duplicate]

This question already has an answer here:
python equality precedence
(1 answer)
Closed 3 years ago.
I recently stumbled upon a seemingly strange behavior regarding the order in which __eq__ methods are executed, if one side of the comparison is an object which inherits from the other.
I've tried this in Python 3.7.2 in an admittedly academic example. Usually, given an equality comparison a == b, I expect a.__eq__ to be called first, followed by b.__eq__, if the first call returned NotImplemented. However, this doesn't seem to be the case, if a and b are part of the same class hierarchy. Consider the following example:
class A(object):
def __eq__(self, other):
print("Calling A({}).__eq__".format(self))
return NotImplemented
class B(A):
def __eq__(self, other):
print("Calling B({}).__eq__".format(self))
return True
class C(object):
def __eq__(self, other):
print("Calling C({}).__eq__".format(self))
return False
a = A()
b = B()
c = C()
print("a: {}".format(a)) # output "a: <__main__.A object at 0x7f8fda95f860>"
print("b: {}".format(b)) # output "b: <__main__.B object at 0x7f8fda8bcfd0>"
print("c: {}".format(c)) # output "c: <__main__.C object at 0x7f8fda8bcef0>"
a == a # case 1
a == b # case 2.1
b == a # case 2.2
a == c # case 3.1
c == a # case 3.2
In case 1, I expect a.__eq__ to be called twice and this is also what I get:
Calling A(<__main__.A object at 0x7f8fda95f860>).__eq__
Calling A(<__main__.A object at 0x7f8fda95f860>).__eq__
However, in cases 2.1 and 2.2, b.__eq__ is always executed first, no matter on which side of the comparison it stands:
Calling B(<__main__.B object at 0x7f8fda8bcfd0>).__eq__ # case 2.1
Calling B(<__main__.B object at 0x7f8fda8bcfd0>).__eq__ # case 2.2
In cases 3.1 and 3.2 then, the left-hand side is again evaluated first, as I expected:
Calling A(<__main__.A object at 0x7f8fda95f860>).__eq__ # case 3.1
Calling C(<__main__.C object at 0x7f8fda8bcef0>).__eq__ # case 3.1
Calling C(<__main__.C object at 0x7f8fda8bcef0>).__eq__ # case 3.2
It seems that, if the compared objects are related to each other, __eq__ of the object of the child class is always evaluated first. Is there a deeper reasoning behind this behavior? If so, is this documented somewhere? PEP 207 doesn't mention this case, as far as I can see. Or am I maybe missing something obvious here?
From the official documentation for __eq__ :
If the operands are of different types, and right operand’s type is a direct or indirect subclass of the left operand’s type, the reflected method of the right operand has priority, otherwise the left operand’s method has priority.

Removing objects that have changed from a Python set

Given this program:
class Obj:
def __init__(self, a, b):
self.a = a
self.b = b
def __hash__(self):
return hash((self.a, self.b))
class Collection:
def __init__(self):
self.objs = set()
def add(self, obj):
self.objs.add(obj)
def find(self, a, b):
objs = []
for obj in self.objs:
if obj.b == b and obj.a == a:
objs.append(obj)
return objs
def remove(self, a, b):
for obj in self.find(a, b):
print('removing', obj)
self.objs.remove(obj)
o1 = Obj('a1', 'b1')
o2 = Obj('a2', 'b2')
o3 = Obj('a3', 'b3')
o4 = Obj('a4', 'b4')
o5 = Obj('a5', 'b5')
objs = Collection()
for o in (o1, o2, o3, o4, o5):
objs.add(o)
objs.remove('a1', 'b1')
o2.a = 'a1'
o2.b = 'b1'
objs.remove('a1', 'b1')
o3.a = 'a1'
o3.b = 'b1'
objs.remove('a1', 'b1')
o4.a = 'a1'
o4.b = 'b1'
objs.remove('a1', 'b1')
o5.a = 'a1'
o5.b = 'b1'
If I run this a few times with Python 3.4.2, sometimes it will succeed, other times it throws a KeyError after removing 2 or 3 objects:
$ python3 py_set_obj_remove_test.py
removing <__main__.Obj object at 0x7f3648035828>
removing <__main__.Obj object at 0x7f3648035860>
removing <__main__.Obj object at 0x7f3648035898>
removing <__main__.Obj object at 0x7f36480358d0>
$ python3 py_set_obj_remove_test.py
removing <__main__.Obj object at 0x7f156170b828>
removing <__main__.Obj object at 0x7f156170b860>
Traceback (most recent call last):
File "py_set_obj_remove_test.py", line 42, in <module>
objs.remove('a1', 'b1')
File "py_set_obj_remove_test.py", line 27, in remove
self.objs.remove(obj)
KeyError: <__main__.Obj object at 0x7f156170b860>
Is this a bug in Python? Or something about the implementation of sets I don't know about?
Interestingly, it seems to always fail at the second objs.remove() call in Python 2.7.9.
This is not a bug in Python, your code is violating a principle of sets: that the hash value must not change. By mutating your object attributes, the hash changes and the set can no longer reliably locate the object in the set.
From the __hash__ method documentation:
If a class defines mutable objects and implements an __eq__() method, it should not implement __hash__(), since the implementation of hashable collections requires that a key’s hash value is immutable (if the object’s hash value changes, it will be in the wrong hash bucket).
Custom Python classes define a default __eq__ method that returns True when both operands reference the same object (obj1 is obj2 is true).
That it sometimes works in Python 3 is a property of hash randomisation for strings. Because the hash value for a string changes between Python interpreter runs, and because the modulus of a hash against the size of the hash table is used, you can end up with the right hash slot anyway, purely by accident, and then the == equality test will still be true because you didn't implement a custom __eq__ method.
Python 2 has hash randomisation too but it is disabled by default, but you could make your test 'pass' anyway by carefully picking the 'right' values for the a and b attributes.
Instead, you could make your code work by basing your hash on the id() of your instance; that makes the hash value not change and would match the default __eq__ implementation:
def __hash__(self):
return hash(id(self))
You could also just remove your __hash__ implementation for the same effect, as the default implementation does basically the above (with the id() value rotated by 4 bits to evade memory alignment patterns). Again, from the __hash__ documentation:
User-defined classes have __eq__() and __hash__() methods by default; with them, all objects compare unequal (except with themselves) and x.__hash__() returns an appropriate value such that x == y implies both that x is y and hash(x) == hash(y).
Alternatively, implement an __eq__ method that bases equality on equality of the attributes of the instance, and don't mutate the attributes.
You are changing the objects (ie changing the objects' hash) after they were added to the set.
When remove is called, it can't find that hash in the set because it was changed after it was calculated (when the objects were originally added to the set).

The same function for iterable obects and non iterable objects [duplicate]

Is there a method like isiterable? The only solution I have found so far is to call
hasattr(myObj, '__iter__')
But I am not sure how fool-proof this is.
Checking for __iter__ works on sequence types, but it would fail on e.g. strings in Python 2. I would like to know the right answer too, until then, here is one possibility (which would work on strings, too):
try:
some_object_iterator = iter(some_object)
except TypeError as te:
print(some_object, 'is not iterable')
The iter built-in checks for the __iter__ method or in the case of strings the __getitem__ method.
Another general pythonic approach is to assume an iterable, then fail gracefully if it does not work on the given object. The Python glossary:
Pythonic programming style that determines an object's type by inspection of its method or attribute signature rather than by explicit relationship to some type object ("If it looks like a duck and quacks like a duck, it must be a duck.") By emphasizing interfaces rather than specific types, well-designed code improves its flexibility by allowing polymorphic substitution. Duck-typing avoids tests using type() or isinstance(). Instead, it typically employs the EAFP (Easier to Ask Forgiveness than Permission) style of programming.
...
try:
_ = (e for e in my_object)
except TypeError:
print my_object, 'is not iterable'
The collections module provides some abstract base classes, which allow to ask classes or instances if they provide particular functionality, for example:
from collections.abc import Iterable
if isinstance(e, Iterable):
# e is iterable
However, this does not check for classes that are iterable through __getitem__.
Duck typing
try:
iterator = iter(the_element)
except TypeError:
# not iterable
else:
# iterable
# for obj in iterator:
# pass
Type checking
Use the Abstract Base Classes. They need at least Python 2.6 and work only for new-style classes.
from collections.abc import Iterable # import directly from collections for Python < 3.3
if isinstance(the_element, Iterable):
# iterable
else:
# not iterable
However, iter() is a bit more reliable as described by the documentation:
Checking isinstance(obj, Iterable) detects classes that are
registered as Iterable or that have an __iter__() method, but
it does not detect classes that iterate with the __getitem__()
method. The only reliable way to determine whether an object
is iterable is to call iter(obj).
I'd like to shed a little bit more light on the interplay of iter, __iter__ and __getitem__ and what happens behind the curtains. Armed with that knowledge, you will be able to understand why the best you can do is
try:
iter(maybe_iterable)
print('iteration will probably work')
except TypeError:
print('not iterable')
I will list the facts first and then follow up with a quick reminder of what happens when you employ a for loop in python, followed by a discussion to illustrate the facts.
Facts
You can get an iterator from any object o by calling iter(o) if at least one of the following conditions holds true: a) o has an __iter__ method which returns an iterator object. An iterator is any object with an __iter__ and a __next__ (Python 2: next) method. b) o has a __getitem__ method.
Checking for an instance of Iterable or Sequence, or checking for the
attribute __iter__ is not enough.
If an object o implements only __getitem__, but not __iter__, iter(o) will construct
an iterator that tries to fetch items from o by integer index, starting at index 0. The iterator will catch any IndexError (but no other errors) that is raised and then raises StopIteration itself.
In the most general sense, there's no way to check whether the iterator returned by iter is sane other than to try it out.
If an object o implements __iter__, the iter function will make sure
that the object returned by __iter__ is an iterator. There is no sanity check
if an object only implements __getitem__.
__iter__ wins. If an object o implements both __iter__ and __getitem__, iter(o) will call __iter__.
If you want to make your own objects iterable, always implement the __iter__ method.
for loops
In order to follow along, you need an understanding of what happens when you employ a for loop in Python. Feel free to skip right to the next section if you already know.
When you use for item in o for some iterable object o, Python calls iter(o) and expects an iterator object as the return value. An iterator is any object which implements a __next__ (or next in Python 2) method and an __iter__ method.
By convention, the __iter__ method of an iterator should return the object itself (i.e. return self). Python then calls next on the iterator until StopIteration is raised. All of this happens implicitly, but the following demonstration makes it visible:
import random
class DemoIterable(object):
def __iter__(self):
print('__iter__ called')
return DemoIterator()
class DemoIterator(object):
def __iter__(self):
return self
def __next__(self):
print('__next__ called')
r = random.randint(1, 10)
if r == 5:
print('raising StopIteration')
raise StopIteration
return r
Iteration over a DemoIterable:
>>> di = DemoIterable()
>>> for x in di:
... print(x)
...
__iter__ called
__next__ called
9
__next__ called
8
__next__ called
10
__next__ called
3
__next__ called
10
__next__ called
raising StopIteration
Discussion and illustrations
On point 1 and 2: getting an iterator and unreliable checks
Consider the following class:
class BasicIterable(object):
def __getitem__(self, item):
if item == 3:
raise IndexError
return item
Calling iter with an instance of BasicIterable will return an iterator without any problems because BasicIterable implements __getitem__.
>>> b = BasicIterable()
>>> iter(b)
<iterator object at 0x7f1ab216e320>
However, it is important to note that b does not have the __iter__ attribute and is not considered an instance of Iterable or Sequence:
>>> from collections import Iterable, Sequence
>>> hasattr(b, '__iter__')
False
>>> isinstance(b, Iterable)
False
>>> isinstance(b, Sequence)
False
This is why Fluent Python by Luciano Ramalho recommends calling iter and handling the potential TypeError as the most accurate way to check whether an object is iterable. Quoting directly from the book:
As of Python 3.4, the most accurate way to check whether an object x is iterable is to call iter(x) and handle a TypeError exception if it isn’t. This is more accurate than using isinstance(x, abc.Iterable) , because iter(x) also considers the legacy __getitem__ method, while the Iterable ABC does not.
On point 3: Iterating over objects which only provide __getitem__, but not __iter__
Iterating over an instance of BasicIterable works as expected: Python
constructs an iterator that tries to fetch items by index, starting at zero, until an IndexError is raised. The demo object's __getitem__ method simply returns the item which was supplied as the argument to __getitem__(self, item) by the iterator returned by iter.
>>> b = BasicIterable()
>>> it = iter(b)
>>> next(it)
0
>>> next(it)
1
>>> next(it)
2
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Note that the iterator raises StopIteration when it cannot return the next item and that the IndexError which is raised for item == 3 is handled internally. This is why looping over a BasicIterable with a for loop works as expected:
>>> for x in b:
... print(x)
...
0
1
2
Here's another example in order to drive home the concept of how the iterator returned by iter tries to access items by index. WrappedDict does not inherit from dict, which means instances won't have an __iter__ method.
class WrappedDict(object): # note: no inheritance from dict!
def __init__(self, dic):
self._dict = dic
def __getitem__(self, item):
try:
return self._dict[item] # delegate to dict.__getitem__
except KeyError:
raise IndexError
Note that calls to __getitem__ are delegated to dict.__getitem__ for which the square bracket notation is simply a shorthand.
>>> w = WrappedDict({-1: 'not printed',
... 0: 'hi', 1: 'StackOverflow', 2: '!',
... 4: 'not printed',
... 'x': 'not printed'})
>>> for x in w:
... print(x)
...
hi
StackOverflow
!
On point 4 and 5: iter checks for an iterator when it calls __iter__:
When iter(o) is called for an object o, iter will make sure that the return value of __iter__, if the method is present, is an iterator. This means that the returned object
must implement __next__ (or next in Python 2) and __iter__. iter cannot perform any sanity checks for objects which only
provide __getitem__, because it has no way to check whether the items of the object are accessible by integer index.
class FailIterIterable(object):
def __iter__(self):
return object() # not an iterator
class FailGetitemIterable(object):
def __getitem__(self, item):
raise Exception
Note that constructing an iterator from FailIterIterable instances fails immediately, while constructing an iterator from FailGetItemIterable succeeds, but will throw an Exception on the first call to __next__.
>>> fii = FailIterIterable()
>>> iter(fii)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: iter() returned non-iterator of type 'object'
>>>
>>> fgi = FailGetitemIterable()
>>> it = iter(fgi)
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/path/iterdemo.py", line 42, in __getitem__
raise Exception
Exception
On point 6: __iter__ wins
This one is straightforward. If an object implements __iter__ and __getitem__, iter will call __iter__. Consider the following class
class IterWinsDemo(object):
def __iter__(self):
return iter(['__iter__', 'wins'])
def __getitem__(self, item):
return ['__getitem__', 'wins'][item]
and the output when looping over an instance:
>>> iwd = IterWinsDemo()
>>> for x in iwd:
... print(x)
...
__iter__
wins
On point 7: your iterable classes should implement __iter__
You might ask yourself why most builtin sequences like list implement an __iter__ method when __getitem__ would be sufficient.
class WrappedList(object): # note: no inheritance from list!
def __init__(self, lst):
self._list = lst
def __getitem__(self, item):
return self._list[item]
After all, iteration over instances of the class above, which delegates calls to __getitem__ to list.__getitem__ (using the square bracket notation), will work fine:
>>> wl = WrappedList(['A', 'B', 'C'])
>>> for x in wl:
... print(x)
...
A
B
C
The reasons your custom iterables should implement __iter__ are as follows:
If you implement __iter__, instances will be considered iterables, and isinstance(o, collections.abc.Iterable) will return True.
If the object returned by __iter__ is not an iterator, iter will fail immediately and raise a TypeError.
The special handling of __getitem__ exists for backwards compatibility reasons. Quoting again from Fluent Python:
That is why any Python sequence is iterable: they all implement __getitem__ . In fact,
the standard sequences also implement __iter__, and yours should too, because the
special handling of __getitem__ exists for backward compatibility reasons and may be
gone in the future (although it is not deprecated as I write this).
I've been studying this problem quite a bit lately. Based on that my conclusion is that nowadays this is the best approach:
from collections.abc import Iterable # drop `.abc` with Python 2.7 or lower
def iterable(obj):
return isinstance(obj, Iterable)
The above has been recommended already earlier, but the general consensus has been that using iter() would be better:
def iterable(obj):
try:
iter(obj)
except Exception:
return False
else:
return True
We've used iter() in our code as well for this purpose, but I've lately started to get more and more annoyed by objects which only have __getitem__ being considered iterable. There are valid reasons to have __getitem__ in a non-iterable object and with them the above code doesn't work well. As a real life example we can use Faker. The above code reports it being iterable but actually trying to iterate it causes an AttributeError (tested with Faker 4.0.2):
>>> from faker import Faker
>>> fake = Faker()
>>> iter(fake) # No exception, must be iterable
<iterator object at 0x7f1c71db58d0>
>>> list(fake) # Ooops
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/.../site-packages/faker/proxy.py", line 59, in __getitem__
return self._factory_map[locale.replace('-', '_')]
AttributeError: 'int' object has no attribute 'replace'
If we'd use insinstance(), we wouldn't accidentally consider Faker instances (or any other objects having only __getitem__) to be iterable:
>>> from collections.abc import Iterable
>>> from faker import Faker
>>> isinstance(Faker(), Iterable)
False
Earlier answers commented that using iter() is safer as the old way to implement iteration in Python was based on __getitem__ and the isinstance() approach wouldn't detect that. This may have been true with old Python versions, but based on my pretty exhaustive testing isinstance() works great nowadays. The only case where isinstance() didn't work but iter() did was with UserDict when using Python 2. If that's relevant, it's possible to use isinstance(item, (Iterable, UserDict)) to get that covered.
Since Python 3.5 you can use the typing module from the standard library for type related things:
from typing import Iterable
...
if isinstance(my_item, Iterable):
print(True)
This isn't sufficient: the object returned by __iter__ must implement the iteration protocol (i.e. next method). See the relevant section in the documentation.
In Python, a good practice is to "try and see" instead of "checking".
In Python <= 2.5, you can't and shouldn't - iterable was an "informal" interface.
But since Python 2.6 and 3.0 you can leverage the new ABC (abstract base class) infrastructure along with some builtin ABCs which are available in the collections module:
from collections import Iterable
class MyObject(object):
pass
mo = MyObject()
print isinstance(mo, Iterable)
Iterable.register(MyObject)
print isinstance(mo, Iterable)
print isinstance("abc", Iterable)
Now, whether this is desirable or actually works, is just a matter of conventions. As you can see, you can register a non-iterable object as Iterable - and it will raise an exception at runtime. Hence, isinstance acquires a "new" meaning - it just checks for "declared" type compatibility, which is a good way to go in Python.
On the other hand, if your object does not satisfy the interface you need, what are you going to do? Take the following example:
from collections import Iterable
from traceback import print_exc
def check_and_raise(x):
if not isinstance(x, Iterable):
raise TypeError, "%s is not iterable" % x
else:
for i in x:
print i
def just_iter(x):
for i in x:
print i
class NotIterable(object):
pass
if __name__ == "__main__":
try:
check_and_raise(5)
except:
print_exc()
print
try:
just_iter(5)
except:
print_exc()
print
try:
Iterable.register(NotIterable)
ni = NotIterable()
check_and_raise(ni)
except:
print_exc()
print
If the object doesn't satisfy what you expect, you just throw a TypeError, but if the proper ABC has been registered, your check is unuseful. On the contrary, if the __iter__ method is available Python will automatically recognize object of that class as being Iterable.
So, if you just expect an iterable, iterate over it and forget it. On the other hand, if you need to do different things depending on input type, you might find the ABC infrastructure pretty useful.
try:
#treat object as iterable
except TypeError, e:
#object is not actually iterable
Don't run checks to see if your duck really is a duck to see if it is iterable or not, treat it as if it was and complain if it wasn't.
You could try this:
def iterable(a):
try:
(x for x in a)
return True
except TypeError:
return False
If we can make a generator that iterates over it (but never use the generator so it doesn't take up space), it's iterable. Seems like a "duh" kind of thing. Why do you need to determine if a variable is iterable in the first place?
The best solution I've found so far:
hasattr(obj, '__contains__')
which basically checks if the object implements the in operator.
Advantages (none of the other solutions has all three):
it is an expression (works as a lambda, as opposed to the try...except variant)
it is (should be) implemented by all iterables, including strings (as opposed to __iter__)
works on any Python >= 2.5
Notes:
the Python philosophy of "ask for forgiveness, not permission" doesn't work well when e.g. in a list you have both iterables and non-iterables and you need to treat each element differently according to it's type (treating iterables on try and non-iterables on except would work, but it would look butt-ugly and misleading)
solutions to this problem which attempt to actually iterate over the object (e.g. [x for x in obj]) to check if it's iterable may induce significant performance penalties for large iterables (especially if you just need the first few elements of the iterable, for example) and should be avoided
I found a nice solution here:
isiterable = lambda obj: isinstance(obj, basestring) \
or getattr(obj, '__iter__', False)
According to the Python 2 Glossary, iterables are
all sequence types (such as list, str, and tuple) and some non-sequence types like dict and file and objects of any classes you define with an __iter__() or __getitem__() method. Iterables can be used in a for loop and in many other places where a sequence is needed (zip(), map(), ...). When an iterable object is passed as an argument to the built-in function iter(), it returns an iterator for the object.
Of course, given the general coding style for Python based on the fact that it's “Easier to ask for forgiveness than permission.”, the general expectation is to use
try:
for i in object_in_question:
do_something
except TypeError:
do_something_for_non_iterable
But if you need to check it explicitly, you can test for an iterable by hasattr(object_in_question, "__iter__") or hasattr(object_in_question, "__getitem__"). You need to check for both, because strs don't have an __iter__ method (at least not in Python 2, in Python 3 they do) and because generator objects don't have a __getitem__ method.
I often find convenient, inside my scripts, to define an iterable function.
(Now incorporates Alfe's suggested simplification):
import collections
def iterable(obj):
return isinstance(obj, collections.Iterable):
so you can test if any object is iterable in the very readable form
if iterable(obj):
# act on iterable
else:
# not iterable
as you would do with thecallable function
EDIT: if you have numpy installed, you can simply do: from numpy import iterable,
which is simply something like
def iterable(obj):
try: iter(obj)
except: return False
return True
If you do not have numpy, you can simply implement this code, or the one above.
pandas has a built-in function like that:
from pandas.util.testing import isiterable
It's always eluded me as to why python has callable(obj) -> bool but not iterable(obj) -> bool...
surely it's easier to do hasattr(obj,'__call__') even if it is slower.
Since just about every other answer recommends using try/except TypeError, where testing for exceptions is generally considered bad practice among any language, here's an implementation of iterable(obj) -> bool I've grown more fond of and use often:
For python 2's sake, I'll use a lambda just for that extra performance boost...
(in python 3 it doesn't matter what you use for defining the function, def has roughly the same speed as lambda)
iterable = lambda obj: hasattr(obj,'__iter__') or hasattr(obj,'__getitem__')
Note that this function executes faster for objects with __iter__ since it doesn't test for __getitem__.
Most iterable objects should rely on __iter__ where special-case objects fall back to __getitem__, though either is required for an object to be iterable.
(and since this is standard, it affects C objects as well)
def is_iterable(x):
try:
0 in x
except TypeError:
return False
else:
return True
This will say yes to all manner of iterable objects, but it will say no to strings in Python 2. (That's what I want for example when a recursive function could take a string or a container of strings. In that situation, asking forgiveness may lead to obfuscode, and it's better to ask permission first.)
import numpy
class Yes:
def __iter__(self):
yield 1;
yield 2;
yield 3;
class No:
pass
class Nope:
def __iter__(self):
return 'nonsense'
assert is_iterable(Yes())
assert is_iterable(range(3))
assert is_iterable((1,2,3)) # tuple
assert is_iterable([1,2,3]) # list
assert is_iterable({1,2,3}) # set
assert is_iterable({1:'one', 2:'two', 3:'three'}) # dictionary
assert is_iterable(numpy.array([1,2,3]))
assert is_iterable(bytearray("not really a string", 'utf-8'))
assert not is_iterable(No())
assert not is_iterable(Nope())
assert not is_iterable("string")
assert not is_iterable(42)
assert not is_iterable(True)
assert not is_iterable(None)
Many other strategies here will say yes to strings. Use them if that's what you want.
import collections
import numpy
assert isinstance("string", collections.Iterable)
assert isinstance("string", collections.Sequence)
assert numpy.iterable("string")
assert iter("string")
assert hasattr("string", '__getitem__')
Note: is_iterable() will say yes to strings of type bytes and bytearray.
bytes objects in Python 3 are iterable True == is_iterable(b"string") == is_iterable("string".encode('utf-8')) There is no such type in Python 2.
bytearray objects in Python 2 and 3 are iterable True == is_iterable(bytearray(b"abc"))
The O.P. hasattr(x, '__iter__') approach will say yes to strings in Python 3 and no in Python 2 (no matter whether '' or b'' or u''). Thanks to #LuisMasuelli for noticing it will also let you down on a buggy __iter__.
There are a lot of ways to check if an object is iterable:
from collections.abc import Iterable
myobject = 'Roster'
if isinstance(myobject , Iterable):
print(f"{myobject } is iterable")
else:
print(f"strong text{myobject } is not iterable")
The easiest way, respecting the Python's duck typing, is to catch the error (Python knows perfectly what does it expect from an object to become an iterator):
class A(object):
def __getitem__(self, item):
return something
class B(object):
def __iter__(self):
# Return a compliant iterator. Just an example
return iter([])
class C(object):
def __iter__(self):
# Return crap
return 1
class D(object): pass
def iterable(obj):
try:
iter(obj)
return True
except:
return False
assert iterable(A())
assert iterable(B())
assert iterable(C())
assert not iterable(D())
Notes:
It is irrelevant the distinction whether the object is not iterable, or a buggy __iter__ has been implemented, if the exception type is the same: anyway you will not be able to iterate the object.
I think I understand your concern: How does callable exists as a check if I could also rely on duck typing to raise an AttributeError if __call__ is not defined for my object, but that's not the case for iterable checking?
I don't know the answer, but you can either implement the function I (and other users) gave, or just catch the exception in your code (your implementation in that part will be like the function I wrote - just ensure you isolate the iterator creation from the rest of the code so you can capture the exception and distinguish it from another TypeError.
The isiterable func at the following code returns True if object is iterable. if it's not iterable returns False
def isiterable(object_):
return hasattr(type(object_), "__iter__")
example
fruits = ("apple", "banana", "peach")
isiterable(fruits) # returns True
num = 345
isiterable(num) # returns False
isiterable(str) # returns False because str type is type class and it's not iterable.
hello = "hello dude !"
isiterable(hello) # returns True because as you know string objects are iterable
Instead of checking for the __iter__ attribute, you could check for the __len__ attribute, which is implemented by every python builtin iterable, including strings.
>>> hasattr(1, "__len__")
False
>>> hasattr(1.3, "__len__")
False
>>> hasattr("a", "__len__")
True
>>> hasattr([1,2,3], "__len__")
True
>>> hasattr({1,2}, "__len__")
True
>>> hasattr({"a":1}, "__len__")
True
>>> hasattr(("a", 1), "__len__")
True
None-iterable objects would not implement this for obvious reasons. However, it does not catch user-defined iterables that do not implement it, nor do generator expressions, which iter can deal with. However, this can be done in a line, and adding a simple or expression checking for generators would fix this problem. (Note that writing type(my_generator_expression) == generator would throw a NameError. Refer to this answer instead.)
You can use GeneratorType from types:
>>> import types
>>> types.GeneratorType
<class 'generator'>
>>> gen = (i for i in range(10))
>>> isinstance(gen, types.GeneratorType)
True
--- accepted answer by utdemir
(This makes it useful for checking if you can call len on the object though.)
Not really "correct" but can serve as quick check of most common types like strings, tuples, floats, etc...
>>> '__iter__' in dir('sds')
True
>>> '__iter__' in dir(56)
False
>>> '__iter__' in dir([5,6,9,8])
True
>>> '__iter__' in dir({'jh':'ff'})
True
>>> '__iter__' in dir({'jh'})
True
>>> '__iter__' in dir(56.9865)
False
In my code I used to check for non iterable objects:
hasattr(myobject,'__trunc__')
This is quite quick and can be used to check for iterables too (use not).
I'm not 100% sure if this solution works for all objects, maybe other can give a some more background on it. __trunc__ method seams to be related to numerical types (all objects that can be rounded to integers needs it). But I didn't found any object that contains __trunc__ together with __iter__ or __getitem__.
Kinda late to the party but I asked myself this question and saw this then thought of an answer. I don't know if someone already posted this. But essentially, I've noticed that all iterable types have __getitem__() in their dict. This is how you would check if an object was an iterable without even trying. (Pun intended)
def is_attr(arg):
return '__getitem__' in dir(arg)

Why isn't "is not" working with ConfigParser.sections() in a list comprehension? [duplicate]

In a comment on this question, I saw a statement that recommended using
result is not None
vs
result != None
What is the difference? And why might one be recommended over the other?
== is an equality test. It checks whether the right hand side and the left hand side are equal objects (according to their __eq__ or __cmp__ methods.)
is is an identity test. It checks whether the right hand side and the left hand side are the very same object. No methodcalls are done, objects can't influence the is operation.
You use is (and is not) for singletons, like None, where you don't care about objects that might want to pretend to be None or where you want to protect against objects breaking when being compared against None.
First, let me go over a few terms. If you just want your question answered, scroll down to "Answering your question".
Definitions
Object identity: When you create an object, you can assign it to a variable. You can then also assign it to another variable. And another.
>>> button = Button()
>>> cancel = button
>>> close = button
>>> dismiss = button
>>> print(cancel is close)
True
In this case, cancel, close, and dismiss all refer to the same object in memory. You only created one Button object, and all three variables refer to this one object. We say that cancel, close, and dismiss all refer to identical objects; that is, they refer to one single object.
Object equality: When you compare two objects, you usually don't care that it refers to the exact same object in memory. With object equality, you can define your own rules for how two objects compare. When you write if a == b:, you are essentially saying if a.__eq__(b):. This lets you define a __eq__ method on a so that you can use your own comparison logic.
Rationale for equality comparisons
Rationale: Two objects have the exact same data, but are not identical. (They are not the same object in memory.)
Example: Strings
>>> greeting = "It's a beautiful day in the neighbourhood."
>>> a = unicode(greeting)
>>> b = unicode(greeting)
>>> a is b
False
>>> a == b
True
Note: I use unicode strings here because Python is smart enough to reuse regular strings without creating new ones in memory.
Here, I have two unicode strings, a and b. They have the exact same content, but they are not the same object in memory. However, when we compare them, we want them to compare equal. What's happening here is that the unicode object has implemented the __eq__ method.
class unicode(object):
# ...
def __eq__(self, other):
if len(self) != len(other):
return False
for i, j in zip(self, other):
if i != j:
return False
return True
Note: __eq__ on unicode is definitely implemented more efficiently than this.
Rationale: Two objects have different data, but are considered the same object if some key data is the same.
Example: Most types of model data
>>> import datetime
>>> a = Monitor()
>>> a.make = "Dell"
>>> a.model = "E770s"
>>> a.owner = "Bob Jones"
>>> a.warranty_expiration = datetime.date(2030, 12, 31)
>>> b = Monitor()
>>> b.make = "Dell"
>>> b.model = "E770s"
>>> b.owner = "Sam Johnson"
>>> b.warranty_expiration = datetime.date(2005, 8, 22)
>>> a is b
False
>>> a == b
True
Here, I have two Dell monitors, a and b. They have the same make and model. However, they neither have the same data nor are the same object in memory. However, when we compare them, we want them to compare equal. What's happening here is that the Monitor object implemented the __eq__ method.
class Monitor(object):
# ...
def __eq__(self, other):
return self.make == other.make and self.model == other.model
Answering your question
When comparing to None, always use is not. None is a singleton in Python - there is only ever one instance of it in memory.
By comparing identity, this can be performed very quickly. Python checks whether the object you're referring to has the same memory address as the global None object - a very, very fast comparison of two numbers.
By comparing equality, Python has to look up whether your object has an __eq__ method. If it does not, it examines each superclass looking for an __eq__ method. If it finds one, Python calls it. This is especially bad if the __eq__ method is slow and doesn't immediately return when it notices that the other object is None.
Did you not implement __eq__? Then Python will probably find the __eq__ method on object and use that instead - which just checks for object identity anyway.
When comparing most other things in Python, you will be using !=.
Consider the following:
class Bad(object):
def __eq__(self, other):
return True
c = Bad()
c is None # False, equivalent to id(c) == id(None)
c == None # True, equivalent to c.__eq__(None)
None is a singleton, and therefore identity comparison will always work, whereas an object can fake the equality comparison via .__eq__().
>>> () is ()
True
>>> 1 is 1
True
>>> (1,) == (1,)
True
>>> (1,) is (1,)
False
>>> a = (1,)
>>> b = a
>>> a is b
True
Some objects are singletons, and thus is with them is equivalent to ==. Most are not.

Does python coerce types when doing operator overloading?

I have the following code:
a = str('5')
b = int(5)
a == b
# False
But if I make a subclass of int, and reimplement __cmp__:
class A(int):
def __cmp__(self, other):
return super(A, self).__cmp__(other)
a = str('5')
b = A(5)
a == b
# TypeError: A.__cmp__(x,y) requires y to be a 'A', not a 'str'
Why are these two different? Is the python runtime catching the TypeError thrown by int.__cmp__(), and interpreting that as a False value? Can someone point me to the bit in the 2.x cpython source that shows how this is working?
The documentation isn't completely explicit on this point, but see here:
If both are numbers, they are converted to a common type. Otherwise, objects of different types always compare unequal, and are ordered consistently but arbitrarily. You can control comparison behavior of objects of non-built-in types by defining a __cmp__ method or rich comparison methods like __gt__, described in section Special method names.
This (particularly the implicit contrast between "objects of different types" and "objects of non-built-in types") suggests that the normal process of actually calling comparison methods is skipped for built-in types: if you try to compare objects of two dfferent (and non-numeric) built-in types, it just short-circuits to an automatic False.
A comparison decision tree for a == b looks something like:
python calls a.__cmp__(b)
a checks that b is an appropriate type
if b is an appropriate type, return -1, 0, or +1
if b is not, return NotImplented
if -1, 0, or +1 returned, python is done; otherwise
if NotImplented returned, try
b.__cmp__(a)
b checks that a is an appropriate type
if a is an appropriate type, return -1, 0, or +1
if a is not, return NotImplemented
if -1, 0, or +1 returned, python is done; otherwise
if NotImplented returned again, the answer is False
Not an exact answer, but hopefully it helps.
If I understood your problem right, you need something like:
>>> class A(int):
... def __cmp__(self, other):
... return super(A, self).__cmp__(A(other)) # <--- A(other) instead of other
...
>>> a = str('5')
>>> b = A(5)
>>> a == b
True
Updated
Regarding to 2.x cpython source, you can find reason for this result in typeobject.c in function wrap_cmpfunc which actually checks two things: given compare function is a func and other is subtype for self.
if (Py_TYPE(other)->tp_compare != func &&
!PyType_IsSubtype(Py_TYPE(other), Py_TYPE(self))) {
// ....
}

Categories