how to check if an iterable allows more than one pass? - python

In Python 3, how can I check whether an object is a container (rather than an iterator that may allow only one pass)?
Here's an example:
def renormalize(cont):
'''
each value from the original container is scaled by the same factor
such that their total becomes 1.0
'''
total = sum(cont)
for v in cont:
yield v/total
list(renormalize(range(5))) # [0.0, 0.1, 0.2, 0.3, 0.4]
list(renormalize(k for k in range(5))) # [] - a bug!
Obviously, when the renormalize function receives a generator expression, it does not work as intended. It assumes it can iterate through the container multiple times, while the generator allows only one pass through it.
Ideally, I'd like to do this:
def renormalize(cont):
if not is_container(cont):
raise ContainerExpectedException
# ...
How can I implement is_container?
I suppose I could check if the argument is empty right as we're starting to do the second pass through it. But this approach doesn't work for more complicated functions where it's not obvious when exactly the second pass starts. Furthermore, I'd rather put the validation at the function entrance, rather than deep inside the function (and shift it around whenever the function is modified).
I can of course rewrite the renormalize function to work correctly with a one-pass iterator. But that require copying the input data to a container. The performance impact of copying millions of large lists "just in case they are not lists" is ridiculous.
EDIT: My original example used a weighted_average function:
def weighted_average(c):
'''
returns weighted average of a container c
c contains values and weights in tuples
weights don't need to sum up 1 (automatically renormalized)
'''
return sum((v * w for v, w in c)) / sum((w for v, w in c))
weighted_average([(0,1), (1,1)]) #0.5
weighted_average([(k, 1) for k in range(2)]) #0.5
weighted_average((k, 1) for k in range(2)) #mistake
But it was not the best example since the version of weighted_average rewritten to use a single pass is arguably better anyway:
def weighted_average(it):
'''
returns weighted average of an iterator it
it yields values and weights in tuples
weights don't need to sum up 1 (automatically renormalized)
'''
total_value = 0
total_weight = 0
for v, w in it:
total_value += v
total_weight += w
return total_value / total_weight

Although all iterables should subclass collections.Iterable, not all of them do, unfortunately. Here is an answer based on what interface the objects implement, instead of what they "declare".
Short answer:
A "container" as you call it, ie a list/tuple that can be iterated over more than once as opposed to being a generator that will be exhausted, will typically implement both __iter__ and __getitem__. Hence you can do this:
>>> def is_container_iterable(o):
... return hasattr(o, '__iter__') and hasattr(o, '__getitem__')
...
>>> is_container_iterable([])
True
>>> is_container_iterable(())
True
>>> is_container_iterable({})
True
>>> is_container_iterable(range(5))
True
>>> is_container_iterable(iter([]))
False
Long answer:
However, you can make an iterable that will not be exhausted and do not support getitem. For example, a function that generates prime-numbers. You could repeat the generation many times if you want, but having a function to retrieve the 1065th prime would take a lot of calculation, so you may not want to support that. :-)
So is there any more "reliable" way?
Well, all iterables will implement an __iter__ function that will return an iterator. The iterators will have a __next__ function. This is what is used when iterating over it. Calling __next__ repeatedly will in the end exhaust the iterator.
So if it has a __next__ function it is an iterator, and will be exhausted.
>>> def foo():
... for x in range(5):
... yield x
...
>>> f = foo()
>>> f.__next__
<method-wrapper '__next__' of generator object at 0xb73c02d4>
Iterables that are not yet iterators will not have a __next__ function, but will implement a __iter__ function, that will return an iterable:
>>> r = range(5)
>>> r.__next__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'range' object has no attribute '__next__'
>>> ri = iter(r)
>>> ri.__next__
<method-wrapper '__next__' of range_iterator object at 0xb73bef80>
So you can check that the object has __iter__ but that it does not have __next__.
>>> def is_container_iterable(o):
... return hasattr(o, '__iter__') and not hasattr(o, '__next__')
...
>>> is_container_iterable(())
True
>>> is_container_iterable([])
True
>>> is_container_iterable({})
True
>>> is_container_iterable(range(5))
True
>>> is_container_iterable(iter(range(5)))
False
Iterators also has an __iter__ function, that will return self.
>>> iter(f) is f
True
>>> iter(r) is r
False
>>> iter(ri) is ri
True
Hence, you can do these variations of the checking:
>>> def is_container_iterable(o):
... return iter(o) is not o
...
>>> is_container_iterable([])
True
>>> is_container_iterable(())
True
>>> is_container_iterable({})
True
>>> is_container_iterable(range(5))
True
>>> is_container_iterable(iter([]))
False
That would fail if you implement an object that returns a broken iterator, one that does not return self when you call iter() on it again. But then your (or a third-party modules) code is actually doing things wrong.
It does depends on making an iterator though, and hence calling the objects __iter__, which in theory may have side-effects, while the above hasattr calls should not have side effects. OK, so it calls getattribute which could have. But you can fix that thusly:
>>> def is_container_iterable(o):
... try:
... object.__getattribute__(o, '__iter__')
... except AttributeError:
... return False
... try:
... object.__getattribute__(o, '__next__')
... except AttributeError:
... return True
... return False
...
>>> is_container_iterable([])
True
>>> is_container_iterable(())
True
>>> is_container_iterable({})
True
>>> is_container_iterable(range(5))
True
>>> is_container_iterable(iter(range(5)))
False
This one is reasonably safe, and should work in all cases except if the object generates __next__ or __iter__ dynamically on __getattribute__ calls, but if you do that you are insane. :-)
Instinctively my preferred version would be iter(o) is o, but I haven't ever needed to do this, so that's not based on experience.

You could use the abstract base classes defined in the collections module to check and see if it is an instance of collections.Iterator.
if isinstance(it, collections.Iterator):
# handle the iterator case
Personally though I find your iterator friendly version of weighted average far easier to read than the multiple list comprehension / sum version. :-)

The best way would be to use the abstract base class infrastructure:
def weighted_average(c):
if not isinstance(c, collections.Sequence):
raise ContainerExpectedException

Related

How come in python dictionary keys are not an iterator, but I am able to iterate over them [duplicate]

What are "iterable", "iterator", and "iteration" in Python? How are they defined?
Iteration is a general term for taking each item of something, one after another. Any time you use a loop, explicit or implicit, to go over a group of items, that is iteration.
In Python, iterable and iterator have specific meanings.
An iterable is an object that has an __iter__ method which returns an iterator, or which defines a __getitem__ method that can take sequential indexes starting from zero (and raises an IndexError when the indexes are no longer valid). So an iterable is an object that you can get an iterator from.
An iterator is an object with a next (Python 2) or __next__ (Python 3) method.
Whenever you use a for loop, or map, or a list comprehension, etc. in Python, the next method is called automatically to get each item from the iterator, thus going through the process of iteration.
A good place to start learning would be the iterators section of the tutorial and the iterator types section of the standard types page. After you understand the basics, try the iterators section of the Functional Programming HOWTO.
Here's the explanation I use in teaching Python classes:
An ITERABLE is:
anything that can be looped over (i.e. you can loop over a string or file) or
anything that can appear on the right-side of a for-loop: for x in iterable: ... or
anything you can call with iter() that will return an ITERATOR: iter(obj) or
an object that defines __iter__ that returns a fresh ITERATOR,
or it may have a __getitem__ method suitable for indexed lookup.
An ITERATOR is an object:
with state that remembers where it is during iteration,
with a __next__ method that:
returns the next value in the iteration
updates the state to point at the next value
signals when it is done by raising StopIteration
and that is self-iterable (meaning that it has an __iter__ method that returns self).
Notes:
The __next__ method in Python 3 is spelt next in Python 2, and
The builtin function next() calls that method on the object passed to it.
For example:
>>> s = 'cat' # s is an ITERABLE
# s is a str object that is immutable
# s has no state
# s has a __getitem__() method
>>> t = iter(s) # t is an ITERATOR
# t has state (it starts by pointing at the "c"
# t has a next() method and an __iter__() method
>>> next(t) # the next() function returns the next value and advances the state
'c'
>>> next(t) # the next() function returns the next value and advances
'a'
>>> next(t) # the next() function returns the next value and advances
't'
>>> next(t) # next() raises StopIteration to signal that iteration is complete
Traceback (most recent call last):
...
StopIteration
>>> iter(t) is t   # the iterator is self-iterable
The above answers are great, but as most of what I've seen, don't stress the distinction enough for people like me.
Also, people tend to get "too Pythonic" by putting definitions like "X is an object that has __foo__() method" before. Such definitions are correct--they are based on duck-typing philosophy, but the focus on methods tends to get between when trying to understand the concept in its simplicity.
So I add my version.
In natural language,
iteration is the process of taking one element at a time in a row of elements.
In Python,
iterable is an object that is, well, iterable, which simply put, means that
it can be used in iteration, e.g. with a for loop. How? By using iterator.
I'll explain below.
... while iterator is an object that defines how to actually do the
iteration--specifically what is the next element. That's why it must have
next() method.
Iterators are themselves also iterable, with the distinction that their __iter__() method returns the same object (self), regardless of whether or not its items have been consumed by previous calls to next().
So what does Python interpreter think when it sees for x in obj: statement?
Look, a for loop. Looks like a job for an iterator... Let's get one. ...
There's this obj guy, so let's ask him.
"Mr. obj, do you have your iterator?" (... calls iter(obj), which calls
obj.__iter__(), which happily hands out a shiny new iterator _i.)
OK, that was easy... Let's start iterating then. (x = _i.next() ... x = _i.next()...)
Since Mr. obj succeeded in this test (by having certain method returning a valid iterator), we reward him with adjective: you can now call him "iterable Mr. obj".
However, in simple cases, you don't normally benefit from having iterator and iterable separately. So you define only one object, which is also its own iterator. (Python does not really care that _i handed out by obj wasn't all that shiny, but just the obj itself.)
This is why in most examples I've seen (and what had been confusing me over and over),
you can see:
class IterableExample(object):
def __iter__(self):
return self
def next(self):
pass
instead of
class Iterator(object):
def next(self):
pass
class Iterable(object):
def __iter__(self):
return Iterator()
There are cases, though, when you can benefit from having iterator separated from the iterable, such as when you want to have one row of items, but more "cursors". For example when you want to work with "current" and "forthcoming" elements, you can have separate iterators for both. Or multiple threads pulling from a huge list: each can have its own iterator to traverse over all items. See #Raymond's and #glglgl's answers above.
Imagine what you could do:
class SmartIterableExample(object):
def create_iterator(self):
# An amazingly powerful yet simple way to create arbitrary
# iterator, utilizing object state (or not, if you are fan
# of functional), magic and nuclear waste--no kittens hurt.
pass # don't forget to add the next() method
def __iter__(self):
return self.create_iterator()
Notes:
I'll repeat again: iterator is not iterable. Iterator cannot be used as
a "source" in for loop. What for loop primarily needs is __iter__()
(that returns something with next()).
Of course, for is not the only iteration loop, so above applies to some other
constructs as well (while...).
Iterator's next() can throw StopIteration to stop iteration. Does not have to,
though, it can iterate forever or use other means.
In the above "thought process", _i does not really exist. I've made up that name.
There's a small change in Python 3.x: next() method (not the built-in) now
must be called __next__(). Yes, it should have been like that all along.
You can also think of it like this: iterable has the data, iterator pulls the next
item
Disclaimer: I'm not a developer of any Python interpreter, so I don't really know what the interpreter "thinks". The musings above are solely demonstration of how I understand the topic from other explanations, experiments and real-life experience of a Python newbie.
An iterable is a object which has a __iter__() method. It can possibly iterated over several times, such as list()s and tuple()s.
An iterator is the object which iterates. It is returned by an __iter__() method, returns itself via its own __iter__() method and has a next() method (__next__() in 3.x).
Iteration is the process of calling this next() resp. __next__() until it raises StopIteration.
Example:
>>> a = [1, 2, 3] # iterable
>>> b1 = iter(a) # iterator 1
>>> b2 = iter(a) # iterator 2, independent of b1
>>> next(b1)
1
>>> next(b1)
2
>>> next(b2) # start over, as it is the first call to b2
1
>>> next(b1)
3
>>> next(b1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> b1 = iter(a) # new one, start over
>>> next(b1)
1
Here's my cheat sheet:
sequence
+
|
v
def __getitem__(self, index: int):
+ ...
| raise IndexError
|
|
| def __iter__(self):
| + ...
| | return <iterator>
| |
| |
+--> or <-----+ def __next__(self):
+ | + ...
| | | raise StopIteration
v | |
iterable | |
+ | |
| | v
| +----> and +-------> iterator
| ^
v |
iter(<iterable>) +----------------------+
|
def generator(): |
+ yield 1 |
| generator_expression +-+
| |
+-> generator() +-> generator_iterator +-+
Quiz: Do you see how...
every iterator is an iterable?
a container object's __iter__() method can be implemented as a generator?
an iterable that has a __next__ method is not necessarily an iterator?
Answers:
Every iterator must have an __iter__ method. Having __iter__ is enough to be an iterable. Therefore every iterator is an iterable.
When __iter__ is called it should return an iterator (return <iterator> in the diagram above). Calling a generator returns a generator iterator which is a type of iterator.
class Iterable1:
def __iter__(self):
# a method (which is a function defined inside a class body)
# calling iter() converts iterable (tuple) to iterator
return iter((1,2,3))
class Iterable2:
def __iter__(self):
# a generator
for i in (1, 2, 3):
yield i
class Iterable3:
def __iter__(self):
# with PEP 380 syntax
yield from (1, 2, 3)
# passes
assert list(Iterable1()) == list(Iterable2()) == list(Iterable3()) == [1, 2, 3]
Here is an example:
class MyIterable:
def __init__(self):
self.n = 0
def __getitem__(self, index: int):
return (1, 2, 3)[index]
def __next__(self):
n = self.n = self.n + 1
if n > 3:
raise StopIteration
return n
# if you can iter it without raising a TypeError, then it's an iterable.
iter(MyIterable())
# but obviously `MyIterable()` is not an iterator since it does not have
# an `__iter__` method.
from collections.abc import Iterator
assert isinstance(MyIterable(), Iterator) # AssertionError
I don’t know if it helps anybody but I always like to visualize concepts in my head to better understand them. So as I have a little son I visualize iterable/iterator concept with bricks and white paper.
Suppose we are in the dark room and on the floor we have bricks for my son. Bricks of different size, color, does not matter now. Suppose we have 5 bricks like those. Those 5 bricks can be described as an object – let’s say bricks kit. We can do many things with this bricks kit – can take one and then take second and then third, can change places of bricks, put first brick above the second. We can do many sorts of things with those. Therefore this bricks kit is an iterable object or sequence as we can go through each brick and do something with it. We can only do it like my little son – we can play with one brick at a time. So again I imagine myself this bricks kit to be an iterable.
Now remember that we are in the dark room. Or almost dark. The thing is that we don’t clearly see those bricks, what color they are, what shape etc. So even if we want to do something with them – aka iterate through them – we don’t really know what and how because it is too dark.
What we can do is near to first brick – as element of a bricks kit – we can put a piece of white fluorescent paper in order for us to see where the first brick-element is. And each time we take a brick from a kit, we replace the white piece of paper to a next brick in order to be able to see that in the dark room. This white piece of paper is nothing more than an iterator. It is an object as well. But an object with what we can work and play with elements of our iterable object – bricks kit.
That by the way explains my early mistake when I tried the following in an IDLE and got a TypeError:
>>> X = [1,2,3,4,5]
>>> next(X)
Traceback (most recent call last):
File "<pyshell#19>", line 1, in <module>
next(X)
TypeError: 'list' object is not an iterator
List X here was our bricks kit but NOT a white piece of paper. I needed to find an iterator first:
>>> X = [1,2,3,4,5]
>>> bricks_kit = [1,2,3,4,5]
>>> white_piece_of_paper = iter(bricks_kit)
>>> next(white_piece_of_paper)
1
>>> next(white_piece_of_paper)
2
>>>
Don’t know if it helps, but it helped me. If someone could confirm/correct visualization of the concept, I would be grateful. It would help me to learn more.
I don't think that you can get it much simpler than the documentation, however I'll try:
Iterable is something that can be iterated over. In practice it usually means a sequence e.g. something that has a beginning and an end and some way to go through all the items in it.
You can think Iterator as a helper pseudo-method (or pseudo-attribute) that gives (or holds) the next (or first) item in the iterable. (In practice it is just an object that defines the method next())
Iteration is probably best explained by the Merriam-Webster definition of the word :
b : the repetition of a sequence of computer instructions a specified
number of times or until a condition is met — compare recursion
Iterables have a __iter__ method that instantiates a new iterator every time.
Iterators implement a __next__ method that returns individual items, and a __iter__ method that returns self .
Therefore, iterators are also iterable, but iterables are not iterators.
Luciano Ramalho, Fluent Python.
Iterable:- something that is iterable is iterable; like sequences like lists ,strings etc.
Also it has either the __getitem__ method or an __iter__ method. Now if we use iter() function on that object, we'll get an iterator.
Iterator:- When we get the iterator object from the iter() function; we call __next__() method (in python3) or simply next() (in python2) to get elements one by one. This class or instance of this class is called an iterator.
From docs:-
The use of iterators pervades and unifies Python. Behind the scenes, the for statement calls iter() on the container object. The function returns an iterator object that defines the method __next__() which accesses elements in the container one at a time. When there are no more elements, __next__() raises a StopIteration exception which tells the for loop to terminate. You can call the __next__() method using the next() built-in function; this example shows how it all works:
>>> s = 'abc'
>>> it = iter(s)
>>> it
<iterator object at 0x00A1DB50>
>>> next(it)
'a'
>>> next(it)
'b'
>>> next(it)
'c'
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
next(it)
StopIteration
Ex of a class:-
class Reverse:
"""Iterator for looping over a sequence backwards."""
def __init__(self, data):
self.data = data
self.index = len(data)
def __iter__(self):
return self
def __next__(self):
if self.index == 0:
raise StopIteration
self.index = self.index - 1
return self.data[self.index]
>>> rev = Reverse('spam')
>>> iter(rev)
<__main__.Reverse object at 0x00A1DB50>
>>> for char in rev:
... print(char)
...
m
a
p
s
Iterators are objects that implement the iter and next methods. If those methods are defined, we can use for loop or comprehensions.
class Squares:
def __init__(self, length):
self.length = length
self.i = 0
def __iter__(self):
print('calling __iter__') # this will be called first and only once
return self
def __next__(self):
print('calling __next__') # this will be called for each iteration
if self.i >= self.length:
raise StopIteration
else:
result = self.i ** 2
self.i += 1
return result
Iterators get exhausted. It means after you iterate over items, you cannot reiterate, you have to create a new object. Let's say you have a class, which holds the cities properties and you want to iterate over.
class Cities:
def __init__(self):
self._cities = ['Brooklyn', 'Manhattan', 'Prag', 'Madrid', 'London']
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._cities):
raise StopIteration
else:
item = self._cities[self._index]
self._index += 1
return item
Instance of class Cities is an iterator. However if you want to reiterate over cities, you have to create a new object which is an expensive operation. You can separate the class into 2 classes: one returns cities and second returns an iterator which gets the cities as init param.
class Cities:
def __init__(self):
self._cities = ['New York', 'Newark', 'Istanbul', 'London']
def __len__(self):
return len(self._cities)
class CityIterator:
def __init__(self, city_obj):
# cities is an instance of Cities
self._city_obj = city_obj
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._city_obj):
raise StopIteration
else:
item = self._city_obj._cities[self._index]
self._index += 1
return item
Now if we need to create a new iterator, we do not have to create the data again, which is cities. We creates cities object and pass it to the iterator. But we are still doing extra work. We could implement this by creating only one class.
Iterable is a Python object that implements the iterable protocol. It requires only __iter__() that returns a new instance of iterator object.
class Cities:
def __init__(self):
self._cities = ['New York', 'Newark', 'Istanbul', 'Paris']
def __len__(self):
return len(self._cities)
def __iter__(self):
return self.CityIterator(self)
class CityIterator:
def __init__(self, city_obj):
self._city_obj = city_obj
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._city_obj):
raise StopIteration
else:
item = self._city_obj._cities[self._index]
self._index += 1
return item
Iterators has __iter__ and __next__, iterables have __iter__, so we can say Iterators are also iterables but they are iterables that get exhausted. Iterables on the other hand never become exhausted
because they always return a new iterator that is then used to iterate
You notice that the main part of the iterable code is in the iterator, and the iterable itself is nothing more than an extra layer that allows us to create and access the iterator.
Iterating over an iterable
Python has a built function iter() which calls the __iter__(). When we iterate over an iterable, Python calls the iter() which returns an iterator, then it starts using __next__() of iterator to iterate over the data.
NOte that in the above example, Cities creates an iterable but it is not a sequence type, it means we cannot get a city by an index. To fix this we should just add __get_item__ to the Cities class.
class Cities:
def __init__(self):
self._cities = ['New York', 'Newark', 'Budapest', 'Newcastle']
def __len__(self):
return len(self._cities)
def __getitem__(self, s): # now a sequence type
return self._cities[s]
def __iter__(self):
return self.CityIterator(self)
class CityIterator:
def __init__(self, city_obj):
self._city_obj = city_obj
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._city_obj):
raise StopIteration
else:
item = self._city_obj._cities[self._index]
self._index += 1
return item
iterable = [1, 2]
iterator = iter(iterable)
print(iterator.__next__())
print(iterator.__next__())
so,
iterable is an object that can be looped over. e.g. list , string , tuple etc.
using the iter function on our iterable object will return an iterator object.
now this iterator object has method named __next__ (in Python 3, or just next in Python 2) by which you can access each element of iterable.
so,
OUTPUT OF ABOVE CODE WILL BE:
1
2
An iterable is an object that has an iter() method which returns an iterator. It is something that can be looped over.
Example : A list is iterable because we can loop over a list BUT is not an iterator
An iterator is an object that you can get an iterator from. It is an object with a state so that it remember where it is during iteration
To see if the object has this method iter() we can use the below function.
ls = ['hello','bye']
print(dir(ls))
Output
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
As you can see has the iter() that's mean that is a iterable object, but doesn't contain the next() method which is a feature of the iterator object
Whenever you use a for loop or map or a list comprehension in Python the next method is called automatically to get each item from the iteration
Before dealing with the iterables and iterator the major factor that decide the iterable and iterator is sequence
Sequence: Sequence is the collection of data
Iterable: Iterable are the sequence type object that support __iter__ method.
Iter method: Iter method take sequence as an input and create an object which is known as iterator
Iterator: Iterator are the object which call next method and transverse through the sequence. On calling the next method it returns the object that it traversed currently.
example:
x=[1,2,3,4]
x is a sequence which consists of collection of data
y=iter(x)
On calling iter(x) it returns a iterator only when the x object has iter method otherwise it raise an exception.If it returns iterator then y is assign like this:
y=[1,2,3,4]
As y is a iterator hence it support next() method
On calling next method it returns the individual elements of the list one by one.
After returning the last element of the sequence if we again call the next method it raise an StopIteration error
example:
>>> y.next()
1
>>> y.next()
2
>>> y.next()
3
>>> y.next()
4
>>> y.next()
StopIteration
Other people already explained comprehensively, what is iterable and iterator, so I will try to do the same thing with generators.
IMHO the main problem for understanding generators is a confusing use of the word “generator”, because this word is used in 2 different meanings:
as a tool for creating (generating) iterators,
in the form of a function returning an iterator (i.e. with the yield statement(s) in its body),
in the form of a generator expression
as a result of the use of that tool, i.e. the resulting iterator.
(In this meaning a generator is a special form of an iterator — the word “generator” points out how this iterator was created.)
Generator as a tool of the 1st type:
In[2]: def my_generator():
...: yield 100
...: yield 200
In[3]: my_generator
Out[3]: <function __main__.my_generator()>
In[4]: type(my_generator)
Out[4]: function
Generator as a result (i.e. an iterator) of the use of this tool:
In[5]: my_iterator = my_generator()
In[6]: my_iterator
Out[6]: <generator object my_generator at 0x00000000053EAE48>
In[7]: type(my_iterator)
Out[7]: generator
Generator as a tool of the 2nd type — indistinguishable from the resulting iterator of this tool:
In[8]: my_gen_expression = (2 * i for i in (10, 20))
In[9]: my_gen_expression
Out[9]: <generator object <genexpr> at 0x000000000542C048>
In[10]: type(my_gen_expression)
Out[10]: generator
Here's another view using collections.abc. This view may be useful the second time around or later.
From collections.abc we can see the following hierarchy:
builtins.object
Iterable
Iterator
Generator
i.e. Generator is derived from Iterator is derived from Iterable is derived from the base object.
Hence,
Every iterator is an iterable, but not every iterable is an iterator. For example, [1, 2, 3] and range(10) are iterables, but not iterators. x = iter([1, 2, 3]) is an iterator and an iterable.
A similar relationship exists between Iterator and Generator.
Calling iter() on an iterator or a generator returns itself. Thus, if it is an iterator, then iter(it) is it is True.
Under the hood, a list comprehension like [2 * x for x in nums] or a for loop like for x in nums:, acts as though iter() is called on the iterable (nums) and then iterates over nums using that iterator. Hence, all of the following are functionally equivalent (with, say, nums=[1, 2, 3]):
for x in nums:
for x in iter(nums):
for x in iter(iter(nums)):
for x in iter(iter(iter(iter(iter(nums))))):
For me, Python's glossery was most helpful for these questions, e.g. for iterable it says:
An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as list, str, and tuple) and some non-sequence types like dict, file objects, and objects of any classes you define with an iter() method or with a getitem() method that implements Sequence semantics.
Iterables can be used in a for loop and in many other places where a sequence is needed (zip(), map(), …). When an iterable object is passed as an argument to the built-in function iter(), it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to call iter() or deal with iterator objects yourself. The for statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop. See also iterator, sequence, and generator.

Iterable class in python3

I am trying to implement an iterable proxy for a web resource (lazily fetched images).
Firstly, I did (returning ids, in production those will be image buffers)
def iter(ids=[1,2,3]):
for id in ids:
yield id
and that worked nicely, but now I need to keep state.
I read the four ways to define iterators. I judged that the iterator protocol is the way to go. Follow my attempt and failure to implement that.
class Test:
def __init__(self, ids):
self.ids = ids
def __iter__(self):
return self
def __next__(self):
for id in self.ids:
yield id
raise StopIteration
test = Test([1,2,3])
for t in test:
print('new value', t)
Output:
new value <generator object Test.__next__ at 0x7f9c46ed1750>
new value <generator object Test.__next__ at 0x7f9c46ed1660>
new value <generator object Test.__next__ at 0x7f9c46ed1750>
new value <generator object Test.__next__ at 0x7f9c46ed1660>
new value <generator object Test.__next__ at 0x7f9c46ed1750>
forever.
What's wrong?
Thanks to absolutely everyone! It's all new to me, but I'm learning new cool stuff.
Your __next__ method uses yield, which makes it a generator function. Generator functions return a new iterator when called.
But the __next__ method is part of the iterator interface. It should not itself be an iterator. __next__ should return the next value, not something that returns all values(*).
Because you wanted to create an iterable, you can just make __iter__ the generator here:
class Test:
def __init__(self, ids):
self.ids = ids
def __iter__(self):
for id in self.ids:
yield id
Note that a generator function should not use raise StopIteration, just returning from the function does that for you.
The above class is an iterable. Iterables only have an __iter__ method, and no __next__ method. Iterables produce an iterator when __iter__ is called:
Iterable -> (call __iter__) -> Iterator
In the above example, because Test.__iter__ is a generator function, it creates a new object each time we call it:
>>> test = Test([1,2,3])
>>> test.__iter__() # create an iterator
<generator object Test.__iter__ at 0x111e85660>
>>> test.__iter__()
<generator object Test.__iter__ at 0x111e85740>
A generator object is a specific kind of iterator, one created by calling a generator function, or by using a generator expression. Note that the hex values in the representations differ, two different objects were created for the two calls. This is by design! Iterables produce iterators, and can create more at will. This lets you loop over them independently:
>>> test_it1 = test.__iter__()
>>> test_it1.__next__()
1
>>> test_it2 = test.__iter__()
>>> test_it2.__next__()
1
>>> test_it1.__next__()
2
Note that I called __next__() on the object returned by test.__iter__(), the iterator, not on test itself, which doesn't have that method because it is only an iterable, not an iterator.
Iterators also have an __iter__ method, which always must return self, because they are their own iterators. It is the __next__ method that makes them an iterator, and the job of __next__ is to be called repeatedly, until it raises StopIteration. Until StopIteration is raised, each call should return the next value. Once an iterator is done (has raised StopIteration), it is meant to then always raise StopIteration. Iterators can only be used once, unless they are infinite (never raise StopIteration and just keep producing values each time __next__ is called).
So this is an iterator:
class IteratorTest:
def __init__(self, ids):
self.ids = ids
self.nextpos = 0
def __iter__(self):
return self
def __next__(self):
if self.ids is None or self.nextpos >= len(self.ids):
# we are done
self.ids = None
raise StopIteration
value = self.ids[self.nextpos]
self.nextpos += 1
return value
This has to do a bit more work; it has to keep track of what the next value to produce would be, and if we have raised StopIteration yet. Other answerers here have used what appear to be simpler ways, but those actually involve letting something else do all the hard work. When you use iter(self.ids) or (i for i in ids) you are creating a different iterator to delegate __next__ calls to. That's cheating a bit, hiding the state of the iterator inside ready-made standard library objects.
You don't usually see anything calling __iter__ or __next__ in Python code, because those two methods are just the hooks that you can implement in your Python classes; if you were to implement an iterator in the C API then the hook names are slightly different. Instead, you either use the iter() and next() functions, or just use the object in syntax or a function call that accepts an iterable.
The for loop is such syntax. When you use a for loop, Python uses the (moral equivalent) of calling __iter__() on the object, then __next__() on the resulting iterator object to get each value. You can see this if you disassemble the Python bytecode:
>>> from dis import dis
>>> dis("for t in test: pass")
1 0 LOAD_NAME 0 (test)
2 GET_ITER
>> 4 FOR_ITER 4 (to 10)
6 STORE_NAME 1 (t)
8 JUMP_ABSOLUTE 4
>> 10 LOAD_CONST 0 (None)
12 RETURN_VALUE
The GET_ITER opcode at position 2 calls test.__iter__(), and FOR_ITER uses __next__ on the resulting iterator to keep looping (executing STORE_NAME to set t to the next value, then jumping back to position 4), until StopIteration is raised. Once that happens, it'll jump to position 10 to end the loop.
If you want to play more with the difference between iterators and iterables, take a look at the Python standard types and see what happens when you use iter() and next() on them. Like lists or tuples:
>>> foo = (42, 81, 17, 111)
>>> next(foo) # foo is a tuple, not an iterator
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object is not an iterator
>>> t_it = iter(foo) # so use iter() to create one from the tuple
>>> t_it # here is an iterator object for our foo tuple
<tuple_iterator object at 0x111e9af70>
>>> iter(t_it) # it returns itself
<tuple_iterator object at 0x111e9af70>
>>> iter(t_it) is t_it # really, it returns itself, not a new object
True
>>> next(t_it) # we can get values from it, one by one
42
>>> next(t_it) # another one
81
>>> next(t_it) # yet another one
17
>>> next(t_it) # this is getting boring..
111
>>> next(t_it) # and now we are done
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> next(t_it) # an *stay* done
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> foo # but foo itself is still there
(42, 81, 17, 111)
You could make Test, the iterable, return a custom iterator class instance too (and not cop out by having generator function create the iterator for us):
class Test:
def __init__(self, ids):
self.ids = ids
def __iter__(self):
return TestIterator(self)
class TestIterator:
def __init__(self, test):
self.test = test
def __iter__(self):
return self
def __next__(self):
if self.test is None or self.nextpos >= len(self.test.ids):
# we are done
self.test = None
raise StopIteration
value = self.test.ids[self.nextpos]
self.nextpos += 1
return value
That's a lot like the original IteratorTest class above, but TestIterator keeps a reference to the Test instance. That's really how tuple_iterator works too.
A brief, final note on naming conventions here: I am sticking with using self for the first argument to methods, so the bound instance. Using different names for that argument only serves to make it harder to talk about your code with other, experienced Python developers. Don't use me, however cute or short it may seem.
(*) Unless your goal was to create an iterator of iterators, of course (which is basically what the itertools.groupby() iterator does, it is an iterator producing (object, group_iterator) tuples, but I digress).
It is unclear to me exactly what you are trying to achieve, but if you really want to use your instance attributes like this, you can convert the input to a generator and then iterate it as such. But, as I said, this feels odd and I don't think you'd actually want a setup like this.
class Test:
def __init__(self, ids):
self.ids = iter(ids)
def __iter__(self):
return self
def __next__(self):
return next(self.ids)
test = Test([1,2,3])
for t in test:
print('new value', t)
The simplest solution is to use __iter__ and return an iterator to the main list:
class Test:
def __init__(self, ids):
self.ids = ids
def __iter__(self):
return iter(self.ids)
test = Test([1,2,3])
for t in test:
print('new value', t)
As the update, for lazily loading you can return an iterator to a generator:
def __iter__(self):
return iter(load_file(id) for id in self.ids)
The __next__ function is supposed to return the next value provided by an iterator. Since you have used yield in your implementation, the function returns a generator, which is what you get.
You need to make clear whether you want Test to be an iterable or an iterator. If it is an iterable, it will have the ability to provide an iterator with __iter__. If it is an iterator, it will have the ability to provide new elements with __next__. Iterators can typically work as iterables by returning themselves in __iter__. Martijn's answer shows what you probably want. However, if you want an example of how you could specifically implement __next__ (by making Test explicitly an iterator), it could be something like this:
class Test:
def __init__(self, ids):
self.ids = ids
self.idx = 0
def __iter__(self):
return self
def __next__(self):
if self.idx >= len(self.ids):
raise StopIteration
else:
self.idx += 1
return self.ids[self.idx - 1]
test = Test([1,2,3])
for t in test:
print('new value', t)

Python Iterators: Use of __iter__ method in a class [duplicate]

What are "iterable", "iterator", and "iteration" in Python? How are they defined?
Iteration is a general term for taking each item of something, one after another. Any time you use a loop, explicit or implicit, to go over a group of items, that is iteration.
In Python, iterable and iterator have specific meanings.
An iterable is an object that has an __iter__ method which returns an iterator, or which defines a __getitem__ method that can take sequential indexes starting from zero (and raises an IndexError when the indexes are no longer valid). So an iterable is an object that you can get an iterator from.
An iterator is an object with a next (Python 2) or __next__ (Python 3) method.
Whenever you use a for loop, or map, or a list comprehension, etc. in Python, the next method is called automatically to get each item from the iterator, thus going through the process of iteration.
A good place to start learning would be the iterators section of the tutorial and the iterator types section of the standard types page. After you understand the basics, try the iterators section of the Functional Programming HOWTO.
Here's the explanation I use in teaching Python classes:
An ITERABLE is:
anything that can be looped over (i.e. you can loop over a string or file) or
anything that can appear on the right-side of a for-loop: for x in iterable: ... or
anything you can call with iter() that will return an ITERATOR: iter(obj) or
an object that defines __iter__ that returns a fresh ITERATOR,
or it may have a __getitem__ method suitable for indexed lookup.
An ITERATOR is an object:
with state that remembers where it is during iteration,
with a __next__ method that:
returns the next value in the iteration
updates the state to point at the next value
signals when it is done by raising StopIteration
and that is self-iterable (meaning that it has an __iter__ method that returns self).
Notes:
The __next__ method in Python 3 is spelt next in Python 2, and
The builtin function next() calls that method on the object passed to it.
For example:
>>> s = 'cat' # s is an ITERABLE
# s is a str object that is immutable
# s has no state
# s has a __getitem__() method
>>> t = iter(s) # t is an ITERATOR
# t has state (it starts by pointing at the "c"
# t has a next() method and an __iter__() method
>>> next(t) # the next() function returns the next value and advances the state
'c'
>>> next(t) # the next() function returns the next value and advances
'a'
>>> next(t) # the next() function returns the next value and advances
't'
>>> next(t) # next() raises StopIteration to signal that iteration is complete
Traceback (most recent call last):
...
StopIteration
>>> iter(t) is t   # the iterator is self-iterable
The above answers are great, but as most of what I've seen, don't stress the distinction enough for people like me.
Also, people tend to get "too Pythonic" by putting definitions like "X is an object that has __foo__() method" before. Such definitions are correct--they are based on duck-typing philosophy, but the focus on methods tends to get between when trying to understand the concept in its simplicity.
So I add my version.
In natural language,
iteration is the process of taking one element at a time in a row of elements.
In Python,
iterable is an object that is, well, iterable, which simply put, means that
it can be used in iteration, e.g. with a for loop. How? By using iterator.
I'll explain below.
... while iterator is an object that defines how to actually do the
iteration--specifically what is the next element. That's why it must have
next() method.
Iterators are themselves also iterable, with the distinction that their __iter__() method returns the same object (self), regardless of whether or not its items have been consumed by previous calls to next().
So what does Python interpreter think when it sees for x in obj: statement?
Look, a for loop. Looks like a job for an iterator... Let's get one. ...
There's this obj guy, so let's ask him.
"Mr. obj, do you have your iterator?" (... calls iter(obj), which calls
obj.__iter__(), which happily hands out a shiny new iterator _i.)
OK, that was easy... Let's start iterating then. (x = _i.next() ... x = _i.next()...)
Since Mr. obj succeeded in this test (by having certain method returning a valid iterator), we reward him with adjective: you can now call him "iterable Mr. obj".
However, in simple cases, you don't normally benefit from having iterator and iterable separately. So you define only one object, which is also its own iterator. (Python does not really care that _i handed out by obj wasn't all that shiny, but just the obj itself.)
This is why in most examples I've seen (and what had been confusing me over and over),
you can see:
class IterableExample(object):
def __iter__(self):
return self
def next(self):
pass
instead of
class Iterator(object):
def next(self):
pass
class Iterable(object):
def __iter__(self):
return Iterator()
There are cases, though, when you can benefit from having iterator separated from the iterable, such as when you want to have one row of items, but more "cursors". For example when you want to work with "current" and "forthcoming" elements, you can have separate iterators for both. Or multiple threads pulling from a huge list: each can have its own iterator to traverse over all items. See #Raymond's and #glglgl's answers above.
Imagine what you could do:
class SmartIterableExample(object):
def create_iterator(self):
# An amazingly powerful yet simple way to create arbitrary
# iterator, utilizing object state (or not, if you are fan
# of functional), magic and nuclear waste--no kittens hurt.
pass # don't forget to add the next() method
def __iter__(self):
return self.create_iterator()
Notes:
I'll repeat again: iterator is not iterable. Iterator cannot be used as
a "source" in for loop. What for loop primarily needs is __iter__()
(that returns something with next()).
Of course, for is not the only iteration loop, so above applies to some other
constructs as well (while...).
Iterator's next() can throw StopIteration to stop iteration. Does not have to,
though, it can iterate forever or use other means.
In the above "thought process", _i does not really exist. I've made up that name.
There's a small change in Python 3.x: next() method (not the built-in) now
must be called __next__(). Yes, it should have been like that all along.
You can also think of it like this: iterable has the data, iterator pulls the next
item
Disclaimer: I'm not a developer of any Python interpreter, so I don't really know what the interpreter "thinks". The musings above are solely demonstration of how I understand the topic from other explanations, experiments and real-life experience of a Python newbie.
An iterable is a object which has a __iter__() method. It can possibly iterated over several times, such as list()s and tuple()s.
An iterator is the object which iterates. It is returned by an __iter__() method, returns itself via its own __iter__() method and has a next() method (__next__() in 3.x).
Iteration is the process of calling this next() resp. __next__() until it raises StopIteration.
Example:
>>> a = [1, 2, 3] # iterable
>>> b1 = iter(a) # iterator 1
>>> b2 = iter(a) # iterator 2, independent of b1
>>> next(b1)
1
>>> next(b1)
2
>>> next(b2) # start over, as it is the first call to b2
1
>>> next(b1)
3
>>> next(b1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> b1 = iter(a) # new one, start over
>>> next(b1)
1
Here's my cheat sheet:
sequence
+
|
v
def __getitem__(self, index: int):
+ ...
| raise IndexError
|
|
| def __iter__(self):
| + ...
| | return <iterator>
| |
| |
+--> or <-----+ def __next__(self):
+ | + ...
| | | raise StopIteration
v | |
iterable | |
+ | |
| | v
| +----> and +-------> iterator
| ^
v |
iter(<iterable>) +----------------------+
|
def generator(): |
+ yield 1 |
| generator_expression +-+
| |
+-> generator() +-> generator_iterator +-+
Quiz: Do you see how...
every iterator is an iterable?
a container object's __iter__() method can be implemented as a generator?
an iterable that has a __next__ method is not necessarily an iterator?
Answers:
Every iterator must have an __iter__ method. Having __iter__ is enough to be an iterable. Therefore every iterator is an iterable.
When __iter__ is called it should return an iterator (return <iterator> in the diagram above). Calling a generator returns a generator iterator which is a type of iterator.
class Iterable1:
def __iter__(self):
# a method (which is a function defined inside a class body)
# calling iter() converts iterable (tuple) to iterator
return iter((1,2,3))
class Iterable2:
def __iter__(self):
# a generator
for i in (1, 2, 3):
yield i
class Iterable3:
def __iter__(self):
# with PEP 380 syntax
yield from (1, 2, 3)
# passes
assert list(Iterable1()) == list(Iterable2()) == list(Iterable3()) == [1, 2, 3]
Here is an example:
class MyIterable:
def __init__(self):
self.n = 0
def __getitem__(self, index: int):
return (1, 2, 3)[index]
def __next__(self):
n = self.n = self.n + 1
if n > 3:
raise StopIteration
return n
# if you can iter it without raising a TypeError, then it's an iterable.
iter(MyIterable())
# but obviously `MyIterable()` is not an iterator since it does not have
# an `__iter__` method.
from collections.abc import Iterator
assert isinstance(MyIterable(), Iterator) # AssertionError
I don’t know if it helps anybody but I always like to visualize concepts in my head to better understand them. So as I have a little son I visualize iterable/iterator concept with bricks and white paper.
Suppose we are in the dark room and on the floor we have bricks for my son. Bricks of different size, color, does not matter now. Suppose we have 5 bricks like those. Those 5 bricks can be described as an object – let’s say bricks kit. We can do many things with this bricks kit – can take one and then take second and then third, can change places of bricks, put first brick above the second. We can do many sorts of things with those. Therefore this bricks kit is an iterable object or sequence as we can go through each brick and do something with it. We can only do it like my little son – we can play with one brick at a time. So again I imagine myself this bricks kit to be an iterable.
Now remember that we are in the dark room. Or almost dark. The thing is that we don’t clearly see those bricks, what color they are, what shape etc. So even if we want to do something with them – aka iterate through them – we don’t really know what and how because it is too dark.
What we can do is near to first brick – as element of a bricks kit – we can put a piece of white fluorescent paper in order for us to see where the first brick-element is. And each time we take a brick from a kit, we replace the white piece of paper to a next brick in order to be able to see that in the dark room. This white piece of paper is nothing more than an iterator. It is an object as well. But an object with what we can work and play with elements of our iterable object – bricks kit.
That by the way explains my early mistake when I tried the following in an IDLE and got a TypeError:
>>> X = [1,2,3,4,5]
>>> next(X)
Traceback (most recent call last):
File "<pyshell#19>", line 1, in <module>
next(X)
TypeError: 'list' object is not an iterator
List X here was our bricks kit but NOT a white piece of paper. I needed to find an iterator first:
>>> X = [1,2,3,4,5]
>>> bricks_kit = [1,2,3,4,5]
>>> white_piece_of_paper = iter(bricks_kit)
>>> next(white_piece_of_paper)
1
>>> next(white_piece_of_paper)
2
>>>
Don’t know if it helps, but it helped me. If someone could confirm/correct visualization of the concept, I would be grateful. It would help me to learn more.
I don't think that you can get it much simpler than the documentation, however I'll try:
Iterable is something that can be iterated over. In practice it usually means a sequence e.g. something that has a beginning and an end and some way to go through all the items in it.
You can think Iterator as a helper pseudo-method (or pseudo-attribute) that gives (or holds) the next (or first) item in the iterable. (In practice it is just an object that defines the method next())
Iteration is probably best explained by the Merriam-Webster definition of the word :
b : the repetition of a sequence of computer instructions a specified
number of times or until a condition is met — compare recursion
Iterables have a __iter__ method that instantiates a new iterator every time.
Iterators implement a __next__ method that returns individual items, and a __iter__ method that returns self .
Therefore, iterators are also iterable, but iterables are not iterators.
Luciano Ramalho, Fluent Python.
Iterable:- something that is iterable is iterable; like sequences like lists ,strings etc.
Also it has either the __getitem__ method or an __iter__ method. Now if we use iter() function on that object, we'll get an iterator.
Iterator:- When we get the iterator object from the iter() function; we call __next__() method (in python3) or simply next() (in python2) to get elements one by one. This class or instance of this class is called an iterator.
From docs:-
The use of iterators pervades and unifies Python. Behind the scenes, the for statement calls iter() on the container object. The function returns an iterator object that defines the method __next__() which accesses elements in the container one at a time. When there are no more elements, __next__() raises a StopIteration exception which tells the for loop to terminate. You can call the __next__() method using the next() built-in function; this example shows how it all works:
>>> s = 'abc'
>>> it = iter(s)
>>> it
<iterator object at 0x00A1DB50>
>>> next(it)
'a'
>>> next(it)
'b'
>>> next(it)
'c'
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
next(it)
StopIteration
Ex of a class:-
class Reverse:
"""Iterator for looping over a sequence backwards."""
def __init__(self, data):
self.data = data
self.index = len(data)
def __iter__(self):
return self
def __next__(self):
if self.index == 0:
raise StopIteration
self.index = self.index - 1
return self.data[self.index]
>>> rev = Reverse('spam')
>>> iter(rev)
<__main__.Reverse object at 0x00A1DB50>
>>> for char in rev:
... print(char)
...
m
a
p
s
Iterators are objects that implement the iter and next methods. If those methods are defined, we can use for loop or comprehensions.
class Squares:
def __init__(self, length):
self.length = length
self.i = 0
def __iter__(self):
print('calling __iter__') # this will be called first and only once
return self
def __next__(self):
print('calling __next__') # this will be called for each iteration
if self.i >= self.length:
raise StopIteration
else:
result = self.i ** 2
self.i += 1
return result
Iterators get exhausted. It means after you iterate over items, you cannot reiterate, you have to create a new object. Let's say you have a class, which holds the cities properties and you want to iterate over.
class Cities:
def __init__(self):
self._cities = ['Brooklyn', 'Manhattan', 'Prag', 'Madrid', 'London']
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._cities):
raise StopIteration
else:
item = self._cities[self._index]
self._index += 1
return item
Instance of class Cities is an iterator. However if you want to reiterate over cities, you have to create a new object which is an expensive operation. You can separate the class into 2 classes: one returns cities and second returns an iterator which gets the cities as init param.
class Cities:
def __init__(self):
self._cities = ['New York', 'Newark', 'Istanbul', 'London']
def __len__(self):
return len(self._cities)
class CityIterator:
def __init__(self, city_obj):
# cities is an instance of Cities
self._city_obj = city_obj
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._city_obj):
raise StopIteration
else:
item = self._city_obj._cities[self._index]
self._index += 1
return item
Now if we need to create a new iterator, we do not have to create the data again, which is cities. We creates cities object and pass it to the iterator. But we are still doing extra work. We could implement this by creating only one class.
Iterable is a Python object that implements the iterable protocol. It requires only __iter__() that returns a new instance of iterator object.
class Cities:
def __init__(self):
self._cities = ['New York', 'Newark', 'Istanbul', 'Paris']
def __len__(self):
return len(self._cities)
def __iter__(self):
return self.CityIterator(self)
class CityIterator:
def __init__(self, city_obj):
self._city_obj = city_obj
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._city_obj):
raise StopIteration
else:
item = self._city_obj._cities[self._index]
self._index += 1
return item
Iterators has __iter__ and __next__, iterables have __iter__, so we can say Iterators are also iterables but they are iterables that get exhausted. Iterables on the other hand never become exhausted
because they always return a new iterator that is then used to iterate
You notice that the main part of the iterable code is in the iterator, and the iterable itself is nothing more than an extra layer that allows us to create and access the iterator.
Iterating over an iterable
Python has a built function iter() which calls the __iter__(). When we iterate over an iterable, Python calls the iter() which returns an iterator, then it starts using __next__() of iterator to iterate over the data.
NOte that in the above example, Cities creates an iterable but it is not a sequence type, it means we cannot get a city by an index. To fix this we should just add __get_item__ to the Cities class.
class Cities:
def __init__(self):
self._cities = ['New York', 'Newark', 'Budapest', 'Newcastle']
def __len__(self):
return len(self._cities)
def __getitem__(self, s): # now a sequence type
return self._cities[s]
def __iter__(self):
return self.CityIterator(self)
class CityIterator:
def __init__(self, city_obj):
self._city_obj = city_obj
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._city_obj):
raise StopIteration
else:
item = self._city_obj._cities[self._index]
self._index += 1
return item
iterable = [1, 2]
iterator = iter(iterable)
print(iterator.__next__())
print(iterator.__next__())
so,
iterable is an object that can be looped over. e.g. list , string , tuple etc.
using the iter function on our iterable object will return an iterator object.
now this iterator object has method named __next__ (in Python 3, or just next in Python 2) by which you can access each element of iterable.
so,
OUTPUT OF ABOVE CODE WILL BE:
1
2
An iterable is an object that has an iter() method which returns an iterator. It is something that can be looped over.
Example : A list is iterable because we can loop over a list BUT is not an iterator
An iterator is an object that you can get an iterator from. It is an object with a state so that it remember where it is during iteration
To see if the object has this method iter() we can use the below function.
ls = ['hello','bye']
print(dir(ls))
Output
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
As you can see has the iter() that's mean that is a iterable object, but doesn't contain the next() method which is a feature of the iterator object
Whenever you use a for loop or map or a list comprehension in Python the next method is called automatically to get each item from the iteration
Before dealing with the iterables and iterator the major factor that decide the iterable and iterator is sequence
Sequence: Sequence is the collection of data
Iterable: Iterable are the sequence type object that support __iter__ method.
Iter method: Iter method take sequence as an input and create an object which is known as iterator
Iterator: Iterator are the object which call next method and transverse through the sequence. On calling the next method it returns the object that it traversed currently.
example:
x=[1,2,3,4]
x is a sequence which consists of collection of data
y=iter(x)
On calling iter(x) it returns a iterator only when the x object has iter method otherwise it raise an exception.If it returns iterator then y is assign like this:
y=[1,2,3,4]
As y is a iterator hence it support next() method
On calling next method it returns the individual elements of the list one by one.
After returning the last element of the sequence if we again call the next method it raise an StopIteration error
example:
>>> y.next()
1
>>> y.next()
2
>>> y.next()
3
>>> y.next()
4
>>> y.next()
StopIteration
Other people already explained comprehensively, what is iterable and iterator, so I will try to do the same thing with generators.
IMHO the main problem for understanding generators is a confusing use of the word “generator”, because this word is used in 2 different meanings:
as a tool for creating (generating) iterators,
in the form of a function returning an iterator (i.e. with the yield statement(s) in its body),
in the form of a generator expression
as a result of the use of that tool, i.e. the resulting iterator.
(In this meaning a generator is a special form of an iterator — the word “generator” points out how this iterator was created.)
Generator as a tool of the 1st type:
In[2]: def my_generator():
...: yield 100
...: yield 200
In[3]: my_generator
Out[3]: <function __main__.my_generator()>
In[4]: type(my_generator)
Out[4]: function
Generator as a result (i.e. an iterator) of the use of this tool:
In[5]: my_iterator = my_generator()
In[6]: my_iterator
Out[6]: <generator object my_generator at 0x00000000053EAE48>
In[7]: type(my_iterator)
Out[7]: generator
Generator as a tool of the 2nd type — indistinguishable from the resulting iterator of this tool:
In[8]: my_gen_expression = (2 * i for i in (10, 20))
In[9]: my_gen_expression
Out[9]: <generator object <genexpr> at 0x000000000542C048>
In[10]: type(my_gen_expression)
Out[10]: generator
Here's another view using collections.abc. This view may be useful the second time around or later.
From collections.abc we can see the following hierarchy:
builtins.object
Iterable
Iterator
Generator
i.e. Generator is derived from Iterator is derived from Iterable is derived from the base object.
Hence,
Every iterator is an iterable, but not every iterable is an iterator. For example, [1, 2, 3] and range(10) are iterables, but not iterators. x = iter([1, 2, 3]) is an iterator and an iterable.
A similar relationship exists between Iterator and Generator.
Calling iter() on an iterator or a generator returns itself. Thus, if it is an iterator, then iter(it) is it is True.
Under the hood, a list comprehension like [2 * x for x in nums] or a for loop like for x in nums:, acts as though iter() is called on the iterable (nums) and then iterates over nums using that iterator. Hence, all of the following are functionally equivalent (with, say, nums=[1, 2, 3]):
for x in nums:
for x in iter(nums):
for x in iter(iter(nums)):
for x in iter(iter(iter(iter(iter(nums))))):
For me, Python's glossery was most helpful for these questions, e.g. for iterable it says:
An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as list, str, and tuple) and some non-sequence types like dict, file objects, and objects of any classes you define with an iter() method or with a getitem() method that implements Sequence semantics.
Iterables can be used in a for loop and in many other places where a sequence is needed (zip(), map(), …). When an iterable object is passed as an argument to the built-in function iter(), it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to call iter() or deal with iterator objects yourself. The for statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop. See also iterator, sequence, and generator.

Python for loop and iterator behavior

I wanted to understand a bit more about iterators, so please correct me if I'm wrong.
An iterator is an object which has a pointer to the next object and is read as a buffer or stream (i.e. a linked list). They're particularly efficient cause all they do is tell you what is next by references instead of using indexing.
However I still don't understand why is the following behavior happening:
In [1]: iter = (i for i in range(5))
In [2]: for _ in iter:
....: print _
....:
0
1
2
3
4
In [3]: for _ in iter:
....: print _
....:
In [4]:
After a first loop through the iterator (In [2]) it's as if it was consumed and left empty, so the second loop (In [3]) prints nothing.
However I never assigned a new value to the iter variable.
What is really happening under the hood of the for loop?
Your suspicion is correct: the iterator has been consumed.
In actuality, your iterator is a generator, which is an object which has the ability to be iterated through only once.
type((i for i in range(5))) # says it's type generator
def another_generator():
yield 1 # the yield expression makes it a generator, not a function
type(another_generator()) # also a generator
The reason they are efficient has nothing to do with telling you what is next "by reference." They are efficient because they only generate the next item upon request; all of the items are not generated at once. In fact, you can have an infinite generator:
def my_gen():
while True:
yield 1 # again: yield means it is a generator, not a function
for _ in my_gen(): print(_) # hit ctl+c to stop this infinite loop!
Some other corrections to help improve your understanding:
The generator is not a pointer, and does not behave like a pointer as you might be familiar with in other languages.
One of the differences from other languages: as said above, each result of the generator is generated on the fly. The next result is not produced until it is requested.
The keyword combination for in accepts an iterable object as its second argument.
The iterable object can be a generator, as in your example case, but it can also be any other iterable object, such as a list, or dict, or a str object (string), or a user-defined type that provides the required functionality.
The iter function is applied to the object to get an iterator (by the way: don't use iter as a variable name in Python, as you have done - it is one of the keywords). Actually, to be more precise, the object's __iter__ method is called (which is, for the most part, all the iter function does anyway; __iter__ is one of Python's so-called "magic methods").
If the call to __iter__ is successful, the function next() is applied to the iterable object over and over again, in a loop, and the first variable supplied to for in is assigned to the result of the next() function. (Remember: the iterable object could be a generator, or a container object's iterator, or any other iterable object.) Actually, to be more precise: it calls the iterator object's __next__ method, which is another "magic method".
The for loop ends when next() raises the StopIteration exception (which usually happens when the iterable does not have another object to yield when next() is called).
You can "manually" implement a for loop in python this way (probably not perfect, but close enough):
try:
temp = iterable.__iter__()
except AttributeError():
raise TypeError("'{}' object is not iterable".format(type(iterable).__name__))
else:
while True:
try:
_ = temp.__next__()
except StopIteration:
break
except AttributeError:
raise TypeError("iter() returned non-iterator of type '{}'".format(type(temp).__name__))
# this is the "body" of the for loop
continue
There is pretty much no difference between the above and your example code.
Actually, the more interesting part of a for loop is not the for, but the in. Using in by itself produces a different effect than for in, but it is very useful to understand what in does with its arguments, since for in implements very similar behavior.
When used by itself, the in keyword first calls the object's __contains__ method, which is yet another "magic method" (note that this step is skipped when using for in). Using in by itself on a container, you can do things like this:
1 in [1, 2, 3] # True
'He' in 'Hello' # True
3 in range(10) # True
'eH' in 'Hello'[::-1] # True
If the iterable object is NOT a container (i.e. it doesn't have a __contains__ method), in next tries to call the object's __iter__ method. As was said previously: the __iter__ method returns what is known in Python as an iterator. Basically, an iterator is an object that you can use the built-in generic function next() on1. A generator is just one type of iterator.
If the call to __iter__ is successful, the in keyword applies the function next() to the iterable object over and over again. (Remember: the iterable object could be a generator, or a container object's iterator, or any other iterable object.) Actually, to be more precise: it calls the iterator object's __next__ method).
If the object doesn't have a __iter__ method to return an iterator, in then falls back on the old-style iteration protocol using the object's __getitem__ method2.
If all of the above attempts fail, you'll get a TypeError exception.
If you wish to create your own object type to iterate over (i.e, you can use for in, or just in, on it), it's useful to know about the yield keyword, which is used in generators (as mentioned above).
class MyIterable():
def __iter__(self):
yield 1
m = MyIterable()
for _ in m: print(_) # 1
1 in m # True
The presence of yield turns a function or method into a generator instead of a regular function/method. You don't need the __next__ method if you use a generator (it brings __next__ along with it automatically).
If you wish to create your own container object type (i.e, you can use in on it by itself, but NOT for in), you just need the __contains__ method.
class MyUselessContainer():
def __contains__(self, obj):
return True
m = MyUselessContainer()
1 in m # True
'Foo' in m # True
TypeError in m # True
None in m # True
1 Note that, to be an iterator, an object must implement the iterator protocol. This only means that both the __next__ and __iter__ methods must be correctly implemented (generators come with this functionality "for free", so you don't need to worry about it when using them). Also note that the ___next__ method is actually next (no underscores) in Python 2.
2 See this answer for the different ways to create iterable classes.
For loop basically calls the next method of an object that is applied to (__next__ in Python 3).
You can simulate this simply by doing:
iter = (i for i in range(5))
print(next(iter))
print(next(iter))
print(next(iter))
print(next(iter))
print(next(iter))
# this prints 1 2 3 4
At this point there is no next element in the input object. So doing this:
print(next(iter))
Will result in StopIteration exception thrown. At this point for will stop. And iterator can be any object which will respond to the next() function and throws the exception when there are no more elements. It does not have to be any pointer or reference (there are no such things in python anyway in C/C++ sense), linked list, etc.
There is an iterator protocol in python that defines how the for statement will behave with lists and dicts, and other things that can be looped over.
It's in the python docs here and here.
The way the iterator protocol works typically is in the form of a python generator. We yield a value as long as we have a value until we reach the end and then we raise StopIteration
So let's write our own iterator:
def my_iter():
yield 1
yield 2
yield 3
raise StopIteration()
for i in my_iter():
print i
The result is:
1
2
3
A couple of things to note about that. The my_iter is a function. my_iter() returns an iterator.
If I had written using iterator like this instead:
j = my_iter() #j is the iterator that my_iter() returns
for i in j:
print i #this loop runs until the iterator is exhausted
for i in j:
print i #the iterator is exhausted so we never reach this line
And the result is the same as above. The iter is exhausted by the time we enter the second for loop.
But that's rather simplistic what about something more complicated? Perhaps maybe in a loop why not?
def capital_iter(name):
for x in name:
yield x.upper()
raise StopIteration()
for y in capital_iter('bobert'):
print y
And when it runs, we use the iterator on the string type (which is built into iter). This in turn, allows us run a for loop on it, and yield the results until we are done.
B
O
B
E
R
T
So now this begs the question, so what happens between yields in the iterator?
j = capital_iter("bobert")
print i.next()
print i.next()
print i.next()
print("Hey there!")
print i.next()
print i.next()
print i.next()
print i.next() #Raises StopIteration
The answer is the function is paused at the yield waiting for the next call to next().
B
O
B
Hey There!
E
R
T
Traceback (most recent call last):
File "", line 13, in
StopIteration
Some additional details about the behaviour of iter() with __getitem__ classes that lack their own __iter__ method.
Before __iter__ there was __getitem__. If the __getitem__ works with ints from 0 - len(obj)-1, then iter() supports these objects. It will construct a new iterator that repeatedly calls __getitem__ with 0, 1, 2, ... until it gets an IndexError, which it converts to a StopIteration.
See this answer for more details of the different ways to create an iterator.
Excerpt from the Python Practice book:
5. Iterators & Generators
5.1. Iterators
We use for statement for looping over a list.
>>> for i in [1, 2, 3, 4]:
... print i,
...
1
2
3
4
If we use it with a string, it loops over its characters.
>>> for c in "python":
... print c
...
p
y
t
h
o
n
If we use it with a dictionary, it loops over its keys.
>>> for k in {"x": 1, "y": 2}:
... print k
...
y
x
If we use it with a file, it loops over lines of the file.
>>> for line in open("a.txt"):
... print line,
...
first line
second line
So there are many types of objects which can be used with a for loop. These are called iterable objects.
There are many functions which consume these iterables.
>>> ",".join(["a", "b", "c"])
'a,b,c'
>>> ",".join({"x": 1, "y": 2})
'y,x'
>>> list("python")
['p', 'y', 't', 'h', 'o', 'n']
>>> list({"x": 1, "y": 2})
['y', 'x']
5.1.1. The Iteration Protocol
The built-in function iter takes an iterable object and returns an iterator.
>>> x = iter([1, 2, 3])
>>> x
<listiterator object at 0x1004ca850>
>>> x.next()
1
>>> x.next()
2
>>> x.next()
3
>>> x.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Each time we call the next method on the iterator gives us the next element. If there are no more elements, it raises a StopIteration.
Iterators are implemented as classes. Here is an iterator that works like built-in xrange function.
class yrange:
def __init__(self, n):
self.i = 0
self.n = n
def __iter__(self):
return self
def next(self):
if self.i < self.n:
i = self.i
self.i += 1
return i
else:
raise StopIteration()
The iter method is what makes an object iterable. Behind the scenes, the iter function calls iter method on the given object.
The return value of iter is an iterator. It should have a next method and raise StopIteration when there are no more elements.
Lets try it out:
>>> y = yrange(3)
>>> y.next()
0
>>> y.next()
1
>>> y.next()
2
>>> y.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 14, in next
StopIteration
Many built-in functions accept iterators as arguments.
>>> list(yrange(5))
[0, 1, 2, 3, 4]
>>> sum(yrange(5))
10
In the above case, both the iterable and iterator are the same object. Notice that the iter method returned self. It need not be the case always.
class zrange:
def __init__(self, n):
self.n = n
def __iter__(self):
return zrange_iter(self.n)
class zrange_iter:
def __init__(self, n):
self.i = 0
self.n = n
def __iter__(self):
# Iterators are iterables too.
# Adding this functions to make them so.
return self
def next(self):
if self.i < self.n:
i = self.i
self.i += 1
return i
else:
raise StopIteration()
If both iteratable and iterator are the same object, it is consumed in a single iteration.
>>> y = yrange(5)
>>> list(y)
[0, 1, 2, 3, 4]
>>> list(y)
[]
>>> z = zrange(5)
>>> list(z)
[0, 1, 2, 3, 4]
>>> list(z)
[0, 1, 2, 3, 4]
5.2. Generators
Generators simplifies creation of iterators. A generator is a function that produces a sequence of results instead of a single value.
def yrange(n):
i = 0
while i < n:
yield i
i += 1
Each time the yield statement is executed the function generates a new value.
>>> y = yrange(3)
>>> y
<generator object yrange at 0x401f30>
>>> y.next()
0
>>> y.next()
1
>>> y.next()
2
>>> y.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
So a generator is also an iterator. You don’t have to worry about the iterator protocol.
The word “generator” is confusingly used to mean both the function that generates and what it generates. In this chapter, I’ll use the word “generator” to mean the generated object and “generator function” to mean the function that generates it.
Can you think about how it is working internally?
When a generator function is called, it returns a generator object without even beginning execution of the function. When next method is called for the first time, the function starts executing until it reaches yield statement. The yielded value is returned by the next call.
The following example demonstrates the interplay between yield and call to next method on generator object.
>>> def foo():
... print "begin"
... for i in range(3):
... print "before yield", i
... yield i
... print "after yield", i
... print "end"
...
>>> f = foo()
>>> f.next()
begin
before yield 0
0
>>> f.next()
after yield 0
before yield 1
1
>>> f.next()
after yield 1
before yield 2
2
>>> f.next()
after yield 2
end
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Lets see an example:
def integers():
"""Infinite sequence of integers."""
i = 1
while True:
yield i
i = i + 1
def squares():
for i in integers():
yield i * i
def take(n, seq):
"""Returns first n values from the given sequence."""
seq = iter(seq)
result = []
try:
for i in range(n):
result.append(seq.next())
except StopIteration:
pass
return result
print take(5, squares()) # prints [1, 4, 9, 16, 25]
Concept 1
All generators are iterators but all iterators are not generator
Concept 2
An iterator is an object with a next (Python 2) or next (Python 3)
method.
Concept 3
Quoting from wiki
Generators Generators
functions allow you to declare a function that behaves like an
iterator, i.e. it can be used in a for loop.
In your case
>>> it = (i for i in range(5))
>>> type(it)
<type 'generator'>
>>> callable(getattr(it, 'iter', None))
False
>>> callable(getattr(it, 'next', None))
True

What are iterator, iterable, and iteration?

What are "iterable", "iterator", and "iteration" in Python? How are they defined?
Iteration is a general term for taking each item of something, one after another. Any time you use a loop, explicit or implicit, to go over a group of items, that is iteration.
In Python, iterable and iterator have specific meanings.
An iterable is an object that has an __iter__ method which returns an iterator, or which defines a __getitem__ method that can take sequential indexes starting from zero (and raises an IndexError when the indexes are no longer valid). So an iterable is an object that you can get an iterator from.
An iterator is an object with a next (Python 2) or __next__ (Python 3) method.
Whenever you use a for loop, or map, or a list comprehension, etc. in Python, the next method is called automatically to get each item from the iterator, thus going through the process of iteration.
A good place to start learning would be the iterators section of the tutorial and the iterator types section of the standard types page. After you understand the basics, try the iterators section of the Functional Programming HOWTO.
Here's the explanation I use in teaching Python classes:
An ITERABLE is:
anything that can be looped over (i.e. you can loop over a string or file) or
anything that can appear on the right-side of a for-loop: for x in iterable: ... or
anything you can call with iter() that will return an ITERATOR: iter(obj) or
an object that defines __iter__ that returns a fresh ITERATOR,
or it may have a __getitem__ method suitable for indexed lookup.
An ITERATOR is an object:
with state that remembers where it is during iteration,
with a __next__ method that:
returns the next value in the iteration
updates the state to point at the next value
signals when it is done by raising StopIteration
and that is self-iterable (meaning that it has an __iter__ method that returns self).
Notes:
The __next__ method in Python 3 is spelt next in Python 2, and
The builtin function next() calls that method on the object passed to it.
For example:
>>> s = 'cat' # s is an ITERABLE
# s is a str object that is immutable
# s has no state
# s has a __getitem__() method
>>> t = iter(s) # t is an ITERATOR
# t has state (it starts by pointing at the "c"
# t has a next() method and an __iter__() method
>>> next(t) # the next() function returns the next value and advances the state
'c'
>>> next(t) # the next() function returns the next value and advances
'a'
>>> next(t) # the next() function returns the next value and advances
't'
>>> next(t) # next() raises StopIteration to signal that iteration is complete
Traceback (most recent call last):
...
StopIteration
>>> iter(t) is t   # the iterator is self-iterable
The above answers are great, but as most of what I've seen, don't stress the distinction enough for people like me.
Also, people tend to get "too Pythonic" by putting definitions like "X is an object that has __foo__() method" before. Such definitions are correct--they are based on duck-typing philosophy, but the focus on methods tends to get between when trying to understand the concept in its simplicity.
So I add my version.
In natural language,
iteration is the process of taking one element at a time in a row of elements.
In Python,
iterable is an object that is, well, iterable, which simply put, means that
it can be used in iteration, e.g. with a for loop. How? By using iterator.
I'll explain below.
... while iterator is an object that defines how to actually do the
iteration--specifically what is the next element. That's why it must have
next() method.
Iterators are themselves also iterable, with the distinction that their __iter__() method returns the same object (self), regardless of whether or not its items have been consumed by previous calls to next().
So what does Python interpreter think when it sees for x in obj: statement?
Look, a for loop. Looks like a job for an iterator... Let's get one. ...
There's this obj guy, so let's ask him.
"Mr. obj, do you have your iterator?" (... calls iter(obj), which calls
obj.__iter__(), which happily hands out a shiny new iterator _i.)
OK, that was easy... Let's start iterating then. (x = _i.next() ... x = _i.next()...)
Since Mr. obj succeeded in this test (by having certain method returning a valid iterator), we reward him with adjective: you can now call him "iterable Mr. obj".
However, in simple cases, you don't normally benefit from having iterator and iterable separately. So you define only one object, which is also its own iterator. (Python does not really care that _i handed out by obj wasn't all that shiny, but just the obj itself.)
This is why in most examples I've seen (and what had been confusing me over and over),
you can see:
class IterableExample(object):
def __iter__(self):
return self
def next(self):
pass
instead of
class Iterator(object):
def next(self):
pass
class Iterable(object):
def __iter__(self):
return Iterator()
There are cases, though, when you can benefit from having iterator separated from the iterable, such as when you want to have one row of items, but more "cursors". For example when you want to work with "current" and "forthcoming" elements, you can have separate iterators for both. Or multiple threads pulling from a huge list: each can have its own iterator to traverse over all items. See #Raymond's and #glglgl's answers above.
Imagine what you could do:
class SmartIterableExample(object):
def create_iterator(self):
# An amazingly powerful yet simple way to create arbitrary
# iterator, utilizing object state (or not, if you are fan
# of functional), magic and nuclear waste--no kittens hurt.
pass # don't forget to add the next() method
def __iter__(self):
return self.create_iterator()
Notes:
I'll repeat again: iterator is not iterable. Iterator cannot be used as
a "source" in for loop. What for loop primarily needs is __iter__()
(that returns something with next()).
Of course, for is not the only iteration loop, so above applies to some other
constructs as well (while...).
Iterator's next() can throw StopIteration to stop iteration. Does not have to,
though, it can iterate forever or use other means.
In the above "thought process", _i does not really exist. I've made up that name.
There's a small change in Python 3.x: next() method (not the built-in) now
must be called __next__(). Yes, it should have been like that all along.
You can also think of it like this: iterable has the data, iterator pulls the next
item
Disclaimer: I'm not a developer of any Python interpreter, so I don't really know what the interpreter "thinks". The musings above are solely demonstration of how I understand the topic from other explanations, experiments and real-life experience of a Python newbie.
An iterable is a object which has a __iter__() method. It can possibly iterated over several times, such as list()s and tuple()s.
An iterator is the object which iterates. It is returned by an __iter__() method, returns itself via its own __iter__() method and has a next() method (__next__() in 3.x).
Iteration is the process of calling this next() resp. __next__() until it raises StopIteration.
Example:
>>> a = [1, 2, 3] # iterable
>>> b1 = iter(a) # iterator 1
>>> b2 = iter(a) # iterator 2, independent of b1
>>> next(b1)
1
>>> next(b1)
2
>>> next(b2) # start over, as it is the first call to b2
1
>>> next(b1)
3
>>> next(b1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> b1 = iter(a) # new one, start over
>>> next(b1)
1
Here's my cheat sheet:
sequence
+
|
v
def __getitem__(self, index: int):
+ ...
| raise IndexError
|
|
| def __iter__(self):
| + ...
| | return <iterator>
| |
| |
+--> or <-----+ def __next__(self):
+ | + ...
| | | raise StopIteration
v | |
iterable | |
+ | |
| | v
| +----> and +-------> iterator
| ^
v |
iter(<iterable>) +----------------------+
|
def generator(): |
+ yield 1 |
| generator_expression +-+
| |
+-> generator() +-> generator_iterator +-+
Quiz: Do you see how...
every iterator is an iterable?
a container object's __iter__() method can be implemented as a generator?
an iterable that has a __next__ method is not necessarily an iterator?
Answers:
Every iterator must have an __iter__ method. Having __iter__ is enough to be an iterable. Therefore every iterator is an iterable.
When __iter__ is called it should return an iterator (return <iterator> in the diagram above). Calling a generator returns a generator iterator which is a type of iterator.
class Iterable1:
def __iter__(self):
# a method (which is a function defined inside a class body)
# calling iter() converts iterable (tuple) to iterator
return iter((1,2,3))
class Iterable2:
def __iter__(self):
# a generator
for i in (1, 2, 3):
yield i
class Iterable3:
def __iter__(self):
# with PEP 380 syntax
yield from (1, 2, 3)
# passes
assert list(Iterable1()) == list(Iterable2()) == list(Iterable3()) == [1, 2, 3]
Here is an example:
class MyIterable:
def __init__(self):
self.n = 0
def __getitem__(self, index: int):
return (1, 2, 3)[index]
def __next__(self):
n = self.n = self.n + 1
if n > 3:
raise StopIteration
return n
# if you can iter it without raising a TypeError, then it's an iterable.
iter(MyIterable())
# but obviously `MyIterable()` is not an iterator since it does not have
# an `__iter__` method.
from collections.abc import Iterator
assert isinstance(MyIterable(), Iterator) # AssertionError
I don’t know if it helps anybody but I always like to visualize concepts in my head to better understand them. So as I have a little son I visualize iterable/iterator concept with bricks and white paper.
Suppose we are in the dark room and on the floor we have bricks for my son. Bricks of different size, color, does not matter now. Suppose we have 5 bricks like those. Those 5 bricks can be described as an object – let’s say bricks kit. We can do many things with this bricks kit – can take one and then take second and then third, can change places of bricks, put first brick above the second. We can do many sorts of things with those. Therefore this bricks kit is an iterable object or sequence as we can go through each brick and do something with it. We can only do it like my little son – we can play with one brick at a time. So again I imagine myself this bricks kit to be an iterable.
Now remember that we are in the dark room. Or almost dark. The thing is that we don’t clearly see those bricks, what color they are, what shape etc. So even if we want to do something with them – aka iterate through them – we don’t really know what and how because it is too dark.
What we can do is near to first brick – as element of a bricks kit – we can put a piece of white fluorescent paper in order for us to see where the first brick-element is. And each time we take a brick from a kit, we replace the white piece of paper to a next brick in order to be able to see that in the dark room. This white piece of paper is nothing more than an iterator. It is an object as well. But an object with what we can work and play with elements of our iterable object – bricks kit.
That by the way explains my early mistake when I tried the following in an IDLE and got a TypeError:
>>> X = [1,2,3,4,5]
>>> next(X)
Traceback (most recent call last):
File "<pyshell#19>", line 1, in <module>
next(X)
TypeError: 'list' object is not an iterator
List X here was our bricks kit but NOT a white piece of paper. I needed to find an iterator first:
>>> X = [1,2,3,4,5]
>>> bricks_kit = [1,2,3,4,5]
>>> white_piece_of_paper = iter(bricks_kit)
>>> next(white_piece_of_paper)
1
>>> next(white_piece_of_paper)
2
>>>
Don’t know if it helps, but it helped me. If someone could confirm/correct visualization of the concept, I would be grateful. It would help me to learn more.
I don't think that you can get it much simpler than the documentation, however I'll try:
Iterable is something that can be iterated over. In practice it usually means a sequence e.g. something that has a beginning and an end and some way to go through all the items in it.
You can think Iterator as a helper pseudo-method (or pseudo-attribute) that gives (or holds) the next (or first) item in the iterable. (In practice it is just an object that defines the method next())
Iteration is probably best explained by the Merriam-Webster definition of the word :
b : the repetition of a sequence of computer instructions a specified
number of times or until a condition is met — compare recursion
Iterables have a __iter__ method that instantiates a new iterator every time.
Iterators implement a __next__ method that returns individual items, and a __iter__ method that returns self .
Therefore, iterators are also iterable, but iterables are not iterators.
Luciano Ramalho, Fluent Python.
Iterable:- something that is iterable is iterable; like sequences like lists ,strings etc.
Also it has either the __getitem__ method or an __iter__ method. Now if we use iter() function on that object, we'll get an iterator.
Iterator:- When we get the iterator object from the iter() function; we call __next__() method (in python3) or simply next() (in python2) to get elements one by one. This class or instance of this class is called an iterator.
From docs:-
The use of iterators pervades and unifies Python. Behind the scenes, the for statement calls iter() on the container object. The function returns an iterator object that defines the method __next__() which accesses elements in the container one at a time. When there are no more elements, __next__() raises a StopIteration exception which tells the for loop to terminate. You can call the __next__() method using the next() built-in function; this example shows how it all works:
>>> s = 'abc'
>>> it = iter(s)
>>> it
<iterator object at 0x00A1DB50>
>>> next(it)
'a'
>>> next(it)
'b'
>>> next(it)
'c'
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
next(it)
StopIteration
Ex of a class:-
class Reverse:
"""Iterator for looping over a sequence backwards."""
def __init__(self, data):
self.data = data
self.index = len(data)
def __iter__(self):
return self
def __next__(self):
if self.index == 0:
raise StopIteration
self.index = self.index - 1
return self.data[self.index]
>>> rev = Reverse('spam')
>>> iter(rev)
<__main__.Reverse object at 0x00A1DB50>
>>> for char in rev:
... print(char)
...
m
a
p
s
Iterators are objects that implement the iter and next methods. If those methods are defined, we can use for loop or comprehensions.
class Squares:
def __init__(self, length):
self.length = length
self.i = 0
def __iter__(self):
print('calling __iter__') # this will be called first and only once
return self
def __next__(self):
print('calling __next__') # this will be called for each iteration
if self.i >= self.length:
raise StopIteration
else:
result = self.i ** 2
self.i += 1
return result
Iterators get exhausted. It means after you iterate over items, you cannot reiterate, you have to create a new object. Let's say you have a class, which holds the cities properties and you want to iterate over.
class Cities:
def __init__(self):
self._cities = ['Brooklyn', 'Manhattan', 'Prag', 'Madrid', 'London']
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._cities):
raise StopIteration
else:
item = self._cities[self._index]
self._index += 1
return item
Instance of class Cities is an iterator. However if you want to reiterate over cities, you have to create a new object which is an expensive operation. You can separate the class into 2 classes: one returns cities and second returns an iterator which gets the cities as init param.
class Cities:
def __init__(self):
self._cities = ['New York', 'Newark', 'Istanbul', 'London']
def __len__(self):
return len(self._cities)
class CityIterator:
def __init__(self, city_obj):
# cities is an instance of Cities
self._city_obj = city_obj
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._city_obj):
raise StopIteration
else:
item = self._city_obj._cities[self._index]
self._index += 1
return item
Now if we need to create a new iterator, we do not have to create the data again, which is cities. We creates cities object and pass it to the iterator. But we are still doing extra work. We could implement this by creating only one class.
Iterable is a Python object that implements the iterable protocol. It requires only __iter__() that returns a new instance of iterator object.
class Cities:
def __init__(self):
self._cities = ['New York', 'Newark', 'Istanbul', 'Paris']
def __len__(self):
return len(self._cities)
def __iter__(self):
return self.CityIterator(self)
class CityIterator:
def __init__(self, city_obj):
self._city_obj = city_obj
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._city_obj):
raise StopIteration
else:
item = self._city_obj._cities[self._index]
self._index += 1
return item
Iterators has __iter__ and __next__, iterables have __iter__, so we can say Iterators are also iterables but they are iterables that get exhausted. Iterables on the other hand never become exhausted
because they always return a new iterator that is then used to iterate
You notice that the main part of the iterable code is in the iterator, and the iterable itself is nothing more than an extra layer that allows us to create and access the iterator.
Iterating over an iterable
Python has a built function iter() which calls the __iter__(). When we iterate over an iterable, Python calls the iter() which returns an iterator, then it starts using __next__() of iterator to iterate over the data.
NOte that in the above example, Cities creates an iterable but it is not a sequence type, it means we cannot get a city by an index. To fix this we should just add __get_item__ to the Cities class.
class Cities:
def __init__(self):
self._cities = ['New York', 'Newark', 'Budapest', 'Newcastle']
def __len__(self):
return len(self._cities)
def __getitem__(self, s): # now a sequence type
return self._cities[s]
def __iter__(self):
return self.CityIterator(self)
class CityIterator:
def __init__(self, city_obj):
self._city_obj = city_obj
self._index = 0
def __iter__(self):
return self
def __next__(self):
if self._index >= len(self._city_obj):
raise StopIteration
else:
item = self._city_obj._cities[self._index]
self._index += 1
return item
iterable = [1, 2]
iterator = iter(iterable)
print(iterator.__next__())
print(iterator.__next__())
so,
iterable is an object that can be looped over. e.g. list , string , tuple etc.
using the iter function on our iterable object will return an iterator object.
now this iterator object has method named __next__ (in Python 3, or just next in Python 2) by which you can access each element of iterable.
so,
OUTPUT OF ABOVE CODE WILL BE:
1
2
An iterable is an object that has an iter() method which returns an iterator. It is something that can be looped over.
Example : A list is iterable because we can loop over a list BUT is not an iterator
An iterator is an object that you can get an iterator from. It is an object with a state so that it remember where it is during iteration
To see if the object has this method iter() we can use the below function.
ls = ['hello','bye']
print(dir(ls))
Output
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
As you can see has the iter() that's mean that is a iterable object, but doesn't contain the next() method which is a feature of the iterator object
Whenever you use a for loop or map or a list comprehension in Python the next method is called automatically to get each item from the iteration
Before dealing with the iterables and iterator the major factor that decide the iterable and iterator is sequence
Sequence: Sequence is the collection of data
Iterable: Iterable are the sequence type object that support __iter__ method.
Iter method: Iter method take sequence as an input and create an object which is known as iterator
Iterator: Iterator are the object which call next method and transverse through the sequence. On calling the next method it returns the object that it traversed currently.
example:
x=[1,2,3,4]
x is a sequence which consists of collection of data
y=iter(x)
On calling iter(x) it returns a iterator only when the x object has iter method otherwise it raise an exception.If it returns iterator then y is assign like this:
y=[1,2,3,4]
As y is a iterator hence it support next() method
On calling next method it returns the individual elements of the list one by one.
After returning the last element of the sequence if we again call the next method it raise an StopIteration error
example:
>>> y.next()
1
>>> y.next()
2
>>> y.next()
3
>>> y.next()
4
>>> y.next()
StopIteration
Other people already explained comprehensively, what is iterable and iterator, so I will try to do the same thing with generators.
IMHO the main problem for understanding generators is a confusing use of the word “generator”, because this word is used in 2 different meanings:
as a tool for creating (generating) iterators,
in the form of a function returning an iterator (i.e. with the yield statement(s) in its body),
in the form of a generator expression
as a result of the use of that tool, i.e. the resulting iterator.
(In this meaning a generator is a special form of an iterator — the word “generator” points out how this iterator was created.)
Generator as a tool of the 1st type:
In[2]: def my_generator():
...: yield 100
...: yield 200
In[3]: my_generator
Out[3]: <function __main__.my_generator()>
In[4]: type(my_generator)
Out[4]: function
Generator as a result (i.e. an iterator) of the use of this tool:
In[5]: my_iterator = my_generator()
In[6]: my_iterator
Out[6]: <generator object my_generator at 0x00000000053EAE48>
In[7]: type(my_iterator)
Out[7]: generator
Generator as a tool of the 2nd type — indistinguishable from the resulting iterator of this tool:
In[8]: my_gen_expression = (2 * i for i in (10, 20))
In[9]: my_gen_expression
Out[9]: <generator object <genexpr> at 0x000000000542C048>
In[10]: type(my_gen_expression)
Out[10]: generator
Here's another view using collections.abc. This view may be useful the second time around or later.
From collections.abc we can see the following hierarchy:
builtins.object
Iterable
Iterator
Generator
i.e. Generator is derived from Iterator is derived from Iterable is derived from the base object.
Hence,
Every iterator is an iterable, but not every iterable is an iterator. For example, [1, 2, 3] and range(10) are iterables, but not iterators. x = iter([1, 2, 3]) is an iterator and an iterable.
A similar relationship exists between Iterator and Generator.
Calling iter() on an iterator or a generator returns itself. Thus, if it is an iterator, then iter(it) is it is True.
Under the hood, a list comprehension like [2 * x for x in nums] or a for loop like for x in nums:, acts as though iter() is called on the iterable (nums) and then iterates over nums using that iterator. Hence, all of the following are functionally equivalent (with, say, nums=[1, 2, 3]):
for x in nums:
for x in iter(nums):
for x in iter(iter(nums)):
for x in iter(iter(iter(iter(iter(nums))))):
For me, Python's glossery was most helpful for these questions, e.g. for iterable it says:
An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as list, str, and tuple) and some non-sequence types like dict, file objects, and objects of any classes you define with an iter() method or with a getitem() method that implements Sequence semantics.
Iterables can be used in a for loop and in many other places where a sequence is needed (zip(), map(), …). When an iterable object is passed as an argument to the built-in function iter(), it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to call iter() or deal with iterator objects yourself. The for statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop. See also iterator, sequence, and generator.

Categories