I've written a generator that does nothing more or less than store a range from 0 to 10:
result = (num for num in range(11))
When I want to print values, I can use next():
print(next(result))
[Out]: 0
print(next(result))
[Out]: 1
print(next(result))
[Out]: 2
print(next(result))
[Out]: 3
print(next(result))
[Out]: 4
If I then run a for loop on the generator, it runs on the values that I have not called next() on:
for value in result:
print(value)
[Out]: 5
6
7
8
9
10
Has the generator eliminated the other values by acting on them with a next() function? I've tried to find some documentation on the functionality of next() and generators but haven't been successful.
Actually this is can be implicitly deduced from next's docs and by understanding the iterator protocol/contract:
next(iterator[, default])
Retrieve the next item from the iterator by
calling its next() method. If default is given, it is returned if
the iterator is exhausted, otherwise StopIteration is raised.
Yes. Using a generator's __next__ method retrieves and removes the next value from the generator.
tldr; yes
An iterator is essentially a value producer that yields successive values from its associated iterable object. The built-in function next() is used to obtain the next value from in iterator.
Here is an example using the same list as above:
>>> l = ['Sarah', 'Roark']
>>> itr = iter(l)
>>> itr
<list_iterator object at 0x100ba8950>
>>> next(itr)
'Sarah'
>>> next(itr)
'Roark'
>>> next(itr)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
In this example, l is an iterable list and itr is the associated iterator, obtained with iter(). Each next(itr) call obtains the next value from itr.
Notice how an iterator retains its state internally. It knows which values have been obtained already, so when you call next(), it knows what value to return next.
If all the values from an iterator have been returned already, a subsequent next() call raises a StopIteration exception. Any further attempts to obtain values from the iterator will fail.
We can only obtain values from an iterator in one direction. We can’t go backward. There is no prev() function. But we can define two independent iterators on the same iterable object:
>>> l
['Sarah', 'Roark', 30]
>>> itr1 = iter(l)
>>> itr2 = iter(l)
>>> next(itr1)
'Sarah'
>>> next(itr1)
'Roark'
>>> next(itr1)
30
>>> next(itr2)
'Sarah'
Yes, a for loop in Python just returns the next item from the iterator, the same way that next() does.
https://docs.python.org/3/reference/compound_stmts.html#the-for-statement
The suite is then executed once for each item provided by the iterator, in the order returned by the iterator.
So you can think of a for loop like this:
for x in container:
statement()
As (almost) equivalent to a while loop:
iterator = iter(container)
while True:
x = next(iterator)
if x is None:
break
statement()
If container is already an iterator, then iter(container) is container.
Note: Technically, a for loop is more like this:
iterator = iter(container)
while True:
try:
x = iterator.__next__()
except StopIteration:
break
statement()
Related
How come I can call next to a reversed range but can't call it on a regular range ?
r1 = range(6)
next(r1) # Error
r2 = reversed(range(6))
next(r2) # -> 5
There is a subtle distinction here. First, range is a type. An instance of range is not an iterator, because range.__next__ is not defined. An instance is iterable, though, because range.__iter__ is defined, so you can get an iterator with, for example, iter(range(3)).
>>> type(range(1))
<class 'range'>
>>> type(iter(range(1)))
<class 'range_iterator'>
range.__next__ is not defined, but range_iterator.__next__ is.
An instance of range represents a bounded sequence of integers, without actually being a bounded sequences. As such, you can have multiple independent iterators over the same range.
>>> r = range(10)
>>> i1 = iter(r)
>>> next(i1)
0
>>> next(i1)
1
>>> next(i1)
2
>>> i2 = iter(r)
>>> next(i2)
0
>>> next(i1)
3
reversed, however, by definition returns an iterator. If need be, it can call iter on its iterable argument in order to get an iterator to reverse. It can also use its argument's __reversed__ method to get a reverse iterator. range.__reversed__ yields an iterator like range.__iter__, but going in the opposite direction.
Because, per the reversed() docs:
Return a reverse iterator.
range(), however, returns an immutable sequence.
next() can only be used on objects with a __next__() method.
Look at below sample:
a = [1, 2, 3, 4]
for i in a:
print(a)
a is the list (iterable) not the iterator.
I'm not asking to know that __iter__ or iter() convert list to iterator!
I'm asking to know if for loop itself convert list implicitly then call __iter__ for iteration keeping list without removing like iterator?
Since stackoverflow identified my question as possible duplicate:
The unique part is that I'm not asking about for loop as concept nor __iter__, I'm asking about the core mechanism of for loop and relationship with iter.
I'm asking to know if for loop itself convert list implicitly then call iter for iteration keeping list without removing like iterator?
The for loop does not convert the list implicitly in the sense that it mutates the list, but it implicitly creates an iterator from the list. The list itself will not change state during iteration, but the created iterator will.
a = [1, 2, 3]
for x in a:
print(x)
is equivalent to
a = [1, 2, 3]
it = iter(a) # calls a.__iter__
while True:
try:
x = next(it)
except StopIteration:
break
print(x)
Here's proof that __iter__ actually gets called:
import random
class DemoIterable(object):
def __iter__(self):
print('__iter__ called')
return DemoIterator()
class DemoIterator(object):
def __iter__(self):
return self
def __next__(self):
print('__next__ called')
r = random.randint(1, 10)
if r == 5:
print('raising StopIteration')
raise StopIteration
return r
Iteration over a DemoIterable:
>>> di = DemoIterable()
>>> for x in di:
... print(x)
...
__iter__ called
__next__ called
9
__next__ called
8
__next__ called
10
__next__ called
3
__next__ called
10
__next__ called
raising StopIteration
I am currently reading in the official documentation of Python 3.5.
It states that range() is iterable, and that list() and for are iterators. [section 4.3]
However, here it states that zip() makes an iterator.
My question is that when we use this instruction:
list(zip(list1, list2))
are we using an iterator (list()) to iterate through another iterator?
The documentation is creating some confusion here, by re-using the term 'iterator'.
There are three components to the iterator protocol:
Iterables; things you can potentially iterate over and get their elements, one by one.
Iterators; things that do the iteration. Every time you want to step through all items of an iterable, you need one of these to keep track of where you are in the process. These are not re-usable; once you reach the end, that's it. For most iterables, you can create multiple indepedent iterators, each tracking position independently.
Consumers of iterators; those things that want to do something with the items.
A for loop is an example of the latter, so #3. A for loop uses the iter() function to produce an iterator (#2 above) for whatever you want to loop over, so that "whatever" must be an iterable (#1 above).
range() is an example of #1; it is iterable object. You can iterate over it multiple times, independently:
>>> r = range(5)
>>> r_iter_1 = iter(r)
>>> next(r_iter_1)
0
>>> next(r_iter_1)
1
>>> r_iter_2 = iter(r)
>>> next(r_iter_2)
0
>>> next(r_iter_1)
2
Here r_iter_1 and r_iter_2 are two separate iterators, and each time you ask for a next item they do so based on their own internal bookkeeping.
list() is an example of both an iterable (#1) and a iteration consumer (#3). If you pass another iterable (#1) to the list() call, a list object is produced containing all elements from that iterable. But list objects themselves are also iterables.
zip(), in Python 3, takes in multiple iterables (#1), and is itself an iterator (#2). zip() stores a new iterator (#2) for each of the iterables you gave it. Each time you ask zip() for the next element, zip() builds a new tuple with the next elements from each of the contained iterables:
>>> lst1, lst2 = ['foo', 'bar'], [42, 81]
>>> zipit = zip(lst1, lst2)
>>> next(zipit)
('foo', 42)
>>> next(zipit)
('bar', 81)
So in the end, list(zip(list1, list2)) uses both list1 and list2 as iterables (#1), zip() consumes those (#3) when it itself is being consumed by the outer list() call.
The documentation is badly worded. Here's the section you're referring to:
We say such an object is iterable, that is, suitable as a target for functions and constructs that expect something from which they can obtain successive items until the supply is exhausted. We have seen that the for statement is such an iterator. The function list() is another; it creates lists from iterables:
In this paragraph, iterator does not refer to a Python iterator object, but the general idea of "something which iterates over something". In particular, the for statement cannot be an iterator object because it isn't an object at all; it's a language construct.
To answer your specific question:
... when we use this instruction:
list(zip(list1, list2))
are we using an iterator (list()) to iterate through another iterator?
No, list() is not an iterator. It's the constructor for the list type. It can accept any iterable (including an iterator) as an argument, and uses that iterable to construct a list.
zip() is an iterator function, that is, a function which returns an iterator. In your example, the iterator it returns is passed to list(), which constructs a list object from it.
A simple way to tell whether an object is an iterator is to call next() with it, and see what happens:
>>> list1 = [1, 2, 3]
>>> list2 = [4, 5, 6]
>>> zipped = zip(list1, list2)
>>> zipped
<zip object at 0x7f27d9899688>
>>> next(zipped)
(1, 4)
In this case, the next element of zipped is returned.
>>> list3 = list(zipped)
>>> list3
[(2, 5), (3, 6)]
Notice that only the last two elements of the iterator are found in list3, because we already consumed the first one with next().
>>> next(list3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'list' object is not an iterator
This doesn't work, because lists are not iterators.
>>> next(zipped)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
This time, although zipped is an iterator, calling next() with it raises StopIteration because it's already been exhausted to construct list3.
I'm not sure how you name the n in the following for loop. Is there are a term for it?
for n in [1,2,3,4,5]:
print i
And, am I correct that the list itself is the iterator of the for loop ?
While n is called a loop variable the list is absolutely not an iterator. It is iterable object, i.e. and iterable, but it is not an iterator. An iterable may be an iterator itself, but not always. That is to say, iterators are iterable, but not all iterables are iterators. In the case of a list it is simply an iterable.
It is an iterable because it implements an __iter__ method, which returns an iterator:
From the Python Glossary an iterable is:
An object capable of returning its members one at a time. Examples of
iterables include all sequence types (such as list, str, and tuple)
and some non-sequence types like dict, file objects, and objects of
any classes you define with an __iter__() or __getitem__() method.
Iterables can be used in a for loop and in many other places where a
sequence is needed (zip(), map(), ...). When an iterable object is
passed as an argument to the built-in function iter(), it returns an
iterator for the object. This iterator is good for one pass over the
set of values. When using iterables, it is usually not necessary to
call iter() or deal with iterator objects yourself. The for statement
does that automatically for you, creating a temporary unnamed variable
to hold the iterator for the duration of the loop.
So, observe:
>>> x = [1,2,3]
>>> iterator = iter(x)
>>> type(iterator)
<class 'list_iterator'>
>>> next(iterator)
1
>>> next(iterator)
2
>>> next(iterator)
3
>>> next(iterator)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
It is illuminating to understand that a for-loop in Python such as the following:
for n in some_iterable:
# do something
is equivalent to:
iterator = iter(some_iterable)
while True:
try:
n = next(iterator)
# do something
except StopIteration as e:
break
Iterators, which are returned by a call to an object's __iter__ method, also implement the __iter__ method (usually returning themselves) but they also implement a __next__ method. Thus, an easy way to check if something is an iterable is to see if it implements a next method
>>> next(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'list' object is not an iterator
Again, from the Python Glossary, an iterator is:
An object representing a stream of data. Repeated calls to the
iterator’s __next__() method (or passing it to the built-in function
next()) return successive items in the stream. When no more data are
available a StopIteration exception is raised instead. At this point,
the iterator object is exhausted and any further calls to its
__next__() method just raise StopIteration again. Iterators are required to have an __iter__() method that returns the iterator object
itself so every iterator is also iterable and may be used in most
places where other iterables are accepted. One notable exception is
code which attempts multiple iteration passes. A container object
(such as a list) produces a fresh new iterator each time you pass it
to the iter() function or use it in a for loop. Attempting this with
an iterator will just return the same exhausted iterator object used
in the previous iteration pass, making it appear like an empty
container.
I've illustrated the bevahior of an iterator with the next function above, so now I want to concentrate on the bolded portion.
Basically, an iterator can be used in the place of an iterable because iterators are always iterable. However, an iterator is good for only a single pass. So, if I use a non-iterator iterable, like a list, I can do stuff like this:
>>> my_list = ['a','b','c']
>>> for c in my_list:
... print(c)
...
a
b
c
And this:
>>> for c1 in my_list:
... for c2 in my_list:
... print(c1,c2)
...
a a
a b
a c
b a
b b
b c
c a
c b
c c
>>>
An iterator behaves almost in the same way, so I can still do this:
>>> it = iter(my_list)
>>> for c in it:
... print(c)
...
a
b
c
>>>
However, iterators do not support multiple iteration (well, you can make your an iterator that does, but generally they do not):
>>> it = iter(my_list)
>>> for c1 in it:
... for c2 in it:
... print(c1,c2)
...
a b
a c
Why is that? Well, recall what is happening with the iterator protocol which is used by a for loop under the hood, and consider the following:
>>> my_list = ['a','b','c','d','e','f','g']
>>> iterator = iter(my_list)
>>> iterator_of_iterator = iter(iterator)
>>> next(iterator)
'a'
>>> next(iterator)
'b'
>>> next(iterator_of_iterator)
'c'
>>> next(iterator_of_iterator)
'd'
>>> next(iterator)
'e'
>>> next(iterator_of_iterator)
'f'
>>> next(iterator)
'g'
>>> next(iterator)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> next(iterator_of_iterator)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>
When I used iter() on an iterator, it returned itself!
>>> id(iterator)
139788446566216
>>> id(iterator_of_iterator)
139788446566216
The example you gave is an "iterator-based for-loop"
n is called the loop variable.
The role that list plays is more troublesome to name.
Indeed, after an interesting conversation with #juanpa.arrivillaga I've concluded that there simply isn't a "clearly correct formal name", nor a commonly used name, for that syntactic element.
That being said, I do think that if you referred to it in context in a sentence as "the loop iterator" everyone would know what you meant.
In doing so, you take the risk of confusing yourself or someone else with the fact that the syntactic element in that position is not in fact an iterator, its a collection or (loosely, but from the definition in the referenced article) an "iterable of some sort".
I suspect that one reason why there isn't a name for this is that we hardly ever have to refer to it in a sentence. Another is that they types of element that can appear in that position vary widely, so it is hard to safely cover them all with a label.
I don't understand exactly why the __iter__ special method just returns the object it's called on (if it's called on an iterator). Is it essentially just a flag indicating that the object is an iterator?
EDIT: Actually, I discovered that "This is required to allow both containers and iterators to be used with the for and in statements." https://docs.python.org/3/library/stdtypes.html#iterator.iter
Alright, here's how I understand it: When writing a for loop, you're allowed to specify either an iterable or an iterator to loop over. But Python ultimately needs an iterator for the loop, so it calls the __iter__ method on whatever it's given. If it's been given an iterable, the __iter__ method will produce an iterator, and if it's been given an iterator, the __iter__ method will likewise produce an iterator (the original object given).
When you loop over something using for x in something, then the loop actually calls iter(something) first, so it has something to work with. In general, the for loop is approximately equivalent to something like this:
something_iterator = iter(something)
while True:
try:
x = next(something_iterator)
# loop body
except StopIteration:
break
So as you already figured out yourself, in order to be able to loop over an iterator, i.e. when something is already an iterator, iterators should always return themselves when calling iter() on them. So this basically makes sure that iterators are also iterable.
This depends what object you call iter on. If an object is already an iterator, then there is no operation required to convert it to an iterator, because it already is one. But if the object is not an iterator, but is iterable, then an iterator is constructed from the object.
A good example of this is the list object:
>>> x = [1, 2, 3]
>>> iter(x) == x
False
>>> iter(x)
<list_iterator object at 0x7fccadc5feb8>
>>> x
[1, 2, 3]
Lists are iterable, but they are not themselves iterators. The result of list.__iter__ is not the original list.
In Python when ever you try to use loops, or try to iterate over any object like below..
Lets try to understand for list object..
>>> l = [1, 2, 3] # Defined list l
If we iterate over the above list..
>>> for i in l:
... print i
...
1
2
3
When you try to do this iteration over list l, Python for loop checks for l.__iter__() which intern return an iterator object.
>>> for i in l.__iter__():
... print i
...
1
2
3
To understand this more, lets customize the list and create anew list class..
>>> class ListOverride(list):
... def __iter__(self):
... raise TypeError('Not iterable')
...
Here I've created ListOverride class which intern inherited from list and overrided list.__iter__ method to raise TypeError.
>>> ll = ListOverride([1, 2, 3])
>>> ll
[1, 2, 3]
And i've created anew list using ListOverride class, and since it's list object it should iterate in the same way as list does.
>>> for i in ll:
... print i
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in __iter__
TypeError: Not iterable
If we try to iterate over ListOverride object ll, we'll endup getting NotIterable exception..