yield from vs yield in for-loop

yield from vs yield in for-loop - python

My understanding of yield from is that it is similar to yielding every item from an iterable. Yet, I observe the different behavior in the following example.
I have Class1
class Class1:
def __init__(self, gen):
self.gen = gen
def __iter__(self):
for el in self.gen:
yield el
and Class2 that different only in replacing yield in for loop with yield from
class Class2:
def __init__(self, gen):
self.gen = gen
def __iter__(self):
yield from self.gen
The code below reads the first element from an instance of a given class and then reads the rest in a for loop:
a = Class1((i for i in range(3)))
print(next(iter(a)))
for el in iter(a):
print(el)
This produces different outputs for Class1 and Class2. For Class1 the output is
0
1
2
and for Class2 the output is
0
Live demo
What is the mechanism behind yield from that produces different behavior?

What Happened?
When you use next(iter(instance_of_Class2)), iter() calls .close() on the inner generator when it (the iterator, not the generator!) goes out of scope (and is deleted), while with Class1, iter() only closes its instance
>>> g = (i for i in range(3))
>>> b = Class2(g)
>>> i = iter(b) # hold iterator open
>>> next(i)
0
>>> next(i)
1
>>> del(i) # closes g
>>> next(iter(b))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
This behavior is described in PEP 342 in two parts
the new .close() method (well, new to Python 2.5)
from the Specification Summary
Add support to ensure that close() is called when a generator iterator is garbage-collected.
What happens is a little clearer (if perhaps surprising) when multiple generator delegations occur; only the generator being delegated is closed when its wrapping iter is deleted
>>> g1 = (a for a in range(10))
>>> g2 = (a for a in range(10, 20))
>>> def test3():
... yield from g1
... yield from g2
...
>>> next(test3())
0
>>> next(test3())
10
>>> next(test3())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Fixing Class2
What options are there to make Class2 behave more the way you expect?
Notably, other strategies, though they don't have the visually pleasing sugar of yield from or some of its potential benefits gives you a way to interact with the values, which seems like a primary benefit
avoid creating a structure like this at all ("just don't do that!")
if you don't interact with the generator and don't intend to keep a reference to the iterator, why bother wrapping it at all? (see above comment about interacting)
create the iterator yourself internally (this may be what you expected)
>>> class Class3:
... def __init__(self, gen):
... self.iterator = iter(gen)
...
... def __iter__(self):
... return self.iterator
...
>>> c = Class3((i for i in range(3)))
>>> next(iter(c))
0
>>> next(iter(c))
1
make the whole class a "proper" Generator
while testing this, it plausibly highlights some iter() inconsistency - see comments below (ie. why isn't e closed?)
also an opportunity to pass multiple generators with itertools.chain.from_iterable
>>> class Class5(collections.abc.Generator):
... def __init__(self, gen):
... self.gen = gen
... def send(self, value):
... return next(self.gen)
... def throw(self, value):
... raise StopIteration
... def close(self): # optional, but more complete
... self.gen.close()
...
>>> e = Class5((i for i in range(10)))
>>> next(e) # NOTE iter is not necessary!
0
>>> next(e)
1
>>> next(iter(e)) # but still works
2
>>> next(iter(e)) # doesn't close e?? (should it?)
3
>>> e.close()
>>> next(e)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.9/_collections_abc.py", line 330, in __next__
return self.send(None)
File "<stdin>", line 5, in send
StopIteration
Hunting the Mystery
A better clue is that if you directly try again, next(iter(instance)) raises StopIteration, indicating the generator is permanently closed (either through exhaustion or .close()), and why iterating over it with a for loop yields no more values
>>> a = Class1((i for i in range(3)))
>>> next(iter(a))
0
>>> next(iter(a))
1
>>> b = Class2((i for i in range(3)))
>>> next(iter(b))
0
>>> next(iter(b))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
However, if we name the iterator, it works as expected
>>> b = Class2((i for i in range(3)))
>>> i = iter(b)
>>> next(i)
0
>>> next(i)
1
>>> j = iter(b)
>>> next(j)
2
>>> next(i)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
To me, this suggests that when the iterator doesn't have a name, it calls .close() when it goes out of scope
>>> def gen_test(iterable):
... yield from iterable
...
>>> g = gen_test((i for i in range(3)))
>>> next(iter(g))
0
>>> g.close()
>>> next(iter(g))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Disassembling the result, we find the internals are a little different
>>> a = Class1((i for i in range(3)))
>>> dis.dis(a.__iter__)
6 0 LOAD_FAST 0 (self)
2 LOAD_ATTR 0 (gen)
4 GET_ITER
>> 6 FOR_ITER 10 (to 18)
8 STORE_FAST 1 (el)
7 10 LOAD_FAST 1 (el)
12 YIELD_VALUE
14 POP_TOP
16 JUMP_ABSOLUTE 6
>> 18 LOAD_CONST 0 (None)
20 RETURN_VALUE
>>> b = Class2((i for i in range(3)))
>>> dis.dis(b.__iter__)
6 0 LOAD_FAST 0 (self)
2 LOAD_ATTR 0 (gen)
4 GET_YIELD_FROM_ITER
6 LOAD_CONST 0 (None)
8
10 POP_TOP
12 LOAD_CONST 0 (None)
14 RETURN_VALUE
Notably, the yield from version has GET_YIELD_FROM_ITER
If TOS is a generator iterator or coroutine object it is left as is. Otherwise, implements TOS = iter(TOS).
(subtly, YIELD_FROM keyword appears to be removed in 3.11)
So if the given iterable (to the class) is a generator iterator, it'll be handed off directly, giving the result we (might) expect
Extras
Passing an iterator which isn't a generator (iter() creates a new iterator each time in both cases)
>>> a = Class1([i for i in range(3)])
>>> next(iter(a))
0
>>> next(iter(a))
0
>>> b = Class2([i for i in range(3)])
>>> next(iter(b))
0
>>> next(iter(b))
0
Expressly closing Class1's internal generator
>>> g = (i for i in range(3))
>>> a = Class1(g)
>>> next(iter(a))
0
>>> next(iter(a))
1
>>> a.gen.close()
>>> next(iter(a))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
generator is only closed by iter when deleted if instance is popped
>>> g = (i for i in range(10))
>>> b = Class2(g)
>>> i = iter(b)
>>> next(i)
0
>>> j = iter(b)
>>> del(j) # next() not called on j
>>> next(i)
1
>>> j = iter(b)
>>> next(j)
2
>>> del(j) # generator closed
>>> next(i) # now fails, despite range(10) above
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

updated
I don't see it as that complicated, and the resulting behavior can be seen as actually unsurprising.
When the iterator goes out of scope, Python will throw a "GeneratorExit" exception in the (innermost) generator.
On the "classic" for form, the exception happens in the user-written __iter__ method, is not catch, and is suppressed when bubbling up by the generator mechanisms.
On the yield from form, the same exception is thrown in the inner self.gen, thus "killing" it, and bubbles up to the user-written __iter__ .
Writing another intermediate generator can make this easily visible:
def inner_gen(gen):
try:
for item in gen:
yield item
except GeneratorExit:
print("Generator exit thrown in inner generator")
class Class1:
def __init__(self, gen):
self.gen = inner_gen(gen)
def __iter__(self):
try:
for el in self.gen:
yield el
except GeneratorExit:
print("Generator exit thrown in outer generator for 'classic' form")
class Class2(Class1):
def __iter__(self):
try:
yield from self.gen
except GeneratorExit as exit:
print("Generator exit thrown in outer generator for 'yield from' form" )
first = lambda g:next(iter(g))
And now:
In [324]: c1 = Class1((i for i in range(2)))
In [325]: first(c1)
Generator exit thrown in outer generator for 'classic' form
Out[325]: 0
In [326]: first(c1)
Generator exit thrown in outer generator for 'classic' form
Out[326]: 1
In [327]: c2 = Class2((i for i in range(2)))
In [328]: first(c2)
Generator exit thrown in inner generator
Generator exit thrown in outter generator for 'yield from' form
Out[328]: 0
In [329]: first(c2)
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
Cell In[329], line 1
(...)
StopIteration:
update
I had a previous answer text speculating how the call to close would take place, skipping the intermediate generator - it is not that simple regarding close though: Python will always call __del__ - not close, which is only called by the user, or in certain circunstances that were hard to pin down. But it will always throw the GeneratorExit exception in a generator-function body (not in a class with explict __next__ and throw , though - let's skip this for another question :-D )

Related

What does a yield inside a yield do?

Consider the following code:
def mygen():
yield (yield 1)
a = mygen()
print(next(a))
print(next(a))
The output yields:
1
None
What does the interpreter do at the "outside" yield exactly?

a is a generator object. The first time you call next on it, the body is evaluated up to the first yield expression (that is, the first to be evaluated: the inner one). That yield produces the value 1 for next to return, then blocks until the next entry into the generator. That is produced by the second call to next, which does not send any value into the generator. As a result, the first (inner) yield evaluates to None. That value is used as the argument for the outer yield, which becomes the return value of the second call to next. If you were to call next a third time, you would get a StopIteration exception.
Compare the use of the send method (instead of next) to change the return value of the first yield expression.
>>> a = mygen()
>>> next(a)
1
>>> a.send(3) # instead of next(a)
3
>>> next(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
A more explicit way of writing the generator would have been
def mygen():
x = yield 1
yield x
a = mygen()
print(a.send(None)) # outputs 1, from yield 1
print(a.send(5)) # makes yield 1 == 5, then gets 5 back from yield x
print(a.send(3)) # Raises StopIteration, as there's nothing after yield x
Prior to Python 2.5, the yield statement provided one-way communication between a caller and a generator; a call to next would execute the generator up to the next yield statement, and the value provided by the yield keyword would serve as the return value of next. The generator
would also suspend at the point of the yield statement, waiting for the next call to next to resume.
In Python 2.5, the yield statement was replaced* with the yield expression, and generators acquired a send method. send works very much like next, except it can take an argument. (For the rest of this, assume that next(a) is equivalent to a.send(None).) A generator starts execution after a call to send(None), at which point it executes up to the first yield, which returns a value as before. Now, however, the expression blocks until the next call to send, at which point the yield expression evaluates to the argument passed to send. A generator can now receive a value when it resumes.
* Not quite replaced; kojiro's answer goes into more detail about the subtle difference between a yield statement and yield expression.

yield has two forms, expressions and statements. They're mostly the same, but I most often see them in the statement form, where the result would not be used.
def f():
yield a thing
But in the expression form, yield has a value:
def f():
y = yield a thing
In your question, you're using both forms:
def f():
yield ( # statement
yield 1 # expression
)
When you iterate over the resulting generator, you get first the result of the inner yield expression
>>> x=f()
>>> next(x)
1
At this point, the inner expression has also produced a value that the outer statement can use
>>> next(x)
>>> # None
and now you've exhausted the generator
>>> next(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
To understand more about statements vs expressions, there are good answers in other stackoverflow questions: What is the difference between an expression and a statement in Python?

>>> def mygen():
... yield (yield 1)
...
>>> a = mygen()
>>>
>>> a.send(None)
1
>>> a.send(5)
5
>>> a.send(2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>
>>>
>>>
>>> def mygen():
... yield 1
...
>>> def mygen2():
... yield (yield 1)
...
>>> def mygen3():
... yield (yield (yield 1))
...
>>> a = mygen()
>>> a2 = mygen2()
>>> a3 = mygen3()
>>>
>>> a.send(None)
1
>>> a.send(0)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> a2.send(None)
1
>>> a2.send(0)
0
>>> a2.send(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> a3.send(None)
1
>>> a3.send(0)
0
>>> a3.send(1)
1
>>> a3.send(2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>
Every other yield simply waits for a value to be passed into, generator don't only give data but they also receive it.
>>> def mygen():
... print('Wait for first input')
... x = yield # this is what we get from send
... print(x, 'is received')
...
>>> a = mygen()
>>> a.send(None)
Wait for first input
>>> a.send('bla')
bla is received
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>
yield gives the next value when you continue if you get it, and if it is not used for giving the next value, it is being used for receiving the next
>>> def mygen():
... print('Wait for first input')
... x = yield # this is what we get from send
... yield x*2 # this is what we give
...
>>> a = mygen()
>>> a.send(None)
Wait for first input
>>> a.send(5)
10
>>>

Any generator exhausts elements till it runs out of them.
In the 2-level nested example like below, the first next gives us the element from inner most yield, which is 1, the next yields just returns None, since it has no elements to return, if you call next again, it will return StopIteration
def mygen():
yield (yield 1)
a = mygen()
print(next(a))
print(next(a))
print(next(a))
You can expand this case to include more nested yields, and you will see that after n next are called, StopIteration expection is thrown, below is an example with 5 nested yields
def mygen():
yield ( yield ( yield ( yield (yield 1))))
a = mygen()
print(next(a))
print(next(a))
print(next(a))
print(next(a))
print(next(a))
print(next(a))
Note that this answer is just based on my observation, and might not be technically correct in the nitty-gritties, all updates and suggestions are welcome

Decorating a generator in Python: call some method in between yields

I found some very useful information about decorating generator functions in Python here using yield from. For example:
def mydec(func):
def wrapper(*args, **kwargs):
print(f'Getting values from "{func.__name__}"...')
x = yield from func(*args, **kwargs)
print(f'Got value {x}')
return x
return wrapper
#mydec
def mygen(n):
for i in range(n):
yield i
However, this seems to only allow for adding decorated behaviors at the beginning and end of the generator's lifetime:
>>> foo = mygen(3)
>>> x = next(foo)
Getting values from "mygen"...
>>> x
0
>>> x = next(foo)
>>> x
1
>>> x = next(foo)
>>> x
2
>>> x = next(foo)
Got value None
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> x
2
However I am interested in using the decorator to implement some behavior every time the generator yields. However, the decorator should not modify the values that are gotten from the generator. That is, for example, I'd like to have the output:
>>> foo = mygen(3)
>>> x = next(foo)
Getting values from "mygen"...
Got value 0
>>> x
0
>>> x = next(foo)
Got value 1
>>> x
1
>>> x = next(foo)
Got value 2
>>> x
2
So, a call to print occurs with each yield, however the yielded values remain unchanged.
Is this possible?

yield from is for coroutine stuff. You're not doing coroutine stuff. Just iterating the generator:
def mydec(func):
def wrapper(*args, **kwargs):
print(f'Getting values from "{func.__name__}"...')
gen = func(*args, **kwargs)
for value in gen:
print(f'got value {value}')
yield value
return wrapper

Stop Iteration error when using next()

I am not able to clarify my self over the use of next() in python(3).
I have a data :
chr pos ms01e_PI ms01e_PG_al ms02g_PI ms02g_PG_al ms03g_PI ms03g_PG_al ms04h_PI ms04h_PG_al
2 15881989 4 C|C 6 A|C 7 C|C 7 C|C
2 15882091 4 A|T 6 A|T 7 T|A 7 A|A
2 15882148 4 T|T 6 T|T 7 T|T 7 T|G
and I read it like:
Works fine
c = csv.DictReader(io.StringIO(data), dialect=csv.excel_tab)
print(c)
print(list(c))
Works fine
c = csv.DictReader(io.StringIO(data), dialect=csv.excel_tab)
print(c)
keys = next(c)
print('keys:', keys)
But, now there is a problem.
c = csv.DictReader(io.StringIO(data), dialect=csv.excel_tab)
print(c)
print(list(c))
keys = next(c)
print('keys:', keys)
Error message:
Traceback (most recent call last):
2 15882601 4 C|C 9 C|C 6 C|C 5 T|C
File "/home/everestial007/Test03.py", line 24, in <module>
keys = next(c)
File "/home/everestial007/anaconda3/lib/python3.5/csv.py", line 110, in __next__
row = next(self.reader)
StopIteration
Why does print(keys) after print(list(c)) gives StopIteration? I read the documentation but I am not clear on this particular example.

The error isn't with the print statement. It's with the keys = next(c) line. Consider a simpler example which reproduces your issue.
a = (i ** 2 for i in range(5))
a # `a` is a generator object
<generator object <genexpr> at 0x150e286c0>
list(a) # calling `list` on `a` will exhaust the generator
[0, 1, 4, 9, 16]
next(a) # calling `next` on an exhausted generator object raises `StopIteration`
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-2076-3f6e2eea332d> in <module>()
----> 1 next(a)
StopIteration:
What happens is that c is an iterator object (very similar to the generator a above), and is meant to be iterated over once until it is exhausted. Calling list on this object will exhaust it, so that the elements can be collected into a list.
Once the object has been exhausted, it will not produce any more elements. At this point, the generator mechanism is designed to raise a StopIteration if you attempt to iterate over it even after it has been exhausted. Constructs such as for loops listen for this error, silently swallowing it, however, next returns the raw exception as soon as it has been raised.

Python for loop and iterator behavior

I wanted to understand a bit more about iterators, so please correct me if I'm wrong.
An iterator is an object which has a pointer to the next object and is read as a buffer or stream (i.e. a linked list). They're particularly efficient cause all they do is tell you what is next by references instead of using indexing.
However I still don't understand why is the following behavior happening:
In [1]: iter = (i for i in range(5))
In [2]: for _ in iter:
....: print _
....:
0
1
2
3
4
In [3]: for _ in iter:
....: print _
....:
In [4]:
After a first loop through the iterator (In [2]) it's as if it was consumed and left empty, so the second loop (In [3]) prints nothing.
However I never assigned a new value to the iter variable.
What is really happening under the hood of the for loop?

Your suspicion is correct: the iterator has been consumed.
In actuality, your iterator is a generator, which is an object which has the ability to be iterated through only once.
type((i for i in range(5))) # says it's type generator
def another_generator():
yield 1 # the yield expression makes it a generator, not a function
type(another_generator()) # also a generator
The reason they are efficient has nothing to do with telling you what is next "by reference." They are efficient because they only generate the next item upon request; all of the items are not generated at once. In fact, you can have an infinite generator:
def my_gen():
while True:
yield 1 # again: yield means it is a generator, not a function
for _ in my_gen(): print(_) # hit ctl+c to stop this infinite loop!
Some other corrections to help improve your understanding:
The generator is not a pointer, and does not behave like a pointer as you might be familiar with in other languages.
One of the differences from other languages: as said above, each result of the generator is generated on the fly. The next result is not produced until it is requested.
The keyword combination for in accepts an iterable object as its second argument.
The iterable object can be a generator, as in your example case, but it can also be any other iterable object, such as a list, or dict, or a str object (string), or a user-defined type that provides the required functionality.
The iter function is applied to the object to get an iterator (by the way: don't use iter as a variable name in Python, as you have done - it is one of the keywords). Actually, to be more precise, the object's __iter__ method is called (which is, for the most part, all the iter function does anyway; __iter__ is one of Python's so-called "magic methods").
If the call to __iter__ is successful, the function next() is applied to the iterable object over and over again, in a loop, and the first variable supplied to for in is assigned to the result of the next() function. (Remember: the iterable object could be a generator, or a container object's iterator, or any other iterable object.) Actually, to be more precise: it calls the iterator object's __next__ method, which is another "magic method".
The for loop ends when next() raises the StopIteration exception (which usually happens when the iterable does not have another object to yield when next() is called).
You can "manually" implement a for loop in python this way (probably not perfect, but close enough):
try:
temp = iterable.__iter__()
except AttributeError():
raise TypeError("'{}' object is not iterable".format(type(iterable).__name__))
else:
while True:
try:
_ = temp.__next__()
except StopIteration:
break
except AttributeError:
raise TypeError("iter() returned non-iterator of type '{}'".format(type(temp).__name__))
# this is the "body" of the for loop
continue
There is pretty much no difference between the above and your example code.
Actually, the more interesting part of a for loop is not the for, but the in. Using in by itself produces a different effect than for in, but it is very useful to understand what in does with its arguments, since for in implements very similar behavior.
When used by itself, the in keyword first calls the object's __contains__ method, which is yet another "magic method" (note that this step is skipped when using for in). Using in by itself on a container, you can do things like this:
1 in [1, 2, 3] # True
'He' in 'Hello' # True
3 in range(10) # True
'eH' in 'Hello'[::-1] # True
If the iterable object is NOT a container (i.e. it doesn't have a __contains__ method), in next tries to call the object's __iter__ method. As was said previously: the __iter__ method returns what is known in Python as an iterator. Basically, an iterator is an object that you can use the built-in generic function next() on1. A generator is just one type of iterator.
If the call to __iter__ is successful, the in keyword applies the function next() to the iterable object over and over again. (Remember: the iterable object could be a generator, or a container object's iterator, or any other iterable object.) Actually, to be more precise: it calls the iterator object's __next__ method).
If the object doesn't have a __iter__ method to return an iterator, in then falls back on the old-style iteration protocol using the object's __getitem__ method2.
If all of the above attempts fail, you'll get a TypeError exception.
If you wish to create your own object type to iterate over (i.e, you can use for in, or just in, on it), it's useful to know about the yield keyword, which is used in generators (as mentioned above).
class MyIterable():
def __iter__(self):
yield 1
m = MyIterable()
for _ in m: print(_) # 1
1 in m # True
The presence of yield turns a function or method into a generator instead of a regular function/method. You don't need the __next__ method if you use a generator (it brings __next__ along with it automatically).
If you wish to create your own container object type (i.e, you can use in on it by itself, but NOT for in), you just need the __contains__ method.
class MyUselessContainer():
def __contains__(self, obj):
return True
m = MyUselessContainer()
1 in m # True
'Foo' in m # True
TypeError in m # True
None in m # True
1 Note that, to be an iterator, an object must implement the iterator protocol. This only means that both the __next__ and __iter__ methods must be correctly implemented (generators come with this functionality "for free", so you don't need to worry about it when using them). Also note that the ___next__ method is actually next (no underscores) in Python 2.
2 See this answer for the different ways to create iterable classes.

For loop basically calls the next method of an object that is applied to (__next__ in Python 3).
You can simulate this simply by doing:
iter = (i for i in range(5))
print(next(iter))
print(next(iter))
print(next(iter))
print(next(iter))
print(next(iter))
# this prints 1 2 3 4
At this point there is no next element in the input object. So doing this:
print(next(iter))
Will result in StopIteration exception thrown. At this point for will stop. And iterator can be any object which will respond to the next() function and throws the exception when there are no more elements. It does not have to be any pointer or reference (there are no such things in python anyway in C/C++ sense), linked list, etc.

There is an iterator protocol in python that defines how the for statement will behave with lists and dicts, and other things that can be looped over.
It's in the python docs here and here.
The way the iterator protocol works typically is in the form of a python generator. We yield a value as long as we have a value until we reach the end and then we raise StopIteration
So let's write our own iterator:
def my_iter():
yield 1
yield 2
yield 3
raise StopIteration()
for i in my_iter():
print i
The result is:
1
2
3
A couple of things to note about that. The my_iter is a function. my_iter() returns an iterator.
If I had written using iterator like this instead:
j = my_iter() #j is the iterator that my_iter() returns
for i in j:
print i #this loop runs until the iterator is exhausted
for i in j:
print i #the iterator is exhausted so we never reach this line
And the result is the same as above. The iter is exhausted by the time we enter the second for loop.
But that's rather simplistic what about something more complicated? Perhaps maybe in a loop why not?
def capital_iter(name):
for x in name:
yield x.upper()
raise StopIteration()
for y in capital_iter('bobert'):
print y
And when it runs, we use the iterator on the string type (which is built into iter). This in turn, allows us run a for loop on it, and yield the results until we are done.
B
O
B
E
R
T
So now this begs the question, so what happens between yields in the iterator?
j = capital_iter("bobert")
print i.next()
print i.next()
print i.next()
print("Hey there!")
print i.next()
print i.next()
print i.next()
print i.next() #Raises StopIteration
The answer is the function is paused at the yield waiting for the next call to next().
B
O
B
Hey There!
E
R
T
Traceback (most recent call last):
File "", line 13, in
StopIteration

Some additional details about the behaviour of iter() with __getitem__ classes that lack their own __iter__ method.
Before __iter__ there was __getitem__. If the __getitem__ works with ints from 0 - len(obj)-1, then iter() supports these objects. It will construct a new iterator that repeatedly calls __getitem__ with 0, 1, 2, ... until it gets an IndexError, which it converts to a StopIteration.
See this answer for more details of the different ways to create an iterator.

Excerpt from the Python Practice book:
5. Iterators & Generators
5.1. Iterators
We use for statement for looping over a list.
>>> for i in [1, 2, 3, 4]:
... print i,
...
1
2
3
4
If we use it with a string, it loops over its characters.
>>> for c in "python":
... print c
...
p
y
t
h
o
n
If we use it with a dictionary, it loops over its keys.
>>> for k in {"x": 1, "y": 2}:
... print k
...
y
x
If we use it with a file, it loops over lines of the file.
>>> for line in open("a.txt"):
... print line,
...
first line
second line
So there are many types of objects which can be used with a for loop. These are called iterable objects.
There are many functions which consume these iterables.
>>> ",".join(["a", "b", "c"])
'a,b,c'
>>> ",".join({"x": 1, "y": 2})
'y,x'
>>> list("python")
['p', 'y', 't', 'h', 'o', 'n']
>>> list({"x": 1, "y": 2})
['y', 'x']
5.1.1. The Iteration Protocol
The built-in function iter takes an iterable object and returns an iterator.
>>> x = iter([1, 2, 3])
>>> x
<listiterator object at 0x1004ca850>
>>> x.next()
1
>>> x.next()
2
>>> x.next()
3
>>> x.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Each time we call the next method on the iterator gives us the next element. If there are no more elements, it raises a StopIteration.
Iterators are implemented as classes. Here is an iterator that works like built-in xrange function.
class yrange:
def __init__(self, n):
self.i = 0
self.n = n
def __iter__(self):
return self
def next(self):
if self.i < self.n:
i = self.i
self.i += 1
return i
else:
raise StopIteration()
The iter method is what makes an object iterable. Behind the scenes, the iter function calls iter method on the given object.
The return value of iter is an iterator. It should have a next method and raise StopIteration when there are no more elements.
Lets try it out:
>>> y = yrange(3)
>>> y.next()
0
>>> y.next()
1
>>> y.next()
2
>>> y.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 14, in next
StopIteration
Many built-in functions accept iterators as arguments.
>>> list(yrange(5))
[0, 1, 2, 3, 4]
>>> sum(yrange(5))
10
In the above case, both the iterable and iterator are the same object. Notice that the iter method returned self. It need not be the case always.
class zrange:
def __init__(self, n):
self.n = n
def __iter__(self):
return zrange_iter(self.n)
class zrange_iter:
def __init__(self, n):
self.i = 0
self.n = n
def __iter__(self):
# Iterators are iterables too.
# Adding this functions to make them so.
return self
def next(self):
if self.i < self.n:
i = self.i
self.i += 1
return i
else:
raise StopIteration()
If both iteratable and iterator are the same object, it is consumed in a single iteration.
>>> y = yrange(5)
>>> list(y)
[0, 1, 2, 3, 4]
>>> list(y)
[]
>>> z = zrange(5)
>>> list(z)
[0, 1, 2, 3, 4]
>>> list(z)
[0, 1, 2, 3, 4]
5.2. Generators
Generators simplifies creation of iterators. A generator is a function that produces a sequence of results instead of a single value.
def yrange(n):
i = 0
while i < n:
yield i
i += 1
Each time the yield statement is executed the function generates a new value.
>>> y = yrange(3)
>>> y
<generator object yrange at 0x401f30>
>>> y.next()
0
>>> y.next()
1
>>> y.next()
2
>>> y.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
So a generator is also an iterator. You don’t have to worry about the iterator protocol.
The word “generator” is confusingly used to mean both the function that generates and what it generates. In this chapter, I’ll use the word “generator” to mean the generated object and “generator function” to mean the function that generates it.
Can you think about how it is working internally?
When a generator function is called, it returns a generator object without even beginning execution of the function. When next method is called for the first time, the function starts executing until it reaches yield statement. The yielded value is returned by the next call.
The following example demonstrates the interplay between yield and call to next method on generator object.
>>> def foo():
... print "begin"
... for i in range(3):
... print "before yield", i
... yield i
... print "after yield", i
... print "end"
...
>>> f = foo()
>>> f.next()
begin
before yield 0
0
>>> f.next()
after yield 0
before yield 1
1
>>> f.next()
after yield 1
before yield 2
2
>>> f.next()
after yield 2
end
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Lets see an example:
def integers():
"""Infinite sequence of integers."""
i = 1
while True:
yield i
i = i + 1
def squares():
for i in integers():
yield i * i
def take(n, seq):
"""Returns first n values from the given sequence."""
seq = iter(seq)
result = []
try:
for i in range(n):
result.append(seq.next())
except StopIteration:
pass
return result
print take(5, squares()) # prints [1, 4, 9, 16, 25]

Concept 1
All generators are iterators but all iterators are not generator
Concept 2
An iterator is an object with a next (Python 2) or next (Python 3)
method.
Concept 3
Quoting from wiki
Generators Generators
functions allow you to declare a function that behaves like an
iterator, i.e. it can be used in a for loop.
In your case
>>> it = (i for i in range(5))
>>> type(it)
<type 'generator'>
>>> callable(getattr(it, 'iter', None))
False
>>> callable(getattr(it, 'next', None))
True

Difference between generator expression and generator function

Is there any difference — performance or otherwise — between generator expressions and generator functions?
In [1]: def f():
...: yield from range(4)
...:
In [2]: def g():
...: return (i for i in range(4))
...:
In [3]: f()
Out[3]: <generator object f at 0x109902550>
In [4]: list(f())
Out[4]: [0, 1, 2, 3]
In [5]: list(g())
Out[5]: [0, 1, 2, 3]
In [6]: g()
Out[6]: <generator object <genexpr> at 0x1099056e0>
I'm asking because I want to decide how I should decide between using the two. Sometimes generator functions are clearer and then the choice is clear. I am asking about those times when code clarity does not make one choice obvious.

The functions you provided have completely different semantics in the general case.
The first one, with yield from, passes the control to the iterable. This means that calls to send() and throw() during the iteration will be handled by the iterable and not by the function you are defining.
The second function only iterates over the elements of the iterable, and it will handle all the calls to send() and throw(). To see the difference check this code:
In [8]: def action():
...: try:
...: for el in range(4):
...: yield el
...: except ValueError:
...: yield -1
...:
In [9]: def f():
...: yield from action()
...:
In [10]: def g():
...: return (el for el in action())
...:
In [11]: x = f()
In [12]: next(x)
Out[12]: 0
In [13]: x.throw(ValueError())
Out[13]: -1
In [14]: next(x)
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-14-5e4e57af3a97> in <module>()
----> 1 next(x)
StopIteration:
In [15]: x = g()
In [16]: next(x)
Out[16]: 0
In [17]: x.throw(ValueError())
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-17-1006c792356f> in <module>()
----> 1 x.throw(ValueError())
<ipython-input-10-f156e9011f2f> in <genexpr>(.0)
1 def g():
----> 2 return (el for el in action())
3
ValueError:
In fact, due to this reason, yield from probably has a higher overhead than the genexp, even though it is probably irrelevant.
Use yield from only when the above behaviour is what you want or if you are iterating over a simple iterable that is not a generator (so that yield from is equivalent to a loop + simple yields).
Stylistically speaking I'd prefer:
def h():
for el in range(4):
yield el
Instead of returning a genexp or using yield from when dealing with generators.
In fact the code used by the generator to perform the iteration is almost identical to the above function:
In [22]: dis.dis((i for i in range(4)).gi_code)
1 0 LOAD_FAST 0 (.0)
>> 3 FOR_ITER 11 (to 17)
6 STORE_FAST 1 (i)
9 LOAD_FAST 1 (i)
12 YIELD_VALUE
13 POP_TOP
14 JUMP_ABSOLUTE 3
>> 17 LOAD_CONST 0 (None)
20 RETURN_VALUE
As you can see it does a FOR_ITER + YIELD_VALUE. note that the argument (.0), is iter(range(4)). The bytecode of the function also contains the calls to LOAD_GLOBAL and GET_ITER that are required to lookup range and obtain its iterable. However this actions must be performed by the genexp too, just not inside its code but before calling it.

In addition to #Bakuriu's good point — that generator functions implement send(), throw(), and close() — there is another difference I've run into. Sometimes, you have some setup code that happens before the yield statement is reached. If that setup code can raise exceptions, then the generator-returning version might be preferable to the generator function because it will raise the exception sooner. E.g.,
def f(x):
if x < 0:
raise ValueError
for i in range(4):
yield i * i
def g(x):
if x < 0:
raise ValueError
return (i * i for i in range(x))
print(list(f(4)))
print(list(g(4)))
f(-1) # no exception until the iterator is consumed!
g(-1)
If one wants both behaviors, I think the following is best:
def f(count):
x = 0
for i in range(count):
x = yield i + (x or 0)
def protected_f(count):
if count < 0:
raise ValueError
return f(count)
it = protected_f(10)
try:
print(next(it))
x = 0
while True:
x = it.send(x)
print(x)
except StopIteration:
pass
it = protected_f(-1)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

yield from vs yield in for-loop - python

Related

What does a yield inside a yield do?

Decorating a generator in Python: call some method in between yields

Stop Iteration error when using next()

Python for loop and iterator behavior

Difference between generator expression and generator function

Categories

Resources