Calling gen.send() with a new generator in Python 3.3+? - python

From PEP342:
Because generator-iterators begin execution at the top of the generator's function body, there is no yield expression to receive a value when the generator has just been created. Therefore, calling send() with a non-None argument is prohibited when the generator iterator has just started, ...
For example,
>>> def a():
... for i in range(5):
... print((yield i))
...
>>> g = a()
>>> g.send("Illegal")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can't send non-None value to a just-started generator
Why is this illegal? The way I understood the use of yield here, it pauses execution of the function, and returns to that spot the next time that next() (or send()) is called. But it seems like it should be legal to print the first result of (yield i)?
Asked a different way, in what state is the generator 'g' directly after g = a(). I assumed that it had run a() up until the first yield, and since there was a yield it returned a generator, instead of a standard synchronous object return.
So why exactly is calling send with non-None argument on a new generator illegal?
Note: I've read the answer to this question, but it doesn't really get to the heart of why it's illegal to call send (with non-None) on a new generator.

Asked a different way, in what state is the generator 'g' directly after g = a(). I assumed that it had run a() up until the first yield, and since there was a yield it returned a generator, instead of a standard synchronous object return.
No. Right after g = a() it is right at the beginning of the function. It does not run up to the first yield until after you advance the generator once (by calling next(g)).
This is what it says in the quote you included in your question: "Because generator-iterators begin execution at the top of the generator's function body..." It also says it in PEP 255, which introduced generators:
When a generator function is called, the actual arguments are bound to function-local formal argument names in the usual way, but no code in the body of the function is executed.
Note that it does not matter whether the yield statement is actually executed. The mere occurrence of yield inside the function body makes the function a generator, as documented:
Using a yield expression in a function definition is sufficient to cause that definition to create a generator function instead of a normal function.

Related

Where is the source code of python generator send?

Question
Please help pin-point the Python source code that implements the generator send part. I suppose somewhere in github This is Python version 3.8.7rc1 but not familiar with how the repository is organized.
Background
Having difficulty with what PEP 342 and documentation regarding the generator send(value). Hence trying to find out how it is implemented to understand.
there is no yield expression to receive a value when the generator has just been created
The value argument becomes the result of the **current yield expression**. The send() method returns the **next value yielded by the generator
Specification: Sending Values into Generators
Because generator-iterators begin execution at the top of the
generator's function body, there is no yield expression to receive a
value when the generator has just been created. Therefore, calling
send() with a non-None argument is prohibited when the generator
iterator has just started, and a TypeError is raised if this occurs
(presumably due to a logic error of some kind). Thus, before you can
communicate with a coroutine you must first call next() or send(None)
to advance its execution to the first yield expression.
generator.send(value)
Resumes the execution and “sends” a value into the generator function.
The value argument becomes the result of the current yield expression.
The send() method returns the next value yielded by the generator, or
raises StopIteration if the generator exits without yielding another
value. When send() is called to start the generator, it must be called
with None as the argument, because there is no yield expression that
could receive the value.
I suppose yield would be like a UNIX system call moving into a routine, inside which stack frame and execution pointer are saved and the generator co-routine is suspended. I think when save(value) is called, some tricks happen there and those are regarding the cryptic parts in the documents.
Although sent_value = (yield value) is one line statement, the blocking and resuming both happens in the same line, I think. The execution does not resume after the yield but within it, hence would like to know how block/resume are implemented. Also I believe next(generator) is the same with generator.send(None) and would like to verify.
Look here for class Generator, also look for this file, is a complete implementation of generators on C inside Python

What happens when you invoke a function that contains yield?

I read here the following example:
>>> def double_inputs():
... while True: # Line 1
... x = yield # Line 2
... yield x * 2 # Line 3
...
>>> gen = double_inputs()
>>> next(gen) # Run up to the first yield
>>> gen.send(10) # goes into 'x' variable
If I understand the above correctly, it seems to imply that Python actually waits until next(gen) to "run up to" to Line 2 in the body of the function. Put another way, the interpreter would not start executing the body of the function until we call next.
Is that actually correct?
To my knowledge, Python does not do AOT compilation, and it doesn't "look ahead" much except for parsing the code and making sure it's valid Python. Is this correct?
If the above are true, how would Python know when I invoke double_inputs() that it needs to wait until I call next(gen) before it even enters the loop while True?
Correct. Calling double_inputs never executes any of the code; it simply returns a generator object. The presence of the yield expression in the body, discovered when the def statement is parsed, changes the semantics of the def statement to create a generator object rather than a function object.
The function contains yield is a generator.
When you call gen = double_inputs(), you get a generator instance as the result. You need to consume this generator by calling next.
So for your first question, it is true. It runs lines 1, 2, 3 when you first call next.
For your second question, I don't exactly get your point. When you define the function, Python knows what you are defining, it doesn't need to look ahead when running it.
For your third question, the key is yield key word.
Generator-function is de iure a function, but de facto it is an iterator, i.e. a class (with implemented __next__(), __iter()__, and some other methods.)
          In other words, it is a class disguised as a function.
It means, that “calling” this function is in reality making an instance of this class, and explains, why the “called function” does initially nothing. This is the answer to your 3rd question.
The answer to your 1st question is surprisingly no.
Instances always wait for calling its methods, and the __next__() method (indirectly launched by calling the next() build-in function) is not the only method of generators. Other method is the .send(), and you may use gen.send(None) instead of your next(gen).
The answer to your 2nd question is no. Python interpreter by no mean "look ahead" and there are no exceptions, including your
... except for parsing the code and making sure it's valid Python.
Or the answer to this question is yes, if you mean “parsing only up to the next command”. ;-)

Generators and files

When I write:
lines = (line.strip() for line in open('a_file'))
Is the file opened immediately or is the file system only accessed when I start to consume the generator expression?
open() is called immediately upon the construction of the generator, irrespective of when or whether you consume from it.
The relevant spec is PEP-289:
Early Binding versus Late Binding
After much discussion, it was
decided that the first (outermost) for-expression should be evaluated
immediately and that the remaining expressions be evaluated when the
generator is executed.
Asked to summarize the reasoning for binding the first expression,
Guido offered [5]:
Consider sum(x for x in foo()). Now suppose there's a bug in foo()
that raises an exception, and a bug in sum() that raises an exception
before it starts iterating over its argument. Which exception would
you expect to see? I'd be surprised if the one in sum() was raised
rather the one in foo(), since the call to foo() is part of the
argument to sum(), and I expect arguments to be processed before the
function is called.
OTOH, in sum(bar(x) for x in foo()), where sum() and foo() are
bugfree, but bar() raises an exception, we have no choice but to delay
the call to bar() until sum() starts iterating -- that's part of the
contract of generators. (They do nothing until their next() method is
first called.)
See the rest of that section for further discussion.
It is opened immediately. You can verify this if you use a filename that's not present (it will throw an Exception which indicates that Python actually tried to open it immediatly).
You can also use a function that gives more feedback to see that the command is executed even before the generator is iterated over:
def somefunction(filename):
print(filename)
return open(filename)
lines = (line.strip() for line in somefunction('a_file')) # prints
However if you use a generator function instead of a generator expression the file is only opened when you iterate over it:
def somefunction(filename):
print(filename)
for line in open(filename):
yield line.strip()
lines = somefunction('a_file') # no print!
list(lines) # prints because list iterates over the generator function.
It is opened immediately.
Example:
def func():
print('x')
return [1, 2, 3]
g = (x for x in func())
Output:
x
The function needs to return an iterable object.
open() returns an open file object that is iterable.
Therefore, the file will be opened when you define the generator expression.

Unexecuted yield statement blocks function to run?

In the below simplified code, I would like to reuse a loop to do a preparation first and yield the result.
However, the preparation (bar()) function is never executed.
Is yield statement changing the flow of the function?
def bar(*args,**kwargs):
print("ENTER bar")
pass
def foo(prepare=False):
print("ENTER foo")
for x in range(1,10):
if prepare:
bar(x)
else:
yield x
foo(prepare=True)
r = foo(prepare=False)
for x in r:
pass
Because the foo definition contains a yield, it won't run like a normal function even if you call it like one (e.g. foo(prepare=True) ).
Running foo() with whatever arguments will return a generator object, suitable to be iterated through. The body of the definition won't be run until you try and iterate that generator object.
The new coroutine syntax puts a keyword at the start of the definition, so that the change in nature isn't hidden inside the body of the function.
The problem is that having a yield statement changes the function to returning a generator and alters the behavior of the function.
Basically this means that on the call of the .next function of the generator the function executes to the yield or termination of the function (in which case it raises StopIteration exception).
Consequently what you should have done is to ensure that you iterate over it even if the yield statement won't be reached. Like:
r = foo(prepare=True)
for x in r:
pass
In this case the loop will terminate immediately as no yield statement is being reached.
In my opinion, the actual explanation here is that:
Python evaluates if condition lazily!
And I'll explain:
When you call to
foo(prepare=True)
just like that, nothing happens, although you might expected that bar(x) will be executed 10 times. But what really happen is that 'no-one' demanding the return value of foo(prepare=True) call, so the if is not evaluated, but it might if you use the return value from foo.
In the second call to foo, iterating the return value r, python has to evaluate the return value,and it does, and I'll show that:
Case 1
r = foo(prepare=True)
for x in r:
pass
The output here is 'ENTER bar' 9 times. This means that bar is executed 9 times.
Case 2
r = foo(prepare=False)
for x in r:
pass
In this case no 'ENTER bar' is printed, as expected.
To sum everything up, I'll say that:
There are some cases where Python perform Lazy Evaluation, one of them is the if statement.
Not everything is evaluated lazily in Python,
for example:
# builds a big list and immediately discards it
sum([x*x for x in xrange(2000000)])
vs.
# only keeps one value at a time in memory
sum(x*x for x in xrange(2000000))
About lazy and eager evaluation in python, continue read here.

yield extended syntax and send method

I read about yield extended syntax, so that if I have:
def numgen(N):
for i in range(N):
n = yield i
if n:
yield n
I can factor it:
def numgen(N):
n = yield from range(N)
if n:
yield n
but I have noticed that if I do, after I have coded the second generator:
g = numgen(10)
next(g)
g.send(54)
I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in gensquare
AttributeError: 'range_iterator' object has no attribute 'send'
So, how is that? How can I send a value to my numgen generator object?
range() is not a generator, it doesn't have a generator.send() method.
This is clearly documented in the yield expression documentation:
When yield from <expr> is used, it treats the supplied expression as a subiterator. All values produced by that subiterator are passed directly to the caller of the current generator’s methods. Any values passed in with send() and any exceptions passed in with throw() are passed to the underlying iterator if it has the appropriate methods. If this is not the case, then send() will raise AttributeError or TypeError, while throw() will just raise the passed in exception immediately.
Emphasis mine.
You are trying to send a value to the range() iterator, but it has no .send() method.
range() is just a sequence, not a generator object; you can create multiple iterators for it, you can test if a number is a member of the sequence, ask it for its length, etc.
Note that your 'refactoring' is not the same thing at all; in your original n is assigned anything you send in through generator.send(); in your second version yield from returns the value attribute of the StopIteration exception raised when the sub-iterator ends. If the sub-iterator is a generator itself, you can set that value either by manually raising StopIteration(value) or by using a return statement. yield from cannot return the value sent in with generator.send() because such values would be passed on to the sub-generator instead.
Again, from the documentation:
When the underlying iterator is complete, the value attribute of the raised StopIteration instance becomes the value of the yield expression. It can be either set explicitly when raising StopIteration, or automatically when the sub-iterator is a generator (by returning a value from the sub-generator).
So your first version is set up to receive N messages, yielding both the i for target and any sent value is true-thy, while the other passes on any sent messages to a degelated-to generator, then would yield just the StopIteration value if it is true-thy once, after the delegated-to iterator is done.

Categories