Question
Please help pin-point the Python source code that implements the generator send part. I suppose somewhere in github This is Python version 3.8.7rc1 but not familiar with how the repository is organized.
Background
Having difficulty with what PEP 342 and documentation regarding the generator send(value). Hence trying to find out how it is implemented to understand.
there is no yield expression to receive a value when the generator has just been created
The value argument becomes the result of the **current yield expression**. The send() method returns the **next value yielded by the generator
Specification: Sending Values into Generators
Because generator-iterators begin execution at the top of the
generator's function body, there is no yield expression to receive a
value when the generator has just been created. Therefore, calling
send() with a non-None argument is prohibited when the generator
iterator has just started, and a TypeError is raised if this occurs
(presumably due to a logic error of some kind). Thus, before you can
communicate with a coroutine you must first call next() or send(None)
to advance its execution to the first yield expression.
generator.send(value)
Resumes the execution and “sends” a value into the generator function.
The value argument becomes the result of the current yield expression.
The send() method returns the next value yielded by the generator, or
raises StopIteration if the generator exits without yielding another
value. When send() is called to start the generator, it must be called
with None as the argument, because there is no yield expression that
could receive the value.
I suppose yield would be like a UNIX system call moving into a routine, inside which stack frame and execution pointer are saved and the generator co-routine is suspended. I think when save(value) is called, some tricks happen there and those are regarding the cryptic parts in the documents.
Although sent_value = (yield value) is one line statement, the blocking and resuming both happens in the same line, I think. The execution does not resume after the yield but within it, hence would like to know how block/resume are implemented. Also I believe next(generator) is the same with generator.send(None) and would like to verify.
Look here for class Generator, also look for this file, is a complete implementation of generators on C inside Python
Related
I read here the following example:
>>> def double_inputs():
... while True: # Line 1
... x = yield # Line 2
... yield x * 2 # Line 3
...
>>> gen = double_inputs()
>>> next(gen) # Run up to the first yield
>>> gen.send(10) # goes into 'x' variable
If I understand the above correctly, it seems to imply that Python actually waits until next(gen) to "run up to" to Line 2 in the body of the function. Put another way, the interpreter would not start executing the body of the function until we call next.
Is that actually correct?
To my knowledge, Python does not do AOT compilation, and it doesn't "look ahead" much except for parsing the code and making sure it's valid Python. Is this correct?
If the above are true, how would Python know when I invoke double_inputs() that it needs to wait until I call next(gen) before it even enters the loop while True?
Correct. Calling double_inputs never executes any of the code; it simply returns a generator object. The presence of the yield expression in the body, discovered when the def statement is parsed, changes the semantics of the def statement to create a generator object rather than a function object.
The function contains yield is a generator.
When you call gen = double_inputs(), you get a generator instance as the result. You need to consume this generator by calling next.
So for your first question, it is true. It runs lines 1, 2, 3 when you first call next.
For your second question, I don't exactly get your point. When you define the function, Python knows what you are defining, it doesn't need to look ahead when running it.
For your third question, the key is yield key word.
Generator-function is de iure a function, but de facto it is an iterator, i.e. a class (with implemented __next__(), __iter()__, and some other methods.)
In other words, it is a class disguised as a function.
It means, that “calling” this function is in reality making an instance of this class, and explains, why the “called function” does initially nothing. This is the answer to your 3rd question.
The answer to your 1st question is surprisingly no.
Instances always wait for calling its methods, and the __next__() method (indirectly launched by calling the next() build-in function) is not the only method of generators. Other method is the .send(), and you may use gen.send(None) instead of your next(gen).
The answer to your 2nd question is no. Python interpreter by no mean "look ahead" and there are no exceptions, including your
... except for parsing the code and making sure it's valid Python.
Or the answer to this question is yes, if you mean “parsing only up to the next command”. ;-)
I am trying to make a function that yields a value. Before anything could be done, I want to make sure function input is valid. Below code creates generator after execution. It raises exception only after next. Is there an elegant structure of the function which throws exception before next?
def foo(value):
if validate(value):
raise ValueError
yield 1
you can't check the value before using next, that is the whole point of using gnerators, from the docs:
Each yield temporarily suspends processing, remembering the location
execution state (including local variables and pending
try-statements). When the generator iterator resumes, it picks up
where it left off (in contrast to functions which start fresh on every
invocation).
what you can do it is to check the value before using the generator
I am facing a strange behavior with nested generators.
def empty_generator():
for i in []:
yield
def gen():
next(empty_generator())
print("This is not printed, why?")
yield
list(gen()) # No Error
next(empty_generator()) # Error
I would expect the gen() function to raises an error, as I am calling next() around an empty generator. But this is not the case, the functions is leaving from nowhere, without raising or printing anything.
That seems to violate the principle of least astonishment, isn't it?
Technically, you don't have an error; you have an uncaught StopIteration exception, which is used for flow control. The call to list, which takes an arbitrary iterable as its argument, catches the exception raised by gen for you.
for loops work similarly; every iterator raises StopIteration at the end, but the for loop catches it and ends in response.
Put another way, the consumer of an iterable is responsible for catching StopIteration. When gen calls next, it lets the exception bubble up. The call to list catches it, but you don't when you call next explicitly.
Note that PEP-479 changes this behavior. Python 3.5 provides the new semantics via __future__, Python 3.6 makes provides a deprecation warning, and Python 3.7 (due out Summer 2018) completes the transition. I refer the reader to the PEP itself for further details.
Once an iterator reaches its end, it raises StopIteration which... stops the iteration, so list(gen()) constructs an empty list.
From PEP342:
Because generator-iterators begin execution at the top of the generator's function body, there is no yield expression to receive a value when the generator has just been created. Therefore, calling send() with a non-None argument is prohibited when the generator iterator has just started, ...
For example,
>>> def a():
... for i in range(5):
... print((yield i))
...
>>> g = a()
>>> g.send("Illegal")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can't send non-None value to a just-started generator
Why is this illegal? The way I understood the use of yield here, it pauses execution of the function, and returns to that spot the next time that next() (or send()) is called. But it seems like it should be legal to print the first result of (yield i)?
Asked a different way, in what state is the generator 'g' directly after g = a(). I assumed that it had run a() up until the first yield, and since there was a yield it returned a generator, instead of a standard synchronous object return.
So why exactly is calling send with non-None argument on a new generator illegal?
Note: I've read the answer to this question, but it doesn't really get to the heart of why it's illegal to call send (with non-None) on a new generator.
Asked a different way, in what state is the generator 'g' directly after g = a(). I assumed that it had run a() up until the first yield, and since there was a yield it returned a generator, instead of a standard synchronous object return.
No. Right after g = a() it is right at the beginning of the function. It does not run up to the first yield until after you advance the generator once (by calling next(g)).
This is what it says in the quote you included in your question: "Because generator-iterators begin execution at the top of the generator's function body..." It also says it in PEP 255, which introduced generators:
When a generator function is called, the actual arguments are bound to function-local formal argument names in the usual way, but no code in the body of the function is executed.
Note that it does not matter whether the yield statement is actually executed. The mere occurrence of yield inside the function body makes the function a generator, as documented:
Using a yield expression in a function definition is sufficient to cause that definition to create a generator function instead of a normal function.
I'm new to the concept of non-blocking IO, and there is something i'm having trouble understanding - about coroutines. consider this code:
class UserPostHandler(RequestHandler):
#gen.coroutine
def get(self):
var = 'some variable'
data = json.loads(self.request.body)
yield motor_db.users.insert({self.request.remote_ip: data})#asynch non blocking db insert call
#success
self.set_status(201)
print var
when the get function is called, it creates the string var. what happens to this variable when the function waits for the motor.insert to complete? To my understanding "non blocking" implies that no thread is waiting for the IO call to complete, and no memory is being used while waiting. So where is the value of var stored? how is it accessible when the execution resumes?
Any help would be appreciated!
The memory for var is still being used while insert executes, but the get function itself is "frozen", which allows other functions to execute. Tornado's coroutines are implemented using Python generators, which allow function execution to be temporarily suspended when a yield occurs, and then be restarted again (with the function's state preserved) after the yield point. Here's how the behavior is described in the PEP that introduced generators:
If a yield statement is encountered, the state of the function is
frozen, and the value [yielded] is returned to .next()'s caller. By
"frozen" we mean that all local state is retained, including the
current bindings of local variables, the instruction pointer, and the
internal evaluation stack: enough information is saved so that the
next time .next() is invoked, the function can proceed exactly as if
the yield statement were just another external call.
The #gen.coroutine generator has magic in it that ties into Tornado's event loop, so that the Future returned by the insert call is registered with the event loop, allowing the get generator to be restarted when the insert call completes.