Purpose of python yield when not used in iterator - python

I've inherited some fairly buggy code from another project. One of the functions is a callback(draw_ui method) from a library that has a yield statement in it. I'm wondering what the purpose of having a yield in python is if your not using it in a iterator context to return a value. What possible benefit could it have?
def draw_ui(self, graphics):
self._reset_components()
imgui.set_next_window_size(200, 200, imgui.ONCE)
if imgui.begin("Entity"):
if not self._selected:
imgui.text("No entity selected")
else:
imgui.text(self._selected.name)
yield
imgui.end() # end entity window

When a function has empty yield statement, the function will just return None for the first iteration, so you can say that the function acts as generator that can be iterated only once and yields None value:
def foo():
yield
>>> f = foo()
>>> print(next(f))
None
>>> print(next(f))
Traceback (most recent call last):
File "<input>", line 1, in <module>
StopIteration
That's what an empty yield does. But when a function has empty yield in between two block of code, it will execute the codes before yield for the first iteration, and the codes after yield will be executed on second iteration:
def foo():
print('--statement before yield--')
yield
print('--statement after yield--')
>>> f = foo()
>>> next(f)
--statement before yield--
>>> next(f)
--statement after yield--
Traceback (most recent call last):
File "<input>", line 1, in <module>
StopIteration
So, it somehow allows you to pause the execution of a function in the middle, however, it throws StopIteration Exception for the second iteration because the function doesn't actually yield anything on the second iteration, to avoid this, you can pass a default value to next function:
Looking at your code, your function is also doing the same thing
def draw_ui(self, graphics):
self._reset_components()
imgui.set_next_window_size(200, 200, imgui.ONCE)
if imgui.begin("Entity"):
if not self._selected:
imgui.text("No entity selected")
else:
imgui.text(self._selected.name)
yield #<--------------
imgui.end() #
So, while calling funciton draw_ui, if control goes to else block, then line outside the else block, i.e. imgui.end() is not called until the second iteration.
This type of implementation is generally done to be used in ContextManager and you can relate to following code snippet copied from contextlib.contextmanager documentation
from contextlib import contextmanager
#contextmanager
def managed_resource(*args, **kwds):
# Code to acquire resource, e.g.:
resource = acquire_resource(*args, **kwds)
try:
yield resource
finally:
# Code to release resource, e.g.:
release_resource(resource)
>>> with managed_resource(timeout=3600) as resource:
... # Resource is released at the end of this block,
... # even if code in the block raises an exception

Related

Use contextmanager to trap instructions for later execution

I want to achieve a pseudo-db-like transaction using context manager.
Take for example:
class Transactor:
def a(): pass
def b(d, b): pass
def c(i): pass
#contextmanager
def get_session(self):
txs = []
yield self # accumulate method calls
for tx in tx:
tx() # somehow pass the arguments
def main():
t = Transactor()
with t.get_session() as session:
session.a() # inserts `a` into `txs`
... more code ...
session.c(value) # inserts `c` and `(value)` into `txs`
session.b(value1, value2) # inserts `b` and `(value1, value2)` into `txs`
... more code ...
# non-transator related code
f = open('file.txt') # If this throws an exception,
# break out of the context manager,
# and discard previous transactor calls.
... more code ...
session.a() # inserts `a` into `txs`
session.b(x, y) # inserts `b` and `(x, y)` into `txs`
# Now is outside of context manager.
# The following calls should execute immediately
t.a()
t.b(x, y)
t.c(k)
If something goes wrong such as an exception, discard txs (rollback). If it makes it to the end of the context, execute each instruction in order of insertion and pass in the appropriate arguments.
How can to trap the method call for later execution?
And one extra caveat:
If get_session is not called, I want to execute the instructions immediately.
It's not pretty, but to follow the structure you're looking for you'd have to build a temporary transaction class that holds your function queues and execute it after the context manager exits. You'll need to use functools.partial, but there are some restrictions though:
All the queued up calls must be methods based on your "session" instance. Anything else gets executed right away.
I don't know how you want to handle non-callable session attributes, so for now I assume it'll just retrieve the value.
Having said that, here's my take on it:
from functools import partial
class TempTrans:
# pass in the object instance to mimic
def __init__(self, obj):
self._queue = []
# iterate through the attributes and methods within the object and its class
for attr, val in type(obj).__dict__.items() ^ obj.__dict__.items():
if not attr.startswith('_'):
if callable(val):
setattr(self, attr, partial(self._add, getattr(obj, attr)))
else:
# placeholder to handle non-callable attributes
setattr(self, attr, val)
# function to add to queue
def _add(self, func, *args, **kwargs):
self._queue.append(partial(func, *args, **kwargs))
# function to execute the queue
def _execute(self):
_remove = []
# iterate through the queue to call the functions.
# I suggest catching errors here in case your functions falls through
for func in self._queue:
try:
func()
_remove.append(func)
except Exception as e:
print('some error occured')
break
# remove the functions that were successfully ran
for func in _remove:
self._queue.remove(func)
Now onto the context manager (it will be outside your class, you can place it in as a class method if you wish):
#contextmanager
def temp_session(obj):
t = TempTrans(obj)
try:
yield t
t._execute()
print('Transactions successfully ran')
except:
print('Encountered errors, queue was not executed')
finally:
print(t._queue) # debug to see what's left of the queue
Usage:
f = Foo()
with temp_session(f) as session:
session.a('hello')
session.b(1, 2, 3)
# a hello
# b 1 2 3
# Transactions successfully ran
# []
with temp_session(f) as session:
session.a('hello')
session.b(1, 2, 3)
session.attrdoesnotexist # expect an error
# Encountered errors, queue was not executed
# [
# functools.partial(<bound method Foo.a of <__main__.Foo object at 0x0417D3B0>>, 'hello'),
# functools.partial(<bound method Foo.b of <__main__.Foo object at 0x0417D3B0>>, 1, 2, 3)
# ]
This solution was a bit contrived because of the way you wanted it structured, but if you didn't need a context manager and doesn't need the session to look like a direct function call, it's trivial to just use partial:
my_queue = []
# some session
my_queue.append(partial(f, a))
my_queue.append(partial(f, b))
for func in my_queue:
func()

Why we could not catch Stopiteration exception in a generator definite function?

def simple_generator():
print("-> start ..")
try:
x = yield
print("-> receive {} ..".format(x))
except StopIteration:
print("simple_generator exit..")
I know that each call of next on the generator object runs the code until the next yield statement, and returns the yielded value. If there is no more to get, StopIteration is raised.
So I wanted to catch the StopIteration in function simple_generator as the code above. Then I tried:
>>>
>>> sg3 = simple_generator()
>>> sg3.send(None)
-> start ..
>>> sg3.send("hello generator!")
-> receive hello generator! ..
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
It did throw StopIteration while the try ...excep dit not catch it at all, I could not understand what's underlying reason, could anyone explain that? Thanks in advance.
Of course, I also knew that if I dealed with the StopIteration exception outside function simple_generator, it did work as what I expected, for example.
>>> try:
... sg4 = simple_generator()
... while True:
... next(sg4)
... except StopIteration:
... print("sg4 exit ..")
...
-> start ..
-> receive None ..
sg4 exit ..
>>>
So my question is Why we could not catch Stopiteration exception in a generator definite function?

dictionary changed size during iteration in multithreading app

I do not know how to solve this problem:
Traceback (most recent call last): File
"/usr/local/cabinet_dev/cabinet/lib/python3.4/site-packages/eventlet/hubs/hub.py",
line 458, in fire_timers
timer() File "/usr/local/cabinet_dev/cabinet/lib/python3.4/site-packages/eventlet/hubs/timer.py",
line 58, in call
cb(*args, **kw) File "/usr/local/cabinet_dev/cabinet/lib/python3.4/site-packages/eventlet/greenthread.py",
line 218, in main
result = function(*args, **kwargs) File "./monitor.py", line 148, in caughtBridge
for call in self.active.keys(): RuntimeError: dictionary changed size during iteration
In the code below:
def caughtBridge(self):
while True:
event = self.bridgeQueue.get()
uniqueid1 = str(event.headers.get('Uniqueid1'))
uniqueid2 = str(event.headers.get('Uniqueid2'))
for call in self.active.keys():
if self.active[call]['uniqueid'] == uniqueid1:
self.active[call]['uniqueid2'] = uniqueid2
if self.active[call]['uniqueid'] == uniqueid1:
for listener in self.listeners:
for number in listener.getNumbers():
if number == self.active[call]['exten']:
if not self.active[call]['answered']:
self.sendEvent({"status": "bridge", "id": self.active[call]['uniqueid'],
"number": self.active[call]['exten']},
listener.getRoom())
self.__callInfo(self.active[call], listener.getRoom())
self.active[call]['answered'] = True
self.bridgeQueue.task_done()
Use a copy of self.active.keys(), for example:
for call in list(self.active.keys()):
Didn't see if you add or remove dict Entries?
In case of adding, the other Threads will not see the added dict Entries.
In case of removing, the current Thread will fail with Key Error,
you have to catch these.
For example:
for call in list(self.active.keys()):
<Lock that call to prevent removing>
if call in self.active:
...
self.active[call]['answered'] = True
else:
# call removed do nothing
<Unlocked that call to do whatever in other Thread>
self.bridgeQueue.task_done()
Read about Python » 3.6.2 Documentation: threading.html#lock-objects
Basicly implement Pair Methods self.lock(call) and self.unlock(call), for instance:
Untested Code:
To prevent Deadlocks you have to guarantee self.unlock(call) will be reached!
class xxx
def __init__....
self_lock = threading.Lock
# Init all self.active[call]['lock'] = False
def lock(self, call):
# self._lock ist class threading.Lock
# self._lock has to be the same for all Threads
with self._lock:
if call in self.active and not self.active[call]['lock']:
self.active[call]['lock'] = True
return True
else:
return False
def unlock(self, call):
with self._lock:
self.active[call]['lock'] = False
# Usage:
for call in list(self.active.keys()):
if self.lock(call):
...
self.active[call]['answered'] = True
self.unlock(call)
else:
# call removed do nothing
self.bridgeQueue.task_done()

Turn functions with a callback into Python generators?

The Scipy minimization function (just to use as an example), has the option of adding a callback function at each step. So I can do something like,
def my_callback(x):
print x
scipy.optimize.fmin(func, x0, callback=my_callback)
Is there a way to use the callback function to create a generator version of fmin, so that I could do,
for x in my_fmin(func,x0):
print x
It seems like it might be possible with some combination of yields and sends, but I can quite think of anything.
As pointed in the comments, you could do it in a new thread, using Queue. The drawback is that you'd still need some way to access the final result (what fmin returns at the end). My example below uses an optional callback to do something with it (another option would be to just yield it also, though your calling code would have to differentiate between iteration results and final results):
from thread import start_new_thread
from Queue import Queue
def my_fmin(func, x0, end_callback=(lambda x:x), timeout=None):
q = Queue() # fmin produces, the generator consumes
job_done = object() # signals the processing is done
# Producer
def my_callback(x):
q.put(x)
def task():
ret = scipy.optimize.fmin(func,x0,callback=my_callback)
q.put(job_done)
end_callback(ret) # "Returns" the result of the main call
# Starts fmin in a new thread
start_new_thread(task,())
# Consumer
while True:
next_item = q.get(True,timeout) # Blocks until an input is available
if next_item is job_done:
break
yield next_item
Update: to block the execution of the next iteration until the consumer has finished processing the last one, it's also necessary to use task_done and join.
# Producer
def my_callback(x):
q.put(x)
q.join() # Blocks until task_done is called
# Consumer
while True:
next_item = q.get(True,timeout) # Blocks until an input is available
if next_item is job_done:
break
yield next_item
q.task_done() # Unblocks the producer, so a new iteration can start
Note that maxsize=1 is not necessary, since no new item will be added to the queue until the last one is consumed.
Update 2: Also note that, unless all items are eventually retrieved by this generator, the created thread will deadlock (it will block forever and its resources will never be released). The producer is waiting on the queue, and since it stores a reference to that queue, it will never be reclaimed by the gc even if the consumer is. The queue will then become unreachable, so nobody will be able to release the lock.
A clean solution for that is unknown, if possible at all (since it would depend on the particular function used in the place of fmin). A workaround could be made using timeout, having the producer raises an exception if put blocks for too long:
q = Queue(maxsize=1)
# Producer
def my_callback(x):
q.put(x)
q.put("dummy",True,timeout) # Blocks until the first result is retrieved
q.join() # Blocks again until task_done is called
# Consumer
while True:
next_item = q.get(True,timeout) # Blocks until an input is available
q.task_done() # (one "task_done" per "get")
if next_item is job_done:
break
yield next_item
q.get() # Retrieves the "dummy" object (must be after yield)
q.task_done() # Unblocks the producer, so a new iteration can start
Generator as coroutine (no threading)
Let's have FakeFtp with retrbinary function using callback being called with each successful read of chunk of data:
class FakeFtp(object):
def __init__(self):
self.data = iter(["aaa", "bbb", "ccc", "ddd"])
def login(self, user, password):
self.user = user
self.password = password
def retrbinary(self, cmd, cb):
for chunk in self.data:
cb(chunk)
Using simple callback function has disadvantage, that it is called repeatedly and the callback
function cannot easily keep context between calls.
Following code defines process_chunks generator, which will be able receiving chunks of data one
by one and processing them. In contrast to simple callback, here we are able to keep all the
processing within one function without losing context.
from contextlib import closing
from itertools import count
def main():
processed = []
def process_chunks():
for i in count():
try:
# (repeatedly) get the chunk to process
chunk = yield
except GeneratorExit:
# finish_up
print("Finishing up.")
return
else:
# Here process the chunk as you like
print("inside coroutine, processing chunk:", i, chunk)
product = "processed({i}): {chunk}".format(i=i, chunk=chunk)
processed.append(product)
with closing(process_chunks()) as coroutine:
# Get the coroutine to the first yield
coroutine.next()
ftp = FakeFtp()
# next line repeatedly calls `coroutine.send(data)`
ftp.retrbinary("RETR binary", cb=coroutine.send)
# each callback "jumps" to `yield` line in `process_chunks`
print("processed result", processed)
print("DONE")
To see the code in action, put the FakeFtp class, the code shown above and following line:
main()
into one file and call it:
$ python headsandtails.py
('inside coroutine, processing chunk:', 0, 'aaa')
('inside coroutine, processing chunk:', 1, 'bbb')
('inside coroutine, processing chunk:', 2, 'ccc')
('inside coroutine, processing chunk:', 3, 'ddd')
Finishing up.
('processed result', ['processed(0): aaa', 'processed(1): bbb', 'processed(2): ccc', 'processed(3): ddd'])
DONE
How it works
processed = [] is here just to show, the generator process_chunks shall have no problems to
cooperate with its external context. All is wrapped into def main(): to prove, there is no need to
use global variables.
def process_chunks() is the core of the solution. It might have one shot input parameters (not
used here), but main point, where it receives input is each yield line returning what anyone sends
via .send(data) into instance of this generator. One can coroutine.send(chunk) but in this example it is done via callback refering to this function callback.send.
Note, that in real solution there is no problem to have multiple yields in the code, they are
processed one by one. This might be used e.g. to read (and ignore) header of CSV file and then
continue processing records with data.
We could instantiate and use the generator as follows:
coroutine = process_chunks()
# Get the coroutine to the first yield
coroutine.next()
ftp = FakeFtp()
# next line repeatedly calls `coroutine.send(data)`
ftp.retrbinary("RETR binary", cb=coroutine.send)
# each callback "jumps" to `yield` line in `process_chunks`
# close the coroutine (will throw the `GeneratorExit` exception into the
# `process_chunks` coroutine).
coroutine.close()
Real code is using contextlib closing context manager to ensure, the coroutine.close() is
always called.
Conclusions
This solution is not providing sort of iterator to consume data from in traditional style "from
outside". On the other hand, we are able to:
use the generator "from inside"
keep all iterative processing within one function without being interrupted between callbacks
optionally use external context
provide usable results to outside
all this can be done without using threading
Credits: The solution is heavily inspired by SO answer Python FTP “chunk” iterator (without loading entire file into memory)
written by user2357112
Concept Use a blocking queue with maxsize=1 and a producer/consumer model.
The callback produces, then the next call to the callback will block on the full queue.
The consumer then yields the value from the queue, tries to get another value, and blocks on read.
The producer is the allowed to push to the queue, rinse and repeat.
Usage:
def dummy(func, arg, callback=None):
for i in range(100):
callback(func(arg+i))
# Dummy example:
for i in Iteratorize(dummy, lambda x: x+1, 0):
print(i)
# example with scipy:
for i in Iteratorize(scipy.optimize.fmin, func, x0):
print(i)
Can be used as expected for an iterator:
for i in take(5, Iteratorize(dummy, lambda x: x+1, 0)):
print(i)
Iteratorize class:
from thread import start_new_thread
from Queue import Queue
class Iteratorize:
"""
Transforms a function that takes a callback
into a lazy iterator (generator).
"""
def __init__(self, func, ifunc, arg, callback=None):
self.mfunc=func
self.ifunc=ifunc
self.c_callback=callback
self.q = Queue(maxsize=1)
self.stored_arg=arg
self.sentinel = object()
def _callback(val):
self.q.put(val)
def gentask():
ret = self.mfunc(self.ifunc, self.stored_arg, callback=_callback)
self.q.put(self.sentinel)
if self.c_callback:
self.c_callback(ret)
start_new_thread(gentask, ())
def __iter__(self):
return self
def next(self):
obj = self.q.get(True,None)
if obj is self.sentinel:
raise StopIteration
else:
return obj
Can probably do with some cleaning up to accept *args and **kwargs for the function being wrapped and/or the final result callback.
How about
data = []
scipy.optimize.fmin(func,x0,callback=data.append)
for line in data:
print line
If not, what exactly do you want to do with the generator's data?
A variant of Frits' answer, that:
Supports send to choose a return value for the callback
Supports throw to choose an exception for the callback
Supports close to gracefully shut down
Does not compute a queue item until it is requested
The complete code with tests can be found on github
import queue
import threading
import collections.abc
class generator_from_callback(collections.abc.Generator):
def __init__(self, expr):
"""
expr: a function that takes a callback
"""
self._expr = expr
self._done = False
self._ready_queue = queue.Queue(1)
self._done_queue = queue.Queue(1)
self._done_holder = [False]
# local to avoid reference cycles
ready_queue = self._ready_queue
done_queue = self._done_queue
done_holder = self._done_holder
def callback(value):
done_queue.put((False, value))
cmd, *args = ready_queue.get()
if cmd == 'close':
raise GeneratorExit
elif cmd == 'send':
return args[0]
elif cmd == 'throw':
raise args[0]
def thread_func():
try:
cmd, *args = ready_queue.get()
if cmd == 'close':
raise GeneratorExit
elif cmd == 'send':
if args[0] is not None:
raise TypeError("can't send non-None value to a just-started generator")
elif cmd == 'throw':
raise args[0]
ret = expr(callback)
raise StopIteration(ret)
except BaseException as e:
done_holder[0] = True
done_queue.put((True, e))
self._thread = threading.Thread(target=thread_func)
self._thread.start()
def __next__(self):
return self.send(None)
def send(self, value):
if self._done_holder[0]:
raise StopIteration
self._ready_queue.put(('send', value))
is_exception, val = self._done_queue.get()
if is_exception:
raise val
else:
return val
def throw(self, exc):
if self._done_holder[0]:
raise StopIteration
self._ready_queue.put(('throw', exc))
is_exception, val = self._done_queue.get()
if is_exception:
raise val
else:
return val
def close(self):
if not self._done_holder[0]:
self._ready_queue.put(('close',))
self._thread.join()
def __del__(self):
self.close()
Which works as:
In [3]: def callback(f):
...: ret = f(1)
...: print("gave 1, got {}".format(ret))
...: f(2)
...: print("gave 2")
...: f(3)
...:
In [4]: i = generator_from_callback(callback)
In [5]: next(i)
Out[5]: 1
In [6]: i.send(4)
gave 1, got 4
Out[6]: 2
In [7]: next(i)
gave 2, got None
Out[7]: 3
In [8]: next(i)
StopIteration
For scipy.optimize.fmin, you would use generator_from_callback(lambda c: scipy.optimize.fmin(func, x0, callback=c))
Solution to handle non-blocking callbacks
The solution using threading and queue is pretty good, of high-performance and cross-platform, probably the best one.
Here I provide this not-too-bad solution, which is mainly for handling non-blocking callbacks, e.g. called from the parent function through threading.Thread(target=callback).start(), or other non-blocking ways.
import pickle
import select
import subprocess
def my_fmin(func, x0):
# open a process to use as a pipeline
proc = subprocess.Popen(['cat'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
def my_callback(x):
# x might be any object, not only str, so we use pickle to dump it
proc.stdin.write(pickle.dumps(x).replace(b'\n', b'\\n') + b'\n')
proc.stdin.flush()
from scipy import optimize
optimize.fmin(func, x0, callback=my_callback)
# this is meant to handle non-blocking callbacks, e.g. called somewhere
# through `threading.Thread(target=callback).start()`
while select.select([proc.stdout], [], [], 0)[0]:
yield pickle.loads(proc.stdout.readline()[:-1].replace(b'\\n', b'\n'))
# close the process
proc.communicate()
Then you can use the function like this:
# unfortunately, `scipy.optimize.fmin`'s callback is blocking.
# so this example is just for showing how-to.
for x in my_fmin(lambda x: x**2, 3):
print(x)
Although This solution seems quite simple and readable, it's not as high-performance as the threading and queue solution, because:
Processes are much heavier than threadings.
Passing data through pipe instead of memory is much slower.
Besides, it doesn't work on Windows, because the select module on Windows can only handle sockets, not pipes and other file descriptors.
For a super simple approach...
def callback_to_generator():
data = []
method_with_callback(blah, foo, callback=data.append)
for item in data:
yield item
Yes, this isn't good for large data
Yes, this blocks on all items being processed first
But it still might be useful for some use cases :)
Also thanks to #winston-ewert as this is just a small variant on his answer :)

Python Generators and yield : How to know which line the program is at

Suppose you have a simple generator in Python like this :
Update :
def f(self):
customFunction_1(argList_1)
yield
customFunction_2(argList_2)
yield
customFunction_3(argList_3)
yield
...
I call f() in another script like :
h=f()
while True:
try:
h.next()
sleep(2)
except KeyboardInterrupt:
##[TODO] tell the last line in f() that was executed
Is there a way that I can do the [TODO] section above? that is knowing the last line in f() that was executed before keyboardInterrupt occurred ?
You can use enumerate() to count:
def f():
...
yield
...
yield
...
for step, value in enumerate(f()):
try:
time.sleep(2)
except KeyboardInterrupt:
print(step) # step holds the number of the last executed function
(because in your example yield does not yield a value, value will of course be None)
Or very explicit using a verbose indication:
def f():
...
yield 'first function finished'
...
yield 'almost done'
...
for message in f():
try:
time.sleep(2)
except KeyboardInterrupt:
print(message)
If you want to know the line number for debugging purposes then in CPython you can use h.gi_frame.f_lineno. This is the line which will be executed next and is 1-indexed. I'm not sure if this works on Python implementations other than CPython.
h=f()
while True:
try:
h.next()
sleep(2)
except KeyboardInterrupt:
print h.gi_frame.f_lineno - 1 # f_lineno is the line to be executed next.
If you don't want to know this for debugging purposes then Remi's enumerate solution is far cleaner.
Why don't you yield i from f() and use that ?
val = h.next()
def f(self):
sleep(10)
yield
sleep(10)
yield
sleep(10)
yield
h=f()
while True:
try:
h.next()
except KeyboardInterrupt:
stack_trace = sys.exc_info()[2] # first traceback points into this function where the exception was caught
stack_trace = stack_trace.tb_next # now we are pointing to where it happened (in this case)
line_no = stack_trace.tb_lineno # and here's the line number
del stack_trace # get rid of circular references
I moved the call to sleep() into f as this only works if the exception happens inside f().

Categories