Implementing "SystemCalls" with Coroutines in Python - python

Am currently reading through the tutorial doc http://www.dabeaz.com/coroutines/Coroutines.pdf and got stuck at the (pure coroutine) multitask part, in particular the System call section.
The part that got me confused is
class Task(object):
taskid = 0
def __init__(self,target):
Task.taskid += 1
self.tid = Task.taskid # Task ID
self.target = target # Target coroutine
self.sendval = None # Value to send
def run(self):
return self.target.send(self.sendval)
def foo():
mytid = yield GetTid()
for i in xrange(5):
print "I'm foo", mytid
yield
class SystemCall(object):
def handle(self):
pass
class Scheduler(object):
def __init__(self):
self.ready = Queue()
self.taskmap = {}
def new(self, target):
newtask = Task(target)
self.taskmap[newtask.tid] = newtask
self.schedule(newtask)
return newtask.tid
def schedule(self, task):
self.ready.put(task)
def mainloop(self):
while self.taskmap:
task = self.ready.get()
try:
result = task.run()
if isinstance(result,SystemCall):
result.task = task
result.sched = self
result.handle()
continue
except StopIteration:
self.exit(task)
continue
self.schedule(task)
And the actual calling
sched = Scheduler()
sched.new(foo())
sched.mainloop()
The part I don't understand is that how the tid got assigned to mytid in foo()? In the order of things, it seems to be like (starting from sched.mainloop()). Please correct me if I get the flow wrong.
Assumptions: let's name some of the things
the_coroutine = foo()
scheduler = Scheduler
the_task = scheduler.new(the_coroutine) # assume .new() returns the task instead of tid
scheduler: .mainloop() is called
scheduler: the_task.run() is called
the_task: the_coroutine.send(None) is called
the_corou: yield GetTid(), and it returns an instance of GetTid to scheduler before the None in 3 is sent to the yield statement in loop. (am i right?)
the_corou: (also at the same time?) myTid is assigned as the instance of GetTid()?
scheduler: result = theTask.run() the_task.run() <an instance of GetTid>
scheduler: result is indeed an instance of SystemCall
scheduler: result.handle() is called
GetTid instance: scheduler.schedule(the_task)
scheduler: current iteration is done, start a new one
scheduler: (assume no other task in queue) the_task.run() is called
scheduler: the_coroutine.send() is called
I am lost?
When it reaches step 12, apparently the loop is already started and is able to print the tid whenever the scheduler runs the respective task. However, when exactly was the value of tid assigned to mytid in foo()? I am sure I missed something in the flow, but where (or even completely wrong)?
Then I notice the part where the Task object calls .send(), it returns a value, so .send() returns a value?

You seem to have omitted the Scheduler's new method, which is almost certainly where the assignment occurs. I imagine the result of GetTid() that is yielded is immediately sent back into the coroutine using .send().
Regarding .send(), yes you're correct, .send() returns a value.
Try the example below to see what's going on. .send assigns a value to the variable on the left side of the = yield, and execution of code within the coroutine resumes until it hits the next yield. At that point, the coroutine yields, and whatever it yields is the return value of the .send.
>>> def C():
... x = yield
... yield x**2
... print 'no more yields left'
...
>>> cor = C()
>>> cor.next()
>>> yielded = cor.send(10)
>>> print yielded
100
>>> cor.next()
no more yields left
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Moving some of the in-comment points into the answer
So, how does x = yield y work?
When the coroutine hits that line of code, it yields the value y, then halts execution, waiting for someone to invoke its .send() method with an argument.
When someone does invoke .send() whatever is the argument in the .send is assigned to variable x, and the coroutine begins to execute code from there upto the point of its next yield statement.
Edit: Whoa boy it gets even more complex... I skimmed through this David Beazley presentation before, but to be honest I'm better acquainted with his other two talks on generators...
Going through the linked material, it looks like this definition of GetTid is what we're after.
class GetTid(SystemCall):
def handle(self):
self.task.sendval = self.task.tid
self.sched.schedule(self.task)
I quote from his presentation: "The operation of this is little subtle". haha.
Now look at the mainloop:
if isinstance(result,SystemCall):
result.task = task
result.sched = self
result.handle() # <- This bit!
continue
result here is the GetTid object, which runs its handle method which sets its' task's sendval attribute to the task's tid, and the schedules the task by putting it back in the queue.
Once the task is retrieved from the queue, the task.run() method is run again. Let's look at the Task object definition:
class Task(object):
...
def run(self):
return self.target.send(self.sendval)
When task.run() is invoked for this second time, it will send its sendval value (which was previously set by result.handle() to its tid) to its .target - the foo coroutine. This is where the foo coroutine object finally receives the value of its mytid.
The foo coroutine object runs until its next yield, printing its message along the way, and returns None (because there's nothing to the right of yield). That None is the return value of the task.run() method.
That is NOT an instance of a SystemCall, so the task is not handled/scheduled upon the second pass.
Other evil things probably happen too but that's the flow that I'm seeing for now.

Related

Python multiprocessing.Pool.apply_async() not executing class function

In a custom class I have the following code:
class CustomClass():
triggerQueue: multiprocessing.Queue
def __init__(self):
self.triggerQueue = multiprocessing.Queue()
def poolFunc(queueString):
print(queueString)
def listenerFunc(self):
pool = multiprocessing.Pool(5)
while True:
try:
queueString = self.triggerQueue.get_nowait()
pool.apply_async(func=self.poolFunc, args=(queueString,))
except queue.Empty:
break
What I intend to do is:
add a trigger to the queue (not implemented in this snippet) -> works as intended
run an endless loop within the listenerFunc that reads all triggers from the queue (if any are found) -> works as intended
pass trigger to poolFunc which is to be executed asynchronosly -> not working
It works as soon as I source my poolFun() outside of the class like
def poolFunc(queueString):
print(queueString)
class CustomClass():
[...]
But why is that so? Do I have to pass the self argument somehow? Is it impossible to perform it this way in general?
Thank you for any hint!
There are several problems going on here.
Your instance method, poolFunc, is missing a self parameter.
You are never properly terminating the Pool. You should take advantage of the fact that a multiprocessing.Pool object is a context manager.
You're calling apply_async, but you're never waiting for the results. Read the documentation: you need to call the get method on the AsyncResult object to receive the result; if you don't do this before your program exits your poolFunc function may never run.
By making the Queue object part of your class, you won't be able to pass instance methods to workers.
We can fix all of the above like this:
import multiprocessing
import queue
triggerQueue = multiprocessing.Queue()
class CustomClass:
def poolFunc(self, queueString):
print(queueString)
def listenerFunc(self):
results = []
with multiprocessing.Pool(5) as pool:
while True:
try:
queueString = triggerQueue.get_nowait()
results.append(pool.apply_async(self.poolFunc, (queueString,)))
except queue.Empty:
break
for res in results:
print(res.get())
c = CustomClass()
for i in range(10):
triggerQueue.put(f"testval{i}")
c.listenerFunc()
You can, as you mention, also replace your instance method with a static method, in which case we can keep triggerQueue as part of the class:
import multiprocessing
import queue
class CustomClass:
def __init__(self):
self.triggerQueue = multiprocessing.Queue()
#staticmethod
def poolFunc(queueString):
print(queueString)
def listenerFunc(self):
results = []
with multiprocessing.Pool(5) as pool:
while True:
try:
queueString = self.triggerQueue.get_nowait()
results.append(pool.apply_async(self.poolFunc, (queueString,)))
except queue.Empty:
break
for r in results:
print(r.get())
c = CustomClass()
for i in range(10):
c.triggerQueue.put(f"testval{i}")
c.listenerFunc()
But we still need to reap the pool_async results.
Okay, I found an answer and a workaround:
the answer is based the anser of noxdafox to this question.
Instance methods cannot be serialized that easily. What the Pickle protocol does when serialising a function is simply turning it into a string.
For a child process would be quite hard to find the right object your instance method is referring to due to separate process address spaces.
A functioning workaround is to declare the poolFunc() as static function like
#staticmethod
def poolFunc(queueString):
print(queueString)

What is the meaning of bind = True keyword in celery?

What is the meaning of bind=True in below celery code? When to use it and when not?
#app.task(bind=True)
def send_twitter_status(self, oauth, tweet):
try:
twitter = Twitter(oauth)
twitter.update_status(tweet)
except (Twitter.FailWhaleError, Twitter.LoginError) as exc:
raise self.retry(exc=exc)
Just a small addition to other answers. As already stated, bound tasks have access to the task instance. One use case when this is needed are retries:
#celery.task(bind=True, max_retries=5)
def retrying(self):
try:
return 1/0
except Exception:
self.retry(countdown=5)
Another use case is when you want to define custom states for your tasks and be able to set it during task execution:
#celery.task(bind=True)
def show_progress(self, n):
for i in range(n):
self.update_state(state='PROGRESS', meta={'current': i, 'total': n})
Bound tasks
A task being bound means the first argument to the task will always be the task instance (self), just like Python bound methods:
logger = get_task_logger(__name__)
#task(bind=True)
def add(self, x, y):
logger.info(self.request.id)
The bind argument means that the function will be a “bound method” so that you can access attributes and methods on the task type instance.
See the docs

Python Multiprocessing-- Variable not being updated [duplicate]

I am trying to return values from subprocesses but these values are unfortunately unpicklable. So I used global variables in threads module with success but have not been able to retrieve updates done in subprocesses when using multiprocessing module. I hope I'm missing something.
The results printed at the end are always the same as initial values given the vars dataDV03 and dataDV04. The subprocesses are updating these global variables but these global variables remain unchanged in the parent.
import multiprocessing
# NOT ABLE to get python to return values in passed variables.
ants = ['DV03', 'DV04']
dataDV03 = ['', '']
dataDV04 = {'driver': '', 'status': ''}
def getDV03CclDrivers(lib): # call global variable
global dataDV03
dataDV03[1] = 1
dataDV03[0] = 0
# eval( 'CCL.' + lib + '.' + lib + '( "DV03" )' ) these are unpicklable instantiations
def getDV04CclDrivers(lib, dataDV04): # pass global variable
dataDV04['driver'] = 0 # eval( 'CCL.' + lib + '.' + lib + '( "DV04" )' )
if __name__ == "__main__":
jobs = []
if 'DV03' in ants:
j = multiprocessing.Process(target=getDV03CclDrivers, args=('LORR',))
jobs.append(j)
if 'DV04' in ants:
j = multiprocessing.Process(target=getDV04CclDrivers, args=('LORR', dataDV04))
jobs.append(j)
for j in jobs:
j.start()
for j in jobs:
j.join()
print 'Results:\n'
print 'DV03', dataDV03
print 'DV04', dataDV04
I cannot post to my question so will try to edit the original.
Here is the object that is not picklable:
In [1]: from CCL import LORR
In [2]: lorr=LORR.LORR('DV20', None)
In [3]: lorr
Out[3]: <CCL.LORR.LORR instance at 0x94b188c>
This is the error returned when I use a multiprocessing.Pool to return the instance back to the parent:
Thread getCcl (('DV20', 'LORR'),)
Process PoolWorker-1:
Traceback (most recent call last):
File "/alma/ACS-10.1/casa/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
self.run()
File "/alma/ACS-10.1/casa/lib/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "/alma/ACS-10.1/casa/lib/python2.6/multiprocessing/pool.py", line 71, in worker
put((job, i, result))
File "/alma/ACS-10.1/casa/lib/python2.6/multiprocessing/queues.py", line 366, in put
return send(obj)
UnpickleableError: Cannot pickle <type 'thread.lock'> objects
In [5]: dir(lorr)
Out[5]:
['GET_AMBIENT_TEMPERATURE',
'GET_CAN_ERROR',
'GET_CAN_ERROR_COUNT',
'GET_CHANNEL_NUMBER',
'GET_COUNT_PER_C_OP',
'GET_COUNT_REMAINING_OP',
'GET_DCM_LOCKED',
'GET_EFC_125_MHZ',
'GET_EFC_COMB_LINE_PLL',
'GET_ERROR_CODE_LAST_CAN_ERROR',
'GET_INTERNAL_SLAVE_ERROR_CODE',
'GET_MAGNITUDE_CELSIUS_OP',
'GET_MAJOR_REV_LEVEL',
'GET_MINOR_REV_LEVEL',
'GET_MODULE_CODES_CDAY',
'GET_MODULE_CODES_CMONTH',
'GET_MODULE_CODES_DIG1',
'GET_MODULE_CODES_DIG2',
'GET_MODULE_CODES_DIG4',
'GET_MODULE_CODES_DIG6',
'GET_MODULE_CODES_SERIAL',
'GET_MODULE_CODES_VERSION_MAJOR',
'GET_MODULE_CODES_VERSION_MINOR',
'GET_MODULE_CODES_YEAR',
'GET_NODE_ADDRESS',
'GET_OPTICAL_POWER_OFF',
'GET_OUTPUT_125MHZ_LOCKED',
'GET_OUTPUT_2GHZ_LOCKED',
'GET_PATCH_LEVEL',
'GET_POWER_SUPPLY_12V_NOT_OK',
'GET_POWER_SUPPLY_15V_NOT_OK',
'GET_PROTOCOL_MAJOR_REV_LEVEL',
'GET_PROTOCOL_MINOR_REV_LEVEL',
'GET_PROTOCOL_PATCH_LEVEL',
'GET_PROTOCOL_REV_LEVEL',
'GET_PWR_125_MHZ',
'GET_PWR_25_MHZ',
'GET_PWR_2_GHZ',
'GET_READ_MODULE_CODES',
'GET_RX_OPT_PWR',
'GET_SERIAL_NUMBER',
'GET_SIGN_OP',
'GET_STATUS',
'GET_SW_REV_LEVEL',
'GET_TE_LENGTH',
'GET_TE_LONG_FLAG_SET',
'GET_TE_OFFSET_COUNTER',
'GET_TE_SHORT_FLAG_SET',
'GET_TRANS_NUM',
'GET_VDC_12',
'GET_VDC_15',
'GET_VDC_7',
'GET_VDC_MINUS_7',
'SET_CLEAR_FLAGS',
'SET_FPGA_LOGIC_RESET',
'SET_RESET_AMBSI',
'SET_RESET_DEVICE',
'SET_RESYNC_TE',
'STATUS',
'_HardwareDevice__componentName',
'_HardwareDevice__hw',
'_HardwareDevice__stickyFlag',
'_LORRBase__logger',
'__del__',
'__doc__',
'__init__',
'__module__',
'_devices',
'clearDeviceCommunicationErrorAlarm',
'getControlList',
'getDeviceCommunicationErrorCounter',
'getErrorMessage',
'getHwState',
'getInternalSlaveCanErrorMsg',
'getLastCanErrorMsg',
'getMonitorList',
'hwConfigure',
'hwDiagnostic',
'hwInitialize',
'hwOperational',
'hwSimulation',
'hwStart',
'hwStop',
'inErrorState',
'isMonitoring',
'isSimulated']
In [6]:
When you use multiprocessing to open a second process, an entirely new instance of Python, with its own global state, is created. That global state is not shared, so changes made by child processes to global variables will be invisible to the parent process.
Additionally, most of the abstractions that multiprocessing provides use pickle to transfer data. All data transferred using proxies must be pickleable; that includes all the objects that a Manager provides. Relevant quotations (my emphasis):
Ensure that the arguments to the methods of proxies are picklable.
And (in the Manager section):
Other processes can access the shared objects by using proxies.
Queues also require pickleable data; the docs don't say so, but a quick test confirms it:
import multiprocessing
import pickle
class Thing(object):
def __getstate__(self):
print 'got pickled'
return self.__dict__
def __setstate__(self, state):
print 'got unpickled'
self.__dict__.update(state)
q = multiprocessing.Queue()
p = multiprocessing.Process(target=q.put, args=(Thing(),))
p.start()
print q.get()
p.join()
Output:
$ python mp.py
got pickled
got unpickled
<__main__.Thing object at 0x10056b350>
The one approach that might work for you, if you really can't pickle the data, is to find a way to store it as a ctype object; a reference to the memory can then be passed to a child process. This seems pretty dodgy to me; I've never done it. But it might be a possible solution for you.
Given your update, it seems like you need to know a lot more about the internals of a LORR. Is LORR a class? Can you subclass from it? Is it a subclass of something else? What's its MRO? (Try LORR.__mro__ and post the output if it works.) If it's a pure python object, it might be possible to subclass it, creating a __setstate__ and a __getstate__ to enable pickling.
Another approach might be to figure out how to get the relevant data out of a LORR instance and pass it via a simple string. Since you say that you really just want to call the methods of the object, why not just do so using Queues to send messages back and forth? In other words, something like this (schematically):
Main Process Child 1 Child 2
LORR 1 LORR 2
child1_in_queue -> get message 'foo'
call 'foo' method
child1_out_queue <- return foo data string
child2_in_queue -> get message 'bar'
call 'bar' method
child2_out_queue <- return bar data string
#DBlas gives you a quick url and reference to the Manager class in an answer, but I think its still a bit vague so I thought it might be helpful for you to just see it applied...
import multiprocessing
from multiprocessing import Manager
ants = ['DV03', 'DV04']
def getDV03CclDrivers(lib, data_dict):
data_dict[1] = 1
data_dict[0] = 0
def getDV04CclDrivers(lib, data_list):
data_list['driver'] = 0
if __name__ == "__main__":
manager = Manager()
dataDV03 = manager.list(['', ''])
dataDV04 = manager.dict({'driver': '', 'status': ''})
jobs = []
if 'DV03' in ants:
j = multiprocessing.Process(
target=getDV03CclDrivers,
args=('LORR', dataDV03))
jobs.append(j)
if 'DV04' in ants:
j = multiprocessing.Process(
target=getDV04CclDrivers,
args=('LORR', dataDV04))
jobs.append(j)
for j in jobs:
j.start()
for j in jobs:
j.join()
print 'Results:\n'
print 'DV03', dataDV03
print 'DV04', dataDV04
Because multiprocessing actually uses separate processes, you cannot simply share global variables because they will be in completely different "spaces" in memory. What you do to a global under one process will not reflect in another. Though I admit that it seems confusing since the way you see it, its all living right there in the same piece of code, so "why shouldn't those methods have access to the global"? Its harder to wrap your head around the idea that they will be running in different processes.
The Manager class is given to act as a proxy for data structures that can shuttle info back and forth for you between processes. What you will do is create a special dict and list from a manager, pass them into your methods, and operate on them locally.
Un-pickle-able data
For your specialize LORR object, you might need to create something like a proxy that can represent the pickable state of the instance.
Not super robust or tested much, but gives you the idea.
class LORRProxy(object):
def __init__(self, lorrObject=None):
self.instance = lorrObject
def __getstate__(self):
# how to get the state data out of a lorr instance
inst = self.instance
state = dict(
foo = inst.a,
bar = inst.b,
)
return state
def __setstate__(self, state):
# rebuilt a lorr instance from state
lorr = LORR.LORR()
lorr.a = state['foo']
lorr.b = state['bar']
self.instance = lorr
When using multiprocess, the only way to pass objects between processes is to use Queue or Pipe; globals are not shared. Objects must be pickleable, so multiprocess won't help you here.
You could also use a multiprocessing Array. This allows you to have a shared state between processes and is probably the closest thing to a global variable.
At the top of main, declare an Array. The first argument 'i' says it will be integers. The second argument gives the initial values:
shared_dataDV03 = multiprocessing.Array ('i', (0, 0)) #a shared array
Then pass this array to the process as an argument:
j = multiprocessing.Process(target=getDV03CclDrivers, args=('LORR',shared_dataDV03))
You have to receive the array argument in the function being called, and then you can modify it within the function:
def getDV03CclDrivers(lib,arr): # call global variable
arr[1]=1
arr[0]=0
The array is shared with the parent, so you can print out the values at the end in the parent:
print 'DV03', shared_dataDV03[:]
And it will show the changes:
DV03 [0, 1]
I use p.map() to spin off a number of processes to remote servers and print the results when they come back at unpredictable times:
Servers=[...]
from multiprocessing import Pool
p=Pool(len(Servers))
p.map(DoIndividualSummary, Servers)
This worked fine if DoIndividualSummary used print for the results, but the overall result was in unpredictable order, which made interpretation difficult. I tried a number of approaches to use global variables but ran into problems. Finally, I succeeded with sqlite3.
Before p.map(), open a sqlite connection and create a table:
import sqlite3
conn=sqlite3.connect('servers.db') # need conn for commit and close
db=conn.cursor()
try: db.execute('''drop table servers''')
except: pass
db.execute('''CREATE TABLE servers (server text, serverdetail text, readings text)''')
conn.commit()
Then, when returning from DoIndividualSummary(), save the results into the table:
db.execute('''INSERT INTO servers VALUES (?,?,?)''', (server,serverdetail,readings))
conn.commit()
return
After the map() statement, print the results:
db.execute('''select * from servers order by server''')
rows=db.fetchall()
for server,serverdetail,readings in rows: print serverdetail,readings
May seem like overkill but it was simpler for me than the recommended solutions.

Twisted deferred chaining

I have an application that manages modules calls asynchronously:
it requests a deferred that triggers itself
appends custom callback
checks the returned code to see if = CONTINUE, otherwise handle errors
This is the code that returns a deferred to the main application:
def xxfi_connect(self, hostname):
d = defer.Deferred()
d.callback(Milter.ReturnCodes.CONTINUE)
return d
To asynchronously append some code, I need to hook up my function call in the deferred function like this:
d = defer.Deferred()
d.addCallback(self.run_mods, application.L_CONNECT)
d.callback(Milter.ReturnCodes.CONTINUE)
The trouble is that every function hooked up receive an argument containing the callback parameter (application.L_CONNECT).
Is it possible to achieve this without transporting the returncode in every function call ?
Ideally, I'd like my callback function to be like this:
def run_mods(self, level):
pass
instead of
def run_mods(self, code, level):
pass
because the code (Milter.ReturnCodes.CONTINUE) is only needed at the end of the chain
distinguishing Successful or Erred Deferred()s is already a feature built into them.
>>> from twisted.internet import defer
>>> d = defer.Deferred()
>>> def errors(*args): raise Exception("i'm a bad function")
>>> def sad(*arg): print "this is not so good", arg
>>> def happy(*arg): print "this is good", arg
>>> d.addCallback(errors)
<Deferred at 0x106821f38>
>>> d.addCallbacks(happy, sad)
<Deferred at 0x106821f38>
>>> d.callback("hope")
this is not so good (<twisted.python.failure.Failure <type 'exceptions.Exception'>>,)
Any given "stage" in a chain of callbacks can easily know if it's following an error state, either by how it was added, as the argument to addErrback(), or the second argument to addCallbacks(), or by virtue of it's argument being an instance of Failure()
For more about deferred chaining see: https://twistedmatrix.com/documents/14.0.1/core/howto/defer.html

How to execute Tornado coroutine inside of synchronous environment?

I have some Tornado's coroutine related problem.
There is some python-model A, which have the abbility to execute some function. The function could be set from outside of the model. I can't change the model itself, but I can pass any function I want. I'm trying to teach it to work with Tornado's ioloop through my function, but I couldn't.
Here is the snippet:
import functools
import pprint
from tornado import gen
from tornado import ioloop
class A:
f = None
def execute(self):
return self.f()
pass
#gen.coroutine
def genlist():
raise gen.Return(range(1, 10))
#gen.coroutine
def some_work():
a = A()
a.f = functools.partial(
ioloop.IOLoop.instance().run_sync,
lambda: genlist())
print "a.f set"
raise gen.Return(a)
#gen.coroutine
def main():
a = yield some_work()
retval = a.execute()
raise gen.Return(retval)
if __name__ == "__main__":
pprint.pprint(ioloop.IOLoop.current().run_sync(main))
So the thing is that I set the function in one part of code, but execute it in the other part with the method of the model.
Now, Tornado 4.2.1 gave me "IOLoop is already running" but in Tornado 3.1.1 it works (but I don't know how exactly).
I know next things:
I can create new ioloop but I would like to use existent ioloop.
I can wrap genlist with some function which knows that genlist's result is Future, but I don't know, how to block execution until future's result will be set inside of synchronous function.
Also, I can't use result of a.execute() as an future object because a.execute() could be called from other parts of the code, i.e. it should return list instance.
So, my question is: is there any opportunity to execute asynchronous genlist from the synchronous model's method using current IOLoop?
You cannot restart the outer IOLoop here. You have three options:
Use asynchronous interfaces everywhere: change a.execute() and everything up to the top of the stack into coroutines. This is the usual pattern for Tornado-based applications; trying to straddle the synchronous and asynchronous worlds is difficult and it's better to stay on one side or the other.
Use run_sync() on a temporary IOLoop. This is what Tornado's synchronous tornado.httpclient.HTTPClient does, which makes it safe to call from within another IOLoop. However, if you do it this way the outer IOLoop remains blocked, so you have gained nothing by making genlist asynchronous.
Run a.execute on a separate thread and call back to the main IOLoop's thread for the inner function. If a.execute cannot be made asynchronous, this is the only way to avoid blocking the IOLoop while it is running.
executor = concurrent.futures.ThreadPoolExecutor(8)
#gen.coroutine
def some_work():
a = A()
def adapter():
# Convert the thread-unsafe tornado.concurrent.Future
# to a thread-safe concurrent.futures.Future.
# Note that everything including chain_future must happen
# on the IOLoop thread.
future = concurrent.futures.Future()
ioloop.IOLoop.instance().add_callback(
lambda: tornado.concurrent.chain_future(
genlist(), future)
return future.result()
a.f = adapter
print "a.f set"
raise gen.Return(a)
#gen.coroutine
def main():
a = yield some_work()
retval = yield executor.submit(a.execute)
raise gen.Return(retval)
Say, your function looks something like this:
#gen.coroutine
def foo():
# does slow things
or
#concurrent.run_on_executor
def bar(i=1):
# does slow things
You can run foo() like so:
from tornado.ioloop import IOLoop
loop = IOLoop.current()
loop.run_sync(foo)
You can run bar(..), or any coroutine that takes args, like so:
from functools import partial
from tornado.ioloop import IOLoop
loop = IOLoop.current()
f = partial(bar, i=100)
loop.run_sync(f)

Categories