Twisted: how to wrap a function that takes a call back? - python

I am trying to use twisted framework for some async programming task recently. One thing I do not quite understand is how to wrap a function which takes a call back function and make it an function that return a deferred object?
For example if I have a function like below:
def registerCallbackForData(callback):
pass # this is a function that I do not control, some library code
And now the way I use it is to just register a callback. But I want to be able to incorporate this into the twisted framework, returning a deferred object probably and use reactor.run() later. Is this possible?

def convert_callback_to_deferred(f):
def g():
d = Deferred()
d.addCallback(callback)
f(d.callback)
return d
return g
from somelib import registerCallbackForData
getSomeDeferredForData = convert_callback_to_deferred(registerCallbackForData)
d = getSomeDeferredForData()
d.addCallback(...)
...
However, bear in mind that a Deferred can produce at most one result. If registerCallbackForData(cb) will result in cb being called more than once, there is nowhere for the 2nd and following calls' data to go. Only if you have an at-most-once event source does it make sense to convert it to the Deferred interface.

defer.maybeDeferred will wrap a blocking function call to return a deferred. If you want the call to be non-blocking you can instead use threads.deferToThread.
You can see the difference by switching which call is commented out:
import time
from twisted.internet import reactor, defer, threads
def foo():
print "going to sleep"
time.sleep(1)
print "woke up"
return "a result"
def show_result_and_stop_reactor(result):
print "got result: %s, stopping the reactor" % result
reactor.stop()
print "making the deferred"
d = defer.maybeDeferred(foo)
# d = threads.deferToThread(foo)
print "adding the callback"
d.addCallback(show_result_and_stop_reactor)
print "running the reactor"
reactor.run()

Related

Python multiprocessing.Pool.apply_async() not executing class function

In a custom class I have the following code:
class CustomClass():
triggerQueue: multiprocessing.Queue
def __init__(self):
self.triggerQueue = multiprocessing.Queue()
def poolFunc(queueString):
print(queueString)
def listenerFunc(self):
pool = multiprocessing.Pool(5)
while True:
try:
queueString = self.triggerQueue.get_nowait()
pool.apply_async(func=self.poolFunc, args=(queueString,))
except queue.Empty:
break
What I intend to do is:
add a trigger to the queue (not implemented in this snippet) -> works as intended
run an endless loop within the listenerFunc that reads all triggers from the queue (if any are found) -> works as intended
pass trigger to poolFunc which is to be executed asynchronosly -> not working
It works as soon as I source my poolFun() outside of the class like
def poolFunc(queueString):
print(queueString)
class CustomClass():
[...]
But why is that so? Do I have to pass the self argument somehow? Is it impossible to perform it this way in general?
Thank you for any hint!
There are several problems going on here.
Your instance method, poolFunc, is missing a self parameter.
You are never properly terminating the Pool. You should take advantage of the fact that a multiprocessing.Pool object is a context manager.
You're calling apply_async, but you're never waiting for the results. Read the documentation: you need to call the get method on the AsyncResult object to receive the result; if you don't do this before your program exits your poolFunc function may never run.
By making the Queue object part of your class, you won't be able to pass instance methods to workers.
We can fix all of the above like this:
import multiprocessing
import queue
triggerQueue = multiprocessing.Queue()
class CustomClass:
def poolFunc(self, queueString):
print(queueString)
def listenerFunc(self):
results = []
with multiprocessing.Pool(5) as pool:
while True:
try:
queueString = triggerQueue.get_nowait()
results.append(pool.apply_async(self.poolFunc, (queueString,)))
except queue.Empty:
break
for res in results:
print(res.get())
c = CustomClass()
for i in range(10):
triggerQueue.put(f"testval{i}")
c.listenerFunc()
You can, as you mention, also replace your instance method with a static method, in which case we can keep triggerQueue as part of the class:
import multiprocessing
import queue
class CustomClass:
def __init__(self):
self.triggerQueue = multiprocessing.Queue()
#staticmethod
def poolFunc(queueString):
print(queueString)
def listenerFunc(self):
results = []
with multiprocessing.Pool(5) as pool:
while True:
try:
queueString = self.triggerQueue.get_nowait()
results.append(pool.apply_async(self.poolFunc, (queueString,)))
except queue.Empty:
break
for r in results:
print(r.get())
c = CustomClass()
for i in range(10):
c.triggerQueue.put(f"testval{i}")
c.listenerFunc()
But we still need to reap the pool_async results.
Okay, I found an answer and a workaround:
the answer is based the anser of noxdafox to this question.
Instance methods cannot be serialized that easily. What the Pickle protocol does when serialising a function is simply turning it into a string.
For a child process would be quite hard to find the right object your instance method is referring to due to separate process address spaces.
A functioning workaround is to declare the poolFunc() as static function like
#staticmethod
def poolFunc(queueString):
print(queueString)

Get Deferred from dbpool.runQuery instead of data with Twisted and Oracle

I am trying to get some data from Oracle, using Twisted and runQuery and keep getting Deferred instead of actual data.
How can this be solved?
Some code (I excluded some unnecessary parts, but the idea should be clear):
from twisted.enterprise import adbapi
from twisted.internet import defer
import service_config
ORACLE_DSN = service_config.oracle_dsn
ORACLE_USER = service_config.oracle_user
ORACLE_PASSWORD = service_config.oracle_password
dbpool = adbapi.ConnectionPool('cx_Oracle',
user=ORACLE_USER,
password=ORACLE_PASSWORD,
dsn=ORACLE_DSN, port='49161')
#defer.inlineCallbacks
def ask_db():
data = yield dbpool.runQuery("SELECT * FROM customer")
a = ask_db()
print(a)
I got reactor running in other module, if that is important.
Thank you in advance.
UPDATE:
With help of #notorious.no got working code, returning data instead of Deferred with Python 3.5:
#defer.inlineCallbacks
def ask_db(request):
data = yield dbpool.runQuery(request)
return defer.returnValue(data)
You get a Deferred because you're calling an inlineCallback which always returns a Deferred. You're also misinterpreting what yield does. It doesn't actually return a value from an inlinceCallback it just wait's for a result. Use defer.returnValue() to return a value (you can use a simple return if you're using Python 3.4+). This is what your code should look like:
from __future__ import print_function
#...
#defer.inlineCallbacks
def ask_db():
data = yield dbpool.runQuery("SELECT * FROM customer")
defer.returnValue(data) # actually return a value
a = ask_db() # this returns a Deferred so add callbacks!
a.addCallback(print) # add a useful callback to processes query list
reactor.run()
The difference between what you had previously and this answer is that a callback is added so when the runQuery() returns with a value, the callback is executed and ask_db() actually returns a value you care about.
References
Database Usage with Klein/Twisted

Twisted deferred chaining

I have an application that manages modules calls asynchronously:
it requests a deferred that triggers itself
appends custom callback
checks the returned code to see if = CONTINUE, otherwise handle errors
This is the code that returns a deferred to the main application:
def xxfi_connect(self, hostname):
d = defer.Deferred()
d.callback(Milter.ReturnCodes.CONTINUE)
return d
To asynchronously append some code, I need to hook up my function call in the deferred function like this:
d = defer.Deferred()
d.addCallback(self.run_mods, application.L_CONNECT)
d.callback(Milter.ReturnCodes.CONTINUE)
The trouble is that every function hooked up receive an argument containing the callback parameter (application.L_CONNECT).
Is it possible to achieve this without transporting the returncode in every function call ?
Ideally, I'd like my callback function to be like this:
def run_mods(self, level):
pass
instead of
def run_mods(self, code, level):
pass
because the code (Milter.ReturnCodes.CONTINUE) is only needed at the end of the chain
distinguishing Successful or Erred Deferred()s is already a feature built into them.
>>> from twisted.internet import defer
>>> d = defer.Deferred()
>>> def errors(*args): raise Exception("i'm a bad function")
>>> def sad(*arg): print "this is not so good", arg
>>> def happy(*arg): print "this is good", arg
>>> d.addCallback(errors)
<Deferred at 0x106821f38>
>>> d.addCallbacks(happy, sad)
<Deferred at 0x106821f38>
>>> d.callback("hope")
this is not so good (<twisted.python.failure.Failure <type 'exceptions.Exception'>>,)
Any given "stage" in a chain of callbacks can easily know if it's following an error state, either by how it was added, as the argument to addErrback(), or the second argument to addCallbacks(), or by virtue of it's argument being an instance of Failure()
For more about deferred chaining see: https://twistedmatrix.com/documents/14.0.1/core/howto/defer.html

How to execute Tornado coroutine inside of synchronous environment?

I have some Tornado's coroutine related problem.
There is some python-model A, which have the abbility to execute some function. The function could be set from outside of the model. I can't change the model itself, but I can pass any function I want. I'm trying to teach it to work with Tornado's ioloop through my function, but I couldn't.
Here is the snippet:
import functools
import pprint
from tornado import gen
from tornado import ioloop
class A:
f = None
def execute(self):
return self.f()
pass
#gen.coroutine
def genlist():
raise gen.Return(range(1, 10))
#gen.coroutine
def some_work():
a = A()
a.f = functools.partial(
ioloop.IOLoop.instance().run_sync,
lambda: genlist())
print "a.f set"
raise gen.Return(a)
#gen.coroutine
def main():
a = yield some_work()
retval = a.execute()
raise gen.Return(retval)
if __name__ == "__main__":
pprint.pprint(ioloop.IOLoop.current().run_sync(main))
So the thing is that I set the function in one part of code, but execute it in the other part with the method of the model.
Now, Tornado 4.2.1 gave me "IOLoop is already running" but in Tornado 3.1.1 it works (but I don't know how exactly).
I know next things:
I can create new ioloop but I would like to use existent ioloop.
I can wrap genlist with some function which knows that genlist's result is Future, but I don't know, how to block execution until future's result will be set inside of synchronous function.
Also, I can't use result of a.execute() as an future object because a.execute() could be called from other parts of the code, i.e. it should return list instance.
So, my question is: is there any opportunity to execute asynchronous genlist from the synchronous model's method using current IOLoop?
You cannot restart the outer IOLoop here. You have three options:
Use asynchronous interfaces everywhere: change a.execute() and everything up to the top of the stack into coroutines. This is the usual pattern for Tornado-based applications; trying to straddle the synchronous and asynchronous worlds is difficult and it's better to stay on one side or the other.
Use run_sync() on a temporary IOLoop. This is what Tornado's synchronous tornado.httpclient.HTTPClient does, which makes it safe to call from within another IOLoop. However, if you do it this way the outer IOLoop remains blocked, so you have gained nothing by making genlist asynchronous.
Run a.execute on a separate thread and call back to the main IOLoop's thread for the inner function. If a.execute cannot be made asynchronous, this is the only way to avoid blocking the IOLoop while it is running.
executor = concurrent.futures.ThreadPoolExecutor(8)
#gen.coroutine
def some_work():
a = A()
def adapter():
# Convert the thread-unsafe tornado.concurrent.Future
# to a thread-safe concurrent.futures.Future.
# Note that everything including chain_future must happen
# on the IOLoop thread.
future = concurrent.futures.Future()
ioloop.IOLoop.instance().add_callback(
lambda: tornado.concurrent.chain_future(
genlist(), future)
return future.result()
a.f = adapter
print "a.f set"
raise gen.Return(a)
#gen.coroutine
def main():
a = yield some_work()
retval = yield executor.submit(a.execute)
raise gen.Return(retval)
Say, your function looks something like this:
#gen.coroutine
def foo():
# does slow things
or
#concurrent.run_on_executor
def bar(i=1):
# does slow things
You can run foo() like so:
from tornado.ioloop import IOLoop
loop = IOLoop.current()
loop.run_sync(foo)
You can run bar(..), or any coroutine that takes args, like so:
from functools import partial
from tornado.ioloop import IOLoop
loop = IOLoop.current()
f = partial(bar, i=100)
loop.run_sync(f)

Python observable implementation that supports multi-channel subscribers

In a twisted application I have a series of resource controller/manager classes that interact via the Observable pattern. Generally most observers will subscribe to a specific channel (ex. "foo.bar.entity2") but there are a few cases where I'd like to know about all event in a specific channel (ex. "foo.*" ) so I wrote something like the following:
from collections import defaultdict
class SimplePubSub(object):
def __init__(self):
self.subjects = defaultdict(list)
def subscribe(self, subject, callbackstr):
"""
for brevity, callbackstr would be a valid Python function or bound method but here is just a string
"""
self.subjects[subject].append(callbackstr)
def fire(self, subject):
"""
Again for brevity, fire would have *args, **kwargs or some other additional message arguments but not here
"""
if subject in self.subjects:
print "Firing callback %s" % subject
for callback in self.subjects[subject]:
print "callback %s" % callback
pubSub = SimplePubSub()
pubSub.subscribe('foo.bar', "foo.bar1")
pubSub.subscribe('foo.foo', "foo.foo1")
pubSub.subscribe('foo.ich.tier1', "foo.ich.tier3_1")
pubSub.subscribe('foo.ich.tier2', "foo.ich.tier2_1")
pubSub.subscribe('foo.ich.tier3', "foo.ich.tier2_1")
#Find everything that starts with foo
#say foo.bar is fired
firedSubject = "foo.bar"
pubSub.fire(firedSubject)
#outputs
#>>Firing callback foo.bar
#>>callback foo.bar1
#but let's say I want to add a callback for everything undr foo.ich
class GlobalPubSub(SimplePubSub):
def __init__(self):
self.globals = defaultdict(list)
super(GlobalPubSub, self).__init__()
def subscribe(self, subject, callback):
if subject.find("*") > -1:
#assumes global suscriptions would be like subject.foo.* and we want to catch all subject.foo's
self.globals[subject[:-2]].append(callback)
else:
super(GlobalPubSub, self).subscribe(subject, callback)
def fire(self, subject):
super(GlobalPubSub, self).fire(subject)
if self.globals:
for key in self.globals.iterkeys():
if subject.startswith(key):
for callback in self.globals[key]:
print "global callback says", callback
print "Now with global subscriptions"
print
pubSub = GlobalPubSub()
pubSub.subscribe('foo.bar', "foo.bar1")
pubSub.subscribe('foo.foo', "foo.foo1")
pubSub.subscribe('foo.ich.tier1', "foo.ich.tier3_1")
pubSub.subscribe('foo.ich.tier2', "foo.ich.tier2_1")
pubSub.subscribe('foo.ich.tier3', "foo.ich.tier2_1")
pubSub.subscribe("foo.ich.*", "All the ichs, all the time!")
#Find everything that starts with foo.ich
firedSubject = "foo.ich.tier2"
pubSub.fire(firedSubject)
#outputs
#>>Firing callback foo.bar
#>>callback foo.bar1
#>>Now with global subscriptions
#
#>>Firing callback foo.ich.tier2
#>>callback foo.ich.tier2_1
#>>global callback says All the ichs, all the time!
Is this as good as it gets without resorting to some sort of exotic construct ( tries for example )? I'm looking for an affirmation that I'm on the right track or a better alternative suggestion on a global subscription handler that's pure python ( no external libraries or services ).
This looks like you are on the right track to me. I was using PyPubSub with a wxPython app for a bit, and then ended up implementing my own "more simple" version that, at its root, looks very similar to what you've done here except with a few more bells and whistles that you'd probably end up implementing as you fill out your requirements.
The answer given here is also lot like what you have done as well.
This answer goes into examples that are a bit different approach.
There are a number existing libraries besides PyPubSub out there, such as pydispatch and blinker, that might be worth looking at for reference or ideas.

Categories