I am trying to simulate a network of applications that run using twisted. As part of my simulation I would like to synchronize certain events and be able to feed each process large amounts of data. I decided to use multiprocessing Events and Queues. However, my processes are getting hung.
I wrote the example code below to illustrate the problem. Specifically, (about 95% of the time on my sandy bridge machine), the 'run_in_thread' function finishes, however the 'print_done' callback is not called until after I press Ctrl-C.
Additionally, I can change several things in the example code to make this work more reliably such as: reducing the number of spawned processes, calling self.ready.set from reactor_ready, or changing the delay of deferLater.
I am guessing there is a race condition somewhere between the twisted reactor and blocking multiprocessing calls such as Queue.get() or Event.wait()?
What exactly is the problem I am running into? Is there a bug in my code that I am missing? Can I fix this or is twisted incompatible with multiprocessing events/queues?
Secondly, would something like spawnProcess or Ampoule be the recommended alternative? (as suggested in Mix Python Twisted with multiprocessing?)
Edits (as requested):
I've run into problems with all the reactors I've tried glib2reactor selectreactor, pollreactor, and epollreactor. The epollreactor seems to give the best results and seems to work fine for the example given below but still gives me the same (or a similar) problem in my application. I will continue investigating.
I'm running Gentoo Linux kernel 3.3 and 3.4, python 2.7, and I've tried Twisted 10.2.0, 11.0.0, 11.1.0, 12.0.0, and 12.1.0.
In addition to my sandy bridge machine, I see the same issue on my dual core amd machine.
#!/usr/bin/python
# -*- coding: utf-8 *-*
from twisted.internet import reactor
from twisted.internet import threads
from twisted.internet import task
from multiprocessing import Process
from multiprocessing import Event
class TestA(Process):
def __init__(self):
super(TestA, self).__init__()
self.ready = Event()
self.ready.clear()
self.start()
def run(self):
reactor.callWhenRunning(self.reactor_ready)
reactor.run()
def reactor_ready(self, *args):
task.deferLater(reactor, 1, self.node_ready)
return args
def node_ready(self, *args):
print 'node_ready'
self.ready.set()
return args
def reactor_running():
print 'reactor_running'
df = threads.deferToThread(run_in_thread)
df.addCallback(print_done)
def run_in_thread():
print 'run_in_thread'
for n in processes:
n.ready.wait()
def print_done(dfResult=None):
print 'print_done'
reactor.stop()
if __name__ == '__main__':
processes = [TestA() for i in range(8)]
reactor.callWhenRunning(reactor_running)
reactor.run()
The short answer is yes, Twisted and multiprocessing are not compatible with each other, and you cannot reliably use them as you are attempting to.
On all POSIX platforms, child process management is closely tied to SIGCHLD handling. POSIX signal handlers are process-global, and there can be only one per signal type.
Twisted and stdlib multiprocessing cannot both have a SIGCHLD handler installed. Only one of them can. That means only one of them can reliably manage child processes. Your example application doesn't control which of them will win that ability, so I would expect there to be some non-determinism in its behavior arising from that fact.
However, the more immediate problem with your example is that you load Twisted in the parent process and then use multiprocessing to fork and not exec all of the child processes. Twisted does not support being used like this. If you fork and then exec, there's no problem. However, the lack of an exec of a new process (perhaps a Python process using Twisted) leads to all kinds of extra shared state which Twisted does not account for. In your particular case, the shared state that causes this problem is the internal "waker fd" which is used to implement deferToThread. With the fd shared between the parent and all the children, when the parent tries to wake up the main thread to deliver the result of the deferToThread call, it most likely wakes up one of the child processes instead. The child process has nothing useful to do, so that's just a waste of time. Meanwhile the main thread in the parent never wakes up and never notices your threaded task is done.
It's possible you can avoid this issue by not loading any of Twisted until you've already created the child processes. This would turn your usage into a single-process use case as far as Twisted is concerned (in each process, it would be initially loaded, and then that process would not go on to fork at all, so there's no question of how fork and Twisted interact anymore). This means not even importing Twisted until after you've created the child processes.
Of course, this only helps you out as far as Twisted goes. Any other libraries you use could run into similar trouble (you mentioned glib2, that's a great example of another library that will totally choke if you try to use it like this).
I highly recommend not using the multiprocessing module at all. Instead, use any multi-process approach that involves fork and exec, not fork alone. Ampoule falls into that category.
Related
I have a little program, which does some calculations in background, when I call it through zerorpc module in python 2.7.
Here is my code:
is_busy = False
class Server(object):
def calculateSomeStuff(self):
global is_busy
if (is_busy):
return 'I am busy!'
is_busy = True
# calculate some stuff
is_busy = False
print 'Done!'
return
def getIsBusy(self):
return is_busy
s = zerorpc.Server(Server())
s.bind("tcp://0.0.0.0:66666")
s.run()
What should I change to make this program returning is_busy when I call .getIsBusy() method, after .calculateSomeStuff() has started doing it's job?
As I know, there is no way to make it asynchronous in python 2.
You need multi-threading for real concurrency and exploit more than one CPU core if this is what you are after. See the Python threading module, GIL-lock details & possible workarounds and literature.
If you want a cooperative solution, read on.
zerorpc uses gevent for asynchronous input/output. With gevent you write coroutines (also called greenlet or userland threads) which are all running cooperatively on a single thread. The thread in which the gevent input output loop is running. The gevent ioloop takes care of resuming coroutines waiting for some I/O event.
The key here is the word cooperative. Compare that to threads running on a single CPU/core machine. Effectively there is nothing concurrent,but the operating system will preempt ( verb:
take action in order to prevent (an anticipated event) from happening ) a running thread to execute the next on and so on so that every threads gets a fair chance of moving forward.
This happens fast enough so that it feels like all threads are running at the same time.
If you write your code cooperatively with the gevent input/output loop, you can achieve the same effect by being careful of calling gevent.sleep(0) often enough to give a chance for the gevent ioloop to run other coroutines.
It's literally cooperative multithrading. I've heard it was like that in Windows 2 or something.
So, in your example, in the heavy computation part, you likely have some loop going on. Make sure to call gevent.sleep(0) a couple times per second and you will have the illusion of multi-threading.
I hope my answer wasn't too confusing.
I have read on here on this post that using ThreadingMixin (from the SocketServer module), you are able to create a threaded server with BaseHTTPServer. I have tried it, and it does work. However, how can I stop active threads spawned by the server (for example, during a server shutdown)? Is this possible?
The simplest solution is to just use daemon_threads. The short version is: just set this to True, and don't worry about it; when you quit, any threads still working will stop automatically.
As the ThreadingMixIn docs say:
When inheriting from ThreadingMixIn for threaded connection behavior, you should explicitly declare how you want your threads to behave on an abrupt shutdown. The ThreadingMixIn class defines an attribute daemon_threads, which indicates whether or not the server should wait for thread termination. You should set the flag explicitly if you would like threads to behave autonomously; the default is False, meaning that Python will not exit until all threads created by ThreadingMixIn have exited.
Further details are available in the threading docs:
A thread can be flagged as a “daemon thread”. The significance of this flag is that the entire Python program exits when only daemon threads are left. The initial value is inherited from the creating thread. The flag can be set through the daemon property.
Sometimes this isn't appropriate, because you want to shut down without quitting, or because your handlers may have cleanup that needs to be done. But when it is appropriate, you can't get any simpler.
If all you need is a way to shutdown without quitting, and don't need guaranteed cleanup, you may be able to use platform-specific thread-cancellation APIs via ctypes or win32api. This is generally a bad idea, but occasionally it's what you want.
If you need clean shutdown, you need to build your own machinery for that, where the threads cooperate. For example, you could create a global "quit flag" variable protected by a threading.Condition, and have your handle function check this periodically.
This is great if the threads are just doing slow, non-blocking work that you can break up into smaller pieces. For example, if the handle function always checks the quit flag at least once every 5 seconds, you can guarantee being able to shutdown the threads within 5 seconds. But what if the threads are doing blocking work—as they probably are, because the whole reason you used ThreadingMixIn was to let you make blocking calls instead of writing select loops or using asyncore or the like?
Well, there is no good answer. Obviously if you just need the shutdown to happen "eventually" rather than "within 5 seconds" (or if you're willing to abandon clean shutdown after 5 seconds, and revert to either using platform-specific APIs or daemonizing the threads), you can just put the checks before and after each blocking call, and it will "often" work. But if that's not good enough, there's really nothing you can do.
If you need this, the best answer is to change your architecture to use a framework that has ways to do this. The most popular choices are Twisted, Tornado, and gevent. In the future, PEP 3156 will bring similar functionality into the standard library, and there's a partly-complete reference implementation tulip that's worth playing with if you're not trying to build something for the real world that has to be ready soon.
Here's example code showing how to use threading.Event to shutdown the server on any POST request,
import SocketServer
import BaseHTTPServer
import threading
quit_event = threading.Event()
class MyRequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
"""This handler fires the quit event on POST."""
def do_GET(self):
self.send_response(200)
def do_POST(self):
quit_event.set()
self.send_response(200)
class MyThreadingHTTPServer(
SocketServer.ThreadingMixIn, BaseHTTPServer.HTTPServer):
pass
server = MyThreadingHTTPServer(('', 8080), MyRequestHandler)
threading.Thread(target=server.serve_forever).start()
quit_event.wait()
server.shutdown()
The server is shutdown cleanly, so you can immediately restart the server and the port is available rather than getting "Address already in use".
What are best practices or work-arounds for using both multiprocessing and user threads in the same python application in Linux with respect to Issue 6721, Locks in python standard library should be sanitized on fork?
Why do I need both? I use child processes to do heavy computation that produce data structure results that are much too large to return through a queue -- rather they must be immediately stored to disk. It seemed efficient to have each of these child processes monitored by a separate thread, so that when finished, the thread could handle the IO of reading the large (eg multi GB) data back into the process where the result was needed for further computation in combination with the results of other child processes.
The children processes would intermittently hang, which I just (after much head pounding) found was 'caused' by using the logging module. Others have documented the problem here:
https://twiki.cern.ch/twiki/bin/view/Main/PythonLoggingThreadingMultiprocessingIntermixedStudy
which points to this apparently unsolved python issue: Locks in python standard library should be sanitized on fork; http://bugs.python.org/issue6721
Alarmed at the difficulty I had tracking this down, I answered:
Are there any reasons not to mix Multiprocessing and Threading module in Python
with the rather unhelpful suggestion to 'Be careful' and links to the above.
But the lengthy discussion re: Issue 6721 suggests that it is a 'bug' to use both multiprocessing (or os.fork) and user threads in the same application. With my limited understanding of the problem, I find too much disagreement in the discussion to conclude what are the work-arounds or strategies for using both multiprocessing and threading in the same application. My immediate problem was solved by disabling logging, but I create a small handful of other (explicit) locks in both parent and child processes, and suspect I am setting myself up for further intermittent deadlocks.
Can you give practical recommendations to avoid deadlocks while using locks and/or the logging module while using threading and multiprocessing in a python (2.7,3.2,3.3) application?
You will be safe if you fork off additional processes while you still have only one thread in your program (that is, fork from main thread, before spawning worker threads).
Your use case looks like you don't even need multiprocessing module; you can use subprocess (or even simpler os.system-like calls).
See also Is it safe to fork from within a thread?
I had some annoyances with spawning subprocesses, like getting correct output and so on. A wrapper library, envoy, solved all of my problems with an easy-to-use interface that gets rid of most problems.
Using thread, I sometimes struggle with hanging processes that does not end, external programs launched within threads that I can't reach anymore and so on.
Is there any "threading for dummies" python library out there? Thanks
Is there any "threading for dummies" python library out there?
No, there is not. threading is pretty simple to use in simple cases. You want to use it to introduce concurrency in your program. This means you want to use it whenever you want two or more actions to happen simultaneously, i.e. at the same time.
This is how you can let Peter build a house and let Igor drive to Moskow at the same time:
from threading import Thread
import time
def drive_bus():
time.sleep(1)
print "Igor: I'm Igor and I'm driving to... Moskow!"
time.sleep(9)
print "Igor: Yei, Moskow!"
def build_house():
print "Peter: Let's start building a large house..."
time.sleep(10.1)
print "Peter: Urks, we have no tools :-("
threads = [Thread(target=drive_bus), Thread(target=build_house)]
for t in threads:
t.start()
for t in threads:
t.join()
Isn't that simple? Define your function to be run in another thread. Create a threading.Thread instance with that function as target. Nothing happend so far, until you invoke start. It fires off the thread and immediately returns.
Before letting your main thread exit, you should wait for all the threads you have spawned to finish. This is what t.join() does: it blocks and waits for the thread t to finish. Only then it returns.
I would recommend reading more about the actual Python library - it is simple enough. Your problem with hanging threads, provided it prevents your application from exiting, may be solved by using daemon threads.
What kind of task are you trying to achieve? If you are trying to run a task in parallel without actual use of the custom threading, you may find the package multiprocessing useful. Furthermore, there is an interesting piece of information on the python wiki about parallel processing.
Could you elaborate a bit more on the task please?
I am working on an implementation of a very small library in Python that has to be non-blocking.
On some production code, at some point, a call to this library will be done and it needs to do its own work, in its most simple form it would be a callable that needs to pass some information to a service.
This "passing information to a service" is a non-intensive task, probably sending some data to an HTTP service or something similar. It also doesn't need to be concurrent or to share information, however it does need to terminate at some point, possibly with a timeout.
I have used the threading module before and it seems the most appropriate thing to use, but the application where this library will be used is so big that I am worried of hitting the threading limit.
On local testing I was able to hit that limit at around ~2500 threads spawned.
There is a good possibility (given the size of the application) that I can hit that limit easily. It also makes me weary of using a Queue given the memory implications of placing tasks at a high rate in it.
I have also looked at gevent but I couldn't see an example of being able to spawn something that would do some work and terminate without joining. The examples I went through where calling .join() on a spawned Greenlet or on an array of greenlets.
I don't need to know the result of the work being done! It just needs to fire off and try to talk to the HTTP service and die with a sensible timeout if it didn't.
Have I misinterpreted the guides/tutorials for gevent ? Is there any other possibility to spawn a callable in fully non-blocking fashion that can't hit a ~2500 limit?
This is a simple example in Threading that does work as I would expect:
from threading import Thread
class Synchronizer(Thread):
def __init__(self, number):
self.number = number
Thread.__init__(self)
def run(self):
# Simulating some work
import time
time.sleep(5)
print self.number
for i in range(4000): # totally doesn't get past 2,500
sync = Synchronizer(i)
sync.setDaemon(True)
sync.start()
print "spawned a thread, number %s" % i
And this is what I've tried with gevent, where it obviously blocks at the end to
see what the workers did:
def task(pid):
"""
Some non-deterministic task
"""
gevent.sleep(1)
print('Task', pid, 'done')
for i in range(100):
gevent.spawn(task, i)
EDIT:
My problem stemmed out from my lack of familiarity with gevent. While the Thread code was indeed spawning threads, it also prevented the script from terminating while it did some work.
gevent doesn't really do that in the code above, unless you add a .join(). All I had to do to see the gevent code do some work with the spawned greenlets was to make it a long running process. This definitely fixes my problem as the code that needs to spawn the greenlets is done within a framework that is a long running process in itself.
Nothing requires you to call join in gevent, if you're expecting your main thread to last longer than any of your workers.
The only reason for the join call is to make sure the main thread lasts at least as long as all of the workers (so that the program doesn't terminate early).
Why not spawn a subprocess with a connected pipe or similar and then, instead of a callable, just drop your data on the pipe and let the subprocess handle it completely out of band.
As explained in Understanding Asynchronous/Multiprocessing in Python, asyncoro framework supports asynchronous, concurrent processes. You can run tens or hundreds of thousands of concurrent processes; for reference, running 100,000 simple processes takes about 200MB. If you want to, you can mix threads in rest of the system and coroutines with asyncoro (provided threads and coroutines don't share variables, but use coroutine interface functions to send messages etc.).