Strange blocking behavior with python multiprocessing queue put() and get()

Strange blocking behavior with python multiprocessing queue put() and get() - python

I have written a class in python 2.7 (under linux) that uses multiple processes to manipulate a database asynchronously. I encountered a very strange blocking behaviour when using multiprocessing.Queue.put() and multiprocessing.Queue.get() which I can't explain.
Here is a simplified version of what I do:
from multiprocessing import Process, Queue
class MyDB(object):
def __init__(self):
self.inqueue = Queue()
p1 = Process(target = self._worker_process, kwargs={"inqueue": self.inqueue})
p1.daemon = True
started = False
while not started:
try:
p1.start()
started = True
except:
time.sleep(1)
#Sometimes I start a same second process but it makes no difference to my problem
p2 = Process(target = self._worker_process, kwargs={"inqueue": self.inqueue})
#blahblah... (same as above)
#staticmethod
def _worker_process(inqueue):
while True:
#--------------this blocks depite data having arrived------------
op = inqueue.get(block = True)
#do something with specified operation
#---------------problem area end--------------------
print "if this text gets printed, the problem was solved"
def delete_parallel(self, key, rawkey = False):
someid = ...blahblah
#--------------this section blocked when I was posting the question but for unknown reasons it's fine now
self.inqueue.put({"optype": "delete", "kwargs": {"key":key, "rawkey":rawkey}, "callid": someid}, block = True)
#--------------problem area end----------------
print "if you see this text, there was no blocking or block was released"
If I run the code above inside a test (in which I call delete_parallel on the MyDB object) then everything works, but if I run it in context of my entire application (importing other stuff, inclusive pygtk) strange things happen:
For some reason self.inqueue.get blocks and never releases despite self.inqueue having the data in its buffer. When I instead call self.inqueue.get(block = False, timeout = 1) then the call finishes by raising Queue.Empty, despite the queue containing data. qsize() returns 1 (suggests that data is there) while empty() returns True (suggests that there is no data).
Now clearly there must be something somewhere else in my application that renders self.inqueue unusable by causing acquisition of some internal semaphore. However I don't know what to look for. Eclipse dubugging becomes useless once a blocking semaphore is reached.
Edit 8 (cleaning up and summarizing my previous edits) Last time I had a similar problem, it turned out that pygtk was hijacking the global interpreter lock, but I solved it by calling gobject.threads_init() before I called anything else. Could this issue be related?
When I introduce a print "successful reception" after the get() method and execute my application in terminal, the same behaviour happens at first. When I then terminate by pressing CTRL+D I suddenly get the string "successful reception" inbetween messages. This looks to me like some other process/thread is terminated and releases the lock that blocks the process that is stuck at get().
Since the process that was stuck terminates later, I still see the message. What kind of process could externally mess with a Queue like that? self.inqueue is only accessed inside my class.
Right now it seems to come down to this queue which won't return anything despite the data being there:
the get() method seems to get stuck when it attempts to receive the actual data from some internal pipe. The last line before my debugger hangs is:
res = self._recv()
which is inside of multiprocessing.queues.get()
Tracking this internal python stuff further I find the assignments
self._recv = self._reader.recv and self._reader, self._writer = Pipe(duplex=False).
Edit 9
I'm currently trying to hunt down the import that causes it. My application is quite complex with hundreds of classes and each class importing a lot of other classes, so it's a pretty painful process. I have found a first candidate class which Uses 3 different MyDB instances when I track all its imports (but doesn't access MyDB.inqueue at any time as far as I can tell). The strange thing is, it's basically just a wrapper and the wrapped class works just fine when imported on its own. This also means that it uses MyDB without freezing. As soon as I import the wrapper (which imports that class), I have the blocking issue.
I started rewriting the wrapper by gradually reusing the old code. I'm testing each time I introduce a couple of new lines until I will hopefully see which line will cause the problem to return.

queue.Queue uses internal threads to maintain its state. If you are using GTK then it will break these threads. So you will need to call gobject.init_threads().
It should be noted that qsize() only returns an approximate size of the queue. The real size may be anywhere between 0 and the value returned by qsize().

Related

how to safely pass a variables value between threads in python

I read on the python documentation that Queue.Queue() is a safe way of passing variables between different threads. I didn't really know that there was a safety issue with multithreading. For my application, I need to develop multiple objects with variables that can be accessed from multiple different threads. Right now I just have the threads accessing the object variables directly. I wont show my code here because there's way too much of it, but here is an example to demonstrate what I'm doing.
from threading import Thread
import time
import random
class switch:
def __init__(self,id):
self.id=id
self.is_on = False
def self.toggle():
self.is_on = not self.is_on
switches = []
for i in range(5):
switches[i] = switch(i)
def record_switch():
switch_record = {}
while True:
time.sleep(10)
current = {}
current['time'] = time.srftime(time.time())
for i in switches:
current[i.id] = i.is_on
switch_record.update(current)
def toggle_switch():
while True:
time.sleep(random.random()*100)
for i in switches:
i.toggle()
toggle = Thread(target=toggle_switch(), args = ())
record = Thread(target=record_switch(), args = ())
toggle.start()
record.start()
So as I understand, the queue object can be used only to put and get values, which clearly won't work for me. Is what I have here "safe"? If not, how can I program this so that I can safely access a variable from multiple different threads?

Whenever you have threads modifying a value other threads can see, then you are going to have safety issues. The worry is that a thread will try to modify a value when another thread is in the middle of modifying it, which has risky and undefined behavior. So no, your switch-toggling code is not safe.
The important thing to know is that changing the value of a variable is not guaranteed to be atomic. If an action is atomic, it means that action will always happen in one uninterrupted step. (This differs very slightly from the database definition.) Changing a variable value, especially a list value, can often times take multiple steps on the processor level. When you are working with threads, all of those steps are not guaranteed to happen all at once, before another thread starts working. It's entirely possible that thread A will be halfway through changing variable x when thread B suddenly takes over. Then if thread B tries to read variable x, it's not going to find a correct value. Even worse, if thread B tries to modify variable x while thread A is halfway through doing the same thing, bad things can happen. Whenever you have a variable whose value can change somehow, all accesses to it need to be made thread-safe.
If you're modifying variables instead of passing messages, you should be using aLockobject.
In your case, you'd have a global Lock object at the top:
from threading import Lock
switch_lock = Lock()
Then you would surround the critical piece of code with the acquire and release functions.
for i in switches:
switch_lock.acquire()
current[i.id] = i.is_on
switch_lock.release()
for i in switches:
switch_lock.acquire()
i.toggle()
switch_lock.release()
Only one thread may ever acquire a lock at a time (this kind of lock, anyway). When any of the other threads try, they'll be blocked and wait for the lock to become free again. So by putting locks around critical sections of code, you make it impossible for more than one thread to look at, or modify, a given switch at any time. You can put this around any bit of code you want to be kept exclusive to one thread at a time.
EDIT: as martineau pointed out, locks are integrated well with the with statement, if you're using a version of Python that has it. This has the added benefit of automatically unlocking if an exception happens. So instead of the above acquire and release system, you can just do this:
for i in switches:
with switch_lock:
i.toggle()

Python threading.thread.start() doesn't return control to main thread

I'm trying to a program that executes a piece of code in such a way that the user can stop its execution at any time without stopping the main program. I thought I could do this using threading.Thread, but then I ran the following code in IDLE (Python 3.3):
from threading import *
import math
def f():
eval("math.factorial(1000000000)")
t = Thread(target = f)
t.start()
The last line doesn't return: I eventually restarted the shell. Is this a consequence of the Global Interpreter Lock, or am I doing something wrong? I didn't see anything specific to this problem in the threading documentation (http://docs.python.org/3/library/threading.html)
I tried to do the same thing using a process:
from multiprocessing import *
import math
def f():
eval("math.factorial(1000000000)")
p = Process(target = f)
p.start()
p.is_alive()
The last line returns False, even though I ran it only a few seconds after I started the process! Based on my processor usage, I am forced to conclude that the process never started in the first place. Can somebody please explain what I am doing wrong here?

Thread.start() never returns! Could this have something to do with the C implementation of the math library?
As #eryksun pointed out in the comment: math.factorial() is implemented as a C function that doesn't release GIL so no other Python code may run until it returns.
Note: multiprocessing version should work as is: each Python process has its own GIL.
factorial(1000000000) has hundreds millions of digits. Try import time; time.sleep(10) as dummy calculation instead.
If you have issues with multithreaded code in IDLE then try the same code from the command line, to make sure that the error persists.
If p.is_alive() returns False after p.start() is already called then it might mean that there is an error in f() function e.g., MemoryError.
On my machine, p.is_alive() returns True and one of cpus is at 100% if I paste your code from the question into Python shell.
Unrelated: remove wildcard imports such as from multiprocessing import *. They may shadow other names in your code so that you can't be sure what a given name means e.g., threading could define eval function (it doesn't but it could) with a similar but different semantics that might break your code silently.
I want my program to be able to handle ridiculous inputs from the user gracefully
If you pass user input directly to eval() then the user can do anything.
Is there any way to get a process to print, say, an error message without constructing a pipe or other similar structure?
It is an ordinary Python code:
print(message) # works
The difference is that if several processes run print() then the output might be garbled. You could use a lock to synchronize print() calls.

How to use multiprocessing with class instances in Python?

I am trying to create a class than can run a separate process to go do some work that takes a long time, launch a bunch of these from a main module and then wait for them all to finish. I want to launch the processes once and then keep feeding them things to do rather than creating and destroying processes. For example, maybe I have 10 servers running the dd command, then I want them all to scp a file, etc.
My ultimate goal is to create a class for each system that keeps track of the information for the system in which it is tied to like IP address, logs, runtime, etc. But that class must be able to launch a system command and then return execution back to the caller while that system command runs, to followup with the result of the system command later.
My attempt is failing because I cannot send an instance method of a class over the pipe to the subprocess via pickle. Those are not pickleable. I therefore tried to fix it various ways but I can't figure it out. How can my code be patched to do this? What good is multiprocessing if you can't send over anything useful?
Is there any good documentation of multiprocessing being used with class instances? The only way I can get the multiprocessing module to work is on simple functions. Every attempt to use it within a class instance has failed. Maybe I should pass events instead? I don't understand how to do that yet.
import multiprocessing
import sys
import re
class ProcessWorker(multiprocessing.Process):
"""
This class runs as a separate process to execute worker's commands in parallel
Once launched, it remains running, monitoring the task queue, until "None" is sent
"""
def __init__(self, task_q, result_q):
multiprocessing.Process.__init__(self)
self.task_q = task_q
self.result_q = result_q
return
def run(self):
"""
Overloaded function provided by multiprocessing.Process. Called upon start() signal
"""
proc_name = self.name
print '%s: Launched' % (proc_name)
while True:
next_task_list = self.task_q.get()
if next_task is None:
# Poison pill means shutdown
print '%s: Exiting' % (proc_name)
self.task_q.task_done()
break
next_task = next_task_list[0]
print '%s: %s' % (proc_name, next_task)
args = next_task_list[1]
kwargs = next_task_list[2]
answer = next_task(*args, **kwargs)
self.task_q.task_done()
self.result_q.put(answer)
return
# End of ProcessWorker class
class Worker(object):
"""
Launches a child process to run commands from derived classes in separate processes,
which sit and listen for something to do
This base class is called by each derived worker
"""
def __init__(self, config, index=None):
self.config = config
self.index = index
# Launce the ProcessWorker for anything that has an index value
if self.index is not None:
self.task_q = multiprocessing.JoinableQueue()
self.result_q = multiprocessing.Queue()
self.process_worker = ProcessWorker(self.task_q, self.result_q)
self.process_worker.start()
print "Got here"
# Process should be running and listening for functions to execute
return
def enqueue_process(target): # No self, since it is a decorator
"""
Used to place an command target from this class object into the task_q
NOTE: Any function decorated with this must use fetch_results() to get the
target task's result value
"""
def wrapper(self, *args, **kwargs):
self.task_q.put([target, args, kwargs]) # FAIL: target is a class instance method and can't be pickled!
return wrapper
def fetch_results(self):
"""
After all processes have been spawned by multiple modules, this command
is called on each one to retreive the results of the call.
This blocks until the execution of the item in the queue is complete
"""
self.task_q.join() # Wait for it to to finish
return self.result_q.get() # Return the result
#enqueue_process
def run_long_command(self, command):
print "I am running number % as process "%number, self.name
# In here, I will launch a subprocess to run a long-running system command
# p = Popen(command), etc
# p.wait(), etc
return
def close(self):
self.task_q.put(None)
self.task_q.join()
if __name__ == '__main__':
config = ["some value", "something else"]
index = 7
workers = []
for i in range(5):
worker = Worker(config, index)
worker.run_long_command("ls /")
workers.append(worker)
for worker in workers:
worker.fetch_results()
# Do more work... (this would actually be done in a distributor in another class)
for worker in workers:
worker.close()
Edit: I tried to move the ProcessWorker class and the creation of the multiprocessing queues outside of the Worker class and then tried to manually pickle the worker instance. Even that doesn't work and I get an error
RuntimeError: Queue objects should only be shared between processes
through inheritance
. But I am only passing references of those queues into the worker instance?? I am missing something fundamental. Here is the modified code from the main section:
if __name__ == '__main__':
config = ["some value", "something else"]
index = 7
workers = []
for i in range(1):
task_q = multiprocessing.JoinableQueue()
result_q = multiprocessing.Queue()
process_worker = ProcessWorker(task_q, result_q)
worker = Worker(config, index, process_worker, task_q, result_q)
something_to_look_at = pickle.dumps(worker) # FAIL: Doesn't like queues??
process_worker.start()
worker.run_long_command("ls /")

So, the problem was that I was assuming that Python was doing some sort of magic that is somehow different from the way that C++/fork() works. I somehow thought that Python only copied the class, not the whole program into a separate process. I seriously wasted days trying to get this to work because all of the talk about pickle serialization made me think that it actually sent everything over the pipe. I knew that certain things could not be sent over the pipe, but I thought my problem was that I was not packaging things up properly.
This all could have been avoided if the Python docs gave me a 10,000 ft view of what happens when this module is used. Sure, it tells me what the methods of multiprocess module does and gives me some basic examples, but what I want to know is what is the "Theory of Operation" behind the scenes! Here is the kind of information I could have used. Please chime in if my answer is off. It will help me learn.
When you run start a process using this module, the whole program is copied into another process. But since it is not the "__main__" process and my code was checking for that, it doesn't fire off yet another process infinitely. It just stops and sits out there waiting for something to do, like a zombie. Everything that was initialized in the parent at the time of calling multiprocess.Process() is all set up and ready to go. Once you put something in the multiprocess.Queue or shared memory, or pipe, etc. (however you are communicating), then the separate process receives it and gets to work. It can draw upon all imported modules and setup just as if it was the parent. However, once some internal state variables change in the parent or separate process, those changes are isolated. Once the process is spawned, it now becomes your job to keep them in sync if necessary, either through a queue, pipe, shared memory, etc.
I threw out the code and started over, but now I am only putting one extra function out in the ProcessWorker, an "execute" method that runs a command line. Pretty simple. I don't have to worry about launching and then closing a bunch of processes this way, which has caused me all kinds of instability and performance issues in the past in C++. When I switched to launching processes at the beginning and then passing messages to those waiting processes, my performance improved and it was very stable.
BTW, I looked at this link to get help, which threw me off because the example made me think that methods were being transported across the queues: http://www.doughellmann.com/PyMOTW/multiprocessing/communication.html
The second example of the first section used "next_task()" that appeared (to me) to be executing a task received via the queue.

Instead of attempting to send a method itself (which is impractical), try sending a name of a method to execute.
Provided that each worker runs the same code, it's a matter of a simple getattr(self, task_name).
I'd pass tuples (task_name, task_args), where task_args were a dict to be directly fed to the task method:
next_task_name, next_task_args = self.task_q.get()
if next_task_name:
task = getattr(self, next_task_name)
answer = task(**next_task_args)
...
else:
# poison pill, shut down
break

REF: https://stackoverflow.com/a/14179779
Answer on Jan 6 at 6:03 by David Lynch is not factually correct when he says that he was misled by
http://www.doughellmann.com/PyMOTW/multiprocessing/communication.html.
The code and examples provided are correct and work as advertised. next_task() is executing a task received via the queue -- try and understand what the Task.__call__() method is doing.
In my case what, tripped me up was syntax errors in my implementation of run(). It seems that the sub-process will not report this and just fails silently -- leaving things stuck in weird loops! Make sure you have some kind of syntax checker running e.g. Flymake/Pyflakes in Emacs.
Debugging via multiprocessing.log_to_stderr()F helped me narrow down the problem.

What is the "correct" way to make a stoppable thread in Python, given stoppable pseudo-atomic units of work?

I'm writing a threaded program in Python. This program is interrupted very frequently, by user (CRTL+C) interaction, and by other programs sending various signals, all of which should stop thread operation in various ways. The thread does a bunch of units of work (I call them "atoms") in sequence.
Each atom can be stopped quickly and safely, so making the thread itself stop is fairly trivial, but my question is: what is the "right", or canonical way to implement a stoppable thread, given stoppable, pseudo-atomic pieces of work to be done?
Should I poll a stop_at_next_check flag before each atom (example below)? Should I decorate each atom with something that does the flag-checking (basically the same as the example, but hidden in a decorator)? Or should I use some other technique I haven't thought of?
Example (simple stopped-flag checking):
class stoppable(Thread):
stop_at_next_check = False
current_atom = None
def __init__(self):
Thread.__init__(self)
def do_atom(self, atom):
if self.stop_at_next_check:
return False
self.current_atom = atom
self.current_atom.do_work()
return True
def run(self):
#get "work to be done" objects atom1, atom2, etc. from somewhere
if not do_atom(atom1):
return
if not do_atom(atom2):
return
#...etc
def die(self):
self.stop_at_next_check = True
self.current_atom.stop()

Flag checking seems right, but you missed an occasion to simplify it by using a list for atoms. If you put atoms in a list, you can use a single for loop without needing a do_atom() method, and the problem of where to do the check solves itself.
def run(self):
atoms = # get atoms
for atom in atoms:
if self.stop_at_next_check:
break
self.current_atom = atom
atom.do_work()

Create a "thread x should continue processing" flag, and when you're done with the thread, set the flag to false.
Killing a thread directly is considered bad form, because you might get a fractional chunk of work completed.

A tad late but I have created a small library, ants, solving this problem. In your example an atomic unit is represented by an worker
Example
from ants import worker
#worker
def hello():
print(“hello world”)
t = hello.start()
...
t.stop()
In above example hello() will run in a separate thread being called in a while True: loop thus spitting out “hello world” as fast as possible
You can also have triggering events , e.g. in above replace hello.start() with hello.start(lambda: time.sleep(5)) and you will have it trigger every 5:th second
The library is very new and work is ongoing on GitHub https://github.com/fa1k3n/ants.git
Future work includes adding a colony for having several workers working on different parts of same data, also planning on a queen for worker communication and control, like synch

Using the Queue class in Python 2.6

Let's assume I'm stuck using Python 2.6, and can't upgrade (even if that would help). I've written a program that uses the Queue class. My producer is a simple directory listing. My consumer threads pull a file from the queue, and do stuff with it. If the file has already been processed, I skip it. The processed list is generated before all of the threads are started, so it isn't empty.
Here's some pseudo-code.
import Queue, sys, threading
processed = []
def consumer():
while True:
file = dirlist.get(block=True)
if file in processed:
print "Ignoring %s" % file
else:
# do stuff here
dirlist.task_done()
dirlist = Queue.Queue()
for f in os.listdir("/some/dir"):
dirlist.put(f)
max_threads = 8
for i in range(max_threads):
thr = Thread(target=consumer)
thr.start()
dirlist.join()
The strange behavior I'm getting is that if a thread encounters a file that's already been processed, the thread stalls out and waits until the entire program ends. I've done a little bit of testing, and the first 7 threads (assuming 8 is the max) stop, while the 8th thread keeps processing, one file at a time. But, by doing that, I'm losing the entire reason for threading the application.
Am I doing something wrong, or is this the expected behavior of the Queue/threading classes in Python 2.6?

I tried running your code, and did not see the behavior you describe. However, the program never exits. I recommend changing the .get() call as follows:
try:
file = dirlist.get(True, 1)
except Queue.Empty:
return
If you want to know which thread is currently executing, you can import the thread module and print thread.get_ident().
I added the following line after the .get():
print file, thread.get_ident()
and got the following output:
bin 7116328
cygdrive 7116328
cygwin.bat 7149424
cygwin.ico 7116328
dev etc7598568
7149424
fix 7331000
home 7116328lib
7598568sbin
7149424Thumbs.db
7331000
tmp 7107008
usr 7116328
var 7598568proc
7441800
The output is messy because the threads are writing to stdout at the same time. The variety of thread identifiers further confirms that all of the threads are running.
Perhaps something is wrong in the real code or your test methodology, but not in the code you posted?

Since this problem only manifests itself when finding a file that's already been processed, it seems like this is something to do with the processed list itself. Have you tried implementing a simple lock? For example:
processed = []
processed_lock = threading.Lock()
def consumer():
while True:
with processed_lock.acquire():
fileInList = file in processed
if fileInList:
# ... et cetera
Threading tends to cause the strangest bugs, even if they seem like they "shouldn't" happen. Using locks on shared variables is the first step to make sure you don't end up with some kind of race condition that could cause threads to deadlock.
Of course, if what you're doing under # do stuff here is CPU-intensive, then Python will only run code from one thread at a time anyway, due to the Global Interpreter Lock. In that case, you may want to switch to the multiprocessing module - it's very similar to threading, though you will need to replace shared variables with another solution (see here for details).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.