Python multithreading program is giving unexpected output

Python multithreading program is giving unexpected output - python

I know that there is no guarantee regarding the order of execution for the threads. But my doubt is when I ran below code,
import threading
def doSomething():
print("Hello ")
d = threading.Thread(target=doSomething, args=())
d.start()
print("done")
Output that is coming is either
Hello done
or this
Hello
done
May be if I try too much then it might give me below as well
done
Hello
But I am not convinced with the first output. Since order can be different but how come both outputs are available in the same line. Does that means that one thread is messing up with other threads working?

This is a classic race condition. I can't personally reproduce it, and it would likely vary by interpreter implementation and the precise configuration applied to stdout. On Python interpreters without a GIL, there is basically no protection against races, and this behavior is expected to a certain extent. Python interpreters do tend to try to protect you from egregious data corruption due to threading, unlike C/C++, but even if they ensure every byte written ends up actually printed, they usually wouldn't try to make explicit guarantees against interleaving; Hdelolnoe would be a possible (if fairly unlikely given likely implementations) output when you're making no effort whatsoever to synchronize access to stdout.
On CPython, the GIL protects you more, and writing a single string to stdout is more likely to be atomic, but you're not writing a single string. Essentially, the implementation of print is to write objects one by one to the output file object as it goes, it doesn't batch up to a single string then call write just once. What this means is that:
print("Hello ") # Implicitly outputs default end argument of '\n' after printing provided args
is roughly equivalent to:
sys.stdout.write("Hello ")
sys.stdout.write("\n")
If the underlying stack of file objects that implements sys.stdout decides to engage in real I/O in response to the first write, they'll release the GIL before performing the actual write, allowing the main thread to catch up and potentially grab the GIL before the worker thread is given a chance to write the newline. The main thread then outputs the done and then the newlines from each print come out in some unspecified (and irrelevant) order based on further potential races.
Assuming you're on CPython, you could probably fix this by changing the code to this equivalent code using single write calls:
import threading
import sys
def doSomething():
sys.stdout.write("Hello \n")
d = threading.Thread(target=doSomething) # If it takes no arguments, no need to pass args
d.start()
sys.stdout.write("done\n")
and you'd be back to a race condition that only swaps the order, without interleaving (the language spec wouldn't guarantee a thing, but most reasonable implementations would be atomic for this case). If you want it to work with any guarantees without relying on the quirks of the implementation, you have to synchronize:
import threading
lck = threading.Lock()
def doSomething():
with lck:
print("Hello ")
d = threading.Thread(target=doSomething)
d.start()
with lck:
print("done")

Related

Strange blocking behavior with python multiprocessing queue put() and get()

I have written a class in python 2.7 (under linux) that uses multiple processes to manipulate a database asynchronously. I encountered a very strange blocking behaviour when using multiprocessing.Queue.put() and multiprocessing.Queue.get() which I can't explain.
Here is a simplified version of what I do:
from multiprocessing import Process, Queue
class MyDB(object):
def __init__(self):
self.inqueue = Queue()
p1 = Process(target = self._worker_process, kwargs={"inqueue": self.inqueue})
p1.daemon = True
started = False
while not started:
try:
p1.start()
started = True
except:
time.sleep(1)
#Sometimes I start a same second process but it makes no difference to my problem
p2 = Process(target = self._worker_process, kwargs={"inqueue": self.inqueue})
#blahblah... (same as above)
#staticmethod
def _worker_process(inqueue):
while True:
#--------------this blocks depite data having arrived------------
op = inqueue.get(block = True)
#do something with specified operation
#---------------problem area end--------------------
print "if this text gets printed, the problem was solved"
def delete_parallel(self, key, rawkey = False):
someid = ...blahblah
#--------------this section blocked when I was posting the question but for unknown reasons it's fine now
self.inqueue.put({"optype": "delete", "kwargs": {"key":key, "rawkey":rawkey}, "callid": someid}, block = True)
#--------------problem area end----------------
print "if you see this text, there was no blocking or block was released"
If I run the code above inside a test (in which I call delete_parallel on the MyDB object) then everything works, but if I run it in context of my entire application (importing other stuff, inclusive pygtk) strange things happen:
For some reason self.inqueue.get blocks and never releases despite self.inqueue having the data in its buffer. When I instead call self.inqueue.get(block = False, timeout = 1) then the call finishes by raising Queue.Empty, despite the queue containing data. qsize() returns 1 (suggests that data is there) while empty() returns True (suggests that there is no data).
Now clearly there must be something somewhere else in my application that renders self.inqueue unusable by causing acquisition of some internal semaphore. However I don't know what to look for. Eclipse dubugging becomes useless once a blocking semaphore is reached.
Edit 8 (cleaning up and summarizing my previous edits) Last time I had a similar problem, it turned out that pygtk was hijacking the global interpreter lock, but I solved it by calling gobject.threads_init() before I called anything else. Could this issue be related?
When I introduce a print "successful reception" after the get() method and execute my application in terminal, the same behaviour happens at first. When I then terminate by pressing CTRL+D I suddenly get the string "successful reception" inbetween messages. This looks to me like some other process/thread is terminated and releases the lock that blocks the process that is stuck at get().
Since the process that was stuck terminates later, I still see the message. What kind of process could externally mess with a Queue like that? self.inqueue is only accessed inside my class.
Right now it seems to come down to this queue which won't return anything despite the data being there:
the get() method seems to get stuck when it attempts to receive the actual data from some internal pipe. The last line before my debugger hangs is:
res = self._recv()
which is inside of multiprocessing.queues.get()
Tracking this internal python stuff further I find the assignments
self._recv = self._reader.recv and self._reader, self._writer = Pipe(duplex=False).
Edit 9
I'm currently trying to hunt down the import that causes it. My application is quite complex with hundreds of classes and each class importing a lot of other classes, so it's a pretty painful process. I have found a first candidate class which Uses 3 different MyDB instances when I track all its imports (but doesn't access MyDB.inqueue at any time as far as I can tell). The strange thing is, it's basically just a wrapper and the wrapped class works just fine when imported on its own. This also means that it uses MyDB without freezing. As soon as I import the wrapper (which imports that class), I have the blocking issue.
I started rewriting the wrapper by gradually reusing the old code. I'm testing each time I introduce a couple of new lines until I will hopefully see which line will cause the problem to return.

queue.Queue uses internal threads to maintain its state. If you are using GTK then it will break these threads. So you will need to call gobject.init_threads().
It should be noted that qsize() only returns an approximate size of the queue. The real size may be anywhere between 0 and the value returned by qsize().

how to safely pass a variables value between threads in python

I read on the python documentation that Queue.Queue() is a safe way of passing variables between different threads. I didn't really know that there was a safety issue with multithreading. For my application, I need to develop multiple objects with variables that can be accessed from multiple different threads. Right now I just have the threads accessing the object variables directly. I wont show my code here because there's way too much of it, but here is an example to demonstrate what I'm doing.
from threading import Thread
import time
import random
class switch:
def __init__(self,id):
self.id=id
self.is_on = False
def self.toggle():
self.is_on = not self.is_on
switches = []
for i in range(5):
switches[i] = switch(i)
def record_switch():
switch_record = {}
while True:
time.sleep(10)
current = {}
current['time'] = time.srftime(time.time())
for i in switches:
current[i.id] = i.is_on
switch_record.update(current)
def toggle_switch():
while True:
time.sleep(random.random()*100)
for i in switches:
i.toggle()
toggle = Thread(target=toggle_switch(), args = ())
record = Thread(target=record_switch(), args = ())
toggle.start()
record.start()
So as I understand, the queue object can be used only to put and get values, which clearly won't work for me. Is what I have here "safe"? If not, how can I program this so that I can safely access a variable from multiple different threads?

Whenever you have threads modifying a value other threads can see, then you are going to have safety issues. The worry is that a thread will try to modify a value when another thread is in the middle of modifying it, which has risky and undefined behavior. So no, your switch-toggling code is not safe.
The important thing to know is that changing the value of a variable is not guaranteed to be atomic. If an action is atomic, it means that action will always happen in one uninterrupted step. (This differs very slightly from the database definition.) Changing a variable value, especially a list value, can often times take multiple steps on the processor level. When you are working with threads, all of those steps are not guaranteed to happen all at once, before another thread starts working. It's entirely possible that thread A will be halfway through changing variable x when thread B suddenly takes over. Then if thread B tries to read variable x, it's not going to find a correct value. Even worse, if thread B tries to modify variable x while thread A is halfway through doing the same thing, bad things can happen. Whenever you have a variable whose value can change somehow, all accesses to it need to be made thread-safe.
If you're modifying variables instead of passing messages, you should be using aLockobject.
In your case, you'd have a global Lock object at the top:
from threading import Lock
switch_lock = Lock()
Then you would surround the critical piece of code with the acquire and release functions.
for i in switches:
switch_lock.acquire()
current[i.id] = i.is_on
switch_lock.release()
for i in switches:
switch_lock.acquire()
i.toggle()
switch_lock.release()
Only one thread may ever acquire a lock at a time (this kind of lock, anyway). When any of the other threads try, they'll be blocked and wait for the lock to become free again. So by putting locks around critical sections of code, you make it impossible for more than one thread to look at, or modify, a given switch at any time. You can put this around any bit of code you want to be kept exclusive to one thread at a time.
EDIT: as martineau pointed out, locks are integrated well with the with statement, if you're using a version of Python that has it. This has the added benefit of automatically unlocking if an exception happens. So instead of the above acquire and release system, you can just do this:
for i in switches:
with switch_lock:
i.toggle()

Python threading.thread.start() doesn't return control to main thread

I'm trying to a program that executes a piece of code in such a way that the user can stop its execution at any time without stopping the main program. I thought I could do this using threading.Thread, but then I ran the following code in IDLE (Python 3.3):
from threading import *
import math
def f():
eval("math.factorial(1000000000)")
t = Thread(target = f)
t.start()
The last line doesn't return: I eventually restarted the shell. Is this a consequence of the Global Interpreter Lock, or am I doing something wrong? I didn't see anything specific to this problem in the threading documentation (http://docs.python.org/3/library/threading.html)
I tried to do the same thing using a process:
from multiprocessing import *
import math
def f():
eval("math.factorial(1000000000)")
p = Process(target = f)
p.start()
p.is_alive()
The last line returns False, even though I ran it only a few seconds after I started the process! Based on my processor usage, I am forced to conclude that the process never started in the first place. Can somebody please explain what I am doing wrong here?

Thread.start() never returns! Could this have something to do with the C implementation of the math library?
As #eryksun pointed out in the comment: math.factorial() is implemented as a C function that doesn't release GIL so no other Python code may run until it returns.
Note: multiprocessing version should work as is: each Python process has its own GIL.
factorial(1000000000) has hundreds millions of digits. Try import time; time.sleep(10) as dummy calculation instead.
If you have issues with multithreaded code in IDLE then try the same code from the command line, to make sure that the error persists.
If p.is_alive() returns False after p.start() is already called then it might mean that there is an error in f() function e.g., MemoryError.
On my machine, p.is_alive() returns True and one of cpus is at 100% if I paste your code from the question into Python shell.
Unrelated: remove wildcard imports such as from multiprocessing import *. They may shadow other names in your code so that you can't be sure what a given name means e.g., threading could define eval function (it doesn't but it could) with a similar but different semantics that might break your code silently.
I want my program to be able to handle ridiculous inputs from the user gracefully
If you pass user input directly to eval() then the user can do anything.
Is there any way to get a process to print, say, an error message without constructing a pipe or other similar structure?
It is an ordinary Python code:
print(message) # works
The difference is that if several processes run print() then the output might be garbled. You could use a lock to synchronize print() calls.

Asynchronous subprocess on Windows

First of all, the overall problem I am solving is a bit more complicated than I am showing here, so please do not tell me 'use threads with blocking' as it would not solve my actual situation without a fair, FAIR bit of rewriting and refactoring.
I have several applications which are not mine to modify, which take data from stdin and poop it out on stdout after doing their magic. My task is to chain several of these programs. Problem is, sometimes they choke, and as such I need to track their progress which is outputted on STDERR.
pA = subprocess.Popen(CommandA, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
# ... some more processes make up the chain, but that is irrelevant to the problem
pB = subprocess.Popen(CommandB, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=pA.stdout )
Now, reading directly through pA.stdout.readline() and pB.stdout.readline(), or the plain read() functions, is a blocking matter. Since different applications output in different paces and different formats, blocking is not an option. (And as I wrote above, threading is not an option unless at a last, last resort.) pA.communicate() is deadlock safe, but since I need the information live, that is not an option either.
Thus google brought me to this asynchronous subprocess snippet on ActiveState.
All good at first, until I implement it. Comparing the cmd.exe output of pA.exe | pB.exe, ignoring the fact both output to the same window making for a mess, I see very instantaneous updates. However, I implement the same thing using the above snippet and the read_some() function declared there, and it takes over 10 seconds to notify updates of a single pipe. But when it does, it has updates leading all the way upto 40% progress, for example.
Thus I do some more research, and see numerous subjects concerning PeekNamedPipe, anonymous handles, and returning 0 bytes available even though there is information available in the pipe. As the subject has proven quite a bit beyond my expertise to fix or code around, I come to Stack Overflow to look for guidance. :)
My platform is W7 64-bit with Python 2.6, the applications are 32-bit in case it matters, and compatibility with Unix is not a concern. I can even deal with a full ctypes or pywin32 solution that subverts subprocess entirely if it is the only solution, as long as I can read from every stderr pipe asynchronously with immediate performance and no deadlocks. :)

How bad is it to have to use threads? I encountered much the same problem and eventually decided to use threads to gather up all the data on a sub-process's stdout and stderr and put it onto a thread-safe queue which which the main thread can read in a blocking fashion, without having to worry about the threading going on behind the scenes.
It's not clear what trouble you anticipate with a solution based on threads and blocking. Are you worried about having to make the rest of your code thread-safe? That shouldn't be an issue since the IO thread wouldn't need to interact with any of the rest of your code or data. If you have very restrictive memory requirements or your pipeline is particularly long then perhaps you may feel unhappy about spawning so many threads. I don't know enough about your situation so I couldn't say if this is likely to be a problem, but it seems to me that since you're already spawning off extra processes a few threads to interact with them should not be a terrible burden. In my situation I have not found these IO threads to be particularly problematic.
My thread function looked something like this:
def simple_io_thread(pipe, queue, tag, stop_event):
"""
Read line-by-line from pipe, writing (tag, line) to the
queue. Also checks for a stop_event to give up before
the end of the stream.
"""
while True:
line = pipe.readline()
while True:
try:
# Post to the queue with a large timeout in case the
# queue is full.
queue.put((tag, line), block=True, timeout=60)
break
except Queue.Full:
if stop_event.isSet():
break
continue
if stop_event.isSet() or line=="":
break
pipe.close()
When I start up the subprocess I do this:
outputqueue = Queue.Queue(50)
stop_event = threading.Event()
process = subprocess.Popen(
command,
cwd=workingdir,
env=env,
shell=useshell,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
stderr_thread = threading.Thread(
target=simple_io_thread,
args=(process.stderr, outputqueue, "STDERR", stop_event)
)
stdout_thread = threading.Thread(
target=simple_io_thread,
args=(process.stdout, outputqueue, "STDOUT", stop_event)
)
stderr_thread.daemon = True
stdout_thread.daemon = True
stderr_thread.start()
stdout_thread.start()
Then when I want to read I can just block on outputqueue - each item read from it contains either a string to identify which pipe it came from and a line of text from that pipe. Very little code runs in a separate thread, and it only communicates with the main thread via a thread-safe queue (plus an event in case I need to give up early). Perhaps this approach would be useful and allow you to solve the problem with threads and blocking but without having to rewrite lots of code?
(My solution is made more complicated because I sometimes wish to terminate the subprocesses early, and want to be sure that the threads will all finish. If that's not an issue you can get rid of all the stop_event stuff and it becomes pretty succinct.)

I assume that the process pipeline will not deadlock if it only uses stdin and stdout; and the problem you're trying to solve is how to make it not deadlock if they write to stderr (and have to deal with stderr possibly getting backed up).
If you're letting multiple processes write to stderr, you have to watch out for their output being intermingled. I'm guessing you have that sorted somehow; just putting it out there to be sure.
Be aware of the -u flag to python; it is helpful when testing to see if OS buffering is screwing you up.
If you want to emulate select() on file handles in win32, your only choice is to use PeekNamedPipe() and friends. I have a snippet of code that reads line-oriented output from multiple processes at once, which you may even be able to use directly -- try passing the list of proc.stderr handles to it and go.
class NoLineError(Exception): pass
class NoMoreLineError(Exception): pass
class LineReader(object):
"""Helper class for multi_readlines."""
def __init__(self, f):
self.fd = f.fileno()
self.osf = msvcrt.get_osfhandle(self.fd)
self.buf = ''
def getline(self):
"""Returns a line of text, or raises NoLineError, or NoMoreLineError."""
try:
_, avail, _ = win32pipe.PeekNamedPipe(self.osf, 0)
bClosed = False
except pywintypes.error:
avail = 0
bClosed = True
if avail:
self.buf += os.read(self.fd, avail)
idx = self.buf.find('\n')
if idx >= 0:
ret, self.buf = self.buf[:idx+1], self.buf[idx+1:]
return ret
elif bClosed:
if self.buf:
ret, self.buf = self.buf, None
return ret
else:
raise NoMoreLineError
else:
raise NoLineError
def multi_readlines(fs, timeout=0):
"""Read lines from |fs|, a list of file objects.
The lines come out in arbitrary order, depending on which files
have output available first."""
if type(fs) not in (list, tuple):
raise Exception("argument must be a list.")
objs = [LineReader(f) for f in fs]
for i,obj in enumerate(objs): obj._index = i
while objs:
yielded = 0
for i,obj in enumerate(objs):
try:
yield (obj._index, obj.getline())
yielded += 1
except NoLineError:
#time.sleep(timeout)
pass
except NoMoreLineError:
del objs[i]
break # Because we mutated the array
if not yielded:
time.sleep(timeout)
pass
I have never seen the "Peek returns 0 bytes even though data is available" issue myself. If this happens to others, I bet their libc is buffering their stdout/stderr before sending the data to the OS; there is nothing you can do about that from outside. You have to make the app use unbuffered output somehow (-u to python; win32/libc calls to modify the stderr file handle, ...)
The fact that you are seeing nothing, then a ton of updates, makes me think that your problem is buffering on the source end. win32 libc may buffer differently if it writes to a pipe rather than a console. Again, the best you can do from outside those programs is to aggressively drain their output.

What about using Twisted's FD's? http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.fdesc.html
It's not asynchronous but it is non-blocking. For asynchronous stuff, can you port to using Twisted?

Are Python built-in containers thread-safe?

I would like to know if the Python built-in containers (list, vector, set...) are thread-safe? Or do I need to implement a locking/unlocking environment for my shared variable?

You need to implement your own locking for all shared variables that will be modified in Python. You don't have to worry about reading from the variables that won't be modified (ie, concurrent reads are ok), so immutable types (frozenset, tuple, str) are probably safe, but it wouldn't hurt. For things you're going to be changing - list, set, dict, and most other objects, you should have your own locking mechanism (while in-place operations are ok on most of these, threads can lead to super-nasty bugs - you might as well implement locking, it's pretty easy).
By the way, I don't know if you know this, but locking is very easy in Python - create a threading.lock object, and then you can acquire/release it like this:
import threading
list1Lock = threading.Lock()
with list1Lock:
# change or read from the list here
# continue doing other stuff (the lock is released when you leave the with block)
In Python 2.5, do from __future__ import with_statement; Python 2.4 and before don't have this, so you'll want to put the acquire()/release() calls in try:...finally: blocks:
import threading
list1Lock = threading.Lock()
try:
list1Lock.acquire()
# change or read from the list here
finally:
list1Lock.release()
# continue doing other stuff (the lock is released when you leave the with block)
Some very good information about thread synchronization in Python.

Yes, but you still need to be careful of course
For example:
If two threads are racing to pop() from a list with only one item, One thread will get the item successfully and the other will get an IndexError
Code like this is not thread-safe
if L:
item=L.pop() # L might be empty by the time this line gets executed
You should write it like this
try:
item=L.pop()
except IndexError:
# No items left

They are thread-safe as long as you don't disable the GIL in C code for the thread.

The queue module implements multi-producer, multi-consumer queues. It is especially useful in threaded programming when information must be exchanged safely between multiple threads. The Queue class in this module implements all the required locking semantics.
https://docs.python.org/3/library/queue.html

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.