What is the "critical section" of a thread (in Python)?
A thread enters the critical section
by calling the acquire() method, which
can either be blocking or
non-blocking. A thread exits the
critical section, by calling the
release() method.
- Understanding Threading in Python, Linux Gazette
Also, what is the purpose of a lock?
Other people have given very nice definitions. Here's the classic example:
import threading
account_balance = 0 # The "resource" that zenazn mentions.
account_balance_lock = threading.Lock()
def change_account_balance(delta):
global account_balance
with account_balance_lock:
# Critical section is within this block.
account_balance += delta
Let's say that the += operator consists of three subcomponents:
Read the current value
Add the RHS to that value
Write the accumulated value back to the LHS (technically bind it in Python terms)
If you don't have the with account_balance_lock statement and you execute two change_account_balance calls in parallel you can end up interleaving the three subcomponent operations in a hazardous manner. Let's say you simultaneously call change_account_balance(100) (AKA pos) and change_account_balance(-100) (AKA neg). This could happen:
pos = threading.Thread(target=change_account_balance, args=[100])
neg = threading.Thread(target=change_account_balance, args=[-100])
pos.start(), neg.start()
pos: read current value -> 0
neg: read current value -> 0
pos: add current value to read value -> 100
neg: add current value to read value -> -100
pos: write current value -> account_balance = 100
neg: write current value -> account_balance = -100
Because you didn't force the operations to happen in discrete chunks you can have three possible outcomes (-100, 0, 100).
The with [lock] statement is a single, indivisible operation that says, "Let me be the only thread executing this block of code. If something else is executing, it's cool -- I'll wait." This ensures that the updates to the account_balance are "thread-safe" (parallelism-safe).
Note: There is a caveat to this schema: you have to remember to acquire the account_balance_lock (via with) every time you want to manipulate the account_balance for the code to remain thread-safe. There are ways to make this less fragile, but that's the answer to a whole other question.
Edit: In retrospect, it's probably important to mention that the with statement implicitly calls a blocking acquire on the lock -- this is the "I'll wait" part of the above thread dialog. In contrast, a non-blocking acquire says, "If I can't acquire the lock right away, let me know," and then relies on you to check whether you got the lock or not.
import logging # This module is thread safe.
import threading
LOCK = threading.Lock()
def run():
if LOCK.acquire(False): # Non-blocking -- return whether we got it
logging.info('Got the lock!')
LOCK.release()
else:
logging.info("Couldn't get the lock. Maybe next time")
logging.basicConfig(level=logging.INFO)
threads = [threading.Thread(target=run) for i in range(100)]
for thread in threads:
thread.start()
I also want to add that the lock's primary purpose is to guarantee the atomicity of acquisition (the indivisibility of the acquire across threads), which a simple boolean flag will not guarantee. The semantics of atomic operations are probably also the content of another question.
A critical section of code is one that can only be executed by one thread at a time. Take a chat server for instance. If you have a thread for each connection (i.e., each end user), one "critical section" is the spooling code (sending an incoming message to all the clients). If more than one thread tries to spool a message at once, you'll get BfrIToS mANtwD PIoEmesCEsaSges intertwined, which is obviously no good at all.
A lock is something that can be used to synchronize access to a critical section (or resources in general). In our chat server example, the lock is like a locked room with a typewriter in it. If one thread is in there (to type a message out), no other thread can get into the room. Once the first thread is done, he unlocks the room and leaves. Then another thread can go in the room (locking it). "Aquiring" the lock just means "I get the room."
A "critical section" is a chunk of code in which, for correctness, it is necessary to ensure that only one thread of control can be in that section at a time. In general, you need a critical section to contain references that write values into memory that can be shared among more than one concurrent process.
Related
I have a concurrent.futures.ThreadPoolExecutor and a list. And with the following code I add futures to the ThreadPoolExecutor:
for id in id_list:
future = self._thread_pool.submit(self.myfunc, id)
self._futures.append(future)
And then I wait upon the list:
concurrent.futures.wait(self._futures)
However, self.myfunc does some network I/O and thus there will be some network exceptions. When errors occur, self.myfunc submits a new self.myfunc with the same id to the same thread pool and add a new future to the same list, just as the above:
try:
do_stuff(id)
except:
future = self._thread_pool.submit(self.myfunc, id)
self._futures.append(future)
return None
Here comes the problem: I got an error on the line of concurrent.futures.wait(self._futures):
File "/usr/lib/python3.4/concurrent/futures/_base.py", line 277, in wait
f._waiters.remove(waiter)
ValueError: list.remove(x): x not in list
How should I properly add new Futures to a list while already waiting upon it?
Looking at the implementation of wait(), it certainly doesn't expect that anything outside concurrent.futures will ever mutate the list passed to it. So I don't think you'll ever get that "to work". It's not just that it doesn't expect the list to mutate, it's also that significant processing is done on list entries, and the implementation has no way to know that you've added more entries.
Untested, I'd suggest trying this instead: skip all that, and just keep a running count of threads still active. A straightforward way is to use a Condition guarding a count.
Initialization:
self._count_cond = threading.Condition()
self._thread_count = 0
When my_func is entered (i.e., when a new thread starts):
with self._count_cond:
self._thread_count += 1
When my_func is done (i.e., when a thread ends), for whatever reason (exceptional or not):
with self._count_cond:
self._thread_count -= 1
self._count_cond.notify() # wake up the waiting logic
And finally the main waiting logic:
with self._count_cond:
while self._thread_count:
self._count_cond.wait()
POSSIBLE RACE
It seems possible that the thread count could reach 0 while work for a new thread has been submitted, but before its my_func invocation starts running (and so before _thread_count is incremented to account for the new thread).
So the:
with self._count_cond:
self._thread_count += 1
part should really be done instead right before each occurrence of
self._thread_pool.submit(self.myfunc, id)
Or write a new method to encapsulate that pattern; e.g., like so:
def start_new_thread(self, id):
with self._count_cond:
self._thread_count += 1
self._thread_pool.submit(self.myfunc, id)
A DIFFERENT APPROACH
Offhand, I expect this could work too (but, again, haven't tested it): keep all your code the same except change how you're waiting:
while self._futures:
self._futures.pop().result()
So this simply waits for one thread at a time, until none remain.
Note that .pop() and .append() on lists are atomic in CPython, so no need for your own lock. And because your my_func() code appends before the thread it's running in ends, the list won't become empty before all threads really are done.
AND YET ANOTHER APPROACH
Keep the original waiting code, but rework the rest not to create new threads in case of exception. Like rewrite my_func to return True if it quits due to an exception, return False otherwise, and start threads running a wrapper instead:
def my_func_wrapper(self, id):
keep_going = True
while keep_going:
keep_going = self.my_func(id)
This may be especially attractive if you someday decide to use multiple processes instead of multiple threads (creating new processes can be a lot more expensive on some platforms).
AND A WAY USING cf.wait()
Another way is to change just the waiting code:
while self._futures:
fs = self._futures[:]
for f in fs:
self._futures.remove(f)
concurrent.futures.wait(fs)
Clear? This makes a copy of the list to pass to .wait(), and the copy is never mutated. New threads show up in the original list, and the whole process is repeated until no new threads show up.
Which of these ways makes most sense seems to me to depend mostly on pragmatics, but there's not enough info about all you're doing for me to make a guess about that.
I read on the python documentation that Queue.Queue() is a safe way of passing variables between different threads. I didn't really know that there was a safety issue with multithreading. For my application, I need to develop multiple objects with variables that can be accessed from multiple different threads. Right now I just have the threads accessing the object variables directly. I wont show my code here because there's way too much of it, but here is an example to demonstrate what I'm doing.
from threading import Thread
import time
import random
class switch:
def __init__(self,id):
self.id=id
self.is_on = False
def self.toggle():
self.is_on = not self.is_on
switches = []
for i in range(5):
switches[i] = switch(i)
def record_switch():
switch_record = {}
while True:
time.sleep(10)
current = {}
current['time'] = time.srftime(time.time())
for i in switches:
current[i.id] = i.is_on
switch_record.update(current)
def toggle_switch():
while True:
time.sleep(random.random()*100)
for i in switches:
i.toggle()
toggle = Thread(target=toggle_switch(), args = ())
record = Thread(target=record_switch(), args = ())
toggle.start()
record.start()
So as I understand, the queue object can be used only to put and get values, which clearly won't work for me. Is what I have here "safe"? If not, how can I program this so that I can safely access a variable from multiple different threads?
Whenever you have threads modifying a value other threads can see, then you are going to have safety issues. The worry is that a thread will try to modify a value when another thread is in the middle of modifying it, which has risky and undefined behavior. So no, your switch-toggling code is not safe.
The important thing to know is that changing the value of a variable is not guaranteed to be atomic. If an action is atomic, it means that action will always happen in one uninterrupted step. (This differs very slightly from the database definition.) Changing a variable value, especially a list value, can often times take multiple steps on the processor level. When you are working with threads, all of those steps are not guaranteed to happen all at once, before another thread starts working. It's entirely possible that thread A will be halfway through changing variable x when thread B suddenly takes over. Then if thread B tries to read variable x, it's not going to find a correct value. Even worse, if thread B tries to modify variable x while thread A is halfway through doing the same thing, bad things can happen. Whenever you have a variable whose value can change somehow, all accesses to it need to be made thread-safe.
If you're modifying variables instead of passing messages, you should be using aLockobject.
In your case, you'd have a global Lock object at the top:
from threading import Lock
switch_lock = Lock()
Then you would surround the critical piece of code with the acquire and release functions.
for i in switches:
switch_lock.acquire()
current[i.id] = i.is_on
switch_lock.release()
for i in switches:
switch_lock.acquire()
i.toggle()
switch_lock.release()
Only one thread may ever acquire a lock at a time (this kind of lock, anyway). When any of the other threads try, they'll be blocked and wait for the lock to become free again. So by putting locks around critical sections of code, you make it impossible for more than one thread to look at, or modify, a given switch at any time. You can put this around any bit of code you want to be kept exclusive to one thread at a time.
EDIT: as martineau pointed out, locks are integrated well with the with statement, if you're using a version of Python that has it. This has the added benefit of automatically unlocking if an exception happens. So instead of the above acquire and release system, you can just do this:
for i in switches:
with switch_lock:
i.toggle()
This has been discussed many, many times, but I still don't have a good grasp on how to best accomplish this.
Suppose I have two threads: a main app thread and a worker thread. The main app thread (say it's a WXWidgets GUI thread, or a thread that is looping and accepting user input at the console) could have a reason to stop the worker thread - the user's closing the application, a stop button was clicked, some error occurred in the main thread, whatever.
Commonly suggested is to setup a flag that the thread checks frequently to determine whether to exit. I have two problems with the suggested ways to approach this, however:
First, writing constant checks of a flag into my code makes my code really ugly, and it's very, very prone to problems due to the huge amount of code duplication. Take this example:
def WorkerThread():
while (True):
doOp1() # assume this takes say 100ms.
if (exitThread == True):
safelyEnd()
return
doOp2() # this one also takes some time, say 200ms
if (exitThread == True):
safelyEnd()
return
if (somethingIsTrue == True):
doSomethingImportant()
if (exitThread == True): return
doSomethingElse()
if (exitThread == True): return
doOp3() # this blocks for an indeterminate amount of time - say, it's waiting on a network respond
if (exitThread == True):
safelyEnd()
return
doOp4() # this is doing some math
if (exitThread == True):
safelyEnd()
return
doOp5() # This calls a buggy library that might block forever. We need a way to detect this and kill this thread if it's stuck for long enough...
saveSomethingToDisk() # might block while the disk spins up, or while a network share is accessed...whatever
if (exitThread == True):
safelyEnd()
return
def safelyEnd():
cleanupAnyUnfinishedBusiness() # do whatever is needed to get things to a workable state even if something was interrupted
writeWhatWeHaveToDisk() # it's OK to wait for this since it's so important
If I add more code or change code, I have to make sure I'm adding those check blocks all over the place. If my worker thread is a very lengthy thread, I could easily have tens or even hundreds of those checks. Very cumbersome.
Think of the other problems. If doOp4() does accidentally deadlock, my app will spin forever and never exit. Not a good user experience!
Using daemon threads isn't really a good option either because it denies me the opportunity to execute the safelyEnd() code. This code might be important - flushing disk buffers, writing log data for debugging purposes, etc.
Second, my code might call functions that block where I don't have the opportunity to check frequently. Let's say this function exists but it's in code that I don't have access to - say part of a library:
def doOp4():
time.sleep(60) # imagine that this is a network thread, that waits for 60 seconds for a reply before returning.
If that timeout is 60 seconds, even if my main thread gives the signal for the thread to end, it still might sit there for 60 seconds, when it would be perfectly reasonable for it to just stop waiting for a network response and exit. If that code is part of a library I didn't write, however, I have no control over how that works.
Even if I did write the code for a network check, I'd basically have to refactor it so that rather than waiting 60 seconds, it loops 60 times and waits 1 second before checking the exit thread! Again, very messy!
The upshot of all of this, is it feels like a good way to be able to implement this easily would be to somehow cause an exception on a specific thread. If I could do that, I could wrap the entire worker thread's code in a try block, and put the safelyEnd() code in the exception handler, or even a finally block.
Is there a way to either accomplish this, or refactor this code with a different technique that will make things work? The thing is, ideally, when the user requests a quit, we want to make them wait the minimum possible amount. It seems that there has to be a simple way to accomplish this, as this is a very common thing in apps!
Most of the thread communication objects don't allow for this type of setup. They might allow for a cleaner way to have an exit flag, but it still doesn't eliminate the need to constantly check that exit flag, and it still won't deal with the thread blocking because of an external call or because it's simply in a busy loop.
The biggest thing for me is really that if I have a long worker thread procedure I have to litter it with hundreds of checks of the flag. This just seems way too messy and doesn't feel like it's very good coding practice. There has to be a better way...
Any advice would be greatly appreciated.
First, you can make this a lot less verbose and repetitive by using an exception, without needing the ability to raise exceptions into the thread from outside, or any other new tricks or language features:
def WorkerThread():
class ExitThreadError(Exception):
pass
def CheckEnd():
if exitThread:
raise ExitThreadError()
try:
while True:
doOp1() # assume this takes say 100ms.
CheckEnd()
doOp2() # this one also takes some time, say 200ms
CheckEnd()
# etc.
except ExitThreadError:
safelyEnd()
Note that you really ought to be guarding exitThread with a Lock or Condition—which is another good reason to wrap up the check, so you only need to fix that in one place.
Anyway, I've taken out some excessive parentheses, == True checks, etc. that added nothing to the code; hopefully you can still see how it's equivalent to the original.
You can take this even farther by restructuring your function into a simple state machine; then you don't even need an exception. I'll show a ridiculously trivial example, where every state always implicitly transitions to the next state no matter what. For this case, the refactor is obviously reasonable; whether it's reasonable for your real code, only you can really tell.
def WorkerThread():
states = (doOp1, doOp2, doOp3, doOp4, doOp5)
current = 0
while not exitThread:
states[current]()
current += 1
safelyEnd()
Neither of these does anything to help you interrupt in the middle of one of your steps.
If you have some function that takes 60 seconds and there's not a damn thing you can do about it, then there's no way to cancel your thread during those 60 seconds and there's not a damn thing you can do about it. That's just the way it is.
But usually, things that take 60 seconds are really doing something like blocking on a select, and there is something you can do about that—create a pipe, stick its read end in the select, and write on the other end to wake up the thread.
Or, in you're feeling hacky, often just closing/deleting/etc. a file or other object that the function is waiting on/processing/otherwise using will often guarantee that it fails quickly with an exception. Of course sometimes it guarantees a segfault, or corrupted data, or a 50% chance of exiting and a 50% chance of hanging forever, or… So, even if you can't control that doOp4 function, you'd better be able to analyze its source and/or whitebox test it.
If worst comes to worst, then yes, you do have to either change that one 60-second timeout into 60 1-second timeouts. But usually it won't come to that.
Finally, if you really do need to be able to kill a thread, don't use a thread, use a child process. Those are killable.
Just make sure that your process is always in a state where it's safe to kill it—or, if you only care about Unix, use a USR signal and mask it out when the process isn't in a safe-to-kill state.
But if it's not safe to kill your process in the middle of that 60-second doOp4 call, this isn't really going to help you, because you still won't be able to kill it during those 60 seconds.
In some cases, you can have the child process arrange for the parent to clean up for it if it gets killed unexpectedly, or even arrange for it to be cleaned up on the next run (e.g., think of a typical database journal).
But ultimately, what you're asking for is ultimately a contradiction: You want to hard-kill a thread without giving it a chance to finish what it's doing, but you want to guarantee that it finishes what it's doing, and you don't want to rewrite the code to make that possible. So, you need to rethink your design so that it requires something that isn't impossible.
If you do not mind your code running about ten times slower, you can use the Thread2 class implemented below. An example follows that shows how calling the new stop method should kill the thread on the next bytecode instruction. Implementing a cleanup system is left as an exercise for the reader to accomplish.
import threading
import sys
class StopThread(StopIteration): pass
threading.SystemExit = SystemExit, StopThread
class Thread2(threading.Thread):
def stop(self):
self.__stop = True
def _bootstrap(self):
if threading._trace_hook is not None:
raise ValueError('Cannot run thread with tracing!')
self.__stop = False
sys.settrace(self.__trace)
super()._bootstrap()
def __trace(self, frame, event, arg):
if self.__stop:
raise StopThread()
return self.__trace
class Thread3(threading.Thread):
def _bootstrap(self, stop_thread=False):
def stop():
nonlocal stop_thread
stop_thread = True
self.stop = stop
def tracer(*_):
if stop_thread:
raise StopThread()
return tracer
sys.settrace(tracer)
super()._bootstrap()
################################################################################
import time
def main():
test = Thread2(target=printer)
test.start()
time.sleep(1)
test.stop()
test.join()
def printer():
while True:
print(time.time() % 1)
time.sleep(0.1)
if __name__ == '__main__':
main()
The Thread3 class appears to run code approximately 33% faster than the Thread2 class.
I am using python with Raspian on the Raspberry pi. I have a peripheral attached that causes my interrupt handler function to run. Sometimes the interrupt get fired when the response to the first interrupt has not yet completed. So I added a variable that is set when the interrupt function is entered and reset when exited, and if upon entering the function, it finds that the lock is set it will immediately exit.
Is there a more standard way of dealing this kind of thing.
def IrqHandler(self, channel):
if self.lockout: return
self.lockout = True;
# do stuff
self.lockout = False;
You have a race condition if the IrqHandler is called twice sufficiently close together, both calls can see self.lockout as False and both proceed to set it to True etc.
The threading module has a Lock() object. Usually (the default) this is used to block a thread until the lock is released. This means that all the interrupts would be queued up and have a turn running the Handler.
You can also create a Lock(False) which will just return False if the Lock has been acquired. This is close to your use here
from threading import Lock
def __init__(self):
self.irq_lock = Lock(False)
def IrqHandler(self, channel):
if not self.irq_lock.acquire():
return
# do stuff
self.irq_local.release()
You can tie that in with a borg pattern. This way you can have several interrupt instances paying attention to one state.
There is another one called singleton but here is a discussion on the two.
Why is the Borg pattern better than the Singleton pattern in Python
Let's say if we have a main thread which launches two threads for test modules - " test_a" and " test_b".
Both the test module threads maintain their state whether they are done performing test or if they encountered any error, warning or if they want to update some other information.
How main thread can get access to this information and act accordingly.
For example, if " test_a" raised an error flag; How "main" will know and stop rest of the tests before existing with error ?
One way to do this is using global variables but that gets very ugly.. Very soon.
The obvious solution is to share some kind of mutable variable, by passing it in to the thread objects/functions at constructor/start.
The clean way to do this is to build a class with appropriate instance attributes. If you're using a threading.Thread subclass, instead of just a thread function, you can usually use the subclass itself as the place to stick those attributes. But I'll show it with a list just because it's shorter:
def test_a_func(thread_state):
# ...
thread_state[0] = my_error_state
# ...
def main_thread():
test_states = [None]
test_a = threading.Thread(target=test_a_func, args=(test_states,))
test_a.start()
You can (and usually want to) also pack a Lock or Condition into the mutable state object, so you can properly synchronize between main_thread and test_a.
(Another option is to use a queue.Queue, an os.pipe, etc. to pass information around, but you still need to get that queue or pipe to the child thread—which you do in the exact same way as above.)
However, it's worth considering whether you really need to do this. If you think of test_a and test_b as "jobs", rather than "thread functions", you can just execute those jobs on a pool, and let the pool handle passing results or errors back.
For example:
try:
with concurrent.futures.ThreadPoolExecutor(workers=2) as executor:
tests = [executor.submit(job) for job in (test_a, test_b)]
for test in concurrent.futures.as_completed(tests):
result = test.result()
except Exception as e:
# do stuff
Now, if the test_a function raises an exception, the main thread will get that exception—and, because that means exiting the with block, and all of the other jobs get cancelled and thrown away, and the worker threads shut down.
If you're using 2.5-3.1, you don't have concurrent.futures built in, but you can install the backport off PyPI, or you can rewrite things around multiprocessing.dummy.Pool. (It's slightly more complicated that way, because you have to create a sequence of jobs and call map_async to get back an iterator over AsyncResult objects… but really that's still pretty simple.)