How Do I Queue My Python Locks?

How Do I Queue My Python Locks? - python

Is there a way to make python locks queued? I have been assuming thus far in my code that threading.lock operates on a queue. It looks like it just gives the lock to a random locker. This is bad for me, because the program (game) I'm working is highly dependent on getting messages in the right order. Are there queued locks in python? If so, how much will I lose on processing time?

I wholly agree with the comments claiming that you're probably thinking about this in an unfruitful way. Locks provide serialization, and aren't at all intended to provide ordering. The bog-standard, easy, and reliable way to enforce an order is to use a Queue.Queue
CPython leaves it up to the operating system to decide in which order locks are acquired. On most systems, that will appear to be more-or-less "random". That cannot be changed.
That said, I'll show a way to implement a "FIFO lock". It's neither hard nor easy - somewhere in between - and you shouldn't use it ;-) I'm afraid only you can answer your "how much will I lose on processing time?" question - we have no idea how heavily you use locks, or how much lock contention your application provokes. You can get a rough feel by studying this code, though.
import threading, collections
class QLock:
def __init__(self):
self.lock = threading.Lock()
self.waiters = collections.deque()
self.count = 0
def acquire(self):
self.lock.acquire()
if self.count:
new_lock = threading.Lock()
new_lock.acquire()
self.waiters.append(new_lock)
self.lock.release()
new_lock.acquire()
self.lock.acquire()
self.count += 1
self.lock.release()
def release(self):
with self.lock:
if not self.count:
raise ValueError("lock not acquired")
self.count -= 1
if self.waiters:
self.waiters.popleft().release()
def locked(self):
return self.count > 0
Here's a little test driver, which can be changed in the obvious way to use either this QLock or a threading.Lock:
def work(name):
qlock.acquire()
acqorder.append(name)
from time import sleep
if 0:
qlock = threading.Lock()
else:
qlock = QLock()
qlock.acquire()
acqorder = []
ts = []
for name in "ABCDEFGHIJKLMNOPQRSTUVWXYZ":
t = threading.Thread(target=work, args=(name,))
t.start()
ts.append(t)
sleep(0.1) # probably enough time for .acquire() to run
for t in ts:
while not qlock.locked():
sleep(0) # yield time slice
qlock.release()
for t in ts:
t.join()
assert qlock.locked()
qlock.release()
assert not qlock.locked()
print "".join(acqorder)
On my box just now, 3 runs using threading.Lock produced this output:
BACDEFGHIJKLMNOPQRSTUVWXYZ
ABCDEFGHIJKLMNOPQRSUVWXYZT
ABCEDFGHIJKLMNOPQRSTUVWXYZ
So it's certainly not random, but neither is it wholly predictable. Running it with the QLock instead, the output should always be:
ABCDEFGHIJKLMNOPQRSTUVWXYZ

I stumbled upon this post because I had a similar requirement. Or at least I thought so.
My fear was that if the locks didn't release in FIFO order, thread starvation would be likely to happen, and that would be terrible for my software.
After reading a bit, I dismissed my fears and realized what everyone was saying: if you want this, you're doing it wrong. Also, I was convinced that you can rely on the OS to do its job and not let your thread starve.
To get to that point, I did a bit of digging to understand better how locks worked under Linux. I started by taking a look at the pthreads (Posix Threads) glibc source code and specifications, because I was working in C++ on Linux. I don't know if Python uses pthreads under the hood, but I'm gonna assume it probably is.
I didn't find any specification, on the multiple pthreads references around, relating to the order of unlocks.
What I found is: locks in pthreads on Linux are implemented using a kernel feature called futex.
http://man7.org/linux/man-pages/man2/futex.2.html
http://man7.org/linux/man-pages/man7/futex.7.html
A link on the references of the first of these pages leads you to this PDF:
https://www.kernel.org/doc/ols/2002/ols2002-pages-479-495.pdf
It explains a bit about unlocking strategies, and about how futexes work and are implemented in the Linux kernel, and a lot lot more.
And there I found what I wanted. It explains that futexes are implemented in the kernel in a way such that unlocks are mostly done in FIFO order (to increase fairness). However, that is not guaranteed, and it is possible that one thread might jump the line one in a while. They allow this to not complicate too much the code and allow for the good performance they achieved without losing it due to extreme measures to enforce the FIFO order.
So basically, what you have is:
The POSIX standard doesn't impose any requirement as to the order of locking and unlocking of mutexes. Any implementation is free to do as they want, so if you rely on this order, your code won't be portable (not even between different versions of the same platform).
The Linux implementation of the pthreads library rely on a feature/technique called futex to implement mutexes, and it mostly tries to do a FIFO style unlocking of mutexes, but it is not guaranteed that it will be done in that order.

Yes you could create a FIFO queue using a list of the thread IDs:
FIFO = [5,79,3,2,78,1,9...]
You would try to acquire the lock and if you can't, then push the attempting thread's ID (FIFO.insert(0,threadID)) onto the front of the queue and each time you release the lock, make sure that if a thread wants to acquire the lock it must have the thread ID at the end of the queue (threadID == FIFO[-1]). If the thread does have the thread ID at the end of the queue, then let it acquire the lock and then pop it off (FIFO.pop()). Repeat as necessary.

Related

What is the difference between these threading and multiprocessing codes in python 2.x?

I have tried the following two ways to do something in my project, I used threading first and it kind of worked but when I tried to do it using multiprocessing it just didn't.
The portions of code shown below corresponds to a fuction defined inside init block of X Class.
This is the code done with threading:
def Exec_Manual():
while True:
for i in range(0,5):
if self.rbtnMan.isChecked():
if self.rbtnAuto.isChecked():#This is another radio button.
break
self._tx_freq1_line_edit.setEnabled(1)
self._tx_freq2_line_edit.setEnabled(1)
self._tx_freq3_line_edit.setEnabled(1)
self._tx_freq4_line_edit.setEnabled(1)
self._tx_freq5_line_edit.setEnabled(1)
frec = 'self._tx_freq'+str(i+1)+'_line_edit.text()'
efrec = float(eval(frec))
self.lblTx1.setText(str(efrec-0.4))
self.lblTx2.setText(str(efrec))
self.lblTx3.setText(str(efrec+0.4))
#print frec
print efrec
time.sleep(1)
manual_thread = threading.Thread(target=Exec_Manual)
manual_thread.daemon = True
manual_thread.start()
This is the code done with threads:
def Exec_Manual():
while True:
for i in range(0,5):
if self.rbtnMan.isChecked():
if self.rbtnAuto.isChecked():
break
self._tx_freq1_line_edit.setEnabled(1)
self._tx_freq2_line_edit.setEnabled(1)
self._tx_freq3_line_edit.setEnabled(1)
self._tx_freq4_line_edit.setEnabled(1)
self._tx_freq5_line_edit.setEnabled(1)
frec = 'self._tx_freq'+str(i+1)+'_line_edit.text()'
efrec = float(eval(frec))
self.lblTx1.setText(str(efrec-0.4))
self.lblTx2.setText(str(efrec))
self.lblTx3.setText(str(efrec+0.4))
#print frec
print efrec
time.sleep(1)
proceso_manual = multiprocessing.Process(name='txmanual', target=Exec_Manual)
proceso_manual.daemon = True
proceso_manual.start()
Basically, when multiprocessing is used, it doesn't set the text of the labels or change the Enabled state of the lineedits. ¿how can I achieve this?
Sorry if I bother you with my ignorance, but please all help will be useful TIA.

This is the expected behavior.
Threads operate in the same memory space; processes have their owns. If you start a new process, it can not make changes in the memory of its parent process. The only way to communicate with a process is IPC, basically network or Unix sockets.
UPD:
Also, you can pause and restart threads, e.g. by using synchronization primitives (locks and semaphores) and checking them from thread function. There is also a less nice way which I really don't recommend. So, I would rather stick to synchronization primitives.
Speaking of the IPC, it is much more troublesome and expensive than synchronizing threads. It is built around sockets, so communicating with a process on the same machine is almost as troublesome as talking to another machine on the other side of the world. Fortunately there are quite a few protocols and libraries providing abstraction over sockets and making it less tedious (dbus is a good example).
Finally, if you really like the idea of decentralized processing, it might make sense to look into message queues and workers. This is basically the same as IPC, but abstracted to a higher level. E.g. you can run a processes to queue tasks on one machine, do processing on another and then get results back into the original program (or yet another machine/process). A popular example here could be something like AMPQ, RabbitMQ or Celery.

Why do we need locks for threads, if we have GIL?

I believe it is a stupid question but I still can't find it. Actually it's better to separate it into two questions:
1) Am I right that we could have a lot of threads but because of GIL in one moment only one thread is executing?
2) If so, why do we still need locks? We use locks to avoid the case when two threads are trying to read/write some shared object, because of GIL twi threads can't be executed in one moment, can they?

GIL protects the Python interals. That means:
you don't have to worry about something in the interpreter going wrong because of multithreading
most things do not really run in parallel, because python code is executed sequentially due to GIL
But GIL does not protect your own code. For example, if you have this code:
self.some_number += 1
That is going to read value of self.some_number, calculate some_number+1 and then write it back to self.some_number.
If you do that in two threads, the operations (read, add, write) of one thread and the other may be mixed, so that the result is wrong.
This could be the order of execution:
thread1 reads self.some_number (0)
thread2 reads self.some_number (0)
thread1 calculates some_number+1 (1)
thread2 calculates some_number+1 (1)
thread1 writes 1 to self.some_number
thread2 writes 1 to self.some_number
You use locks to enforce this order of execution:
thread1 reads self.some_number (0)
thread1 calculates some_number+1 (1)
thread1 writes 1 to self.some_number
thread2 reads self.some_number (1)
thread2 calculates some_number+1 (2)
thread2 writes 2 to self.some_number
EDIT: Let's complete this answer with some code which shows the explained behaviour:
import threading
import time
total = 0
lock = threading.Lock()
def increment_n_times(n):
global total
for i in range(n):
total += 1
def safe_increment_n_times(n):
global total
for i in range(n):
lock.acquire()
total += 1
lock.release()
def increment_in_x_threads(x, func, n):
threads = [threading.Thread(target=func, args=(n,)) for i in range(x)]
global total
total = 0
begin = time.time()
for thread in threads:
thread.start()
for thread in threads:
thread.join()
print('finished in {}s.\ntotal: {}\nexpected: {}\ndifference: {} ({} %)'
.format(time.time()-begin, total, n*x, n*x-total, 100-total/n/x*100))
There are two functions which implement increment. One uses locks and the other does not.
Function increment_in_x_threads implements parallel execution of the incrementing function in many threads.
Now running this with a big enough number of threads makes it almost certain that an error will occur:
print('unsafe:')
increment_in_x_threads(70, increment_n_times, 100000)
print('\nwith locks:')
increment_in_x_threads(70, safe_increment_n_times, 100000)
In my case, it printed:
unsafe:
finished in 0.9840562343597412s.
total: 4654584
expected: 7000000
difference: 2345416 (33.505942857142855 %)
with locks:
finished in 20.564176082611084s.
total: 7000000
expected: 7000000
difference: 0 (0.0 %)
So without locks, there were many errors (33% of increments failed). On the other hand, with locks it was 20 times slower.
Of course, both numbers are blown up because I used 70 threads, but this shows the general idea.

At any moment, yes, only one thread is executing Python code (other threads may be executing some IO, NumPy, whatever). That is mostly true. However, this is trivially true on any single-processor system, and yet people still need locks on single-processor systems.
Take a look at the following code:
queue = []
def do_work():
while queue:
item = queue.pop(0)
process(item)
With one thread, everything is fine. With two threads, you might get an exception from queue.pop() because the other thread called queue.pop() on the last item first. So you would need to handle that somehow. Using a lock is a simple solution. You can also use a proper concurrent queue (like in the queue module)--but if you look inside the queue module, you'll find that the Queue object has a threading.Lock() inside it. So either way you are using locks.
It is a common newbie mistake to write multithreaded code without the necessary locks. You look at code and think, "this will work just fine" and then find out many hours later that something truly bizarre has happened because threads weren't synchronized properly.
Or in short, there are many places in a multithreaded program where you need to prevent another thread from modifying a structure until you're done applying some changes. This allows you to maintain the invariants on your data, and if you can't maintain invariants, then it's basically impossible to write code that is correct.
Or put in the shortest way possible, "You don't need locks if you don't care if your code is correct."

the GIL prevents simultaneous execution of multiple threads, but not in all situations.
The GIL is temporarily released during I/O operations executed by threads. That means, multiple threads can run at the same time. That's one reason you still need locks.
I don't know where I found this reference.... in a video or something - hard to look it up, but you can investigate further yourself
UPDATE:
The few thumbs down I got signal to me that people think memory is not a good enough reference, and google not a good enough database. While I'd disagree with that, let me provide one of the first URLs I looked up (and checked!), so the people who disliked my answer can live happily from how on:
https://wiki.python.org/moin/GlobalInterpreterLock

the GIL does not protect you from modification of the internal states of the objects that you are accessing concurrently from different threads, meaning that you can still mess things up if you don't take measures.
So, despite the fact that two threads may not be running at the same exact time, they can still be trying to manipulate the internal state of an object (one at a time, intermittently), and if you don't prevent that from happening (with some locking mechanism) your code could/will eventually fail.
Regards.

threading.Lock() performance issues

I have multiple threads:
dispQ = Queue.Queue()
stop_thr_event = threading.Event()
def worker (stop_event):
while not stop_event.wait(0):
try:
job = dispQ.get(timeout=1)
job.waitcount -= 1
dispQ.task_done()
except Queue.Empty, msg:
continue
# create job objects and put into dispQ here
for j in range(NUM_OF_JOBS):
j = Job()
dispQ.put(j)
# NUM_OF_THREADS could be 10-20 ish
running_threads = []
for t in range(NUM_OF_THREADS):
t1 = threading.Thread( target=worker, args=(stop_thr_event,) )
t1.daemon = True
t1.start()
running_threads.append(t1)
stop_thr_event.set()
for t in running_threads:
t.join()
The code above was giving me some very strange behavior.
I've ended up finding out that it was due to decrementing waitcount with out a lock
I 've added an attribute to Job class self.thr_lock = threading.Lock()
Then I've changed it to
with job.thr_lock:
job.waitcount -= 1
This seems to fix the strange behavior but it looks like it has degraded in performance.
Is this expected? is there way to optimize locking?
Would it be better to have one global lock rather than one lock per job object?

About the only way to "optimize" threading would be to break the processing down in blocks or chunks of work that can be performed at the same time. This mostly means doing input or output (I/O) because that is the only time the interpreter will release the Global Interpreter Lock, aka the GIL.
In actuality there is often no gain or even a net slow-down when threading is added due to the overhead of using it unless the above condition is met.
It would probably be worse if you used a single global lock for all the shared resources because it would make parts of the program wait when they really didn't need to do so since it wouldn't distinguish what resource was needed so unnecessary waiting would occur.
You might find the PyCon 2015 talk David Beasley gave titled Python Concurrency From the Ground Up of interest. It covers threads, event loops, and coroutines.

It's hard to answer your question based on your code. Locks do have some inherent cost, nothing is free, but normally it is quite small. If your jobs are very small, you might want to consider "chunking" them, that way you have many fewer acquire/release calls relative to the amount of work being done by each thread.
A related but separate issue is one of threads blocking each other. You might notice large performance issues if many threads are waiting on the same lock(s). Here your threads are sitting idle waiting on each other. In some cases this cannot be avoided because there is a shared resource which is a performance bottlenecking. In other cases you can re-organize your code to avoid this performance penalty.
There are some things in your example code that make me thing that it might be very different from actual application. First, your example code doesn't share job objects between threads. If you're not sharing job objects you shouldn't need locks on them. Second, as written your example code might not empty the queue before finishing. It will exit as soon as you hit stop_thr_event.set() leaving any remaining jobs in queue, is this by design?

How to resolve thread deadlocks in real-time?

If a deadlock between python threads is suspected in run-time, is there any way to resolve it without killing the entire process?
For example, if a few threads take far longer than they should, a resource manager might suspect that some of them are deadlocked. While of course it should be debugged fixed in the code in the future, is there a clean solution that can be used immediately (in run-time) to perhaps kill specific threads so that the others can resume?
Edit: I was thinking to add some "deadlock detection" loop (in its own thread) that sleeps for a bit, then checks all running threads, and if a few of them look suspiciously slow, it kills the least important one among them. At what point the thread is suspected of deadlocking, and which is the least important of them, is of course defined by the deadlock detection loop programmer.
Clearly, it won't catch all problems (most obviously if the deadlock detection thread itself is deadlocked). The idea is not to find a mathematically perfect solution (which is, of course, not to write code that can deadlock). Rather, I wanted to partially solve the problem in some realistic cases.

You may try using this code snippet ahead of time but during execution the program is stuck and you can't do much about it.
Debuggers like WinDbg or strace might help but as Python is an interpreted language I doubt they'll be realistic to use.

There are 2 approaches to deadlocks detecting. The first one is static analysis of the code. This way is the most preferred, but we bordered in it by the Halting Problem. It's possible to find some potentially deadlocks only in several certain cases, not in general. The second approach - tracking of locks in runtime using the Wait-for Graph. It is the most reliable, but more expensive way, because parallelism is broken when you checking the graph.
For the second way i had written a library implementing lock checking of the graph before of taking. If taking of the lock results to deadlock, exception will be raised.
You can download it by pip:
$ pip install locklib
And use it as a usual lock from the standard library:
from threading import Thread
from locklib import SmartLock
lock_1 = SmartLock()
lock_2 = SmartLock()
def function_1():
while True:
with lock_1:
with lock_2:
pass
def function_2():
while True:
with lock_2:
with lock_1:
pass
thread_1 = Thread(target=function_1)
thread_2 = Thread(target=function_2)
thread_1.start()
thread_2.start()
In this example of code you can look at potentially deadlock situation, but the thread that locking second raises an exception. Deadlock is impossible in this case.

Variable loops and synchronization

def start(self):
self.running = True
while self.running:
pass
def shut_down(self):
self.running = False
Hi i want to knew a good way, to synchronise variable running. I want to have fast solution but i dont knew what will be better semaphores, mutexs or locks. I assume that shut_down is not often used.
This is my best solution but i think we can do this better.
def start(self):
self.__lock__.acquire()
self.running = True
while self.running:
self.__lock__.release()
self.__lock__.acquire()
def shut_down(self):
self.__lock__.acquire()
self.running = False
self.__lock__.release()

For your simple example with a primitive value like a Boolean flag, no synchronization is necessary in Python. At least, not in CPython (the standard interpreter you can download from python.org).
That's because the whole interpreter is covered by the "Global Interpreter Lock" so only one thread can be running at a time at the Python level (multiple threads could potentially be doing stuff at the same time in extension modules, if those modules are set up to release the GIL at appropriate times). So, when your worker thread looping in the start function checks the running attribute, it is guaranteed that that object is in a sane state. If no code other than shut_down modifies it, you can even be sure that it will be True or False.
So, if you are going to be using CPython and you're sticking to very simple logic (like a Boolean flag that is only ever written to by one thread), your first example code will work just fine.
If you need more complicated logic, like a counter that can be incremented by any of several threads, then you will need a bit of synchronization to avoid race conditions like TOCTTOU.
One of the easiest synchronization tools that Python offers is the queue module, which allows synchronized communication between threads in a FIFO manner. I can't speak to its performance compared to lower level stuff, but it's really easy to get code working correctly with queues (whereas its pretty easy to mess up manual locking, ending up with deadlocks or race conditions that are a nightmare to debug).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.