I'm talking to a measurement device. I basically send commands and receive answers. But I'm providing a method ask that sends a command and reads back the answer. If I lock this method I get a deadlock due to the called methods read and write locking aswell. If I don't lock another thread could steal the answer or write before I'm reading. How would you implement this?
import threading
class Device(object):
lock = threading.Lock()
def ask(self, value):
# can't use lock here would block
self.write(value) # another thread could start reading the answer
return self.read()
def read(self):
with self.lock:
# read values from device
def write(self, value):
with self.lock:
# send command to device
Use threading.RLock() to avoid contention within a single thread:
A reentrant lock is a synchronization primitive that may be acquired
multiple times by the same thread. Internally, it uses the concepts of
“owning thread” and “recursion level” in addition to the
locked/unlocked state used by primitive locks. In the locked state,
some thread owns the lock; in the unlocked state, no thread owns it.
You can use threading.RLock() object to do reentrant lock. But t is better to rewrite code such way, it do not need RLock. For example you could remove locks from write(), read() and rewrite ask() similar way
with self.lock:
self.write(value)
r = self.read()
return r
RLock() in old versions of python work slower, because it is more complicated in implementation.
Also note, in code you wrote, you get one lock for all instances. In some cases it is appropriate (for example if you have only one device and many instances), but in general not. If you want different locks for different instances, put its initialization in __init__() method.
Related
I'm very familiar with Python queue.Queue. This is definitely the thing you want when you want to have a reliable stream between consumer and producer threads.
However, sometimes you have producers that are faster than consumers and are forced to drop data (as for live video frame capture, for example. We may typically want to buffer just the last one, or two frames).
Does Python provide an asynchronous buffer class, similar to queue.Queue?
It's not exactly obvious how to correctly implement one using queue.Queue.
I could, for example:
buf = queue.Queue(maxsize=3)
def produce(msg):
if buf.full():
buf.get(block=False) # Make space
buf.put(msg, block=False)
def consume():
msg = buf.get(block=True)
work(msg)
although I don't particularly like that produce is not a locked, queue-atomic operation. A consume may start between full and get, for example, and it would be (probably) broken for a multi-producer scenario.
Is there's an out-of-the-box solution?
There's nothing built in for this, but it appears straightforward enough to build your own buffer class that wraps a Queue and provides mutual exclusion between .put() and .get() with its own lock, and using a Condition variable to wake up would-be consumers whenever an item is added. Like so:
import threading
class SBuf:
def __init__(self, maxsize):
import queue
self.q = queue.Queue()
self.maxsize = maxsize
self.nonempty = threading.Condition()
def get(self):
with self.nonempty:
while not self.q.qsize():
self.nonempty.wait()
assert self.q.qsize()
return self.q.get()
def put(self, v):
with self.nonempty:
while self.q.qsize() >= self.maxsize:
self.q.get()
self.q.put(v)
assert 0 < self.q.qsize() <= self.maxsize
self.nonempty.notify_all()
BTW, I advise against trying to build this kind of logic out of raw locks. Of course it can be done, but Condition variables are very carefully designed to save you from universes of unintended race conditions. There's a learning curve for Condition variables, but one well worth climbing: they often make things easy instead of brain-busting. Indeed, Python's threading module uses them internally to implement all sort of things.
An Alternative
In the above, we only invoke queue.Queue methods under the protection of our own lock, so there's really no need to use a thread-safe container - we're supplying all the thread safety already.
So it would be a bit leaner to use a simpler container. Happily, a collections.deque can be configured to discard all but the most recent N entries itself, but "at C speed". Like so:
class SBuf:
def __init__(self, maxsize):
import collections
self.q = collections.deque(maxlen=maxsize)
self.maxsize = maxsize
self.nonempty = threading.Condition()
def get(self):
with self.nonempty:
while not self.q:
self.nonempty.wait()
assert self.q
return self.q.popleft()
def put(self, v):
with self.nonempty:
self.q.append(v) # discards oldest, if needed
assert 0 < len(self.q) <= self.maxsize
self.nonempty.notify()
This also changed .notify_all() to .notify(). In this use case, either works correctly, but we're only adding one item so there's no need to notify more than one consumer. If there are multiple consumers waiting, .notify_all() will wake all of them up but only the first will find a non-empty queue. The others will see that it's empty, and just .wait() again.
Queue is already multiprocessing and multithreading safe, in that you can't write and read from the queue at the same time. However, you are correct that there's nothing stopping the queue from getting modified between the full() and get commands.
As such you can use a lock, which is how you can control thread access between multiple lines. The lock can only be acquired once, so if its currently locked, all other threads will wait until it has been released before they continue.
import threading
lock = threading.Lock()
def produce(msg):
lock.acquire()
if buf.full():
buf.get(block=False) # Make space
buf.put(msg, block=False)
lock.release()
def consume():
msg = None
while !msg:
lock.acquire()
try:
msg = buf.get(block=False)
except queue.Empty:
# buffer is empty, wait and try again
sleep(0.01)
lock.release()
work(msg)
I've wrote a class that inherits from object and has instances of sub-objects that uses some threads for tasks. There are two socket listeners that creates other threads for each accepted connection. They do what they have to do. To finish them, they are looking on a Threading.Event object to know that they have to finish.
I've noticed that, when exit the python console they are not notified (or don't catch the notification) and the exit don't return control to the bash console, unless a Close() is called before.
First idea to fix it has been to implement the '__del__' method to use the garbage collector to clean it when exit.
class ServiceProvider(object):
def __init__(self):
super(ServiceProvider,self).__init__()
#...
self.Open()
def Open(self):
#... Some threads are created.
def Close(self):
#.... Threading.Event to report the threads to finish
def __del__(self):
self.Close()
But the behaviour is the same. If I place a print in those methods, non in '__del__', neither in 'Close' they are written. Unless it is closed before, then the print in the del is wrote.
Then I've implemented the '__enter__' and '__exit__' methods to manage the with statement. And the exit behaves as expected and when the with ends, things are release. But what I really want is to have something like the file descriptors that event if file.close() is not called, it is executed when exits the program.
class ServiceProvider(object):
#...
def __enter__(self):
return self
def __exit__(self):
self.Close()
Searching for more solutions I've tried with atexit but not. I have similar results that doesn't fix the issue. Even I collect all the objects created of this class, the doOnExit only writes its print if the objects in the list are already Close.
import atexit
global objects2Close
objects2Close = []
#atexit.register
def doOnExit():
for obj in objects2Close:
obj.Close()
class ServiceProvider(object):
def __init__(self):
super(ServiceProvider,self).__init__()
objects2Close.append(self)
It's usually a good idea to use with when you have resources that you don't want to leak (files, connections, whatever else you care about).
Somewhere, just outside your main loop you should have something like:
with ServiceProvider(some_params) as service_provider:
rest_of_the_code()
What this does is that regardless of how you exit rest_of_the_code() (except for kill -9) it will call service_provider.Close() at the end. This works for exceptions and interrupts as well. Kill -9 doesn't work because the process is kill at os level and doesn't have a chance to attempt to recover.
I've got a solution for this issue. The posted information in this question was not related with the real issue.
This is as simple as daemon threading.
A the implementation uses some threads for listening remote connections they have to finish their execution when the program goes to exit. But the program ends when all the no daemon thread has finished.
Mistakenly those listeners and talkers where not set to be daemons and that's why the execution waits for them.
I have a number of files, mapped to memory (as mmap objects). In course of their processing each file must be opened several times. It works fine, if there is only one thread. However, when I try to run the task in parallel, a problem arises: different threads cannot access the same file simultaneously. The problem is illustrated by this sample:
import mmap, threading
class MmapReading(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
def run(self):
for i in range(10000):
content = mmap_object.read().decode('utf-8')
mmap_object.seek(0)
if not content:
print('Error while reading mmap object')
with open('my_dummy_file.txt', 'w') as f:
f.write('Hello world')
with open('my_dummy_file.txt', 'r') as f:
mmap_object = mmap.mmap(f.fileno(), 0, prot = mmap.PROT_READ)
threads = []
for i in range(64):
threads.append(MmapReading())
threads[i].daemon = True
threads[i].start()
for thread in threading.enumerate():
if thread != threading.current_thread():
thread.join()
print('Mmap reading testing done!')
Whenever I run this script, I get around 20 error messages.
Is there a way to circumvent this problem, other then making 64 copies of each file (which would consume too much memory in my case)?
The seek(0) is not always performed before another thread jumps in and performs a read().
Say thread 1 performs a read, reading to end of file; seek(0) has
not yet been executed.
Then thread 2 executes a read. The file pointer in the mmap is still
at the end of the file. read() therefore returns ''.
The error detection code is triggered because content is ''.
Instead of using read(), you can use slicing to achieve the same result. Replace:
content = mmap_object.read().decode('utf-8')
mmap_object.seek(0)
with
content = mmap_object[:].decode('utf8')
content = mmap_object[:mmap_object.size()] also works.
Locking is another way, but it's unnecessary in this case. If you want to try it, you can use a global threading.Lock object and pass that to MmapReading when instantiating. Store the lock object in an instance variable self.lock. Then call self.lock.acquire() before reading/seeking, and self.lock.release() afterwards. You'll experience a very noticeable performance penalty doing this.
from threading import Lock
class MmapReading(threading.Thread):
def __init__(self, lock):
self.lock = lock
threading.Thread.__init__(self)
def run(self):
for i in range(10000):
self.lock.acquire()
mmap_object.seek(0)
content = mmap_object.read().decode('utf-8')
self.lock.release()
if not content:
print('Error while reading mmap object')
lock = Lock()
for i in range(64):
threads.append(MmapReading(lock))
.
.
.
Note that I've changed the order of the read and the seek; it makes more sense to do the seek first, positioning the file pointer at the start of the file.
I fail to see where you need mmap to begin with. mmap is a technique to share data between processes. Why don't you just read the contents into memory (once!) e.g. as list? Each thread will be then accessing the list with it's own set of iterators. Also, be aware of the GIL in Python which prevents any speedup from happening using multithreading. If you want that, use multiprocessing (and then a mmaped file makes sense, but is actually shared amongst the various processes)
The issue is that the single mmap_object is being shared among the threads so that thread A calls read and before it gets to the seek, thread B also calls read, and so gets no data.
What you really need is an ability to duplicate the python mmap object without duplicating the underlying mmap, but I see no way of doing that.
I think the only feasible solution short of rewriting the object implementation is to employ a lock (mutex, etc) per mmap object to prevent two threads from accessing the same object at the same time.
I can best explain this with example code first;
class reciever(threading.Thread,simple_server):
def __init__(self,callback):
threading.Thread.__init__(self)
self.callback=callback
def run(self):
self.serve_forever(self.callback)
class sender(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.parameter=50
def run(self):
while True:
#do some processing in general
#....
#send some udp messages derived from self.parameter
send_message(self.parameter)
if __name__=='__main__':
osc_send=sender()
osc_send.start()
def update_parameter(val):
osc_send.parameter=val
osc_recv=reciever(update_parameter)
osc_recv.start()
the pieces I have left out are hopefully self explanatory from the code thats there..
My question is, is this a safe way to use a server running in a thread to update the attributes on a separate thread that could be reading the value at any time?
The way you're updating that parameter is actually thread-safe already, because of the Global Interpreter Lock (GIL). The GIL means that Python only allows one thread to execute byte-code at a time, so it is impossible for one thread to be reading from parameter at the same time another thread is writing to it. Reading from and setting an attribute are both single, atomic byte-code operations; one will always start and complete before the other can happen. You would only need to introduce synchronization primitives if you needed to do operations that are more than one byte-code operation from more than one threads (e.g. incrementing parameter from multiple threads).
I have some python application with 2 threads. Each thread operates within a separate gui. The GUIs need to operate independently without blocking. I am trying to figure out how to make thread_1 trigger an event to happen in thread_2?
Below is some code I want function foo to trigger function bar in the simplest, most elegant way as quickly as possible, without consuming unnecessary resources. Below is what I've come up with.
bar_trigger=False #global trigger for function bar.
lock = threading.Lock()
class Thread_2(threading.Thread):
def run(self):
global lock, bar_trigger
while(True):
lock.acquire()
if bar_trigger==True:
Thread_2.bar() #function I want to happen
bar_trigger=False
lock.release()
time.sleep(100) #sleep to preserve resources
#would like to preserve as much resources as possible
# and sleep as little as possible.
def bar(self):
print "Bar!"
class Thread_1(threading.Thread):
def foo(self):
global lock, bar_trigger
lock.acquire()
bar_trigger=True #trigger for bar in thread2
lock.release()
Is there a better way to accomplish this? I'm not a threadding expert so any advice on how to best trigger a method in thread_2 from within thread_1 is appreciated.
Without knowing what you're doing and what GUI framework you're using, I can't get into much more detail, but from your problem's code snippet, it sounds like you're looking for something called conditional variables.
Python comes with them included by default in the threading module, under threading.Condition You might be interested in threading.Event as well.
How are these threads instantiated? There should really be a main thread that oversees the workers. For example,
import time
import threading
class Worker(threading.Thread):
def __init__(self, stopper):
threading.Thread.__init__(self)
self.stopper = stopper
def run(self):
while not self.stopper.is_set():
print 'Hello from Worker!'
time.sleep(1)
stop = threading.Event()
worker = Worker(stop)
worker.start()
# ...
stop.set()
Using a shared Event object is just one way of synchronizing and sending messages between threads. There are others, and their usages depend on the specifics.
One option would be to share a queue between the threads. Thread 1 would push an instruction into the queue and thread two would poll that queue. When Thread 2 sees the queue is non-empty, it reads off the first instruction in the queue and calls the appropriate function. This has the additional benefit of being fairly loosely couple which can make testing each thread in isolation easier.