In the Twisted API for DeferredFilesystemLock, it is stated that deferUntilLocked is not safe for concurrent use.
I would like to understand in what way it is unsafe and what makes it unsafe, in order to ensure that I don't misuse the file locks.
Arguably the method is actually quite safe for concurrent use. If you read the first four lines of the implementation then it's clear that an attempt at concurrent use will immediately raise AlreadyTryingToLockError.
Perhaps the warning is meant to tell you that you'll get an exception rather than useful locking behavior, though.
The implementation of that exception should provide a hint about why concurrent use isn't allowed. DeferredFilesystemLock uses some instance attributes, starting with _tryLockCall, to keep track of progress in the attempt to acquire the lock. If concurrent attempts were allowed, they would each trample over the use of this attribute (and others) by each other.
This could be enhanced with relative ease. All that would be necessary is to keep the state associated with the lock attempt on a new object allocated per-attempt (instead of on the DeferredFilesystemLock instance). Or, DeferredLock could help.
The first and most obvious thing that comes to mind is that in concurrent situations you're never guaranteed to acquire the lock (if another thread never releases it), so you may defer forever. You could avoid this by simply passing the optional timeout to deferUntilLocked.
Other things to consider that may make this unsuitable for concurrent use:
Starvation: What if multiple threads are continually waiting to acquire the same lock - are they treated fairly, or will one thread spend longer waiting than others? Are threads guaranteed to eventually acquire the lock?
Deadlocks: If you're acquiring multiple locks at a time, and multiple threads are doing this, you may get into a situation where you have two threads both waiting on a resource that the other one holds.
Are you sure that acquired locks are always released? What if one thread acquires a lock and crashes without releasing it?
It looks to me like Twisted's implementation is fairly simple and probably doesn't take into account many of these things. Their "not safe" comment is a "there be dragons here" - you may/will get difficult to troubleshoot concurrent bugs or issues if you try to use this in a concurrent application.
I have different threads and after processing they put data in a common list. Is there anything built in python for a list or a numpy array to be accessed by only a single thread. Secondly, if it is not what is an elegant way of doing it?
According to Thread synchronisation mechanisms in Python, reading a single item from a list and modifying a list in place are guaranteed to be atomic. If this is right (although it seems to be partially contradicted by the very existence of the Queue module), then if your code is all of the form:
try:
val = mylist.pop()
except IndexError:
# wait for a while or exit
else:
# process val
And everything put into mylist is done by .append(), then your code is already threadsafe. If you don't trust that one document on that score, use a queue.queue, which does all synchronisation for you, and has a better API than list for concurrent programs - particularly, it gives you the option of blocking indefinitely, or for a timeout, waiting for .pop() to work if you don't have anything else the thread could be getting on with in the mean time.
For numpy arrays, and in general any case where you need more than a producer/consumer queue, use a Lock or RLock from threading - these implement the context manager protocol, so using them is quite simple:
with mylock:
# Process as necessarry
And python will guarantee that the lock gets released once you fall off the end of the with block - including in tricky cases like if something you do raises an exception.
Finally, consider whether multiprocessing is a better fit for your application than threading - threads in Python aren't guaranteed to actually run concurrently, and in CPython only can if the drop to C-level code. multiprocessing gets around that issue, but may have some extra overhead - if you haven't already, you should read the docs to determine which one suits your needs better.
threading provides Lock objects if you need to protect an entire critical section, or the Queue module provides a queue that is threadsafe.
How about the standard library Queue?
I'm trying to write a Python 2.6 (OSX) program using multiprocessing, and I want to populate a Queue with more than the default of 32767 items.
from multiprocessing import Queue
Queue(2**15) # raises OSError
Queue(32767) works fine, but any higher number (e.g. Queue(32768)) fails with OSError: [Errno 22] Invalid argument
Is there a workaround for this issue?
One approach would be to wrap your multiprocessing.Queue with a custom class (just on the producer side, or transparently from the consumer perspective). Using that you would queue up items to be dispatched to the Queue object that you're wrapping, and only feed things from the local queue (Python list() object) into the multiprocess.Queue as space becomes available, with exception handling to throttle when the Queue is full.
That's probably the easiest approach since it should have the minimum impact on the rest of your code. The custom class should behave just like a Queue while hiding the underlying multiprocessing.Queue behind your abstraction.
(One approach might be to have your producer use threads, one thread to manage the dispatch from a threading Queue to your multiprocessing.Queue and any other threads actually just feeding the threading Queue).
I've already answered the original question but I do feel like adding that Redis lists are quite reliable and the Python module's support for them are extremely easy to use for implementing a Queue like object. These have the advantage of allowing one to scale out over multiple nodes (across a network) as well as just over multiple processes.
Basically to use those you'd just pick a key (string) for your queue name have your producers push into it and have your workers (task consumers) loop on blocking pops from that key.
The Redis BLPOP, and BRPOP commands all take a list of keys (lists/queues) and an optional timeout value. They return a tuple (key,value) or None (on timeout). So you can easily write up an event driven system that's very similar to the familiar structure of select() (but at a much higher level). The only thing you have to watch for are missing keys and invalid key types (just wrap your queue operations with exception handlers, of course). (If some other application stops on your shared Redis server removing keys or replacing keys that you were using as queues with string/integer or other types of values ... well, you have a different problem at that point). :)
Another advantage of this model is that Redis does persist its data to the disk. So your work queue could survive system restarts if you chose to allow it.
(Of course you could implement a simple Queue as a table in SQLlite or any other SQL system if you really wanted to do so; just using some sort of auto-incrementing index for the sequencing and a column to mark each item has having been "done" (consumed); but that does involve somewhat more complexity than using what Redis gives you "out of the box").
Working for me on MacOSX
>>> import Queue
>>> Queue.Queue(30000000)
<Queue.Queue instance at 0x1006035f0>
I'm learning to use the Queue module, and am a bit confused about how a queue consumer thread can be made to know that the queue is complete. Ideally I'd like to use get() from within the consumer thread and have it throw an exception if the queue has been marked "done". Is there a better way to communicate this than by appending a sentinel value to mark the last item in the queue?
original (most of this has changed; see updates below)
Based on some of the suggestions (thanks!) of Glenn Maynard and others, I decided to roll up a descendant of Queue.Queue that implements a close method. It's available in the form of a primitive (unpackaged) module. I'll clean this up a bit and package it properly when I have a bit more time. For now the module only contains the CloseableQueue class and the Closed exception class. I'm planning to expand it to also include subclasses of Queue.LifoQueue and Queue.PriorityQueue.
It's in a pretty preliminary state currently, which is to say that although it passes its test suite, I haven't actually used it for anything yet. Your mileage may vary. I'll keep this answer updated with exciting news.
The CloseableQueue class differs a bit from Glenn's suggestion in that closing the queue will prevent future puts, but not prevent future gets until the queue is emptied. This made the most sense to me; it seemed like functionality to clear the queue could be added as a separate mixin* that would be orthogonal to the closeability functionality. So basically with CloseableQueue, by closing the queue you indicate that the last element has been put. There's also an option to do this atomically by passing last=True to the final put call. Subsequent calls to put, and subsequent calls to get once the queue is emptied, as well as outstanding blocked calls matching those descriptions, will raise the Closed exception.
This is mostly useful for situations where a single producer is generating data for one or more consumers, but it could also be useful for a multi-multi arrangement where consumers are waiting for a particular item or set of items. In particular it doesn't provide a way to determine that all of a number of producers have finished production. Getting that working would entail the provision of some way to register producers (.open()?), as well as a way to indicate that producer registration is itself closed.
Suggestions and/or code reviews are quite welcome. I haven't written a whole lot of concurrency code, but hopefully the test suite is thorough enough that the fact that the code passes it is an indication of the code's quality, rather than the suite's lack thereof. I was able to reuse a bunch of the code from the Queue module's test suite: the file itself is included in this module and used as a basis for various subclasses and routines, including regression testing. This probably (hopefully) helped to avoid complete ineptitude in the testing department. The code itself just overrides Queue.get and Queue.put with fairly minimal changes, and adds the close and closed methods.
I've sort of intentionally avoided using any new-fangled fanciness like context managers in both the code itself and in the test suite in an effort to keep the code as backwards-compatible as is the Queue module itself, which is considerably backwards indeed. I'll probably add __enter__ and __exit__ methods at some point; otherwise, the contextlib's closing function should be applicable to a CloseableQueue instance.
*: Here I use the term "mixin" loosely. As the Queue module's classes are old-style, mixins would need to be mixed using class factory functions; some restrictions apply; offer void where prohibited by Guido.
update
The CloseableQueue module now provides CloseableLifoQueue and CloseablePriorityQueue classes. I've also added some convenience functions to support iteration. Still need to rework it as a proper package. There's a class factory function to allow for convenient subclassing of other Queue.Queue-derived classes.
update 2
CloseableQueue is now available via PyPI, e.g. with
$ easy_install CloseableQueue
Comments and criticism are welcome, especially from this answer's anonymous downvoter.
Queue's don't inherently have the idea of being complete or done. They can be used indefinitely. To close it up when you are done, you will indeed need to put None or some other magic value at the end and write the logic to check for it, as you described. The ideal way would probably be subclassing the Queue object.
See http://en.wikipedia.org/wiki/Queue_(data_structure) to learn more about queue in general.
A sentinel is a natural way to shut down a queue, but there are a couple things to watch out for.
First, remember that you may have more than one consumer, so you need to send a sentinel once for each running consumer, and guarantee that each consumer will only consume one sentinel, to ensure that each consumer receives its shutdown sentinel.
Second, remember that Queue defines an interface, and that when possible, code should behave regardless of the underlying Queue. You might have a PriorityQueue, or you might have some other class that exposes the same interface and returns values in some other order.
Unfortunately, it's hard to deal with both of these. To deal with the general case of different queues, a consumer that's shutting down must continue to consume values after receiving its shutdown sentinel until the queue is empty. That means that it may consume another thread's sentinel. This is a weakness of the Queue interface: it should have a Queue.shutdown call to cause an exception to be thrown by all consumers, but that's missing.
So, in practice:
if you're sure you're only ever using a regular Queue, simply send one sentinel per thread.
if you may be using a PriorityQueue, ensure that the sentinel has the lowest priority.
Queue is a FIFO (first in first out) register so remember that the consumer can be faster than producer. When consumers thread detect that the queue is empty normally realise one of following actions:
Send to API: switch to next thread.
Send to API: sleep some ms and than check again the queue.
Send to API: wait on event (like new message in queue).
If you wont that consumers thread terminate after job is complete than put in queue a sentinel value to terminate task.
The best practice way of doing this would be to have the queue itself notify a client that it has reached the 'done' state. The client can then take any action that is appropriate.
What you have suggested; checking the queue to see if it is done periodically, would be highly undesirable. Polling is an antipattern in multithreaded programming, you should always be using notifications.
EDIT:
So your saying that the queue itself knows that it's 'done' based on some criteria and needs to notify the clients of that fact. I think you are correct and the best way to do this is by throwing when a client calls get() and the queue is in the done state. If your throwing this would negate the need for a sentinel value on the client side. Internally the queue can detect that it is 'done' in any way it pleases e.g. queue is empty, it's state was set to done etc I don't see any need for a sentinel value.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
What are the modules used to write multi-threaded applications in Python? I'm aware of the basic concurrency mechanisms provided by the language and also of Stackless Python, but what are their respective strengths and weaknesses?
In order of increasing complexity:
Use the threading module
Pros:
It's really easy to run any function (any callable in fact) in its
own thread.
Sharing data is if not easy (locks are never easy :), at
least simple.
Cons:
As mentioned by Juergen Python threads cannot actually concurrently access state in the interpreter (there's one big lock, the infamous Global Interpreter Lock.) What that means in practice is that threads are useful for I/O bound tasks (networking, writing to disk, and so on), but not at all useful for doing concurrent computation.
Use the multiprocessing module
In the simple use case this looks exactly like using threading except each task is run in its own process not its own thread. (Almost literally: If you take Eli's example, and replace threading with multiprocessing, Thread, with Process, and Queue (the module) with multiprocessing.Queue, it should run just fine.)
Pros:
Actual concurrency for all tasks (no Global Interpreter Lock).
Scales to multiple processors, can even scale to multiple machines.
Cons:
Processes are slower than threads.
Data sharing between processes is trickier than with threads.
Memory is not implicitly shared. You either have to explicitly share it or you have to pickle variables and send them back and forth. This is safer, but harder. (If it matters increasingly the Python developers seem to be pushing people in this direction.)
Use an event model, such as Twisted
Pros:
You get extremely fine control over priority, over what executes when.
Cons:
Even with a good library, asynchronous programming is usually harder than threaded programming, hard both in terms of understanding what's supposed to happen and in terms of debugging what actually is happening.
In all cases I'm assuming you already understand many of the issues involved with multitasking, specifically the tricky issue of how to share data between tasks. If for some reason you don't know when and how to use locks and conditions you have to start with those. Multitasking code is full of subtleties and gotchas, and it's really best to have a good understanding of concepts before you start.
You've already gotten a fair variety of answers, from "fake threads" all the way to external frameworks, but I've seen nobody mention Queue.Queue -- the "secret sauce" of CPython threading.
To expand: as long as you don't need to overlap pure-Python CPU-heavy processing (in which case you need multiprocessing -- but it comes with its own Queue implementation, too, so you can with some needed cautions apply the general advice I'm giving;-), Python's built-in threading will do... but it will do it much better if you use it advisedly, e.g., as follows.
"Forget" shared memory, supposedly the main plus of threading vs multiprocessing -- it doesn't work well, it doesn't scale well, never has, never will. Use shared memory only for data structures that are set up once before you spawn sub-threads and never changed afterwards -- for everything else, make a single thread responsible for that resource, and communicate with that thread via Queue.
Devote a specialized thread to every resource you'd normally think to protect by locks: a mutable data structure or cohesive group thereof, a connection to an external process (a DB, an XMLRPC server, etc), an external file, etc, etc. Get a small thread pool going for general purpose tasks that don't have or need a dedicated resource of that kind -- don't spawn threads as and when needed, or the thread-switching overhead will overwhelm you.
Communication between two threads is always via Queue.Queue -- a form of message passing, the only sane foundation for multiprocessing (besides transactional-memory, which is promising but for which I know of no production-worthy implementations except In Haskell).
Each dedicated thread managing a single resource (or small cohesive set of resources) listens for requests on a specific Queue.Queue instance. Threads in a pool wait on a single shared Queue.Queue (Queue is solidly threadsafe and won't fail you in this).
Threads that just need to queue up a request on some queue (shared or dedicated) do so without waiting for results, and move on. Threads that eventually DO need a result or confirmation for a request queue a pair (request, receivingqueue) with an instance of Queue.Queue they just made, and eventually, when the response or confirmation is indispensable in order to proceed, they get (waiting) from their receivingqueue. Be sure you're ready to get error-responses as well as real responses or confirmations (Twisted's deferreds are great at organizing this kind of structured response, BTW!).
You can also use Queue to "park" instances of resources which can be used by any one thread but never be shared among multiple threads at one time (DB connections with some DBAPI compoents, cursors with others, etc) -- this lets you relax the dedicated-thread requirement in favor of more pooling (a pool thread that gets from the shared queue a request needing a queueable resource will get that resource from the apppropriate queue, waiting if necessary, etc etc).
Twisted is actually a good way to organize this minuet (or square dance as the case may be), not just thanks to deferreds but because of its sound, solid, highly scalable base architecture: you may arrange things to use threads or subprocesses only when truly warranted, while doing most things normally considered thread-worthy in a single event-driven thread.
But, I realize Twisted is not for everybody -- the "dedicate or pool resources, use Queue up the wazoo, never do anything needing a Lock or, Guido forbid, any synchronization procedure even more advanced, such as semaphore or condition" approach can still be used even if you just can't wrap your head around async event-driven methodologies, and will still deliver more reliability and performance than any other widely-applicable threading approach I've ever stumbled upon.
It depends on what you're trying to do, but I'm partial to just using the threading module in the standard library because it makes it really easy to take any function and just run it in a separate thread.
from threading import Thread
def f():
...
def g(arg1, arg2, arg3=None):
....
Thread(target=f).start()
Thread(target=g, args=[5, 6], kwargs={"arg3": 12}).start()
And so on. I often have a producer/consumer setup using a synchronized queue provided by the Queue module
from Queue import Queue
from threading import Thread
q = Queue()
def consumer():
while True:
print sum(q.get())
def producer(data_source):
for line in data_source:
q.put( map(int, line.split()) )
Thread(target=producer, args=[SOME_INPUT_FILE_OR_SOMETHING]).start()
for i in range(10):
Thread(target=consumer).start()
Kamaelia is a python framework for building applications with lots of communicating processes.
(source: kamaelia.org) Kamaelia - Concurrency made useful, fun
In Kamaelia you build systems from simple components that talk to each other. This speeds development, massively aids maintenance and also means you build naturally concurrent software. It's intended to be accessible by any developer, including novices. It also makes it fun :)
What sort of systems? Network servers, clients, desktop applications, pygame based games, transcode systems and pipelines, digital TV systems, spam eradicators, teaching tools, and a fair amount more :)
Here's a video from Pycon 2009. It starts by comparing Kamaelia to Twisted and Parallel Python and then gives a hands on demonstration of Kamaelia.
Easy Concurrency with Kamaelia - Part 1 (59:08)
Easy Concurrency with Kamaelia - Part 2 (18:15)
Regarding Kamaelia, the answer above doesn't really cover the benefit here. Kamaelia's approach provides a unified interface, which is pragmatic not perfect, for dealing with threads, generators & processes in a single system for concurrency.
Fundamentally it provides a metaphor of a running thing which has inboxes, and outboxes. You send messages to outboxes, and when wired together, messages flow from outboxes to inboxes. This metaphor/API remains the same whether you're using generators, threads or processes, or speaking to other systems.
The "not perfect" part is due to syntactic sugar not being added as yet for inboxes and outboxes (though this is under discussion) - there is a focus on safety/usability in the system.
Taking the producer consumer example using bare threading above, this becomes this in Kamaelia:
Pipeline(Producer(), Consumer() )
In this example it doesn't matter if these are threaded components or otherwise, the only difference is between them from a usage perspective is the baseclass for the component. Generator components communicate using lists, threaded components using Queue.Queues and process based using os.pipes.
The reason behind this approach though is to make it harder to make hard to debug bugs. In threading - or any shared memory concurrency you have, the number one problem you face is accidentally broken shared data updates. By using message passing you eliminate one class of bugs.
If you use bare threading and locks everywhere you're generally working on the assumption that when you write code that you won't make any mistakes. Whilst we all aspire to that, it's very rare that will happen. By wrapping up the locking behaviour in one place you simplify where things can go wrong. (Context handlers help, but don't help with accidental updates outside the context handler)
Obviously not every piece of code can be written as message passing and shared style which is why Kamaelia also has a simple software transactional memory (STM), which is a really neat idea with a nasty name - it's more like version control for variables - ie check out some variables, update them and commit back. If you get a clash you rinse and repeat.
Relevant links:
Europython 09 tutorial
Monthly releases
Mailing list
Examples
Example Apps
Reusable components (generator & thread)
Anyway, I hope that's a useful answer. FWIW, the core reason behind Kamaelia's setup is to make concurrency safer & easier to use in python systems, without the tail wagging the dog. (ie the big bucket of components
I can understand why the other Kamaelia answer was modded down, since even to me it looks more like an ad than an answer. As the author of Kamaelia it's nice to see enthusiasm though I hope this contains a bit more relevant content :-)
And that's my way of saying, please take the caveat that this answer is by definition biased, but for me, Kamaelia's aim is to try and wrap what is IMO best practice. I'd suggest trying a few systems out, and seeing which works for you. (also if this is inappropriate for stack overflow, sorry - I'm new to this forum :-)
I would use the Microthreads (Tasklets) of Stackless Python, if I had to use threads at all.
A whole online game (massivly multiplayer) is build around Stackless and its multithreading principle -- since the original is just to slow for the massivly multiplayer property of the game.
Threads in CPython are widely discouraged. One reason is the GIL -- a global interpreter lock -- that serializes threading for many parts of the execution. My experiance is, that it is really difficult to create fast applications this way. My example codings where all slower with threading -- with one core (but many waits for input should have made some performance boosts possible).
With CPython, rather use seperate processes if possible.
If you really want to get your hands dirty, you can try using generators to fake coroutines. It probably isn't the most efficient in terms of work involved, but coroutines do offer you very fine control of co-operative multitasking rather than pre-emptive multitasking you'll find elsewhere.
One advantage you'll find is that by and large, you will not need locks or mutexes when using co-operative multitasking, but the more important advantage for me was the nearly-zero switching speed between "threads". Of course, Stackless Python is said to be very good for that as well; and then there's Erlang, if it doesn't have to be Python.
Probably the biggest disadvantage in co-operative multitasking is the general lack of workaround for blocking I/O. And in the faked coroutines, you'll also encounter the issue that you can't switch "threads" from anything but the top level of the stack within a thread.
After you've made an even slightly complex application with fake coroutines, you'll really begin to appreciate the work that goes into process scheduling at the OS level.