Eventlet and locking

Eventlet and locking - python

Since Eventlet uses green threading and asynchronous I/O, do I still need to set locks before accessing objects? My understanding is that greenlets are all part of one thread and locking isn't necessary. Can anyone confirm or deny this?

Your understanding is correct: "green" threads are not actually threads, they don't get pre-empted at unpredictable points (esp. not "in the middle" of an operation), so you have full control of when execution moves away from one (and can thus get dispatched to another) and can save yourself the trouble/overhead of lock acquire/release operations.

Related

When should I be using asyncio over regular threads, and why? Does it provide performance increases?

I have a pretty basic understanding of multithreading in Python and an even basic-er understanding of asyncio.
I'm currently writing a small Curses-based program (eventually going to be using a full GUI, but that's another story) that handles the UI and user IO in the main thread, and then has two other daemon threads (each with their own queue/worker-method-that-gets-things-from-a-queue):
a watcher thread that watches for time-based and conditional (e.g. posts to a message board, received messages, etc.) events to occur and then puts required tasks into...
the other (worker) daemon thread's queue which then completes them.
All three threads are continuously running concurrently, which leads me to some questions:
When the worker thread's queue (or, more generally, any thread's queue) is empty, should it be stopped until is has something to do again, or is it okay to leave continuously running? Do concurrent threads take up a lot of processing power when they aren't doing anything other than watching its queue?
Should the two threads' queues be combined? Since the watcher thread is continuously running a single method, I guess the worker thread would be able to just pull tasks from the single queue that the watcher thread puts in.
I don't think it'll matter since I'm not multiprocessing, but is this setup affected by Python's GIL (which I believe still exists in 3.4) in any way?
Should the watcher thread be running continuously like that? From what I understand, and please correct me if I'm wrong, asyncio is supposed to be used for event-based multithreading, which seems relevant to what I'm trying to do.
The main thread is basically always just waiting for the user to press a key to access a different part of the menu. This seems like a situation asyncio would be perfect for, but, again, I'm not sure.
Thanks!

When the worker thread's queue (or, more generally, any thread's queue) is empty, should it be stopped until is has something to do again, or is it okay to leave continuously running? Do concurrent threads take up a lot of processing power when they aren't doing anything other than watching its queue?
You should just use a blocking call to queue.get(). That will leave the thread blocked on I/O, which means the GIL will be released, and no processing power (or at least a very minimal amount) will be used. Don't use non-blocking gets in a while loop, since that's going to require a lot more CPU wakeups.
Should the two threads' queues be combined? Since the watcher thread is continuously running a single method, I guess the worker thread would be able to just pull tasks from the single queue that the watcher thread puts in.
If all the watcher is doing is pulling things off a queue and immediately putting it into another queue, where it gets consumed by a single worker, it sounds like its unnecessary overhead - you may as well just consume it directly in the worker. It's not exactly clear to me if that's the case, though - is the watcher consuming from a queue, or just putting items into one? If it is consuming from a queue, who is putting stuff into it?
I don't think it'll matter since I'm not multiprocessing, but is this setup affected by Python's GIL (which I believe still exists in 3.4) in any way?
Yes, this is affected by the GIL. Only one of your threads can run Python bytecode at a time, so won't get true parallelism, except when threads are running I/O (which releases the GIL). If your worker thread is doing CPU-bound activities, you should seriously consider running it in a separate process via multiprocessing, if possible.
Should the watcher thread be running continuously like that? From what I understand, and please correct me if I'm wrong, asyncio is supposed to be used for event-based multithreading, which seems relevant to what I'm trying to do.
It's hard to say, because I don't know exactly what "running continuously" means. What is it doing continuously? If it spends most of its time sleeping or blocking on a queue, it's fine - both of those things release the GIL. If it's constantly doing actual work, that will require the GIL, and therefore degrade the performance of the other threads in your app (assuming they're trying to do work at the same time). asyncio is designed for programs that are I/O-bound, and can therefore be run in a single thread, using asynchronous I/O. It sounds like your program may be a good fit for that depending on what your worker is doing.
The main thread is basically always just waiting for the user to press a key to access a different part of the menu. This seems like a situation asyncio would be perfect for, but, again, I'm not sure.
Any program where you're mostly waiting for I/O is potentially a good for for asyncio - but only if you can find a library that makes curses (or whatever other GUI library you eventually choose) play nicely with it. Most GUI frameworks come with their own event loop, which will conflict with asyncio's. You would need to use a library that can make the GUI's event loop play nicely with asyncio's event loop. You'd also need to make sure that you can find asyncio-compatible versions of any other synchronous-I/O based library your application uses (e.g. a database driver).
That said, you're not likely to see any kind of performance improvement by switching from your thread-based program to something asyncio-based. It'll likely perform about the same. Since you're only dealing with 3 threads, the overhead of context switching between them isn't very significant, so switching from that a single-threaded, asynchronous I/O approach isn't going to make a very big difference. asyncio will help you avoid thread synchronization complexity (if that's an issue with your app - it's not clear that it is), and at least theoretically, would scale better if your app potentially needed lots of threads, but it doesn't seem like that's the case. I think for you, it's basically down to which style you prefer to code in (assuming you can find all the asyncio-compatible libraries you need).

Why are Twisted's DeferredFilesystemLocks not safe for concurrent use?

In the Twisted API for DeferredFilesystemLock, it is stated that deferUntilLocked is not safe for concurrent use.
I would like to understand in what way it is unsafe and what makes it unsafe, in order to ensure that I don't misuse the file locks.

Arguably the method is actually quite safe for concurrent use. If you read the first four lines of the implementation then it's clear that an attempt at concurrent use will immediately raise AlreadyTryingToLockError.
Perhaps the warning is meant to tell you that you'll get an exception rather than useful locking behavior, though.
The implementation of that exception should provide a hint about why concurrent use isn't allowed. DeferredFilesystemLock uses some instance attributes, starting with _tryLockCall, to keep track of progress in the attempt to acquire the lock. If concurrent attempts were allowed, they would each trample over the use of this attribute (and others) by each other.
This could be enhanced with relative ease. All that would be necessary is to keep the state associated with the lock attempt on a new object allocated per-attempt (instead of on the DeferredFilesystemLock instance). Or, DeferredLock could help.

The first and most obvious thing that comes to mind is that in concurrent situations you're never guaranteed to acquire the lock (if another thread never releases it), so you may defer forever. You could avoid this by simply passing the optional timeout to deferUntilLocked.
Other things to consider that may make this unsuitable for concurrent use:
Starvation: What if multiple threads are continually waiting to acquire the same lock - are they treated fairly, or will one thread spend longer waiting than others? Are threads guaranteed to eventually acquire the lock?
Deadlocks: If you're acquiring multiple locks at a time, and multiple threads are doing this, you may get into a situation where you have two threads both waiting on a resource that the other one holds.
Are you sure that acquired locks are always released? What if one thread acquires a lock and crashes without releasing it?
It looks to me like Twisted's implementation is fairly simple and probably doesn't take into account many of these things. Their "not safe" comment is a "there be dragons here" - you may/will get difficult to troubleshoot concurrent bugs or issues if you try to use this in a concurrent application.

Twisted - should this code be run in separate threads

I am running some code that has X workers, each worker pulling tasks from a queue every second. For this I use twisted's task.LoopingCall() function. Each worker fulfills its request (scrape some data) and then pushes the response back to another queue. All this is done in the reactor thread since I am not deferring this to any other thread.
I am wondering whether I should run all these jobs in separate threads or leave them as they are. And if so, is there a problem if I call task.LoopingCall every second from each thread ?

No, you shouldn't use threads. You can't call LoopingCall from a thread (unless you use reactor.callFromThread), but it wouldn't help you make your code faster.
If you notice a performance problem, you may want to profile your workload, figure out where the CPU-intensive work is, and then put that work into multiple processes, spawned with spawnProcess. You really can't skip the step where you figure out where the expensive work is, though: there's no magic pixie dust you can sprinkle on your Twisted application that will make it faster. If you choose a part of your code which isn't very intensive and doesn't require blocking resources like CPU or disk, then you will discover that the overhead of moving work to a different process may outweigh any benefit of having it there.

You shouldn't use threads for that. Doing it all in the reactor thread is ok. If your scraping uses twisted.web.client to do the network access, it shouldn't block, so you will go as fast as it gets.

First, beware that Twisted's reactor sometimes multithreads and assigns tasks without telling you anything. Of course, I haven't seen your program in particular.
Second, in Python (that is, in CPython) spawning threads to do non-blocking computation has little benefit. Read up on the GIL (Global Interpreter Lock).

How to maximize performance in Python when doing many I/O bound operations?

I have a situation where I'm downloading a lot of files. Right now everything runs on one main Python thread, and downloads as many as 3000 files every few minutes. The problem is that the time it takes to do this is too long. I realize Python has no true multi-threading, but is there a better way of doing this? I was thinking of launching multiple threads since the I/O bound operations should not require access to the global interpreter lock, but perhaps I misunderstand that concept.

Multithreading is just fine for the specific purpose of speeding up I/O on the net (although asynchronous programming would give even greater performance). CPython's multithreading is quite "true" (native OS threads) -- what you're probably thinking of is the GIL, the global interpreter lock that stops different threads from simultaneously running Python code. But all the I/O primitives give up the GIL while they're waiting for system calls to complete, so the GIL is not relevant to I/O performance!
For asynchronous programming, the most powerful framework around is twisted, but it can take a while to get the hang of it if you're never done such programming. It would probably be simpler for you to get extra I/O performance via the use of a pool of threads.

Could always take a look at multiprocessing.

is there a better way of doing this?
Yes
I was thinking of launching multiple threads since the I/O bound operations
Don't.
At the OS level, all the threads in a process are sharing a limited set of I/O resources.
If you want real speed, spawn as many heavyweight OS processes as your platform will tolerate. The OS is really, really good about balancing I/O workloads among processes. Make the OS sort this out.
Folks will say that spawning 3000 processes is bad, and they're right. You probably only want to spawn a few hundred at a time.
What you really want is the following.
A shared message queue in which the 3000 URI's are queued up.
A few hundred workers which are all reading from the queue.
Each worker gets a URI from the queue and gets the file.
The workers can stay running. When the queue's empty, they'll just sit there, waiting for work.
"every few minutes" you dump the 3000 URI's into the queue to make the workers start working.
This will tie up every resource on your processor, and it's quite trivial. Each worker is only a few lines of code. Loading the queue is a special "manager" that's just a few lines of code, also.

Gevent is perfect for this.
Gevent's use of Greenlets (lightweight coroutines in the same python process) offer you asynchronous operations without compromising code readability or introducing abstract 'reactor' concepts into your mix.

A multi-part/threaded downloader via python?

I've seen a few threaded downloaders online, and even a few multi-part downloaders (HTTP).
I haven't seen them together as a class/function.
If any of you have a class/function lying around, that I can just drop into any of my applications where I need to grab multiple files, I'd be much obliged.
If there is there a library/framework (or a program's back-end) that does this, please direct me towards it?

Threadpool by Christopher Arndt may be what you're looking for. I've used this "easy to use object-oriented thread pool framework" for the exact purpose you describe and it works great. See the usage examples at the bottom on the linked page. And it really is easy to use: just define three functions (one of which is an optional exception handler in place of the default handler) and you are on your way.
from http://www.chrisarndt.de/projects/threadpool/:
Object-oriented, reusable design
Provides callback mechanism to process results as they are returned from the worker threads.
WorkRequest objects wrap the tasks assigned to the worker threads and allow for easy passing of arbitrary data to the callbacks.
The use of the Queue class solves most locking issues.
All worker threads are daemonic, so they exit when the main program exits, no need for joining.
Threads start running as soon as you create them. No need to start or stop them. You can increase or decrease the pool size at any time, superfluous threads will just exit when they finish their current task.
You don't need to keep a reference to a thread after you have assigned the last task to it. You just tell it: "don't come back looking for work, when you're done!"
Threads don't eat up cycles while waiting to be assigned a task, they just block when the task queue is empty (though they wake up every few seconds to check whether they are dismissed).
Also available at http://pypi.python.org/pypi/threadpool, easy_install, or as a subversion checkout (see project homepage).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.