Using concurrent.futures.Future with greenlets/gevent - python

I have a python library that performs asynchronous network via multicast which may garner replies from other services. It hides the dirty work by returning a Future which will capture a reply. I am integrating this library into an existing gevent application. The call pattern is as simple as:
future = service.broadcast()
# next call blocks the current thread
reply = future.result(some_timeout)
Under the hood, concurrent.futures.Future.result() uses threading.Condition.wait().
With a monkey-patched threading module, this seems fine and safe, and non-blocking with greenlets.
Is there any reason to be worried here or when mixing gevent and concurrent.futures?

Well, as far as I can tell, futures isn't documented to work on top of threading.Condition, and gevent isn't documented to be able to patch futures safely. So, in theory, someone could write a Python implementation that would break gevent.
But in practice? It's hard to imagine what such an implementation would look like. You obviously need some kind of sync objects to make a Future work. Sure, you could use an Event, Lock, and Rlock instead of a Condition, but that won't cause a problem for gevent. The only way an implementation could plausibly break things would be to go directly to the pthreads/Win32/Java/.NET/whatever sync objects instead of using the wrappers in threading.
How would you deal with that if it happened? Well, futures is implemented in pure Python, and it's pretty simple Python, and there's a fully functional backport that works with 2.5+/3.2+. So, you'd just have to grab that backport and swap out concurrent.futures for futures.
So, if you're doing something wacky like deploying a server that's going to run for 5 years unattended and may have its Python repeatedly upgraded underneath it, maybe I'd install the backport now and use that instead.
Otherwise, I'd just document the assumption (and the workaround in case it's ever broken) in the appropriate place, and then just use the stdlib module.

Related

How can I implement async requests without 3rd party packages in Python 2.7

What I want is in title. The backgroud is I have thousands of requests to send to a very slow Restful interface in the program where all 3rd party packages are not allowed to imported into, except requests.
The speed of MULTITHREADING AND MULTIPROCESSING is limited to GIL and the 4 cores computer in which the program will be run.
I know you can implement an incomplete coroutine in Python 2.7 by generator and yield key word, but how can I make it possible to do thousands of requests with the incomplete coroutine ability?
Example
url_list = ["https://www.example.com/rest?id={}".format(num) for num in range(10000)]
results = request_all(url_list) # do asynchronously
First, you're starting from an incorrect premise.
The speed of multiprocessing is not limited by the GIL at all.
The speed of multiprocessing is only limited by the number of cores for CPU-bound work, which yours is not. And async doesn't work at all for CPU-bound work, so multiprocessing would be 4x better than async, not worse.
The speed of multithreading is only limited by the GIL for CPU-bound code, which, again, yours is not.
The speed of multithreading is barely affected by the number of cores. If your code is CPU-bound, the threads mostly end up serialized on a single core. But again, async is even worse here, not better.
The reason people use async is that not that it solves any of these problems; in fact, it only makes them worse. The main advantage is that if you have a ton of workers that are doing almost no work, you can schedule a ton of waiting-around coroutines more cheaply than a ton of waiting-around threads or processes. The secondary advantage is that you can tie the selector loop to the scheduler loop and eliminate a bit of overhead coordinating them.
Second, you can't use requests with asyncio in the first place. It expects to be able to block the whole thread on socket reads. There was a project to rewrite it around an asyncio-based transport adapter, but it was abandoned unfinished.
The usual way around that is to use it in threads, e.g., with run_in_executor. But if the only thing you're doing is requests, building an event loop just to dispatch things to a thread pool executor is silly; just use the executor directly.
Third, I doubt you actually need to have thousands of requests running in parallel. Although of course the details depend on your service or your network or whatever the bottleneck is, it's almost always more efficient to have a thread pool that can run, say, 12 or 64 requests running in parallel, with the other thousands queued up behind them.
Handling thousands of concurrent connections (and therefore workers) is usually something you only have to do on a server. Occasionally you have to do it on a client that's aggregating data from a huge number of different services. But if you're just hitting a single service, there's almost never any benefit to that much concurrency.
Fourth, if you really do want a coroutine-based event loop in Python 2, by far the easiest way is to use gevent or greenlets or another such library.
Yes, they give you an event loop hidden under the covers where you can't see it, and "magic" coroutines where the yielding happens inside methods like socket.send and Thread.join instead of being explicitly visible with await or yield from, but the plus side is that they already work—and, in fact, the magic means they work with requests, which anything you build will not.
Of course you don't want to use any third-party libraries. Building something just like greenlets yourself on top of Stackless or PyPy is pretty easy; building it for CPython is a lot more work. And then you still have to do all the monkeypatching that gevent does to make libraries like sockets work like magic, or rewrite requests around explicit greenlets.
Anyway, if you really want to build an event loop on top of just plain yield, you can.
In Greg Ewing's original papers on why Python needed to add yield from, he included examples of a coroutine event loop with just yield, and a better one that uses an explicit trampoline to yield to—with a simple networking-driven example. He even wrote an automatic translator from code for the (at the time not implemented) yield from to Python 3.1.
Notice that having to bounce every yield off a trampoline makes things a lot less efficient. There's really no way around that. That's a good part of the reason we have yield from in the language.
But that's just the scheduler part with a bit of toy networking. You still need to integrate a selectors loop and then write coroutines to replace all of the socket functions you need. Consider how long asyncio took Guido to build when he knew Python inside and out and had yield from to work with… but then you can steal most of his design, so it won't be quite that bad. Still, it's going to be a lot of work.
(Oh, and you don't have selectors in Python 2. If you don't care about Windows, it's pretty easy to build the part you need out of the select module, but if you do care about Windows, it's a lot more work.)
And remember, because requests won't work with your code, you're also going to need to reimplement most of it as well. Or, maybe better, port aiohttp from asyncio to your framework.
And, in the end, I'd be willing to give you odds that the result is not going to be anywhere near as efficient as aiohttp in Python 3, or requests on top of gevent in Python 2, or just requests on top of a thread pool in either.
And, of course, you'll be the only person in the world using it. asyncio had hundreds of bugs to fix between tulip and going into the stdlib, which were only detected because dozens of early adopters (including people who are serious experts on this kind of thing) were hammering on it. And requests, aiohttp, gevent, etc. are all used by thousands of servers handling zillions of dollars worth of business, so you benefit from all of those people finding bugs and needing fixes. Whatever you build almost certainly won't be nearly as reliable as any of those solutions.
All this for something you're probably going to need to port to Python 3 anyway, since Python 2 hits end-of-life in less than a year and a half, and distros and third-party libraries are already disengaging from it. For a relevant example, requests 3.0 is going to require at least Python 3.5; if you want to stick with Python 2.7, you'll be stuck with requests 2.1 forever.

How can I do asynchronous programming but hide it in Python?

Am just getting my head round Twisted, threading, stackless, etc. etc. and would appreciate some high level advice.
Suppose I have remote clients 1 and 2, connected via a websocket running in a page on their browsers. Here is the ideal goal:
for cl in (1,2):
guess[cl] = show(cl, choice("Pick a number:", range(1,11)))
checkpoint()
if guess[1] == guess[2]:
show((1,2), display("You picked the same number!"))
Ignoring the mechanics of show, choice and display, the point is that I want the show call to be asynchronous. Each client gets shown the choice. The code waits at checkpoint() for all the threads (or whatever) to rejoin.
I would be interested in hearing answers even if they involve hairy things like rewriting the source code. I'd also be interested in less hairy answers which involve compromising a bit on the syntax.
The most simple solution code wise is to use a framework like Autobahn which support remote procdure calls (RPC). That means you can call some JavaScript in the browser and wait for the result.
If you want to call two clients, you will have to use threads.
You can also do it manually. The approach works along these lines:
You need to pass a callback to show().
show() needs to register the callback with some kind of string ID in a global dict
show() must send this ID to the client
When the client sends the answer, it must include the ID.
The Python handler can then remove the callback from the global dict and invoke it with the answer
The callback needs to collect the results.
When it has enough results (two in your case), it must send status updates to the client.
You can simplify the code using yield but the theory behind is a bit complex to understand: What does the "yield" keyword do in Python? and coroutines
In Python, the most widely-used approach to async/event-based network programming that hides that model from the programmer is probably gevent.
Beware: this kind of trickery works by making tasks yield control implicitly, which encourages the same sorts of surprising bugs that tend to appear when OS threads are involved. Local reasoning about such problems is significantly harder than with explicit yielding, and the convenience of avoiding callbacks might not be worth the trouble introduced by the inherent pitfalls. Perhaps just as important to a library author like yourself: this approach is not pure Python, and would force dependencies and interpreter restrictions on the users of your library.
A lot of discussion about this topic sprouted up (especially between the gevent and twisted camps) while Guido was working on the asyncio library, which was called tulip at the time. He summarized the main issues here.

Using `concurrent.futures.Future` as promise

In the Python docs I see:
concurrent.futures.Future... ...should not be created directly
except for testing.
And I want to use it as a promise in my code and I'm very surprised that it is not recommended to use it like this.
My use case:
I have a single thread that reads data packets coming from socket, and I have many callbacks that are called depending on some information contained in packets. Packets are responses to consumers requests, and all consumers use single connection. Each consumer receives a promise and adds some handlers to it, that are invoked when response arrives.
So I cant use Executor subclass here, because I have only one thread, but I need to create many Futures (promises).
Promise is pretty widespread programming technique and, I thought that Future is Python's promise implementation. But if it is not recommended to use it like promise, what pythonistas are commonly use for this purpose?
Note
I use Python 2.7 backport of concurrent.futures to 2.7
It's perfectly fine to use Future in order to wrap non-promise APIs into promises.
The reason it generally should not be created is because most times people create futures directly it's because they're doing the deferred anti pattern and wrapping an executor created future in another future.
It's worth mentioning that this future implementation is very weak, it's akin to Java's old futures, the cool stuff promises give you like chaining is simply missing. It's worth mentioning that languages like JavaScript got their promises from Python's Twisted, which has a better implementation, even if it's intertwined with other things.

Gevent/Eventlet monkey patching for DB drivers

After doing Gevent/Eventlet monkey patching - can I assume that whenever DB driver (eg redis-py, pymongo) uses IO through standard library (eg socket) it will be asynchronous?
So using eventlets monkey patching is enough to make eg: redis-py non blocking in eventlet application?
From what I know it should be enough if I take care about connection usage (eg to use different connection for each greenlet). But I want to be sure.
If you known what else is required, or how to use DB drivers correctly with Gevent/Eventlet please type it also.
You can assume it will be magically patched if all of the following are true.
You're sure of the I/O is built on top of standard Python sockets or other things that eventlet/gevent monkeypatches. No files, no native (C) socket objects, etc.
You pass aggressive=True to patch_all (or patch_select), or you're sure the library doesn't use select or anything similar.
The driver doesn't use any (implicit) internal threads. (If the driver does use threads internally, patch_thread may work, but it may not.)
If you're not sure, it's pretty easy to test—probably easier than reading through the code and trying to work it out. Have one greenlet that just does something like this:
while True:
print("running")
gevent.sleep(0.1)
Then have another that runs a slow query against the database. If it's monkeypatched, the looping greenlet will keep printing "running" 10 times/second; if not, the looping greenlet will not get to run while the program is blocked on the query.
So, what do you do if your driver blocks?
The easiest solution is to use a truly concurrent threadpool for DB queries. The idea is that you fire off each query (or batch) as a threadpool job and greenlet-block your gevent on the completion of that job. (For really simple cases, where you don't need many concurrent queries, you can just spawn a threading.Thread for each one instead, but usually you can't get away with that.)
If the driver does significant CPU work (e.g., you're using something that runs an in-process cache, or even an entire in-process DBMS like sqlite), you want this threadpool to actually be implemented on top of processes, because otherwise the GIL may prevent your greenlets from running. Otherwise (especially if you care about Windows), you probably want to use OS threads. (However, this means you can't patch_threads(); if you need to do that, use processes.)
If you're using eventlet, and you want to use threads, there's a built-in simple solution called tpool that may be sufficient. If you're using gevent, or you need to use processes, this won't work. Unfortunately, blocking a greenlet (without blocking the whole event loop) on a real threading object is a bit different between eventlet and gevent, and not documented very well, but the tpool source should give you the idea. Beyond that part, the rest is just using concurrent.futures (see futures on pypi if you need this in 2.x or 3.1) to execute the tasks on a ThreadPoolExecutor or ProcessPoolExecutor. (Or, if you prefer, you can go right to threading or multiprocessing instead of using futures.)
Can you explain why I should use OS threads on Windows?
The quick summary is: If you stick to threads, you can pretty much just write cross-platform code, but if you go to processes, you're effectively writing code for two different platforms.
First, read the Programming guidelines for the multiprocessing module (both the "All platforms" section and the "Windows" section). Fortunately, a DB wrapper shouldn't run into most of this. You only need to deal with processes via the ProcessPoolExecutor. And, whether you wrap things up at the cursor-op level or the query level, all your arguments and return values are going to be simple types that can be pickled. Still, it's something you have to be careful about, which otherwise wouldn't be an issue.
Meanwhile, Windows has very low overhead for its intra-process synchronization objects, but very high overhead for its inter-process ones. (It also has very fast thread creation and very slow process creation, but that's not important if you're using a pool.) So, how do you deal with that? I had a lot of fun creating OS threads to wait on the cross-process sync objects and signal the greenlets, but your definition of fun may vary.
Finally, tpool can be adapted trivially to a ppool for Unix, but it takes more work on Windows (and you'll have to understand Windows to do that work).
abarnert's answer is correct and very comprehensive. I just want to add that there is no "aggressive" patching in eventlet, probably gevent feature. Also if library uses select that is not a problem, because eventlet can monkey patch that too.
Indeed, in most cases eventlet.monkey_patch() is all you need. Of course, it must be done before creating any sockets.
If you still have any issues, feel free to open issue or write to eventlet mailing list or G+ community. All relevant links can be found at http://eventlet.net/

Python Twisted and database connections

Our projects at work include synchronous applications (short lived) and asynchronous Twisted applications (long lived). We're re-factoring our database and are going to build an API module to decouple all of the SQL in that module. I'd like to create that API so both synchronous and asynchronous applications can use it. For the synchronous applications I'd like calls to the database API to just return data (blocking) just like using MySQLdb, but for the asynchronous applications I'd like calls to the same API functions/methods to be non-blocking, probably returning a deferred. Anyone have any hints, suggestions or help they might offer me to do this?
Thanks in advance,
Doug
twisted.enterprise.adbapi seems the way to go -- do you think it fails to match your requirements, and if so, can you please explain why?
Within Twisted, you basically want a wrapper around a function which returns a Deferred (such as the Twisted DB layer), waits for it's results, and returns them. However, you can't busy-wait, since that's using up your reactor cycles, and checking for a task to complete using the Twisted non-blocking wait is probably inefficient.
Will inlineCallbacks or deferredGenerator solve your problem? They require a modern Twisted. See the twistedmatrix docs.
def thingummy():
thing = yield makeSomeRequestResultingInDeferred()
print thing #the result! hoorj!
thingummy = inlineCallbacks(thingummy)
Another option would be to have two methods which execute the same SQL template, one which uses runInteraction, which blocks, and one which uses runQuery, which returns a Deferred, but that would involve more code paths which do the same thing.
Have you considered borrowing a page from continuation-passing style? Stackless Python supports continuations directly, if you're using it, and the approach appears to have gained some interest already.
All the database libraries I've seen seem to be stubbornly synchronous.
It appears that Twisted.enterprise.abapi solves this problem by using a threads to manage a connection pool and wrapping the underlying database libraries. This is obviously not ideal, but I suppose it would work, but I haven't actually tried it myself.
Ideally there would be some way to have sqlalchemy and twisted integrated. I found this project, nadbapi, which claims to do it, but it looks like it hasn't been updated since 2007.

Categories