Im trying to get more familiar in the usage of asyncio in python3, but I dont see when i should use async/await or threading. Is there a difference or is one just easier to use than the other.
So for example between these two functions, is one better than the other?
Just generic code
def func1()
def func2()
threads = [threading.Thread(target=func1), threading.Thread(target=func2)]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
versus
async def func1()
async def func2()
await asyncio.gather(func1(), func2())
My advice is to use threads in all cases, unless you're writing a high-performance concurrent application and benchmarking shows that threading alone is too slow.
In principle there are a lot of things to like about coroutines. They're easier to reason about since every sequence of nonblocking operations can be treated as atomic (at least if you don't combine them with threads); and because they're cheap, you can spin them up willy-nilly for better separation of concerns, without worrying about killing your performance.
However, the actual implementation of coroutines in Python is a mess. The biggest problem is that every function that might block, or that might call any code that might block, or that might call any code that ... might call any code that might block, has to be rewritten with the async and await keywords. This means that if you want to use a library that blocks, or that calls back into your code and you want to block in the callbacks, then you just can't use that library at all. There are duplicate copies of a bunch of libraries in the CPython distribution now for this reason, and even duplicate copies of built-in syntax (async for, etc.), but you can't expect most of the modules available through pip to be maintained in two versions.
Threading doesn't have this problem. If a library wasn't designed to be thread safe, you can still use it in a multithreaded program, either by confining its use to one thread or by protecting all uses with a lock.
So threading is overall far easier to use in Python despite its problems.
There are some third party coroutine solutions that avoid the "async infection" problem, such as greenlet, but you still won't be able to use them with libraries that block internally unless they're specially designed to work with coroutines.
With asyncio, a piece of code can take back control using await. With threads, this is handled by the Python scheduler. Multithreading puts into place a locking mechanism to prevent issues with shared memory. Another advantage of multithreading is it is able to use several cores.
Here is a great article if you want to read more:
https://medium.com/#nhumrich/asynchronous-python-45df84b82434
Related
I am trying to learn twisted framework. But, I am not able to get a handle of it.
Say, I have this function.
def long_blocking_call(arg1, arg2):
# do something
time.sleep(5) # simulate blocking call
return result
results = []
for k, v in args.iteritems():
r = long_blocking_call(k,v)
results.append(r)
But, I was wondering how can I leverage deferToThread (or something else in twisted world) to run the long_blocking_call in "parallel"
I found this example: Periodically call deferToThread
But, I am not exactly sure if that is running things in parallel?
deferToThread uses Python's built-in threading support to run the function passed to it in a separate thread (from a thread pool).
So deferToThread has all of the same properties as the built-in threading module when it comes to parallelism. On CPython, threads can run in parallel as long as only one of them is holding the Global Interpreter Lock.
Since there is no universal cause of "blocking" there is also no universal solution to "blocking" - so there's no way to say whether deferToThread will result in parallel execution or not in general. However, a general rule of thumb is that if the blocking comes from I/O it probably will and if it comes from computation it probably won't.
Of course, if it comes from I/O, you might be better off using some other feature from Twisted instead of multithreading.
What I want is in title. The backgroud is I have thousands of requests to send to a very slow Restful interface in the program where all 3rd party packages are not allowed to imported into, except requests.
The speed of MULTITHREADING AND MULTIPROCESSING is limited to GIL and the 4 cores computer in which the program will be run.
I know you can implement an incomplete coroutine in Python 2.7 by generator and yield key word, but how can I make it possible to do thousands of requests with the incomplete coroutine ability?
Example
url_list = ["https://www.example.com/rest?id={}".format(num) for num in range(10000)]
results = request_all(url_list) # do asynchronously
First, you're starting from an incorrect premise.
The speed of multiprocessing is not limited by the GIL at all.
The speed of multiprocessing is only limited by the number of cores for CPU-bound work, which yours is not. And async doesn't work at all for CPU-bound work, so multiprocessing would be 4x better than async, not worse.
The speed of multithreading is only limited by the GIL for CPU-bound code, which, again, yours is not.
The speed of multithreading is barely affected by the number of cores. If your code is CPU-bound, the threads mostly end up serialized on a single core. But again, async is even worse here, not better.
The reason people use async is that not that it solves any of these problems; in fact, it only makes them worse. The main advantage is that if you have a ton of workers that are doing almost no work, you can schedule a ton of waiting-around coroutines more cheaply than a ton of waiting-around threads or processes. The secondary advantage is that you can tie the selector loop to the scheduler loop and eliminate a bit of overhead coordinating them.
Second, you can't use requests with asyncio in the first place. It expects to be able to block the whole thread on socket reads. There was a project to rewrite it around an asyncio-based transport adapter, but it was abandoned unfinished.
The usual way around that is to use it in threads, e.g., with run_in_executor. But if the only thing you're doing is requests, building an event loop just to dispatch things to a thread pool executor is silly; just use the executor directly.
Third, I doubt you actually need to have thousands of requests running in parallel. Although of course the details depend on your service or your network or whatever the bottleneck is, it's almost always more efficient to have a thread pool that can run, say, 12 or 64 requests running in parallel, with the other thousands queued up behind them.
Handling thousands of concurrent connections (and therefore workers) is usually something you only have to do on a server. Occasionally you have to do it on a client that's aggregating data from a huge number of different services. But if you're just hitting a single service, there's almost never any benefit to that much concurrency.
Fourth, if you really do want a coroutine-based event loop in Python 2, by far the easiest way is to use gevent or greenlets or another such library.
Yes, they give you an event loop hidden under the covers where you can't see it, and "magic" coroutines where the yielding happens inside methods like socket.send and Thread.join instead of being explicitly visible with await or yield from, but the plus side is that they already work—and, in fact, the magic means they work with requests, which anything you build will not.
Of course you don't want to use any third-party libraries. Building something just like greenlets yourself on top of Stackless or PyPy is pretty easy; building it for CPython is a lot more work. And then you still have to do all the monkeypatching that gevent does to make libraries like sockets work like magic, or rewrite requests around explicit greenlets.
Anyway, if you really want to build an event loop on top of just plain yield, you can.
In Greg Ewing's original papers on why Python needed to add yield from, he included examples of a coroutine event loop with just yield, and a better one that uses an explicit trampoline to yield to—with a simple networking-driven example. He even wrote an automatic translator from code for the (at the time not implemented) yield from to Python 3.1.
Notice that having to bounce every yield off a trampoline makes things a lot less efficient. There's really no way around that. That's a good part of the reason we have yield from in the language.
But that's just the scheduler part with a bit of toy networking. You still need to integrate a selectors loop and then write coroutines to replace all of the socket functions you need. Consider how long asyncio took Guido to build when he knew Python inside and out and had yield from to work with… but then you can steal most of his design, so it won't be quite that bad. Still, it's going to be a lot of work.
(Oh, and you don't have selectors in Python 2. If you don't care about Windows, it's pretty easy to build the part you need out of the select module, but if you do care about Windows, it's a lot more work.)
And remember, because requests won't work with your code, you're also going to need to reimplement most of it as well. Or, maybe better, port aiohttp from asyncio to your framework.
And, in the end, I'd be willing to give you odds that the result is not going to be anywhere near as efficient as aiohttp in Python 3, or requests on top of gevent in Python 2, or just requests on top of a thread pool in either.
And, of course, you'll be the only person in the world using it. asyncio had hundreds of bugs to fix between tulip and going into the stdlib, which were only detected because dozens of early adopters (including people who are serious experts on this kind of thing) were hammering on it. And requests, aiohttp, gevent, etc. are all used by thousands of servers handling zillions of dollars worth of business, so you benefit from all of those people finding bugs and needing fixes. Whatever you build almost certainly won't be nearly as reliable as any of those solutions.
All this for something you're probably going to need to port to Python 3 anyway, since Python 2 hits end-of-life in less than a year and a half, and distros and third-party libraries are already disengaging from it. For a relevant example, requests 3.0 is going to require at least Python 3.5; if you want to stick with Python 2.7, you'll be stuck with requests 2.1 forever.
I saw this page suggesting the usage of defer module to execute a series of tasks asynchronously.
I want to use it for my Project:
Calculate the median of each list of numbers I have (Got a list, containing lists of numbers)
Get the minimum and maximum medians, of all medians.
But for the matter of fact, I did not quite understand how to use it.
I would love some explanation about defer in python, and whether you think it is the appropriate way achieving my goal (considering the Global Interpreter Lock).
Thanks in advance!
No, using asynchronous programming (cooperative routines, aka coroutines), will not help your use case. Async is great for I/O intensive workloads, or anything else that has to wait for slower, external events to fire.
Coroutines work because they give up control (yield) to other coroutines whenever they have to wait for something (usually for some I/O to take place). If they do this frequently, the event loop can alternate between loads of coroutines, often far more than what threading could achieve, with a simpler programming model (no need to lock data structures all the time).
Your use-case is not waiting for I/O however; you have a computationally heavy workload. Such workloads do not have obvious places to yield, and because they don't need wait for external events, there is no reason to do so anyway. For such a workload, use a multiprocessing model to do work in parallel on different CPU cores.
Asynchronous programming does not defeat the GIL either, but does give the event loop the opportunity to move the waiting for I/O parts to C code that can unlock the GIL and handle all that I/O processing in parallel while other Python code (in a different coroutine) can execute.
See this talk by my colleague Łukasz Langa at PyCON 2016 for a good introduction to async programming.
CPython has a Global Interpreter Lock (GIL).
So, multiple threads cannot concurrently run Python bytecodes.
What then is the use and relevance of the threading package in CPython ?
During I/O the GIL is released to other threads can run.
Also some extensions (like numpy) can release the GIL when doing calculations.
So an important purpose is to improve performance on not CPU-bound programs. From the Python documentation for the threading module:
CPython implementation detail: In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing or concurrent.futures.ProcessPoolExecutor. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously.
Another benefit of threading is to do long-running calculations in a GUI program without having to chop up your calculations in small enough pieces to make them fit in timeout functions.
Also keep in mind that while CPython has a GIL now, that might not always be the case in the future.
When python runs some code, the code is compiled in "atomic" commands (= small instructions). Every few hundred atomic instructions python will switch to the next thread and execute the the instructions for that thread. This allows running code pseudo-parallel.
Lets assume you have this code:
def f1():
while True:
# wait for incomming connections and serve a website to them
def f2():
while True:
# get new tweets and process them
And you want to execute f1() and f2() at the same time. In this case, you can simpy use threading and dont need to worry about breaking the loops every now and then to execute the other function. This is also way easier than asynchronous programming.
Simple said: It makes writing scripts which needs to do multiple things easier.
Also, like #roland-smith said, Python releases the GIL during I/O and some other low-level c-code.
I already worked with python async frameworks like Twisted and Tornado. Also I know that python already have native implementation of async calls via asyncio module. I thought that (threads, multiprocessing) and async calls are different concepts. But not long ago I watched a couple of videos related to threading and multiprocessing and seems that all this async staff build above them. Is it true?
No, async calls is the way to structure a program. threading, multiprocessing may be used to implement some of these calls (but they are neither necessary nor common in Python asynchronous frameworks).
Concurrency is not parallelism:
In programming, concurrency is the composition of independently
executing processes, while parallelism is the simultaneous execution
of (possibly related) computations
Do not confuse how the program text is organized and how it is implemented (or executed). The exact same asynchronous code may be executed in a single thread, in multiple threads, in multiple processes. It is easy to switch between a simple Pool code that uses multiprocessing.Pool (processes), multiprocessing.dummy.Pool (threads), or their gevent-patched versions (single-threaded). Also, if there is only a single CPU then processes won't necessarily run in parallel but OS can make them run concurrently.
If by async you mean async keyword in Python then it means a generator function -- just one of the ways to create awaitable objects. asyncio is not the only way to consume such object e.g., there is curio which uses async functions but the backend is independent from asyncio. Recommended video: Python Concurrency From the Ground Up: LIVE!.
No, generally, async is single-threaded, and to implement async absolutely does not require the use of multiple threads of processes (that's the whole point of async). But there are use cases where people may want to mix them together for whatever reason.
In this model [the async model], the tasks are interleaved with one another, but in a
single thread of control. This is simpler than the threaded case
because the programmer always knows that when one task is executing,
another task is not. Although in a single-processor system a threaded
program will also execute in an interleaved pattern, a programmer
using threads should still think in terms of Figure 2, not Figure 3,
lest the program work incorrectly when moved to a multi-processor
system. But a single-threaded asynchronous system will always execute
with interleaving, even on a multi-processor system.
Source: http://krondo.com/?p=1209