In the Python documentation about threads and the GIL, it says:
In order to emulate concurrency of execution, the interpreter regularly tries to switch threads (see sys.setswitchinterval())
Why would it do this? These context switches appear to do nothing other than waste time. Wouldn't it be quicker to run each process until it releases the GIL, and then run the next?
A thread doesn't neccessarly have any I/O. You could have one thread doing number crunching, another handling I/O. The number-crunching thread with your proposal would never drop the GIL so the other thread could handle the I/O.
To ensure every thread gets to run, a thread will by default drop the GIL after 5 ms (Python 3) if it hasn't done so before because of waiting for I/O.
You can change this interval with sys.setswitchinterval().
Threading is a simple concurrency technique. For a more efficient concurrency technique look into asyncio which offers single-threaded concurrency using coroutines.
Related
It seems that several asyncio functions, like those showed here, for synchronization primitives are not thread safe...
By being not thread safe, considering for example asyncio.Lock, I assume that this lock won't lock the global variable, when we're running multiple threads in our computer, so race conditions are problem.
So, what's the point of having this Lock that doesn't lock? (not a criticism, but an honest doubt)
What are the case uses for these unsafe primitives?
Asyncio is not made for multithreading or multiprocessing work, it is originally made for Asynchronous IO (network) operations with little overhead, hence a lock that is only running in a single thread (as is the case for tasks running in Asyncio eventloop) doesn't need to be threadsafe or process-safe, and therefore doesn't need to suffer from the extra overhead from using a thread-safe or process-safe lock.
using Thread and process executors is only added to allow mingling threaded futures and multiprocessing futures with applications running an eventloop futures seamlessly such as passing them to as_completed function or awaiting their completion as you would with a normal non-multithreaded asyncio task.
if you want a thread-safe lock you can use a thread.Lock, and if you want a process-safe lock you should use a multiprocessing.Lock and suffer the extra overhead.
keep in mind that those locks can still work in an asyncio eventloop and perform almost the same functionality as an asyncio.Lock, they just suffer from higher overhead and will make your code slower when used so don't use them unless you need your code to be Thread-safe or process-safe.
just to briefly explain the difference, when a thread is halted by a thread-safe lock the thread is halted and rescheduled by the operating system, which has a big overhead compared to Asyncio lock that will return to the eventloop again and continue execution instead of halting the thread.
Edit: a threading.Lock is not a direct replacement for asyncio.Lock, and instead you should use threading.RLock followed by an asyncio.Lock to make a function both thread-safe and asyncio-safe, as this will avoid a thread dead-locking itself.
Edit2: as commented by #dano, you can wait for a thread.Lock indirectly using the answer in this question if you want a function to work both threaded and in asyncio eventloop at the same time, but it is not recommended to run a function in both at the same time anyway How to use threading.Lock in async function while object can be accessed from multiple thread
Newbie on Python and multiple threading.
I read some articles on what is blocking and non-blocking I/O, and the main difference seems to be the case that blocking I/O only allows tasks to be executed sequentially, while non-blocking I/O allows multiple tasks to be executed concurrently.
If that's the case, how blocking I/O operations (some Python standard built-in functions) can do multiple threading?
Blocking I/O blocks the thread it's running in, not the whole process. (at least in this context, and on a standard PC)
Multithreading is not affected by definition - only current thread gets blocked.
The global interpreter lock (in cpython) is a measure put in place so that only one active python thread executes at the same time. As frustrating as it can be, this is a good thing because it is put in place to avoid interpreter corruption.
When a blocking operation is encountered, the current thread yields the lock and thus allows other threads to execute while the first thread is blocked. However, when CPU bound threads (when purely python calls are made), only one thread executes no matter how many threads are running.
It is interesting to note that in python 3.2, code was added to mitigate the effects of the global interpreter lock. It is also interesting to note that other implementations of python do not have a global interpreter lock
Please not this is a limitation of the python code and that the underlying libraries may be still processing data.
Also, in many cases, when it comes to I/O, to avoid blocking, a useful way to handle IO is using polling and eventing:
Polling involves checking whether the operation would block and test whether there is data. For example, if you are trying to get
data from a socket, you would use select() and poll()
Eventing involves using callbacks in such a way that your thread is triggered when a relevant IO operation just occurred
I have a "I just want to understand it" question..
first, I'm using python 2.6.5 on Ubuntu.
So.. threads in python (via thread module) are only "threads", and is just tells the GIL to run code blocks from each "thread" in a certain period of time and so and so.. and there aren't actually real threads here..
So the question is - if I have a blocking socket in one thread, and now I'm sending data and block the thread for like 5 seconds. I expected to block all the program because it is one C command (sock.send) that is blocking the thread. But I was surprised to see that the main thread continue to run.
So the question is - how can GIL is able to continue and run the rest of the code after it reaches a blocking command like send? Isn't it have to use real thread in here?
Thanks.
Python uses "real" threads, i.e. threads of the underlying platform. On Linux, it will use the pthread library (if you are interested, here is the implementation).
What is special about Python's threads is the GIL: A thread can only modify Python data structures if it holds this global lock. Thus, many Python operations cannot make use of multiple processor cores. A thread with a blocking socket won't hold the GIL though, so it does not affect other threads.
The GIL is often misunderstood, making people believe threads are almost useless in Python. The only thing the GIL prevents is concurrent execution of "pure" Python code on multiple processor cores. If you use threads to make a GUI responsive or to run other code during blocking I/O, the GIL won't affect you. If you use threads to run code in some C extension like NumPy/SciPy concurrently on multiple processor cores, the GIL won't affect you either.
Python wiki page on GIL mentions that
Note that potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL.
GIL (the Global Interpreter Lock) is just a lock, it does not run anything by itself. Rather, the Python interpreter captures and releases that lock as necessary. As a rule, the lock is held when running Python code, but released for calls to lower-level functions (such as sock.send). As Python threads are real OS-level threads, threads will not run Python code in parallel, but if one thread invokes a long-running C function, the GIL is released and another Python code thread can run until the first one finishes.
I am running some code that has X workers, each worker pulling tasks from a queue every second. For this I use twisted's task.LoopingCall() function. Each worker fulfills its request (scrape some data) and then pushes the response back to another queue. All this is done in the reactor thread since I am not deferring this to any other thread.
I am wondering whether I should run all these jobs in separate threads or leave them as they are. And if so, is there a problem if I call task.LoopingCall every second from each thread ?
No, you shouldn't use threads. You can't call LoopingCall from a thread (unless you use reactor.callFromThread), but it wouldn't help you make your code faster.
If you notice a performance problem, you may want to profile your workload, figure out where the CPU-intensive work is, and then put that work into multiple processes, spawned with spawnProcess. You really can't skip the step where you figure out where the expensive work is, though: there's no magic pixie dust you can sprinkle on your Twisted application that will make it faster. If you choose a part of your code which isn't very intensive and doesn't require blocking resources like CPU or disk, then you will discover that the overhead of moving work to a different process may outweigh any benefit of having it there.
You shouldn't use threads for that. Doing it all in the reactor thread is ok. If your scraping uses twisted.web.client to do the network access, it shouldn't block, so you will go as fast as it gets.
First, beware that Twisted's reactor sometimes multithreads and assigns tasks without telling you anything. Of course, I haven't seen your program in particular.
Second, in Python (that is, in CPython) spawning threads to do non-blocking computation has little benefit. Read up on the GIL (Global Interpreter Lock).
I have been reading up on the threaded model of programming versus the asynchronous model from this really good article. http://krondo.com/blog/?p=1209
However, the article mentions the following points.
An async program will simply outperform a sync program by switching between tasks whenever there is a I/O.
Threads are managed by the operating system.
I remember reading that threads are managed by the operating system by moving around TCBs between the Ready-Queue and the Waiting-Queue(amongst other queues). In this case, threads don't waste time on waiting either do they?
In light of the above mentioned, what are the advantages of async programs over threaded programs?
It is very difficult to write code that is thread safe. With asyncronous code, you know exactly where the code will shift from one task to the next and race conditions are therefore much harder to come by.
Threads consume a fair amount of data since each thread needs to have its own stack. With async code, all the code shares the same stack and the stack is kept small due to continuously unwinding the stack between tasks.
Threads are OS structures and are therefore more memory for the platform to support. There is no such problem with asynchronous tasks.
Update 2022:
Many languages now support stackless co-routines (async/await). This allows us to write a task almost synchronously while yielding to other tasks (awaiting) at set places (sleeping or waiting for networking or other threads)
There are two ways to create threads:
synchronous threading - the parent creates one (or more) child threads and then must wait for each child to terminate. Synchronous threading is often referred to as the fork-join model.
asynchronous threading - the parent and child run concurrently/independently of one another. Multithreaded servers typically follow this model.
resource - http://www.amazon.com/Operating-System-Concepts-Abraham-Silberschatz/dp/0470128720
Assume you have 2 tasks, which does not involve any IO (on multiprocessor machine).
In this case threads outperform Async. Because Async like a
single threaded program executes your tasks in order. But threads can
execute both the tasks simultaneously.
Assume you have 2 tasks, which involve IO (on multiprocessor machine).
In this case both Async and Threads performs more or less same (performance
might vary based on number of cores, scheduling, how much process intensive
the task etc.). Also Async takes less amount of resources, low overhead and
less complex to program over multi threaded program.
How it works?
Thread 1 executes Task 1, since it is waiting for IO, it is moved to IO
waiting Queue. Similarly Thread 2 executes Task 2, since it is also involves
IO, it is moved to IO waiting Queue. As soon as it's IO request is resolved
it is moved to ready queue so the scheduler can schedule the thread for
execution.
Async executes Task 1 and without waiting for it's IO to complete it
continues with Task 2 then it waits for IO of both the task to complete. It
completes the tasks in the order of IO completion.
Async best suited for tasks which involve Web service calls, Database query
calls etc.,
Threads for process intensive tasks.
The below video explains aboutAsync vs Threaded model and also when to use etc.,
https://www.youtube.com/watch?v=kdzL3r-yJZY
Hope this is helpful.
First of all, note that a lot of the detail of how threads are implemented and scheduled are very OS-specific. In general, you shouldn't need to worry about threads waiting on each other, since the OS and the hardware will attempt to arrange for them to run efficiently, whether asynchronously on a single-processor system or in parallel on multi-processors.
Once a thread has finished waiting for something, say I/O, it can be thought of as runnable. Threads that are runnable will be scheduled for execution at some point soon. Whether this is implemented as a simple queue or something more sophisticated is, again, OS- and hardware-specific. You can think of the set of blocked threads as a set rather than as a strictly ordered queue.
Note that on a single-processor system, asynchronous programs as defined here are equivalent to threaded programs.
see http://en.wikipedia.org/wiki/Thread_(computing)#I.2FO_and_scheduling
However, the use of blocking system calls in user threads (as opposed to kernel threads) or fibers can be problematic. If a user thread or a fiber performs a system call that blocks, the other user threads and fibers in the process are unable to run until the system call returns. A typical example of this problem is when performing I/O: most programs are written to perform I/O synchronously. When an I/O operation is initiated, a system call is made, and does not return until the I/O operation has been completed. In the intervening period, the entire process is "blocked" by the kernel and cannot run, which starves other user threads and fibers in the same process from executing.
According to this, your whole process might be blocked, and no thread will be scheduled when one thread is blocked in IO. I think this is os-specific, and will not always be hold.
To be fair, let's point out the benefit of Threads under CPython GIL compared to async approach:
it's easier first to write typical code that has one flow of events (no parallel execution) and then to run multiple copies of it in separate threads: it will keep each copy responsive, while the benefit of executing all I/O operations in parallel will be achieved automatically;
many time-proven libraries are sync and therefore easy to be included in the threaded version, and not in async one;
some sync libraries actually let GIL go at C level that allows parallel execution for tasks beyond I/O-bound ones: e.g. NumPy;
it's harder to write async code in general: the inclusion of a heavy CPU-bound section will make concurrent tasks not responsive, or one may forget to await the result and finish execution earlier.
So if there are no immediate plans to scale your services beyond ~100 concurrent connections it may be easier to start with a threaded version and then rewrite it... using some other more performant language like Go.
Async I/O means there is already a thread in the driver that does the job, so you are duplicating functionality and incurring some overhead. On the other hand, often it is not documented how exactly the driver thread behaves, and in complex scenarios, when you want to control timeout/cancellation/start/stop behaviour, synchronization with other threads, it makes sense to implement your own thread. It is also sometimes easier to reason in sync terms.