Python multithreading, How is it using multiple Cores?

Python multithreading, How is it using multiple Cores? - python

I am running a multithreaded application(Python2.7.3) in a Intel(R) Core(TM)2 Duo CPU E7500 # 2.93GHz. I thought it would be using only one core but using the "top" command I see that the python processes are constantly changing the core no. Enabling "SHOW THREADS" in the top command shows diffrent thread processes working on different cores.
Can anyone please explain this? It is bothering me as I know from theory that multithreading is executed on a single core.

First off, multithreading means the inverse, namely that multiple cores are being utilized (via threading) at the same time. CPython is indeed crippled when it comes to this, though whenever you call into C code (this includes parts of the standard library, but also extension modules like Numpy) the lock which prevents concurrent execution of Python code may be unlocked. You can still have multiple threads, they just won't be interpreting Python at the same time (instead, they'll take turns quite frequently). You also speak of "Python processes" -- are you confusing terminology, or is this "multithreaded" Python application in fact multiprocessing? Of course multiple Python processes can run concurrently.
However, from your wording I suspect another source of confusion. Even a single thread can run on multiple cores... just not at the same time. It is up to the operating system which thread is running on which CPU, and the OS scheduler does not necessarily re-assign a thread to the same CPU where it used to run before it was suspended (it's beneficial, as David Schwartz notes in the comments, but not vital). That is, it's perfectly normal for a single thread/process to jump from CPU to CPU.

Threads are designed to take advantage of multiple cores when they are available. If you only have one core, they'll run on one core too. :-)
There's nothing to be concerned about, what you observe is "working as intended".

Related

Python threads for networking - threads don't run in parallel

I am trying to use two threads in a server program, one for listening for any communications from clients using the Twisted library, and the other for doing some other computations on the server. In my attempt to implement the threads, it seems that the python threading library doesn't support parallel threads as answered in this question. I was wondering if there is any other python library that addresses this problem? Or any other way to circumvent this limitation?
Thank you in advance.

Python's GIL (global interpreter lock) prevents two threads to simultaneously execute Python code. Fortunately that doesn't include I/O, so if your threads do significant amounts of networking, database, or filesystem, then usual threads do work correctly. They won't let you take advantage of multiple cores for computation, but will let other threads advance while any number of them are waiting for something to happen.
If your needs are more for computation than for I/O, then threads (as implemented on Python) won't help. Better use the multiprocessing module (standard since Python 2.6), which uses a 'thread-like' API to spawn multiple processes, each one with an independent Python interpreter, and therefore it's own GIL.

Is anyone using zeromq to coordinate multiple Python interpreters in the same process?

I love Python's global interpreter lock because it makes the underlying C code simple.
But it means that each Python interpreter main loop is restricted to one thread at a time.
This is bad because recently the number of cores per processor chip has been doubling frequently.
One of the supposed advantages to zeromq is that it makes multi-threaded programming "easy" or easier.
Is it possible to launch multiple Python interpreters in the same process and have them communicate only using in-process zeromq with no other shared state? Has anyone tried it? Does it work well? Please comment and/or provide links.

I don't know of any way to create multiple instances of the Python interpreter within a single process, but I do have experience with splitting multiple instances across multiple processes and communicating with zmq.
I've been using multiprocessing to implement an island-model architecture for global optimization, with zmq for managing communication between the islands. Each island is its own process with its own Python interpreter, created and managed by the master archipelago process.
Using multiprocessing allows you to launch as many independent Python interpreters as you wish, but they all reside in their own processes with a separate memory space. I believe the OS scheduler takes care of assigning processes to cores and sharing CPU time. The separate memory space is the hardest part, because it means you have to explicitly communicate. To communicate between processes, the objects/data you wish to send must be serializable, because zmq sends byte-strings.
The nice thing about zmq is that it's a piece of cake to scale across systems distributed over a network, and it's pretty lightweight. You can create just about any communication pattern you wish, using REP/REQ, PUB/SUB, or whatever.
But no, it's not as easy as just spinning up a few threads from the threading module.
Edit: Also, here's a Stack Overflow question similar to yours. Inside are some more relevant links which indicate that it may be possible to run multiple Python interpreters within a single process, but it doesn't look simple. Multiple independent embedded Python Interpreters on multiple operating system threads invoked from C/C++ program

Why thread is slower than subprocess ? when should I use subprocess in place of thread and vise versa

In my application, I have tried python threading and subprocess module to open firefox, and I have noticed that subprocess is faster than threading. what could be the reason behind this?
when to use them in place of each other?

Python (or rather CPython, the c-based implementation that is commonly used) has a Global Intepreter Lock (a.k.a. the GIL).
Some kind of locking is necessary to synchronize memory access when several threads are accessing the same memory, which is what happens inside a process. Memory is not shared by between processes (unless you specifically allocate such memory), so no lock is needed there.
The globalness of the lock prevents several threads from running python code in the same process. When running mulitiple processes, the GIL does not interfere.
So, Python code does not scale on threads, you need processes for that.
Now, had your Python code mostly been calling C-APIs (NumPy/OpenGL/etc), there would be scaling since the GIL is usually released when native code is executing, so it's alright (and actually a good idea) to use Python to manage several threads that mostly execute native code.
(There are other Python interpreter implementations out there that do scale across threads (like Jython, IronPython, etc) but these aren't really mainstream.. yet, and usually a bit slower than CPython in single-thread scenarios.)

Will Python use all processors in thread mode?

While developing a Django app deployed on Apache mod_wsgi I found that in case of multithreading (Python threads; mod_wsgi processes=1 threads=8) Python won't use all available processors. With the multiprocessing approach (mod_wsgi processes=8 threads=1) all is fine and I can load my machine at full.
So the question: is this Python behavior normal? I doubt it because using 1 process with few threads is the default mod_wsgi approach.
The system is:
2xIntel Xeon 5XXX series (8 cores (16 with hyperthreading)) on FreeBSD 7.2 AMD64 and Python 2.6.4
Thanks all for answers.
We all found that this behavior is normal because of GIL. Here is a good explanation:
http://jessenoller.com/2009/02/01/python-threads-and-the-global-interpreter-lock/
or stackoverflow GIL discussion: What is a global interpreter lock (GIL)?.

Will Python use all processors in thread mode? No.
Python won't use all available processors; is this Python behavior normal? Yes, it's normal because of the GIL.
For a discussion see http://mail.python.org/pipermail/python-3000/2007-May/007414.html.
You may find that having a couple (or 4) of threads per core/process can still improve performance if there is some blocking, for example waiting for a response from the database would cause that process to block other connections otherwise.

Will python use all processors in thread mode? No.
It this normal? Yes, this is normal. Python makes no effort to locate all your cores.
"1 process with few threads is default mod_wsgi approach". But that's not optimal or even desirable. That's just a default. Don't read anything into it.
If you want to use all your computer's resources, make the OS handle it. Use processes.
The distinction between multi-processing and multi-threading is hard to measure for the most part. Using processes or threads barely matters. It's usually simpler to use processes, since there's trivial OS support for this.
Bottom Line
Use multiple processes, that allows the OS (and Apache) to make as much use as possible of the system.
Threads share a limited set of I/O resources for the Process they're part of, and web page serving is I/O bound. Processes have independent I/O resources and will more easily max out your processor.

There is still hope. The GIL is only an implementation artifact of the C Python implementation that you download from python.org. Jython and IronPython are two other implementations of Python, and they have no GIL, so you may have better threading results with one of them.

Yes. Python is not really multi-threaded. Instead, there is a global lock and each thread gets to execute a few operations in turn. This makes it much more simple to write MT applications in Python since there can't be any problems with stale caches, etc.
So one Python process can only ever occupy a single CPU. To fully utilize a multi-core system, you must run several Python processes.

I don't know if it is still the case, but there is a global lock in the Python interpreter, which prevents the use of all processor resources from a single interpreter, even when using multi threading. IIRC, the global lock has to do with I/O.
It seems you are watching the result of this lock, so, personally, I would use multiple processes with a single thread.

Does running separate python processes avoid the GIL?

I'm curious in how the Global Interpreter Lock in python actually works. If I have a c++ application launch four separate instances of a python script will they run in parallel on separate cores, or does the GIL go even deeper then just the single process that was launched and control all python process's regardless of the process that spawned it?

The GIL only affects threads within a single process. The multiprocessing module is in fact an alternative to threading that lets Python programs use multiple cores &c. Your scenario will easily allow use of multiple cores, too.

As Alex Martelli points out you can indeed avoid the GIL by running multiple processes, I just want to add and point out that the GIL is a limitation of the implementation (CPython) and not of Python in general, it's possible to implement Python without this limitation. Stackless Python comes to mind.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.