Will python multiprocessing suffice for multi-CPU machines?

Will python multiprocessing suffice for multi-CPU machines? - python

I am aware that you have to use something like MPI or python libraries like celery, jug, pp, etc, when you want to distribute processes over multiple machines, but is that necessary if you have a single machine with multiple CPU's?
So if I had a machine whose motherboard has two or more CPU's (each with multiple cores) would python's multiprocessing library suffice to fully utilize that machine?

multiprocessing documentation
"...Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine..."

Related

What really happens if LSF starts a single Python job on multiple nodes?

Using LSF, I have submitted a Python job using -n N where N>1. This means it will use multiple cores, which may or may not be on the same node. I have not written any explicit code for inter-process communication, but I do use libraries that can make use of multiple cores, such as numpy, scipy, and numexpr.
I'm confused because LSF tells me my script is distributed over multiple nodes, i.e. different physical machines, but my code does not take this into account. What is LSF actually doing in this case, and what is the practical implication if it uses multiple cores on different machines, rather than all on the same node?

How to do parallel computing in python 2.7 using MPI?

I have written a python code to carry out genetic algorithm optimization, but it is too slow. I would like to know how to run the same in parallel mode making use of multiple CPUs ?
For more clarity, another python code will be called by my code for, say 100 times one after the other, I wanted to divide this between 4 CPUs. So that 25 times the outside python code is solved by each CPU. Thereby increasing the speed.
Its highly appreciated if someone can help me with is ?
Thanks in advance!

There are several packages that provide parallel computing for python2. I am the author of a package called pathos, which provides parallel computing with several parallel backends and gives them a common API. pathos provides parallel pipes and maps for multi-process, multi-threaded, parallel over sockets, MPI-parallel, and also interactions with schedulers and over ssh. pathos relies on several packages, which you can pick from if you don't want all the different options.
pathos uses: pyina which in turn uses mpi4py. mpi4py provides bindings to MPI, but you can't run the code from python 'normally'… you need to run with whatever you use to run MPI. pyina enables you to run mpi4py from normal python, and to interact with schedulers. Plus, pyina uses dill, which can serialize most python objects, and thus you are much more able to send what you want across processes.
pathos provides a fork of multiprocessing that also plays well with dill and pyina. Using both can enable you to do hierarchical parallel competing -- like launching MPI parallel that then spawns multiprocess or multithreaded parallel.
pathos also uses ppft, which is a fork of pp (Parallel Python), which provides parallel computing across sockets -- so that means you can connect a parallel map across several machines.
There are alternatives to pathos, such as IPython-parallel. However, the ability to use MPI is very new, and I don't know how capable it is yet. It may or may not leverage IPython-cluster-helper, which has been in development for a little while. Note that IPython doesn't use pp, it uses zmq instead, and IPython also provides connectivity to EC2 if you like cloud stuff.
Here are some relevant links:
pathos, 'pyina, dill, ppft: https://github.com/uqfoundation
Ipython-cluster-helper: https://github.com/roryk/ipython-cluster-helper
IPython: http://ipython.org/ipython-doc/dev/parallel/parallel_mpi.html

Does multiprocessing module fix CPython multi-core usage?

In CPython, threading module doesn't utilise multiple cores because it uses global interpreter lock. However I recently found multiprocessing module from standard library which is said to sidestep the GIL. So I think with that module it is possible to utilise multiple core properly in CPython, but I wonder if I'm right.
I need to write an app which requires good utilisation of multiple cores, but it's not that performance critical so I could write it in Python, but I need to know whether this module will allow me to use multiple cores?

The multiprocessing library uses child processes; these each run in their own Python interpreter.
The OS can and will schedule these process across multiple processes and cores, yes. Because each child process is a separate Python interpreter process, the GIL does not interfere.

Is anyone using zeromq to coordinate multiple Python interpreters in the same process?

I love Python's global interpreter lock because it makes the underlying C code simple.
But it means that each Python interpreter main loop is restricted to one thread at a time.
This is bad because recently the number of cores per processor chip has been doubling frequently.
One of the supposed advantages to zeromq is that it makes multi-threaded programming "easy" or easier.
Is it possible to launch multiple Python interpreters in the same process and have them communicate only using in-process zeromq with no other shared state? Has anyone tried it? Does it work well? Please comment and/or provide links.

I don't know of any way to create multiple instances of the Python interpreter within a single process, but I do have experience with splitting multiple instances across multiple processes and communicating with zmq.
I've been using multiprocessing to implement an island-model architecture for global optimization, with zmq for managing communication between the islands. Each island is its own process with its own Python interpreter, created and managed by the master archipelago process.
Using multiprocessing allows you to launch as many independent Python interpreters as you wish, but they all reside in their own processes with a separate memory space. I believe the OS scheduler takes care of assigning processes to cores and sharing CPU time. The separate memory space is the hardest part, because it means you have to explicitly communicate. To communicate between processes, the objects/data you wish to send must be serializable, because zmq sends byte-strings.
The nice thing about zmq is that it's a piece of cake to scale across systems distributed over a network, and it's pretty lightweight. You can create just about any communication pattern you wish, using REP/REQ, PUB/SUB, or whatever.
But no, it's not as easy as just spinning up a few threads from the threading module.
Edit: Also, here's a Stack Overflow question similar to yours. Inside are some more relevant links which indicate that it may be possible to run multiple Python interpreters within a single process, but it doesn't look simple. Multiple independent embedded Python Interpreters on multiple operating system threads invoked from C/C++ program

How to make Ruby or Python web sites to use multiple cores?

Even though Python and Ruby have one kernel thread per interpreter thread, they have a global interpreter lock (GIL) that is used to protect potentially shared data structures, so this inhibits multi-processor execution. Even though the portions in those languajes that are written in C or C++ can be free-threaded, that's not possible with pure interpreted code unless you use multiple processes. What's the best way to achieve this? Using FastCGI? Creating a cluster or a farm of virtualized servers? Using their Java equivalents, JRuby and Jython?

I'm not totally sure which problem you want so solve, but if you deploy your python/django application via an apache prefork MPM using mod_python apache will start several worker processes for handling different requests.
If one request needs so much resources, that you want to use multiple cores have a look at pyprocessing. But I don't think that would be wise.

The 'standard' way to do this with rails is to run a "pack" of Mongrel instances (ie: 4 copies of the rails application) and then use apache or nginx or some other piece of software to sit in front of them and act as a load balancer.
This is probably how it's done with other ruby frameworks such as merb etc, but I haven't used those personally.
The OS will take care of running each mongrel on it's own CPU.
If you install mod_rails aka phusion passenger it will start and stop multiple copies of the rails process for you as well, so it will end up spreading the load across multiple CPUs/cores in a similar way.

Use an interface that runs each response in a separate interpreter, such as mod_wsgi for Python. This lets multi-threading be used without encountering the GIL.
EDIT: Apparently, mod_wsgi no longer supports multiple interpreters per process because idiots couldn't figure out how to properly implement extension modules. It still supports running requests in separate processes FastCGI-style, though, so that's apparently the current accepted solution.

In Python and Ruby it is only possible to use multiple cores, is to spawn new (heavyweight) processes.
The Java counterparts inherit the possibilities of the Java platform. You could imply use Java threads. That is for example a reason why sometimes (often) Java Application Server like Glassfish are used for Ruby on Rails applications.

For Python, the PyProcessing project allows you to program with processes much like you would use threads. It is included in the standard library of the recently released 2.6 version as multiprocessing. The module has many features for establishing and controlling access to shared data structures (queues, pipes, etc.) and support for common idioms (i.e. managers and worker pools).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.