python: multiprocessing Event - python

What is difference between multiprocessing.Event and multiprocessing.managers.SyncManager.Event. When do I use each? Why two different objects exist?
Same question for other similar objects existing in multiprocessing directly and also in Manager (Lock, etc.)

Unfortunately, the only given answer is not very correct and others wasn't given.
I looked it up my own, and found that multiprocessing.Event can be used to synch between processes, it's completely alright.
Event and other objects from multiprocessing.Manager exist to be able to synchronize things between processes that runs on different machines via sockets under the hood. They also can be used synchronize on single machine, but less efficient for this than just using synch objects from multiprocessing.synchronize (like Event and Lock and others)

multiprocessing.Manager is essentially a specialised process that will create instances of multiprocessing's synchronisation primitives on demand in its own address space, and let you access them through RPC proxies. The primitives behave the same, and they have the added flexibility of being accessible from remote hosts (using TCP in the remote case).

Related

python inter-process mutex for arbitrary processes

I need to mutex several processes running python on a linux host.
They processes are not spawned in a way I control (to be clear, they are my code), so i cannot use multithreading.Lock, at least as I understand it. The resource being synchronized is a series of reads/writes to two separate internal services, which are old, stateful, not designed for concurrent/transactional access, and out of scope to modify.
a couple approaches I'm familiar with but rejected so far:
In native code using shmget / pthread_mutex_lock (eg create a pthread mutex by well-known string name, in shared memory provided by the OS). Im hoping to not have to use/add a ctypes wrapper for this (or ideally have any low-level constructs visible at all here for this high-level app).
Using one of the lock file libraries such as fasteners would work - but requiring any particular file system access is awkward (the library/approach could use it robustly under the hood, but ideally my client code is abstracted from that).
Is there a preferred way to accomplish this in python (under linux; bonus points for cross-platform)?
Options for synchronizing non-child processes:
Use a remote manager. I'm not super familiar with this process, but the docs has at least a simple example.
create a simple server with your own protocol (rather than a manager): something like a socket server on the loopback address for bouncing simple messages around.
use the filesystem: https://pypi.org/project/filelock/
On posix compliant systems, there's a rather straightforward wrapper for IPC constructs posix-ipc. I also found a wrapper for windows semaphores, but it's not quite as simple (though also not difficult per-say). In both cases your program would use a well known string "name" to access / create the mutex. In both cases, care / error checking is needed to handle creation of the mutex properly (see things like O_CREX flag...)

Python multiprocessing guidelines seems to conflict: share memory or pickle?

I'm playing with Python multiprocessing module to have a (read-only) array shared among multiple processes. My goal is to use multiprocessing.Array to allocate the data and then have my code forked (forkserver) so that each worker can read straight from the array to do their job.
While reading the Programming guidelines I got a bit confused.
It is first said:
Avoid shared state
As far as possible one should try to avoid shifting large amounts of
data between processes.
It is probably best to stick to using queues or pipes for
communication between processes rather than using the lower level
synchronization primitives.
And then, a couple of lines below:
Better to inherit than pickle/unpickle
When using the spawn or forkserver start methods many types from
multiprocessing need to be picklable so that child processes can use
them. However, one should generally avoid sending shared objects to
other processes using pipes or queues. Instead you should arrange the
program so that a process which needs access to a shared resource
created elsewhere can inherit it from an ancestor process.
As far as I understand, queues and pipes pickle objects. If so, aren't those two guidelines conflicting?
Thanks.
The second guideline is the one relevant to your use case.
The first is reminding you that this isn't threading where you manipulate shared data structures with locks (or atomic operations). If you use Manager.dict() (which is actually SyncManager.dict) for everything, every read and write has to access the manager's process, and you also need the synchronization typical of threaded programs (which itself may come at a higher cost from being cross-process).
The second guideline suggests inheriting shared, read-only objects via fork; in the forkserver case, this means you have to create such objects before the call to set_start_method, since all workers are children of a process created at that time.
The reports on the usability of such sharing are mixed at best, but if you can use a small number of any of the C-like array types (like numpy or the standard array module), you should see good performance (because the majority of pages will never be written to deal with reference counts). Note that you do not need multiprocessing.Array here (though it may work fine), since you do not need writes in one concurrent process to be visible in another.

Is "Python only run one thread in parallel" true? [duplicate]

I've been trying to wrap my head around how threads work in Python, and it's hard to find good information on how they operate. I may just be missing a link or something, but it seems like the official documentation isn't very thorough on the subject, and I haven't been able to find a good write-up.
From what I can tell, only one thread can be running at once, and the active thread switches every 10 instructions or so?
Where is there a good explanation, or can you provide one? It would also be very nice to be aware of common problems that you run into while using threads with Python.
Yes, because of the Global Interpreter Lock (GIL) there can only run one thread at a time. Here are some links with some insights about this:
http://www.artima.com/weblogs/viewpost.jsp?thread=214235
http://smoothspan.wordpress.com/2007/09/14/guido-is-right-to-leave-the-gil-in-python-not-for-multicore-but-for-utility-computing/
From the last link an interesting quote:
Let me explain what all that means.
Threads run inside the same virtual
machine, and hence run on the same
physical machine. Processes can run
on the same physical machine or in
another physical machine. If you
architect your application around
threads, you’ve done nothing to access
multiple machines. So, you can scale
to as many cores are on the single
machine (which will be quite a few
over time), but to really reach web
scales, you’ll need to solve the
multiple machine problem anyway.
If you want to use multi core, pyprocessing defines an process based API to do real parallelization. The PEP also includes some interesting benchmarks.
Python's a fairly easy language to thread in, but there are caveats. The biggest thing you need to know about is the Global Interpreter Lock. This allows only one thread to access the interpreter. This means two things: 1) you rarely ever find yourself using a lock statement in python and 2) if you want to take advantage of multi-processor systems, you have to use separate processes. EDIT: I should also point out that you can put some of the code in C/C++ if you want to get around the GIL as well.
Thus, you need to re-consider why you want to use threads. If you want to parallelize your app to take advantage of dual-core architecture, you need to consider breaking your app up into multiple processes.
If you want to improve responsiveness, you should CONSIDER using threads. There are other alternatives though, namely microthreading. There are also some frameworks that you should look into:
stackless python
greenlets
gevent
monocle
Below is a basic threading sample. It will spawn 20 threads; each thread will output its thread number. Run it and observe the order in which they print.
import threading
class Foo (threading.Thread):
def __init__(self,x):
self.__x = x
threading.Thread.__init__(self)
def run (self):
print str(self.__x)
for x in xrange(20):
Foo(x).start()
As you have hinted at Python threads are implemented through time-slicing. This is how they get the "parallel" effect.
In my example my Foo class extends thread, I then implement the run method, which is where the code that you would like to run in a thread goes. To start the thread you call start() on the thread object, which will automatically invoke the run method...
Of course, this is just the very basics. You will eventually want to learn about semaphores, mutexes, and locks for thread synchronization and message passing.
Note: wherever I mention thread i mean specifically threads in python until explicitly stated.
Threads work a little differently in python if you are coming from C/C++ background. In python, Only one thread can be in running state at a given time.This means Threads in python cannot truly leverage the power of multiple processing cores since by design it's not possible for threads to run parallelly on multiple cores.
As the memory management in python is not thread-safe each thread require an exclusive access to data structures in python interpreter.This exclusive access is acquired by a mechanism called GIL ( global interpretr lock ).
Why does python use GIL?
In order to prevent multiple threads from accessing interpreter state simultaneously and corrupting the interpreter state.
The idea is whenever a thread is being executed (even if it's the main thread), a GIL is acquired and after some predefined interval of time the
GIL is released by the current thread and reacquired by some other thread( if any).
Why not simply remove GIL?
It is not that its impossible to remove GIL, its just that in prcoess of doing so we end up putting mutiple locks inside interpreter in order to serialize access, which makes even a single threaded application less performant.
so the cost of removing GIL is paid off by reduced performance of a single threaded application, which is never desired.
So when does thread switching occurs in python?
Thread switch occurs when GIL is released.So when is GIL Released?
There are two scenarios to take into consideration.
If a Thread is doing CPU Bound operations(Ex image processing).
In Older versions of python , Thread switching used to occur after a fixed no of python instructions.It was by default set to 100.It turned out that its not a very good policy to decide when switching should occur since the time spent executing a single instruction can
very wildly from millisecond to even a second.Therefore releasing GIL after every 100 instructions regardless of the time they take to execute is a poor policy.
In new versions instead of using instruction count as a metric to switch thread , a configurable time interval is used.
The default switch interval is 5 milliseconds.you can get the current switch interval using sys.getswitchinterval().
This can be altered using sys.setswitchinterval()
If a Thread is doing some IO Bound Operations(Ex filesystem access or
network IO)
GIL is release whenever the thread is waiting for some for IO operation to get completed.
Which thread to switch to next?
The interpreter doesn’t have its own scheduler.which thread becomes scheduled at the end of the interval is the operating system’s decision. .
Use threads in python if the individual workers are doing I/O bound operations. If you are trying to scale across multiple cores on a machine either find a good IPC framework for python or pick a different language.
One easy solution to the GIL is the multiprocessing module. It can be used as a drop in replacement to the threading module but uses multiple Interpreter processes instead of threads. Because of this there is a little more overhead than plain threading for simple things but it gives you the advantage of real parallelization if you need it.
It also easily scales to multiple physical machines.
If you need truly large scale parallelization than I would look further but if you just want to scale to all the cores of one computer or a few different ones without all the work that would go into implementing a more comprehensive framework, than this is for you.
Try to remember that the GIL is set to poll around every so often in order to do show the appearance of multiple tasks. This setting can be fine tuned, but I offer the suggestion that there should be work that the threads are doing or lots of context switches are going to cause problems.
I would go so far as to suggest multiple parents on processors and try to keep like jobs on the same core(s).

Python 2.7.5 - Run multiple threads simultaneously without slowing down

I'm creating a simple multiplayer game in python. I have split the processes up using the default thread module in python. However I noticed that the program still slows down with the speed of other threads. I tried using the multiprocessing module but not all of my objects can be pickled.
Is there an alternative to using the multiprocessing module for running simultaneous processes?
Here are your options:
MPI4PY:
http://code.google.com/p/mpi4py/
Celery:
http://www.celeryproject.org/
Pprocess:
http://www.boddie.org.uk/python/pprocess.html
Parallel Python(PP):
http://www.parallelpython.com/
You need to analyze why your program is slowing down when other threads do their work. Assuming that the threads are doing CPU-intensive work, the slowdown is consistent with threads being serialized by the global interpreter lock.
It is impossible to answer in detaile without knowing more about the nature of the work your threads are performing and of objects that must be shared in parallel. In general, you have two viable options:
Use processes, typically through the multiprocessing module. The typical reasons why objects are not picklable is because they contain unpicklable state such as closures, open file handles, or other system resources. But pickle allows objects to implement methods like __getstate__ or __reduce__ which identify object's state, using the state to rebuild the objects. If your objects are unpicklable because they are huge, then you might need to write a C extension that stores them in shared memory or a memory-mapped file, and pickle only a key that identifies them in the shared memory.
Use threads, finding ways to work around the GIL. If your computation is concentrated in several hot spots, you can move those hot spots to C, and release the GIL for the duration of the computation. For this to work, the computation must not refer to any Python objects, i.e. all data must be extracted from the objects while the GIL is held, and stored back into the Python world after the GIL has been reacquired.

How do threads work in Python, and what are common Python-threading specific pitfalls?

I've been trying to wrap my head around how threads work in Python, and it's hard to find good information on how they operate. I may just be missing a link or something, but it seems like the official documentation isn't very thorough on the subject, and I haven't been able to find a good write-up.
From what I can tell, only one thread can be running at once, and the active thread switches every 10 instructions or so?
Where is there a good explanation, or can you provide one? It would also be very nice to be aware of common problems that you run into while using threads with Python.
Yes, because of the Global Interpreter Lock (GIL) there can only run one thread at a time. Here are some links with some insights about this:
http://www.artima.com/weblogs/viewpost.jsp?thread=214235
http://smoothspan.wordpress.com/2007/09/14/guido-is-right-to-leave-the-gil-in-python-not-for-multicore-but-for-utility-computing/
From the last link an interesting quote:
Let me explain what all that means.
Threads run inside the same virtual
machine, and hence run on the same
physical machine. Processes can run
on the same physical machine or in
another physical machine. If you
architect your application around
threads, you’ve done nothing to access
multiple machines. So, you can scale
to as many cores are on the single
machine (which will be quite a few
over time), but to really reach web
scales, you’ll need to solve the
multiple machine problem anyway.
If you want to use multi core, pyprocessing defines an process based API to do real parallelization. The PEP also includes some interesting benchmarks.
Python's a fairly easy language to thread in, but there are caveats. The biggest thing you need to know about is the Global Interpreter Lock. This allows only one thread to access the interpreter. This means two things: 1) you rarely ever find yourself using a lock statement in python and 2) if you want to take advantage of multi-processor systems, you have to use separate processes. EDIT: I should also point out that you can put some of the code in C/C++ if you want to get around the GIL as well.
Thus, you need to re-consider why you want to use threads. If you want to parallelize your app to take advantage of dual-core architecture, you need to consider breaking your app up into multiple processes.
If you want to improve responsiveness, you should CONSIDER using threads. There are other alternatives though, namely microthreading. There are also some frameworks that you should look into:
stackless python
greenlets
gevent
monocle
Below is a basic threading sample. It will spawn 20 threads; each thread will output its thread number. Run it and observe the order in which they print.
import threading
class Foo (threading.Thread):
def __init__(self,x):
self.__x = x
threading.Thread.__init__(self)
def run (self):
print str(self.__x)
for x in xrange(20):
Foo(x).start()
As you have hinted at Python threads are implemented through time-slicing. This is how they get the "parallel" effect.
In my example my Foo class extends thread, I then implement the run method, which is where the code that you would like to run in a thread goes. To start the thread you call start() on the thread object, which will automatically invoke the run method...
Of course, this is just the very basics. You will eventually want to learn about semaphores, mutexes, and locks for thread synchronization and message passing.
Note: wherever I mention thread i mean specifically threads in python until explicitly stated.
Threads work a little differently in python if you are coming from C/C++ background. In python, Only one thread can be in running state at a given time.This means Threads in python cannot truly leverage the power of multiple processing cores since by design it's not possible for threads to run parallelly on multiple cores.
As the memory management in python is not thread-safe each thread require an exclusive access to data structures in python interpreter.This exclusive access is acquired by a mechanism called GIL ( global interpretr lock ).
Why does python use GIL?
In order to prevent multiple threads from accessing interpreter state simultaneously and corrupting the interpreter state.
The idea is whenever a thread is being executed (even if it's the main thread), a GIL is acquired and after some predefined interval of time the
GIL is released by the current thread and reacquired by some other thread( if any).
Why not simply remove GIL?
It is not that its impossible to remove GIL, its just that in prcoess of doing so we end up putting mutiple locks inside interpreter in order to serialize access, which makes even a single threaded application less performant.
so the cost of removing GIL is paid off by reduced performance of a single threaded application, which is never desired.
So when does thread switching occurs in python?
Thread switch occurs when GIL is released.So when is GIL Released?
There are two scenarios to take into consideration.
If a Thread is doing CPU Bound operations(Ex image processing).
In Older versions of python , Thread switching used to occur after a fixed no of python instructions.It was by default set to 100.It turned out that its not a very good policy to decide when switching should occur since the time spent executing a single instruction can
very wildly from millisecond to even a second.Therefore releasing GIL after every 100 instructions regardless of the time they take to execute is a poor policy.
In new versions instead of using instruction count as a metric to switch thread , a configurable time interval is used.
The default switch interval is 5 milliseconds.you can get the current switch interval using sys.getswitchinterval().
This can be altered using sys.setswitchinterval()
If a Thread is doing some IO Bound Operations(Ex filesystem access or
network IO)
GIL is release whenever the thread is waiting for some for IO operation to get completed.
Which thread to switch to next?
The interpreter doesn’t have its own scheduler.which thread becomes scheduled at the end of the interval is the operating system’s decision. .
Use threads in python if the individual workers are doing I/O bound operations. If you are trying to scale across multiple cores on a machine either find a good IPC framework for python or pick a different language.
One easy solution to the GIL is the multiprocessing module. It can be used as a drop in replacement to the threading module but uses multiple Interpreter processes instead of threads. Because of this there is a little more overhead than plain threading for simple things but it gives you the advantage of real parallelization if you need it.
It also easily scales to multiple physical machines.
If you need truly large scale parallelization than I would look further but if you just want to scale to all the cores of one computer or a few different ones without all the work that would go into implementing a more comprehensive framework, than this is for you.
Try to remember that the GIL is set to poll around every so often in order to do show the appearance of multiple tasks. This setting can be fine tuned, but I offer the suggestion that there should be work that the threads are doing or lots of context switches are going to cause problems.
I would go so far as to suggest multiple parents on processors and try to keep like jobs on the same core(s).

Categories