Objects in Multiprocess Shared Memory?

Objects in Multiprocess Shared Memory? - python

I have a set of objects states which is greater than I think it would be reasonable to thread or process at a 1:1 basis, let's say it looks like this
class SubState(object):
def __init__(self):
self.stat_1 = None
self.stat_2 = None
self.list_1 = []
class State(object):
def __init__(self):
self.my_sub_states = {'a': SubState(), 'b': SubState(), 'c': SubState()}
What I'd like to do is to make each of the sub_states to the self.my_sub_states keys shared, and simply access them by grabbing a single lock for the entire sub-state - i.e. self.locks={'a': multiprocessing.Lock(), 'b': multiprocessing.Lock() etc. and then release it when I'm done. Is there a class I can inherit to share an entire SubState object with a single Lock?
The actually process workers would be pulling tasks from a queue (I can't pass the sub_states as args into the process because they don't know which sub_state they need until they get the next task).
Edit: also I'd prefer not to use a manager - manager's are atrociously slow (I haven't done the benchmarks but I'm inclined to think an in memory database would work faster than a manager if it came down to it).

As the multiprocessing docs state, you've really only got two options for actually sharing state between multiprocessing.Process instances (at least without going to third-party options - e.g. redis):
Use a Manager
Use multiprocessing.sharedctypes
A Manager will allow you to share pure Python objects, but as you pointed out, both read and write access to objects being shared this way is quite slow.
multiprocessing.sharedctypes will use actual shared memory, but you're limited to sharing ctypes objects. So you'd need to convert your SubState object to a ctypes.Struct. Also of note is that each multiprocessing.sharedctypes object has its own lock built-in, so you can synchronize access to each object by taking that lock explicitly before operating on it.

Related

multiprocessing initargs - how it works under the hood?

I've assumed that multiprocessing.Pool uses pickle to pass initargs to child processes.
However I find the following stange:
value = multiprocessing.Value('i', 1)
multiprocess.Pool(initializer=worker, initargs=(value, )) # Works
But this does not work:
pickle.dumps(value)
throwing:
RuntimeError: Synchronized objects should only be shared between processes through inheritance
Why is that, and how multiprocessing initargs can bypass that, as it's using pickle as well?
As I understand, multiprocessing.Value is using shared memory behind the scenes, what is the difference between inheritance or passing it via initargs? Specifically speaking on Windows, where the code does not fork, so a new instance of multiprocessing.Value is created.

And if you had instead passed an instance of multiprocessing.Lock(), the error message would have been RuntimeError: Lock objects should only be shared between processes through inheritance. These things can be passed as arguments if you are creating an instance of multiprocessing.Process, which is in fact what is being used when you say initializer=worker, initargs=(value,). The test being made is whether a process is currently being spawned, which is not the case when you already have an existing process pool and are now just submitting some work for it. But why this restriction?
Would it make sense for you to be able to pickle this shared memory to a file and then a week later trying to unpickle it and use it? Of course not! Python cannot know that you would not be doing anything so foolish as that and so it places great restrictions on how shared memory and locks can be pickled/unpickled, which is only for passing to other processes.

Python multiprocessing guidelines seems to conflict: share memory or pickle?

I'm playing with Python multiprocessing module to have a (read-only) array shared among multiple processes. My goal is to use multiprocessing.Array to allocate the data and then have my code forked (forkserver) so that each worker can read straight from the array to do their job.
While reading the Programming guidelines I got a bit confused.
It is first said:
Avoid shared state
As far as possible one should try to avoid shifting large amounts of
data between processes.
It is probably best to stick to using queues or pipes for
communication between processes rather than using the lower level
synchronization primitives.
And then, a couple of lines below:
Better to inherit than pickle/unpickle
When using the spawn or forkserver start methods many types from
multiprocessing need to be picklable so that child processes can use
them. However, one should generally avoid sending shared objects to
other processes using pipes or queues. Instead you should arrange the
program so that a process which needs access to a shared resource
created elsewhere can inherit it from an ancestor process.
As far as I understand, queues and pipes pickle objects. If so, aren't those two guidelines conflicting?
Thanks.

The second guideline is the one relevant to your use case.
The first is reminding you that this isn't threading where you manipulate shared data structures with locks (or atomic operations). If you use Manager.dict() (which is actually SyncManager.dict) for everything, every read and write has to access the manager's process, and you also need the synchronization typical of threaded programs (which itself may come at a higher cost from being cross-process).
The second guideline suggests inheriting shared, read-only objects via fork; in the forkserver case, this means you have to create such objects before the call to set_start_method, since all workers are children of a process created at that time.
The reports on the usability of such sharing are mixed at best, but if you can use a small number of any of the C-like array types (like numpy or the standard array module), you should see good performance (because the majority of pages will never be written to deal with reference counts). Note that you do not need multiprocessing.Array here (though it may work fine), since you do not need writes in one concurrent process to be visible in another.

Python: Why is the multiprocessing lock shared among processes here?

I am trying to share a lock among processes. I understand that the way to share a lock is to pass it as an argument to the target function. However I found that even the approach below is working. I could not understand the way the processes are sharing this lock. Could anyone please explain?
import multiprocessing as mp
import time
class SampleClass:
def __init__(self):
self.lock = mp.Lock()
self.jobs = []
self.total_jobs = 10
def test_run(self):
for i in range(self.total_jobs):
p = mp.Process(target=self.run_job, args=(i,))
p.start()
self.jobs.append(p)
for p in self.jobs:
p.join()
def run_job(self, i):
with self.lock:
print('Sleeping in process {}'.format(i))
time.sleep(5)
if __name__ == '__main__':
t = SampleClass()
t.test_run()

On Windows (which you said you're using), these kinds of things always reduce to details about how multiprocessing plays with pickle, because all Python data crossing process boundaries on Windows is implemented by pickling on the sending end (and unpickling on the receiving end).
My best advice is to avoid doing things that raise such questions to begin with ;-) For example, the code you showed blows up on Windows under Python 2, and also blows up under Python 3 if you use a multiprocessing.Pool method instead of multiprocessing.Process.
It's not just the lock, simply trying to pickle a bound method (like self.run_job) blows up in Python 2. Think about it. You're crossing a process boundary, and there isn't an object corresponding to self on the receiving end. To what object is self.run_job supposed to be bound on the receiving end?
In Python 3, pickling self.run_job also pickles a copy of the self object. So that's the answer: a SampleClass object corresponding to self is created by magic on the receiving end. Clear as mud. t's entire state is pickled, including t.lock. That's why it "works".
See this for more implementation details:
Why can I pass an instance method to multiprocessing.Process, but not a multiprocessing.Pool?
In the long run, you'll suffer the fewest mysteries if you stick to things that were obviously intended to work: pass module-global callable objects (neither, e.g., instance methods nor local functions), and explicitly pass multiprocessing data objects (whether an instance of Lock, Queue, manager.list, etc etc).

On Unix Operating Systems, new processes are created via the fork primitive.
The fork primitive works by cloning the parent process memory address space assigning it to the child. The child will have a copy of the parent's memory as well as for the file descriptors and shared objects.
This means that, when you call fork, if the parent has a file opened, the child will have it too. The same applied with shared objects such as pipes, sockets etc...
In Unix+CPython, Locks are realized via the sem_open primitive which is designed to be shared when forking a process.
I usually recommend against mixing concurrency (multiprocessing in particular) and OOP because it frequently leads to these kind of misunderstandings.
EDIT:
Saw just now that you are using Windows. Tim Peters gave the right answer. For the sake of abstraction, Python is trying to provide OS independent behaviour over its API. When calling an instance method, it will pickle the object and send it over a pipe. Thus providing a similar behaviour as for Unix.
I'd recommend you to read the programming guidelines for multiprocessing. Your issue is addressed in particular in the first point:
Avoid shared state
As far as possible one should try to avoid shifting large amounts of data between processes.
It is probably best to stick to using queues or pipes for communication between processes rather than using the lower level synchronization primitives.

Monitor concurrency (sharing object across processes) in Python

I'm new here and I'm Italian (forgive me if my English is not so good).
I am a computer science student and I am working on a concurrent program project in Python.
We should use monitors, a class with its methods and data (such as condition variables). An instance (object) of this class monitor should be shared accross all processes we have (created by os.fork o by multiprocessing module) but we don't know how to do. It is simpler with threads because they already share memory but we MUST use processes. Is there any way to make this object (monitor) shareable accross all processes?
Hoping I'm not saying nonsenses...thanks a lot to everyone for tour attention.
Waiting answers.
Lorenzo

As far as "sharing" the instance, I believe the instructor wants you to make your monitor's interface to its local process such that it's as if it were shared (a la CORBA).
Look into the absolutely fantastic documentation on multiprocessing's Queue:
from multiprocessing import Process, Queue
def f(q):
q.put([42, None, 'hello'])
if __name__ == '__main__':
q = Queue()
p = Process(target=f, args=(q,))
p.start()
print q.get() # prints "[42, None, 'hello']"
p.join()
You should be able to imagine how your monitor's attributes might be propagated among the peer processes when changes are made.

shared memory between processes is usually a poor idea; when calling os.fork(), the operating system marks all of the memory used by parent and inherited by the child as copy on write; if either process attempts to modify the page, it is instead copied to a new location that is not shared between the two processes.
This means that your usual threading primitives (locks, condition variables, et-cetera) are not useable for communicating across process boundaries.
There are two ways to resolve this; The preferred way is to use a pipe, and serialize communication on both ends. Brian Cain's answer, using multiprocessing.Queue, works in this exact way. Because pipes do not have any shared state, and use a robust ipc mechanism provided by the kernel, it's unlikely that you will end up with processes in an inconsistent state.
The other option is to allocate some memory in a special way so that the os will allow you to use shared memory. the most natural way to do that is with mmap. cPython won't use shared memory for native python object's though, so you would still need to sort out how you will use this shared region. A reasonable library for this is numpy, which can map the untyped binary memory region into useful arrays of some sort. Shared memory is much harder to work with in terms of managing concurrency, though; since there's no simple way for one process to know how another processes is accessing the shared region. The only time this approach makes much sense is when a small number of processes need to share a large volume of data, since shared memory can avoid copying the data through pipes.

python threadsafe object cache

I have implemented a python webserver. Each http request spawns a new thread.
I have a requirement of caching objects in memory and since its a webserver, I want the cache to be thread safe. Is there a standard implementatin of a thread safe object cache in python? I found the following
http://freshmeat.net/projects/lrucache/
This does not look to be thread safe. Can anybody point me to a good implementation of thread safe cache in python?
Thanks!

Well a lot of operations in Python are thread-safe by default, so a standard dictionary should be ok (at least in certain respects). This is mostly due to the GIL, which will help avoid some of the more serious threading issues.
There's a list here: http://coreygoldberg.blogspot.com/2008/09/python-thread-synchronization-and.html that might be useful.
Though atomic nature of those operation just means that you won't have an entirely inconsistent state if you have two threads accessing a dictionary at the same time. So you wouldn't have a corrupted value. However you would (as with most multi-threading programming) not be able to rely on the specific order of those atomic operations.
So to cut a long story short...
If you have fairly simple requirements and aren't to bothered about the ordering of what get written into the cache then you can use a dictionary and know that you'll always get a consistent/not-corrupted value (it just might be out of date).
If you want to ensure that things are a bit more consistent with regard to reading and writing then you might want to look at Django's local memory cache:
http://code.djangoproject.com/browser/django/trunk/django/core/cache/backends/locmem.py
Which uses a read/write lock for locking.

Thread per request is often a bad idea. If your server experiences huge spikes in load it will take the box to its knees. Consider using a thread pool that can grow to a limited size during peak usage and shrink to a smaller size when load is light.

Point 1. GIL does not help you here, an example of a (non-thread-safe) cache for something called "stubs" would be
stubs = {}
def maybe_new_stub(host):
""" returns stub from cache and populates the stubs cache if new is created """
if host not in stubs:
stub = create_new_stub_for_host(host)
stubs[host] = stub
return stubs[host]
What can happen is that Thread 1 calls maybe_new_stub('localhost'), and it discovers we do not have that key in the cache yet. Now we switch to Thread 2, which calls the same maybe_new_stub('localhost'), and it also learns the key is not present. Consequently, both threads call create_new_stub_for_host and put it into the cache.
The map itself is protected by the GIL, so we cannot break it by concurrent access. The logic of the cache, however, is not protected, and so we may end up creating two or more stubs, and dropping all except one on the floor.
Point 2. Depending on the nature of the program, you may not want a global cache. Such shared cache forces synchronization between all your threads. For performance reasons, it is good to make the threads as independent as possible. I believe I do need it, you may actually not.
Point 3. You may use a simple lock. I took inspiration from https://codereview.stackexchange.com/questions/160277/implementing-a-thread-safe-lrucache and came up with the following, which I believe is safe to use for my purposes
import threading
stubs = {}
lock = threading.Lock()
def maybe_new_stub(host):
""" returns stub from cache and populates the stubs cache if new is created """
with lock:
if host not in stubs:
channel = grpc.insecure_channel('%s:6666' % host)
stub = cli_pb2_grpc.BrkStub(channel)
stubs[host] = stub
return stubs[host]
Point 4. It would be best to use existing library. I haven't found any I am prepared to vouch for yet.

You probably want to use memcached instead. It's very fast, very stable, very popular, has good python libraries, and will allow you to grow to a distributed cache should you need to:
http://www.danga.com/memcached/

I'm not sure any of these answers are doing what you want.
I have a similar problem and I'm using a drop-in replacement for lrucache called cachetools which allows you to pass in a lock to make it a bit safer.

For a thread safe object you want threading.local:
from threading import local
safe = local()
safe.cache = {}
You can then put and retrieve objects in safe.cache with thread safety.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.