Organizing the contents/outputs of Python's multiprocessing queue

Organizing the contents/outputs of Python's multiprocessing queue - python

I'm writing a script that processes several different instances of a Class object, which contains a number of attributes and methods. The objects are all placed in a single list (myobjects = [myClass(IDnumber=1), myClass(IDnumber=2), myClass(IDnumber=3)], and then modified by fairly simplistic for loops that call specific functions from the objects, of the form
for x in myobjects:
x.myfunction()
This script utilizes logging, to forward all output to a logfile that I can check later. I'm attempting to parallelize this script, because it's fairly straightforward to do so (example below), and need to utilize a queue in order to organize all the logging outputs from each Process. This aspect works flawlessly- I can define a new logfile for each process, and then pass the object-specific logfile back to my main script, which can then organize the main logfile by appending each minor logfile in turn.
from multiprocessing import Process, Queue
q = Queue()
threads = []
mainlog = 'mylogs.log' #this is set up in my __init__.py but included here as demonstration
for x in myobjects:
logfile = x.IDnumber+'.log'
thread = Process(target=x.myfunction(), args=(logfile, queue))
threads.append(thread)
thread.start()
for thread in threads:
if thread.is_alive():
thread.join()
while not queue.empty():
minilog = queue.get()
minilog_open = open(minilog, 'r')
mainlog_open = open(mainlog, 'a+')
mainlog_open.write(minilog_open.read())
My problem, now, is that I also need these objects to update a specific attribute, x.success, as True or False. Normally, in serial, x.success is updated at the end of x.myfunction() and is sent down the script where it needs to go, and everything works great. However, in this parallel implementation, x.myfunction populates x.success in the Process, but that information never makes it back to the main script- so if I add print(success) inside myfunction(), I see True or False, but if I add for x in myobjects: print(x.success) after the queue.get() block, I just see None. I realize that I can just use queue.put(success) in myfunction() the same way I use queue.put(logfile), but what happens when two or more processes finish simultaneously? There's no guarantee (that I know of) that my queue will be organized like
logfile (for myobjects[0])
success = True (for myobjects[0])
logfile (for myobjects[1])
success = False (for myobjects[1]) (etc etc)
How can I organize object-specific outputs from a queue, if this queue contains both logfiles and variables? I need to know the content of x.success for each x.myfunction(), so that information has to come back to the main process somehow.

OP has request an example to demonstrate concepts mentioned in my comment. Explanation follows the code:-
import concurrent.futures
class MyObject:
def __init__(self):
self._ID = str(id(self))
self._status = None
#property
def ID(self):
return self._ID
#property
def status(self):
return self._status
#status.setter
def status(self, status):
self._status = status
def MyFunction(self):
# do the real work here
self.status = True
def MyThreadFunc(args):
myObject = args[0]
myObject.MyFunction()
# note that the wrapper function returns a tuple
return myObject.status, myObject.ID
if __name__ == '__main__':
N = 10 # number of instances of MyObject
myObjects = [MyObject() for _ in range(N)]
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = {executor.submit(MyThreadFunc, [o]): o for o in myObjects}
for future in concurrent.futures.as_completed(futures):
_status, _id = future.result()
print(f'Status is {_status} for ID {_id}')
The class MyObject obviously doesn't do very much. The key features are that it has a string version of its id, a status and a function that does something but implicitly returns None.
We write a wrapper function that takes a reference to an instance of MyObject (first element in the iterable args), executes MyFunction() on that particular class instance then return that class's ID and status as a tuple.
The main loop uses a pattern that I use a lot and I'm sure many others do too. Using a dictionary comprehension, we build the so-called "futures". Remember that the second argument to submit() must be an iterable even though MyThreadFunc only needs one value.
We then wait for the threads to complete and get their return values.

Related

is multiprocessing manager registered type's argument copied?

I'm trying to convert my class so other processes have access to it.
import multiprocessing as mp
from multiprocessing.managers import BaseManager
import time
class SharedType:
def __init__(self, custom_arg):
self.count = 0
self.custom_arg = custom_arg
self.builtin_arg = []
def accum(self):
self.count += 1
def get_count(self):
return self.count
def get_custom_arg(self):
return self.custom_arg
def get_builtin_arg(self):
return self.builtin_arg
class CustomArg:
def __init__(self):
self.v = []
def append(self, i):
self.v.append(i)
class MyManager(BaseManager):
pass
MyManager.register('SharedType', SharedType)
def count(obj, uid):
print(uid, 'start')
# record uid
obj.get_custom_arg().append(uid)
obj.get_builtin_arg().append(uid)
# do some work
for _ in range(100_000):
obj.accum()
return id
if __name__ == '__main__':
print('run single process')
st = time.time()
c = 0
for _ in range(10):
for __ in range(100_000):
c += 1
print(c, f'elapse: {time.time() - st}')
print()
print('run multi process')
st = time.time()
pool = mp.Pool()
with MyManager() as manager:
shared_obj = manager.SharedType(CustomArg())
# run
ps = [pool.apply_async(count, args=(shared_obj, i)) for i in range(10)]
print('waiting...')
pool.close()
pool.join()
print(shared_obj.get_count(), f'elapse: {time.time() - st}')
print(shared_obj.get_custom_arg().v, shared_obj.get_builtin_arg())
result is as follows:
run single process
1000000 elapse: 0.14413094520568848
run multi process
waiting...
0 start
1 start
2 start
4 start
5 start
3 start
7 start
8 start
6 start
9 start
1000000 elapse: 36.199955463409424
[] []
What I expected was to see shared_obj's attributes remain shared as shared_obj.count does, rather being copied into processes' own memories. So is there a way to share whole attributes including complex ones? Or is such an idea total nonsense?
And additionally, what is the overhead making multiprocessing so slow in above case?

Despite what multiprocessing sometimes tries to pretend, you cannot have an object exist in two processes at the same time in Python, at all.
In your code, the manager is associated with a server process, and the SharedType instance lives in that server process. The master's shared_obj is a proxy object that communicates with the server process for method calls. When you pass shared_obj to pool.apply_async, the workers also get proxies, created by pickling the master's proxy and unpickling the pickle in the worker.
When you call a method on such a proxy, all arguments are pickled, the pickled representations are sent over inter-process communication to the server, and the server unpickles them to construct new objects. The server then calls the method on the SharedType instance, pickles the return value, and sends the pickled data back to the process that requested the method call, which unpickles the pickle to construct its own copy of the method return value.
(All this pickling and unpickling and inter-process communication is really slow, which is why multiprocessing slowed your code down so much. multiprocessing is really slow.)
count isn't actually any more shared than the other attributes here. The difference is that you updated count through the proxy's accum method, which communicates with the server process and updates the count in the server's SharedType instance, but you tried to update custom_arg and builtin_arg by calling methods on copies returned by a proxy's get_custom_arg and get_builtin_arg methods. Modifying a copy does not notify the server of changes or do anything to the SharedType instance living in the server process.
If you want to be able to operate on the return value of get_custom_arg and get_builtin_arg and have that affect the master objects living in the server process, the proxy's get_custom_arg and get_builtin_arg have to return proxies too. That means you'll have to register list and CustomArg, and add methods to CustomArg to read its data without accessing v directly. I think you might also have to create builtin_arg and custom_arg through the manager, or use _method_to_typeid_ to get the SharedType's proxy to return proxies from get_builtin_arg and get_custom_arg - I haven't managed to work out all the bugs in my own attempt, so I'm not sure of the full details.
Unrelated issue: managers don't eliminate the need for synchronization. If you're going to mutate shared objects like this, you need to use a lock to prevent race conditions.

Use multiprocessing to run functions inside a while loop in a class method

I have a method which calculates a final result using multiple other methods. It has a while loop inside which continuously checks for new data, and if new data is received, it runs the other methods and calculates the results. This main method is the only one which is called by the user, and it stays active until the program is closed. the basic structure is as follows:
class sample:
def __init__(self):
results = []
def main_calculation(self):
while True:
#code to get data
if newdata != olddata:
#insert code to prepare data for analysis
res1 = self.calc1(prepped_data)
res2 = self.calc2(prepped_data)
final = res1 + res2
self.results.append(final)
I want to run calc1 and calc2 in parallel, so that I can get the final result faster. However, I am unsure of how to implement multiprocessing in this way, since I'm not using a __main__ guard. Is there any way to run these processes in parallel?
This is likely not the best organization for this code, but it is what is easiest for the actual calculations I am running, since it is necessary that this code be imported and run from a different file. However, I can restructure the code if this is not a salvageable structure.

According to the documentation, the reason you need to use a __main__ guard is that when your program creates a multiprocessing.Process object, it starts up a whole new copy of the Python interpreter which will import a new copy of your program's modules. If importing your module calls multiprocessing.Process() itself, that will create yet another copy of the Python interpreter which interprets yet another copy of your code, and so on until your system crashes (or actually, until Python hits a non-reentrant piece of the multiprocessing code).
In the main module of your program, which usually calls some code at the top level, checking __name__ == '__main__' is the way you can tell whether the program is being run for the first time or is being run as a subprocess. But in a different module, there might not be any code at the top level (other than definitions), and in that case there's no need to use a guard because the module can be safely imported without starting a new process.
In other words, this is dangerous:
import multiprocessing as mp
def f():
...
p = mp.Process(target=f)
p.start()
p.join()
but this is safe:
import multiprocessing as mp
def f():
...
def g():
p = mp.Process(target=f)
p.start()
p.join()
and this is also safe:
import multiprocessing as mp
def f():
...
class H:
def g(self):
p = mp.Process(target=f)
p.start()
p.join()
So in your example, you should be able to directly create Process objects in your function.
However, I'd suggest making it clear in the documentation for the class that that method creates a Process, because whoever uses it (maybe you) needs to know that it's not safe to call that method at the top level of a module. It would be like doing this, which also falls in the "dangerous" category:
import multiprocessing as mp
def f():
...
class H:
def g(self):
p = mp.Process(target=f)
p.start()
p.join()
H().g() # this creates a Process at the top level
You could also consider an alternative approach where you make the caller do all the process creation. In this approach, either your sample class constructor or the main_calculation() method could accept, say, a Pool object, and it can use the processes from that pool to do its calculations. For example:
class sample:
def main_calculation(self, pool):
while True:
if newdata != olddata:
res1_async = pool.apply_async(self.calc1, [prepped_data])
res2_async = pool.apply_async(self.calc2, [prepped_data])
res1 = res1_async.get()
res2 = res2_async.get()
# and so on
This pattern may also allow your program to be more efficient in its use of resources, if there are many different calculations happening, because they can all use the same pool of processes.

Fill a Queue with Objects from several data loaders using multiprocessing

I work on a machine learning input pipeline. I wrote a data loader that reads in data from a large .hdf file and returns slices, which takes roughly 2 seconds per slice. Therefore I would like to use a queue, that takes in objects from several data loaders and can return single objects from the queue via a next function (like a generator). Furthermore the processes that fill the queue should run somehow in the background, refilling the queue when it is not full. I do not get it to work properly. It worked with a single dataloader, giving me 4 times the same slices..
import multiprocessing as mp
class Queue_Generator():
def __init__(self, data_loader_list):
self.pool = mp.Pool(4)
self.data_loader_list = data_loader_list
self.queue = mp.Queue(maxsize=16)
self.pool.map(self.fill_queue, self.data_loader_list)
def fill_queue(self,gen):
self.queue.put(next(gen))
def __next__(self):
yield self.queue.get()
What I get from this:
NotImplementedError: pool objects cannot be passed between processes or pickled
Thanks in advance

Your specific error means that you cannot have a pool as part of your class when you are passing class methods to a pool. What I would suggest could be the following:
import multiprocessing as mp
from queue import Empty
class QueueGenerator(object):
def __init__(self, data_loader_list):
self.data_loader_list = data_loader_list
self.queue = mp.Queue(maxsize=16)
def __iter__(self):
processes = list()
for _ in range(4):
pr = mp.Process(target=fill_queue, args=(self.queue, self.data_loader_list))
pr.start()
processes.append(pr)
return self
def __next__(self):
try:
return self.queue.get(timeout=1) # this should have a value, otherwise your loop will never stop. make it something that ensures your processes have enough time to update the queue but not too long that your program freezes for an extended period of time after all information is processed
except Empty:
raise StopIteration
# have fill queue as a separate function
def fill_queue(queue, gen):
while True:
try:
value = next(gen)
queue.put(value)
except StopIteration: # assumes the given data_loader_list is an iterator
break
print('stopping')
gen = iter(range(70))
qg = QueueGenerator(gen)
for val in qg:
print(val)
# test if it works several times:
for val in qg:
print(val)
The next issue for you to solve I think is to have the data_loader_list be something that provides new information in every separate process. But since you have not given any information about that I can't help you with that. The above does however provide you a way to have the processes fill your queue which is then passed out as an iterator.

Not quite sure why you are yielding in __next__, that doesn't look quite right to me. __next__ should return a value, not a generator object.
Here is a simple way that you can return the results of parallel functions as a generator. It may or may not meet your specific requirements but can be tweaked to suit. It will keep on processing data_loader_list until it is exhausted. This may use a lot of memory compared to keeping, for example, 4 items in a Queue at all times.
import multiprocessing as mp
def read_lines(data_loader):
from time import sleep
sleep(2)
return f'did something with {data_loader}'
def make_gen(data_loader_list):
with mp.Pool(4) as pool:
for result in pool.imap(read_lines, data_loader_list):
yield result
if __name__ == '__main__':
data_loader_list = [i for i in range(15)]
result_generator = make_gen(data_loader_list)
print(type(result_generator))
for i in result_generator:
print(i)
Using imap means that the results can be processed as they are produced. map and map_async would block in the for loop until all results were ready. See this question for more.

Python Unittest and Multiprocessing fail when run sequentially

I'm attempting to create unittests for my application which uses multiple processes, but have been having strange issues when attempting to run all the tests together. Basically when running tests individually they pass without issue but when run sequentially, such as when running all tests in the file, some tests will fail.
What I'm seeing is that many python processes are being created but they aren't closing when the test is reported as passed. For example if 2 tests are run that each generate 5 proceses, then 10 python processes show up in the system monitor.
I've tried using terminate and join but neither work. Is there a way to force a test to correctly close all processes that it generated before running the next test?
I'm running Python 2.7 in Ubuntu 16.04.
Edit:
It's a fairly large code base so here a simplified example.
from multiprocessing import Pipe, Process
class BaseDevice:
# Various methods
pass
class BaseInstr(BaseDevice, Process):
def __init__(self, pipe):
Process.__init__(self)
self.pipe = pipe
def run(self):
# Do stuff and wait for terminate message on pipe
# Various other higher level methods
class BaseCompountInstrument(BaseInstr):
def __init__(self, pipe):
# Create multiple instruments, usually done with config file but simplified here
BaseInstr.__init__(self, pipe)
instrlist = list()
for _ in range(5):
masterpipe, slavepipe = Pipe()
instrlist.append([BaseInstr(slavepipe), masterpipe])
def run(self):
pass
# Listen for message from pipe, send messages to sub-instruments
def shutdown(self):
# When shutdown message received, send to all sub-instruments
pass
class test(unittest.TestCase):
def setUp(self):
# Load up a configuration file from the sample configs so that they're updated
self.parentConn, self.childConn = Pipe()
self.instr = BaseCompountInstrument( self.childConn)
self.instr.start()
def tearDown(self):
self.parentConn.send("shutdown") # Propagates to all sub-instruments
def test1(self):
pass
def test2(self):
pass

After struggling a while (2 days actually) with this, I found a solution with it is not technically wrong, but removes all the parallel code you can have (Only in tests, only in tests...)
I use this package mock to mock functions (which I realize now it's part of the unittest module since Python 3.3 xD), you can suppose the execution of certain function worked well, fix a certain return value, or change the function itself.
So I did the last option: Change the function itself.
In my case I used a list of Process (because Pool didn't work in my case) and Manager's list to share data between the processes.
My original code would be something like this:
import multiprocessing as mp
manager = mp.Manager()
list_data = manager.list()
list_return = manager.list()
def parallel_function(list_data, list_return)
while len(list_data) > 0:
# Do things and make sure to "pop" the data in list_data
list_return.append(return_data)
return None
# Create as many processes as images or cpus, the lesser number
processes = [mp.Process(target=parallel_function,
args=(list_data, list_return))
for num_p in range(mp.cpu_count())]
for p in processes:
p.start()
for p in processes:
p.join(10)
So in my test I mock the function Process._init_ from the multiprocessing module to do my parallel_function instead create a new process.
In the test file, before any test you should define the same function you try to parallelize:
def fake_process(self, list_data, list_return):
while len(list_data) > 0:
# Do things and make sure to "pop" the data in list_data
list_return.append(return_data)
return None
And before the definition of any method which is going to execute this part of the code you have to define its decorators to overwrite the Process._init_ function.
#patch('multiprocessing.Process.__init__', new=fake_process)
#patch('multiprocessing.Process.start', new=lambda x: None)
#patch('multiprocessing.Process.join', new=lambda x, y: None)
def test_from_the_hell(self):
# Do things
If you use Manager data structures there is no need of use Locks or anything to control the access to the data, because those structures are thread safe.
I hope this will help any other lost soul who is trying to test multiprocessing code.

How to use multiprocessing with class instances in Python?

I am trying to create a class than can run a separate process to go do some work that takes a long time, launch a bunch of these from a main module and then wait for them all to finish. I want to launch the processes once and then keep feeding them things to do rather than creating and destroying processes. For example, maybe I have 10 servers running the dd command, then I want them all to scp a file, etc.
My ultimate goal is to create a class for each system that keeps track of the information for the system in which it is tied to like IP address, logs, runtime, etc. But that class must be able to launch a system command and then return execution back to the caller while that system command runs, to followup with the result of the system command later.
My attempt is failing because I cannot send an instance method of a class over the pipe to the subprocess via pickle. Those are not pickleable. I therefore tried to fix it various ways but I can't figure it out. How can my code be patched to do this? What good is multiprocessing if you can't send over anything useful?
Is there any good documentation of multiprocessing being used with class instances? The only way I can get the multiprocessing module to work is on simple functions. Every attempt to use it within a class instance has failed. Maybe I should pass events instead? I don't understand how to do that yet.
import multiprocessing
import sys
import re
class ProcessWorker(multiprocessing.Process):
"""
This class runs as a separate process to execute worker's commands in parallel
Once launched, it remains running, monitoring the task queue, until "None" is sent
"""
def __init__(self, task_q, result_q):
multiprocessing.Process.__init__(self)
self.task_q = task_q
self.result_q = result_q
return
def run(self):
"""
Overloaded function provided by multiprocessing.Process. Called upon start() signal
"""
proc_name = self.name
print '%s: Launched' % (proc_name)
while True:
next_task_list = self.task_q.get()
if next_task is None:
# Poison pill means shutdown
print '%s: Exiting' % (proc_name)
self.task_q.task_done()
break
next_task = next_task_list[0]
print '%s: %s' % (proc_name, next_task)
args = next_task_list[1]
kwargs = next_task_list[2]
answer = next_task(*args, **kwargs)
self.task_q.task_done()
self.result_q.put(answer)
return
# End of ProcessWorker class
class Worker(object):
"""
Launches a child process to run commands from derived classes in separate processes,
which sit and listen for something to do
This base class is called by each derived worker
"""
def __init__(self, config, index=None):
self.config = config
self.index = index
# Launce the ProcessWorker for anything that has an index value
if self.index is not None:
self.task_q = multiprocessing.JoinableQueue()
self.result_q = multiprocessing.Queue()
self.process_worker = ProcessWorker(self.task_q, self.result_q)
self.process_worker.start()
print "Got here"
# Process should be running and listening for functions to execute
return
def enqueue_process(target): # No self, since it is a decorator
"""
Used to place an command target from this class object into the task_q
NOTE: Any function decorated with this must use fetch_results() to get the
target task's result value
"""
def wrapper(self, *args, **kwargs):
self.task_q.put([target, args, kwargs]) # FAIL: target is a class instance method and can't be pickled!
return wrapper
def fetch_results(self):
"""
After all processes have been spawned by multiple modules, this command
is called on each one to retreive the results of the call.
This blocks until the execution of the item in the queue is complete
"""
self.task_q.join() # Wait for it to to finish
return self.result_q.get() # Return the result
#enqueue_process
def run_long_command(self, command):
print "I am running number % as process "%number, self.name
# In here, I will launch a subprocess to run a long-running system command
# p = Popen(command), etc
# p.wait(), etc
return
def close(self):
self.task_q.put(None)
self.task_q.join()
if __name__ == '__main__':
config = ["some value", "something else"]
index = 7
workers = []
for i in range(5):
worker = Worker(config, index)
worker.run_long_command("ls /")
workers.append(worker)
for worker in workers:
worker.fetch_results()
# Do more work... (this would actually be done in a distributor in another class)
for worker in workers:
worker.close()
Edit: I tried to move the ProcessWorker class and the creation of the multiprocessing queues outside of the Worker class and then tried to manually pickle the worker instance. Even that doesn't work and I get an error
RuntimeError: Queue objects should only be shared between processes
through inheritance
. But I am only passing references of those queues into the worker instance?? I am missing something fundamental. Here is the modified code from the main section:
if __name__ == '__main__':
config = ["some value", "something else"]
index = 7
workers = []
for i in range(1):
task_q = multiprocessing.JoinableQueue()
result_q = multiprocessing.Queue()
process_worker = ProcessWorker(task_q, result_q)
worker = Worker(config, index, process_worker, task_q, result_q)
something_to_look_at = pickle.dumps(worker) # FAIL: Doesn't like queues??
process_worker.start()
worker.run_long_command("ls /")

So, the problem was that I was assuming that Python was doing some sort of magic that is somehow different from the way that C++/fork() works. I somehow thought that Python only copied the class, not the whole program into a separate process. I seriously wasted days trying to get this to work because all of the talk about pickle serialization made me think that it actually sent everything over the pipe. I knew that certain things could not be sent over the pipe, but I thought my problem was that I was not packaging things up properly.
This all could have been avoided if the Python docs gave me a 10,000 ft view of what happens when this module is used. Sure, it tells me what the methods of multiprocess module does and gives me some basic examples, but what I want to know is what is the "Theory of Operation" behind the scenes! Here is the kind of information I could have used. Please chime in if my answer is off. It will help me learn.
When you run start a process using this module, the whole program is copied into another process. But since it is not the "__main__" process and my code was checking for that, it doesn't fire off yet another process infinitely. It just stops and sits out there waiting for something to do, like a zombie. Everything that was initialized in the parent at the time of calling multiprocess.Process() is all set up and ready to go. Once you put something in the multiprocess.Queue or shared memory, or pipe, etc. (however you are communicating), then the separate process receives it and gets to work. It can draw upon all imported modules and setup just as if it was the parent. However, once some internal state variables change in the parent or separate process, those changes are isolated. Once the process is spawned, it now becomes your job to keep them in sync if necessary, either through a queue, pipe, shared memory, etc.
I threw out the code and started over, but now I am only putting one extra function out in the ProcessWorker, an "execute" method that runs a command line. Pretty simple. I don't have to worry about launching and then closing a bunch of processes this way, which has caused me all kinds of instability and performance issues in the past in C++. When I switched to launching processes at the beginning and then passing messages to those waiting processes, my performance improved and it was very stable.
BTW, I looked at this link to get help, which threw me off because the example made me think that methods were being transported across the queues: http://www.doughellmann.com/PyMOTW/multiprocessing/communication.html
The second example of the first section used "next_task()" that appeared (to me) to be executing a task received via the queue.

Instead of attempting to send a method itself (which is impractical), try sending a name of a method to execute.
Provided that each worker runs the same code, it's a matter of a simple getattr(self, task_name).
I'd pass tuples (task_name, task_args), where task_args were a dict to be directly fed to the task method:
next_task_name, next_task_args = self.task_q.get()
if next_task_name:
task = getattr(self, next_task_name)
answer = task(**next_task_args)
...
else:
# poison pill, shut down
break

REF: https://stackoverflow.com/a/14179779
Answer on Jan 6 at 6:03 by David Lynch is not factually correct when he says that he was misled by
http://www.doughellmann.com/PyMOTW/multiprocessing/communication.html.
The code and examples provided are correct and work as advertised. next_task() is executing a task received via the queue -- try and understand what the Task.__call__() method is doing.
In my case what, tripped me up was syntax errors in my implementation of run(). It seems that the sub-process will not report this and just fails silently -- leaving things stuck in weird loops! Make sure you have some kind of syntax checker running e.g. Flymake/Pyflakes in Emacs.
Debugging via multiprocessing.log_to_stderr()F helped me narrow down the problem.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.