Sharing a Pool object across processes in python multiprocessing - python

I am trying to share a pool object between multiple processes using the following code
from multiprocessing import Process, Pool
import time
pool = Pool(5)
def print_hello():
time.sleep(1)
return "hello"
def pipeline():
print("In pipeline")
msg = pool.apply_async(print_hello()).get(timeout=1.5)
print("In pipeline")
print(msg)
def post():
p = Process(target = pipeline)
p.start()
return
if __name__ == '__main__':
post()
print("Returned from post")
However the code exists with the timout since get() doesnot return. I believe this has to do with pool being a globally accessible variable because it works just fine when I move pool to being local to pipeline function. Can anyone give me suggestions if there exists a workaround for this problem ?
Edit: finally got working with thread instead of process for running pipeline function.

Related

Python socketio with multiprocessing

So I have been struggling with this one error of pickle which is driving me crazy. I have the following master Engine class with the following code :
import eventlet
import socketio
import multiprocessing
from multiprocessing import Queue
from multi import SIOSerever
class masterEngine:
if __name__ == '__main__':
serverObj = SIOSerever()
try:
receiveData = multiprocessing.Process(target=serverObj.run)
receiveData.start()
receiveProcess = multiprocessing.Process(target=serverObj.fetchFromQueue)
receiveProcess.start()
receiveData.join()
receiveProcess.join()
except Exception as error:
print(error)
and I have another file called multi which runs like the following :
import multiprocessing
from multiprocessing import Queue
import eventlet
import socketio
class SIOSerever:
def __init__(self):
self.cycletimeQueue = Queue()
self.sio = socketio.Server(cors_allowed_origins='*',logger=False)
self.app = socketio.WSGIApp(self.sio, static_files={'/': 'index.html',})
self.ws_server = eventlet.listen(('0.0.0.0', 5000))
#self.sio.on('production')
def p_message(sid, message):
self.cycletimeQueue.put(message)
print("I logged : "+str(message))
def run(self):
eventlet.wsgi.server(self.ws_server, self.app)
def fetchFromQueue(self):
while True:
cycle = self.cycletimeQueue.get()
print(cycle)
As you can see I can trying to create two processes of def run and fetchFromQueue which i want to run independently.
My run function starts the python-socket server to which im sending some data from a html web page ( This runs perfectly without multiprocessing). I am then trying to push the data received to a Queue so that my other function can retrieve it and play with the data received.
I have a set of time taking operations that I need to carry out with the data received from the socket which is why im pushing it all into a Queue.
On running the master Engine class I receive the following :
Can't pickle <class 'threading.Thread'>: it's not the same object as threading.Thread
I ended!
[Finished in 0.5s]
Can you please help with what I am doing wrong?
From multiprocessing programming guidelines:
Explicitly pass resources to child processes
On Unix using the fork start method, a child process can make use of a shared resource created in a parent process using a global resource. However, it is better to pass the object as an argument to the constructor for the child process.
Apart from making the code (potentially) compatible with Windows and the other start methods this also ensures that as long as the child process is still alive the object will not be garbage collected in the parent process. This might be important if some resource is freed when the object is garbage collected in the parent process.
Therefore, I slightly modified your example by removing everything unnecessary, but showing an approach where the shared queue is explicitly passed to all processes that use it:
import multiprocessing
MAX = 5
class SIOSerever:
def __init__(self, queue):
self.cycletimeQueue = queue
def run(self):
for i in range(MAX):
self.cycletimeQueue.put(i)
#staticmethod
def fetchFromQueue(cycletimeQueue):
while True:
cycle = cycletimeQueue.get()
print(cycle)
if cycle >= MAX - 1:
break
def start_server(queue):
server = SIOSerever(queue)
server.run()
if __name__ == '__main__':
try:
queue = multiprocessing.Queue()
receiveData = multiprocessing.Process(target=start_server, args=(queue,))
receiveData.start()
receiveProcess = multiprocessing.Process(target=SIOSerever.fetchFromQueue, args=(queue,))
receiveProcess.start()
receiveData.join()
receiveProcess.join()
except Exception as error:
print(error)
0
1
...

Using Multiprocessing with Modules

I am writing a module such that in one function I want to use the Pool function from the multiprocessing library in Python 3.6. I have done some research on the problem and the it seems that you cannot use if __name__=="__main__" as the code is not being run from main. I have also noticed that the python pool processes get initialized in my task manager but essentially are stuck.
So for example:
class myClass()
...
lots of different functions here
...
def multiprocessFunc()
do stuff in here
def funcThatCallsMultiprocessFunc()
array=[array of filenames to be called]
if __name__=="__main__":
p = Pool(processes=20)
p.map_async(multiprocessFunc,array)
I tried to remove the if __name__=="__main__" part but still no dice. any help would appreciated.
It seems to me that your have just missed out a self. from your code. I should think this will work:
class myClass():
...
# lots of different functions here
...
def multiprocessFunc(self, file):
# do stuff in here
def funcThatCallsMultiprocessFunc(self):
array = [array of filenames to be called]
p = Pool(processes=20)
p.map_async(self.multiprocessFunc, array) #added self. here
Now having done some experiments, I see that map_async could take quite some time to start up (I think because multiprocessing creates processes) and any test code might call funcThatCallsMultiprocessFunc and then quit before the Pool has got started.
In my tests I had to wait for over 10 seconds after funcThatCallsMultiprocessFunc before calls to multiprocessFunc started. But once started, they seemed to run just fine.
This is the actual code I've used:
MyClass.py
from multiprocessing import Pool
import time
import string
class myClass():
def __init__(self):
self.result = None
def multiprocessFunc(self, f):
time.sleep(1)
print(f)
return f
def funcThatCallsMultiprocessFunc(self):
array = [c for c in string.ascii_lowercase]
print(array)
p = Pool(processes=20)
p.map_async(self.multiprocessFunc, array, callback=self.done)
p.close()
def done(self, arg):
self.result = 'Done'
print('done', arg)
Run.py
from MyClass import myClass
import time
def main():
c = myClass()
c.funcThatCallsMultiprocessFunc()
for i in range(30):
print(i, c.result)
time.sleep(1)
if __name__=="__main__":
main()
The if __name__=='__main__' construct is an import protection. You want to use it, to stop multiprocessing from running your setup on import.
In your case, you can leave out this protection in the class setup. Be sure to protect the execution points of the class in the calling file like this:
def apply_async_with_callback():
pool = mp.Pool(processes=30)
for i in range(z):
pool.apply_async(parallel_function, args = (i,x,y, ), callback = callback_function)
pool.close()
pool.join()
print "Multiprocessing done!"
if __name__ == '__main__':
apply_async_with_callback()

Basic multiprocessing with infinity loop and queue

import random
import queue as Queue
import _thread as Thread
a = Queue.Queue()
def af():
while True:
a.put(random.randint(0,1000))
def bf():
while True:
if (not a.empty()): print (a.get())
def main():
Thread.start_new_thread(af, ())
Thread.start_new_thread(bf, ())
return
if __name__ == "__main__":
main()
the above code works fine with extreme high CPU usage, i tried to use multiprocessing with no avail. i have tried
def main():
multiprocessing.Process(target=af).run()
multiprocessing.Process(target=bf).run()
and
def main():
manager = multiprocessing.Manager()
a = manager.Queue()
pool = multiprocessing.Pool()
pool.apply_async(af)
pool.apply_async(bf)
both not working, can anyone please help me? thanks a bunch ^_^
def main():
multiprocessing.Process(target=af).run() # will not return
multiprocessing.Process(target=bf).run()
The above code does not work because af does not return; no chance to call bf. You need to separate run call to start/join so that both can run in parallel. (+ to make them share manage.Queue)
To make the second code work, you need to pass a (manager.Queue object) to functions. Otherwise they will use Queue.Queue global object which is not shared between processes; need to modify af, bf to accepts a, and main to pass a.
def af(a):
while True:
a.put(random.randint(0, 1000))
def bf(a):
while True:
print(a.get())
def main():
manager = multiprocessing.Manager()
a = manager.Queue()
pool = multiprocessing.Pool()
proc1 = pool.apply_async(af, [a])
proc2 = pool.apply_async(bf, [a])
# Wait until process ends. Uncomment following line if there's no waiting code.
# proc1.get()
# proc2.get()
In the first alternative main you use Process, but the method you should call to start the activity is not run(), as one would think, but rather start(). You will want to follow that up with appropriate join() statements. Following the information in multiprocessing (available here: https://docs.python.org/2/library/multiprocessing.html), here is a working sample:
import random
from multiprocessing import Process, Queue
def af(q):
while True:
q.put(random.randint(0,1000))
def bf(q):
while True:
if not q.empty():
print (q.get())
def main():
a = Queue()
p = Process(target=af, args=(a,))
c = Process(target=bf, args=(a,))
p.start()
c.start()
p.join()
c.join()
if __name__ == "__main__":
main()
To add to the accepted answer, in the original code:
while True:
if not q.empty():
print (q.get())
q.empty() is being called every time which is unnecessary since q.get() if the queue is empty will wait until something is available here documentation.
Similar answer here
I assume that this could affect the performance since calling the .empty() every iteration should consume more resources (it should be more noticeable if Thread was used instead of Process because Python Global Interpreter Lock (GIL))
I know it's an old question but hope it helps!

Does multiprocessing.Pipe have to be passed to subprocess through inheritance

I understand that multiprocessing.Queue has to be passed to subprocess through inheritance. However, when I try passing Pipe to a subprocess through message passing, like the following code, the error I got isn't saying that "Pipe can only be shared between processes through inheritance". Instead it fails at q.get() and the error says TypeError: Required argument 'handle' (pos 1) not found. I'm wondering is it at all possible to do so? Assuming that the pipes are implemented using linux named pipes, then all it matters is the name of the pipe and it could be the states to be serialized and passed between processes right?
from multiprocessing import Process, Pipe, Queue
def reader(q):
output_p = q.get()
msg = output_p.recv()
while msg is not None:
msg = output_p.recv()
if __name__ == '__main__':
q = Queue()
reader_p = Process(target=reader, args=(q,))
reader_p.start() # Launch the reader process
output_p, input_p = Pipe(True)
q.put(output_p)
input_p.send('MyMessage')
input_p.send(None)
reader_p.join()
This is a bug which has been fixed in Python 3.
Your code in Python 3 works flawlessly.
noxadofox gave the correct answer here. I'm adding an example I devised to validate that pipes do not require inheritance. In this example I create a second pipe after the executor started its two processes and pass it to the existing processes as a parameter.
""" Multiprocessing pipe and queue test """
import multiprocessing
import concurrent.futures
import time
class Example:
def __init__(self):
manager = multiprocessing.Manager()
q = manager.Queue()
executor = concurrent.futures.ProcessPoolExecutor(max_workers=2)
pipe_out_1, pipe_in_1 = multiprocessing.Pipe(duplex=True)
executor.submit(self.requester, q, pipe_in_1)
executor.submit(self.worker, q, pipe_out_1)
print(executor._processes)
pipe_out_2, pipe_in_2 = multiprocessing.Pipe(duplex=True)
executor.submit(self.requester, q, pipe_in_2)
executor.submit(self.worker, q, pipe_out_2)
print(executor._processes)
#staticmethod
def worker(q, pipe_out):
task = q.get()
print('worker got task {}'.format(task))
pipe_out.send(task + '-RESPONSE')
print('loop_proc sent')
#staticmethod
def requester(q, pipe_in):
q.put('TASK')
response = pipe_in.recv()
print('requester got response {}'.format(response))
time.sleep(2)
if __name__ == '__main__':
Example()
time.sleep(30)

python3 multiprocessing example crashed my pc :(

I am new to multiprocessing
I have run example code for two 'highly recommended' multiprocessing examples given in response to other stackoverflow multiprocessing questions. Here is an example of one (which i dare not run again!)
test2.py (running from pydev)
import multiprocessing
class MyFancyClass(object):
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
print(proc_name, self.name)
def worker(q):
obj = q.get()
obj.do_something()
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
queue.put(MyFancyClass('Fancy Dan'))
# Wait for the worker to finish
queue.close()
queue.join_thread()
p.join()
When I run this my computer slows down imminently. It gets incrementally slower. After some time I managed to get into the task manager only to see MANY MANY python.exe under the processes tab. after trying to end process on some, my mouse stopped moving. It was the second time i was forced to reboot.
I am too scared to attempt a third example...
running - Intel(R) Core(TM) i7 CPU 870 # 2.93GHz (8 CPUs), ~2.9GHz on win7 64
If anyone know what the issue is and can provide a VERY SIMPLE example of multiprocessing (send a string too a multiprocess, alter it and send it back for printing) I would be very grateful.
From the docs:
Make sure that the main module can be safely imported by a new Python
interpreter without causing unintended side effects (such a starting a
new process).
Thus, on Windows, you must wrap your code inside a
if __name__=='__main__':
block.
For example, this sends a string to the worker process, the string is reversed and the result is printed by the main process:
import multiprocessing as mp
def worker(inq,outq):
obj = inq.get()
obj = obj[::-1]
outq.put(obj)
if __name__=='__main__':
inq = mp.Queue()
outq = mp.Queue()
p = mp.Process(target=worker, args=(inq,outq))
p.start()
inq.put('Fancy Dan')
# Wait for the worker to finish
p.join()
result = outq.get()
print(result)
Because of the way multiprocessing works on Windows (child processes import the __main__ module) the __main__ module cannot actually run anything when imported -- any code that should execute when run directly must be protected by the if __name__ == '__main__' idiom. Your corrected code:
import multiprocessing
class MyFancyClass(object):
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
print(proc_name, self.name)
def worker(q):
obj = q.get()
obj.do_something()
if __name__ == '__main__':
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
queue.put(MyFancyClass('Fancy Dan'))
# Wait for the worker to finish
queue.close()
queue.join_thread()
p.join()
Might I suggest this link? It's using threads, instead of multiprocessing, but many of the principles are the same.

Categories