My current project requires using multiple processes. I need to share an array between those processes. The array needs to be able to be written to at any time. And the array has to have multiple dimensions. (example: [["test",2],[87209873,"howdy"]]) I've been looking for an answer to this for a few hours now, but I can't find anything. Please help. Thanks in advance!
Try it:
from multiprocessing import Pool, Manager
def worker(v, array):
array.append(["test", v])
def main():
foo = [["test", 2], [87209873, "howdy"]]
array = Manager().list(foo)
with Pool(processes=4) as pool:
pool.starmap(worker, [(i, array)
for i in range(4)])
print(array)
if __name__ == "__main__":
main()
[EDITED]
If you want, that the main program keeps running, during calculating, wrap pooling in a separate thread:
from multiprocessing import Pool, Manager
from threading import Thread
def _worker(v, array):
for i in range(10000):
array.append(["test", v])
def processor(array):
with Pool(processes=4) as pool:
pool.starmap(_worker, [(i, array)
for i in range(4)])
def main():
foo = [["test", 2], [87209873, "howdy"]]
array = Manager().list(foo)
t = Thread(target=processor, args=(array,))
t.start()
print("Good day!")
# Wait, while thread ends.
# Without doing it, you'll print array,
# not knowing when the thread ended.
t.join()
print(array)
if __name__ == "__main__":
main()
First of all, a list is not an array, if you want to share a list between different processes you can use a Manager from the multiprocessing module, for example:
import multiprocessing as mp
def remove_last_element(mp_list: list):
mp_list.pop()
def append_list(mp_list: list):
mp_list.append([12, 'New Hello'])
if __name__ == "__main__":
mp_list = mp.Manager().list()
mp_list.append(['Hello'])
print("before multiprocessing:", mp_list)
worker1 = mp.Process(target=remove_last_element, args=(mp_list,))
worker2 = mp.Process(target=append_list, args=(mp_list,))
worker1.start()
worker2.start()
worker1.join()
worker2.join()
print("after multiprocessing:", mp_list)
>>> before multiprocessing: [['Hello']]
>>> after multiprocessing: [[12, 'New Hello']]
Related
I want to use use multiprocessing to do the following:
class myClass:
def proc(self):
#processing random numbers
return a
def gen_data(self):
with Pool(cpu_count()) as q:
data = q.map(self.proc, [_ for i in range(cpu_count())])#What is the correct approach?
return data
Try this:
def proc(self, i):
#processing random numbers
return a
def gen_data(self):
with Pool(cpu_count()) as q:
data = q.map(self.proc, [i for i in range(cpu_count())])#What is the correct approach?
return data
Since you don't have to pass an argument to the processes, there's no reason to map, just call apply_async() as many times as needed.
Here's what I'm saying:
from multiprocessing import cpu_count
from multiprocessing.pool import Pool
from random import randint
class MyClass:
def proc(self):
#processing random numbers
return randint(1, 10)
def gen_data(self, num_procs):
with Pool() as pool: # The default pool size will be the number of cpus.
results = [pool.apply_async(self.proc) for _ in range(num_procs)]
pool.close()
pool.join() # Wait until all worker processes exit.
return [result.get() for result in results] # Gather results.
if __name__ == '__main__':
obj = MyClass()
print(obj.gen_data(8))
I know the basic usage of multiprocessing about pools,and I use apply_async() func to avoid block,my problem code such like:
from multiprocessing import Pool, Queue
import time
q = Queue(maxsize=20)
script = "my_path/my_exec_file"
def initQueue():
...
def test_func(queue):
print 'Coming'
While True:
do_sth
...
if __name__ == '__main__':
initQueue()
pool = Pool(processes=3)
for i in xrange(11,20):
result = pool.apply_async(test_func, (q,))
pool.close()
while True:
if q.empty():
print 'Queue is emty,quit'
break
print 'Main Process Lintening'
time.sleep(2)
The results output are always Main Process Linstening,I can;t find word 'Coming'..
The code above has no syntax error and no any Exceptions.
Any one can help, thanks!
I have this code (it is snippet from my program):
from multiprocessing import Process, Manager, cpu_count, Pool, Value, Lock
def grab_future_products(p, test):
print("Starting procces %s" % p)
if __name__ == "__main__": # Main program
n = 4
test = Value('i', 0)
pool = Pool(processes=n) # n processes per every CPU core
for i in range(n):
pool.apply_async(grab_future_products, args=(i, test))
pool.close()
pool.join()
If I run it with python test.py I got no output, no errors, just nothing.
I wanted to use variable test as shared integrer between all processes so I can do in another process something like:
if test.value == X:
break
But interesting is that if I replace args=(i, test)) with args=(i, 1)), it will work as desired.
So my question is, why I can not pass Value() object into process? And how I can solve this problem?
Many thanks.
The trick is to use multiprocessing.Manager, as also mentioned here: Sharing a result queue among several processes:
from multiprocessing import Process, Manager, cpu_count, Pool, Value, Lock
def grab_future_products(p, test):
print("Starting procces %s, value=%i" % (p, test.value))
if __name__ == "__main__": # Main program
n = 4
pool = Pool(processes=n) # n processes per every CPU core
m = Manager()
v = m.Value('i', 0)
for i in range(n):
res=pool.apply_async(grab_future_products, args=(i, v))
pool.close()
pool.join()
I am trying to understand how to use the multiprocessing module in Python. The code below spawns four processes and outputs the results as they become available. It seems to me that there must be a better way for how the results are obtained from the Queue; some method that does not rely on counting how many items the Queue contains but that just returns items as they become available and then gracefully exits once the queue is empty. The docs say that Queue.empty() method is not reliable. Is there a better alternative for how to consume the results from the queue?
import multiprocessing as mp
import time
def multby4_wq(x, queue):
print "Starting!"
time.sleep(5.0/x)
a = x*4
queue.put(a)
if __name__ == '__main__':
queue1 = mp.Queue()
for i in range(1, 5):
p = mp.Process(target=multbyc_wq, args=(i, queue1))
p.start()
for i in range(1, 5): # This is what I am referring to as counting again
print queue1.get()
Instead of using queue, how about using Pool?
For example,
import multiprocessing as mp
import time
def multby4_wq(x):
print "Starting!"
time.sleep(5.0/x)
a = x*4
return a
if __name__ == '__main__':
pool = mp.Pool(4)
for result in pool.map(multby4_wq, range(1, 5)):
print result
Pass multiple arguments
Assume you have a function that accept multiple parameters (add in this example). Make a wrapper function that pass arguments to add (add_wrapper).
import multiprocessing as mp
import time
def add(x, y):
time.sleep(1)
return x + y
def add_wrapper(args):
return add(*args)
if __name__ == '__main__':
pool = mp.Pool(4)
for result in pool.map(add_wrapper, [(1,2), (3,4), (5,6), (7,8)]):
print result
I want a long-running process to return its progress over a Queue (or something similar) which I will feed to a progress bar dialog. I also need the result when the process is completed. A test example here fails with a RuntimeError: Queue objects should only be shared between processes through inheritance.
import multiprocessing, time
def task(args):
count = args[0]
queue = args[1]
for i in xrange(count):
queue.put("%d mississippi" % i)
return "Done"
def main():
q = multiprocessing.Queue()
pool = multiprocessing.Pool()
result = pool.map_async(task, [(x, q) for x in range(10)])
time.sleep(1)
while not q.empty():
print q.get()
print result.get()
if __name__ == "__main__":
main()
I've been able to get this to work using individual Process objects (where I am alowed to pass a Queue reference) but then I don't have a pool to manage the many processes I want to launch. Any advise on a better pattern for this?
The following code seems to work:
import multiprocessing, time
def task(args):
count = args[0]
queue = args[1]
for i in xrange(count):
queue.put("%d mississippi" % i)
return "Done"
def main():
manager = multiprocessing.Manager()
q = manager.Queue()
pool = multiprocessing.Pool()
result = pool.map_async(task, [(x, q) for x in range(10)])
time.sleep(1)
while not q.empty():
print q.get()
print result.get()
if __name__ == "__main__":
main()
Note that the Queue is got from a manager.Queue() rather than multiprocessing.Queue(). Thanks Alex for pointing me in this direction.
Making q global works...:
import multiprocessing, time
q = multiprocessing.Queue()
def task(count):
for i in xrange(count):
q.put("%d mississippi" % i)
return "Done"
def main():
pool = multiprocessing.Pool()
result = pool.map_async(task, range(10))
time.sleep(1)
while not q.empty():
print q.get()
print result.get()
if __name__ == "__main__":
main()
If you need multiple queues, e.g. to avoid mixing up the progress of the various pool processes, a global list of queues should work (of course, each process will then need to know what index in the list to use, but that's OK to pass as an argument;-).