Multiprocessing pool with numpy functions - python

I have a i5-8600k with 6 cores and am running a windows 10 computer. I am trying to perform multi processing with 2 numpy functions. I have made an issue before hand but I have not been successful as to making running the issue: issue, the code down below is from the answer to that issue. I am trying to run func1() and func2() at the same time however, when I run the code below it keeps on running forever.
import multiprocessing as mp
import numpy as np
num_cores = mp.cpu_count()
Numbers = np.array([1,2,3,4,5,6,7,8,9,10,11,12])
def func1():
Solution_1 = Numbers + 10
return Solution_1
def func2():
Solution_2 = Numbers * 10
return Solution_2
# Getting ready my cores, I left one aside
pool = mp.Pool(num_cores-1)
# This is to use all functions easily
functions = [func1, func2]
# This is to store the results
solutions = []
for function in functions:
solutions.append(pool.apply(function, ()))

There are several issues with the code. First, if you want to run this under Jupyter Notebook in Windows then you need to put your worker functions func1 and func2 in an external module, for example, workers.py and import them and that means you now need to either pass the Numbers array as an argument to the workers or initialize static storage of each process with the array when you initialize the pool. We will you the second method with a function called init_pool, which also has to be imported if we are running under Notebook:
workers.py
def func1():
Solution_1 = Numbers + 10
return Solution_1
def func2():
Solution_2 = Numbers * 10
return Solution_2
def init_pool(n_array):
global Numbers
Numbers = n_array
The second issue is that when running under Windows, the code that creates sub-processes or a multiprocessing pool must be within a block that is governed by a conditional if __name__ == '__main__':. Third, it is wasteful to create a pool size greater than 2 if you are only trying to run two parallel "jobs." And fourth, and I think finally, you are using the wrong pool method. apply will block until the "job" submitted (i.e. the one processed by func1) completes and so you are not achieving any degree of parallelism at all. You should be using apply_async.
import multiprocessing as mp
import numpy as np
from workers import func1, func2, init_pool
if __name__ == '__main__':
#num_cores = mp.cpu_count()
Numbers = np.array([1,2,3,4,5,6,7,8,9,10,11,12])
pool = mp.Pool(2, initializer=init_pool, initargs=(Numbers,)) # more than 2 is wasteful
# This is to use all functions easily
functions = [func1, func2]
# This is to store the results
solutions = []
results = [pool.apply_async(function) for function in functions]
for result in results:
solutions.append(result.get()) # wait for completion and get the result
print(solutions)
Prints:
[array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]), array([ 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120])]

Related

Sync Shared Variable in Multiprocessing

I have created a shared class with a shared variable. My function is supposed to run two parallel processes and find the total number of perfectly square integers. I'm able to get the total number of perfectly square numbers in each array, but when the process is done, I'm not able to get the sum of both of these numbers. Could you check where I went wrong? Creating Shared class was unnecessary but I just did it to check if it would work.
Here is my execution:
from multiprocessing import *
import multiprocessing
import math
class Shared:
def __init__(self) -> None:
self.total = multiprocessing.Value('f', 0)
def setMP(self, value):
self.total.value = value
def getMP(self):
return self.total
# global total
# total = multiprocessing.Value('f', 0) # using a synchronized value for all processes
shared = Shared()
shared.setMP(0)
# function to determine if the number is perfect square
def is_perfect(number):
if float(math.sqrt(number)) *2 == int(math.sqrt(number))*2:
return True # the number is perfect square
return False
# function to find the total number of perfectly square numbers
def find_perfect(array):
# loop through each element in the array
for element in array:
if is_perfect(element):
# get value
shared.getMP().acquire()
i = shared.getMP().value + 1
shared.setMP(i)
shared.getMP().release()
print(shared.getMP())
def perfectSquares(listA, listB):
# multiprocess
p1 = Process(target=find_perfect, args=(listA,))
p2 = Process(target=find_perfect, args=(listB,))
p1.start()
p2.start()
p1.join()
p2.join()
return shared.getMP()
if __name__ == '__main__':
list1 = [7, 8, 23, 64, 2, 3]
list2 = [64, 54, 32, 35, 36]
total = perfectSquares(list1, list2)
print (total)
You are running under Windows, a platform that uses spawn rather than fork to create new processes. What this means is that when a new process is created, execution starts at the very top of the program. This is the reason why the code that creates the new process must be within a if __name__ == '__main__': block (if it weren't, you would get into a recursive loop creating new processes). But this means that each new process you are creating is re-executing any code that is at global scope and is therefore creating its own shared variable instance.
The easiest fix is to move the creation of shared to function perfectSquared and to then pass shared as an argument to findPerfect. Be aware that you have two processes running in parallel but that one must finish before the other. The first process to finish will most likly print a count of 1.0 or 2.0 depending upon which process completes first (although it is possible that it could even be 3.0 when the two processes finish very close together) and the second process to finish must print a count of 3.0.
from multiprocessing import *
import multiprocessing
import math
class Shared:
def __init__(self) -> None:
self.total = multiprocessing.Value('f', 0)
def setMP(self, value):
self.total.value = value
def getMP(self):
return self.total
# function to determine if the number is perfect square
def is_perfect(number):
if float(math.sqrt(number)) *2 == int(math.sqrt(number))*2:
return True # the number is perfect square
return False
# function to find the total number of perfectly square numbers
def find_perfect(array, shared):
# loop through each element in the array
for element in array:
if is_perfect(element):
# get value
shared.getMP().acquire()
i = shared.getMP().value + 1
shared.setMP(i)
shared.getMP().release()
print(shared.getMP())
def perfectSquares(listA, listB):
# global total
# total = multiprocessing.Value('f', 0) # using a synchronized value for all processes
shared = Shared()
shared.setMP(0)
# multiprocess
p1 = Process(target=find_perfect, args=(listA, shared))
p2 = Process(target=find_perfect, args=(listB, shared))
p1.start()
p2.start()
p1.join()
p2.join()
return shared.getMP()
if __name__ == '__main__':
list1 = [7, 8, 23, 64, 2, 3]
list2 = [64, 54, 32, 35, 36]
total = perfectSquares(list1, list2)
print (total)
Prints:
<Synchronized wrapper for c_float(1.0)>
<Synchronized wrapper for c_float(3.0)>
<Synchronized wrapper for c_float(3.0)>

How to start functions in parallel, check if they are done, and start a new function in python?

I want to write a python code that does the following:
At first, it starts, say, 3 processes (or threads, or whatever) in parallel.
Then in a loop, python waits until any of the processes have finished (and returned some value)
Then, the python code starts a new function
In the end, I want 3 processes always running in parallel, until all functions I need to run are run. Here is some pseudocode:
import time
import random
from multiprocessing import Process
# some random function which can have different execution time
def foo():
time.sleep(random.randint(10) + 2)
return 42
# Start 3 functions
p = []
p.append(Process(target=foo))
p.append(Process(target=foo))
p.append(Process(target=foo))
while(True):
# wait until one of the processes has finished
???
# then add a new process so that always 3 are running in parallel
p.append(Process(target=foo))
I am pretty sure it is not clear what I want. Please ask.
What you really want is to start three processes and feed a queue with jobs that you want executed. Then there will only ever be three processes and when one is finished, it reads the next item from the queue and executes that:
import time
import random
from multiprocessing import Process, Queue
# some random function which can have different execution time
def foo(a):
print('foo', a)
time.sleep(random.randint(1, 10) + 2)
print(a)
return 42
def readQueue(q):
while True:
item = q.get()
if item:
f,*args = item
f(*args)
else:
return
if __name__ == '__main__':
q = Queue()
for a in range(4): # create 4 jobs
q.put((foo, a))
for _ in range(3): # sentinel for 3 processes
q.put(None)
# Start 3 processes
p = []
p.append(Process(target=readQueue, args=(q,)))
p.append(Process(target=readQueue, args=(q,)))
p.append(Process(target=readQueue, args=(q,)))
for j in p:
j.start()
#time.sleep(10)
for j in p:
j.join()
You can use the Pool of the multiprocessing module.
my_foos = [foo, foo, foo, foo]
def do_something(method):
method()
from multiprocessing import Pool
with Pool(3) as p:
p.map(do_something, my_foos)
The number 3 states the number of parallel jobs.
map takes the inputs as arguments to the function do_something
In your case do_something can be a function which calls the functions you want to be processed, which are passed as a list to inputs.

Multiprocessing - Shared Array

So I'm trying to implement multiprocessing in python where I wish to have a Pool of 4-5 processes running a method in parallel. The purpose of this is to run a total of thousand Monte simulations (250-200 simulations per process) instead of running 1000. I want each process to write to a common shared array by acquiring a lock on it as soon as its done processing the result for one simulation, writing the result and releasing the lock. So it should be a three step process :
Acquire lock
Write result
Release lock for other processes waiting to write to array.
Everytime I pass the array to the processes each process creates a copy of that array which I donot want as I want a common array. Can anyone help me with this by providing sample code?
Since you're only returning state from the child process to the parent process, then using a shared array and explicity locks is overkill. You can use Pool.map or Pool.starmap to accomplish exactly what you need. For example:
from multiprocessing import Pool
class Adder:
"""I'm using this class in place of a monte carlo simulator"""
def add(self, a, b):
return a + b
def setup(x, y, z):
"""Sets up the worker processes of the pool.
Here, x, y, and z would be your global settings. They are only included
as an example of how to pass args to setup. In this program they would
be "some arg", "another" and 2
"""
global adder
adder = Adder()
def job(a, b):
"""wrapper function to start the job in the child process"""
return adder.add(a, b)
if __name__ == "__main__":
args = list(zip(range(10), range(10, 20)))
# args == [(0, 10), (1, 11), ..., (8, 18), (9, 19)]
with Pool(initializer=setup, initargs=["some arg", "another", 2]) as pool:
# runs jobs in parallel and returns when all are complete
results = pool.starmap(job, args)
print(results) # prints [10, 12, ..., 26, 28]
Not tested, but something like that should work.
The array and lock are shared between processes.
from multiprocessing import Process, Array, Lock
def f(array, lock, n): #n is the dedicated location in the array
lock.acquire()
array[n]=-array[n]
lock.release()
if __name__ == '__main__':
size=100
arr=Array('i', [3,-7])
lock=Lock()
p = Process(target=f, args=(arr,lock,0))
q = Process(target=f, args=(arr,lock,1))
p.start()
q.start()
q.join()
p.join()
print(arr[:])
the documentation here https://docs.python.org/3.5/library/multiprocessing.html has plenty of examples to start with

RawArray not modified by processes as shared memory for Python multiprocessing

I am working with python multiprocessing. Using Pool to start concurrent processes and RawArray to share an array between concurrent processes. I do not need to synchronize the accessing of RawArray, that is, the array can be modified by any processes at any time.
The test code for RawArray is: (do not mind the meaning of the program as it is just a test.)
from multiprocessing.sharedctypes import RawArray
import time
sieve = RawArray('i', (10 + 1)*[1]) # shared memory between processes
import multiprocessing as mp
def foo_pool(x):
time.sleep(0.2)
sieve[x] = x*x # modify the shared memory array. seem not work ?
return x*x
result_list = []
def log_result(result):
result_list.append(result)
def apply_async_with_callback():
pool = mp.Pool(processes = 4)
for i in range(10):
pool.apply_async(foo_pool, args = (i,), callback = log_result)
pool.close()
pool.join()
print(result_list)
for x in sieve:
print (x) # !!! sieve is [1, 1, ..., 1]
if __name__ == '__main__':
apply_async_with_callback()
While the code did not work as expected. I commented the key statements. I have got stuck on this for a whole day. Any help or constructive advices would be very appreciated.
time.sleep fails because you did not import time
use sieve[x] = x*x to modify the array instead of sieve[x].value = x*x
on Windows, your code creates a new sieve in each subprocess. You need to pass a reference to the shared array, for example like this:
def foo_init(s):
global sieve
sieve = s
def apply_async_with_callback():
pool = mp.Pool(processes = 4, initializer=foo_init, initargs=(sieve,))
if __name__ == '__main__':
sieve = RawArray('i', (10 + 1)*[1])
You should use multithreading instead of multiprocessing, as threads can share memory of main process natively.
If you worry about python's GIL mechanism, maybe you can resort to the nogil of numba.
Working version:
from multiprocessing import Pool, RawArray
import time
def foo_pool(x):
sieve[x] = x * x # modify the shared memory array.
def foo_init(s):
global sieve
sieve = s
def apply_async_with_callback(loc_size):
with Pool(processes=4, initializer=foo_init, initargs=(sieve,)) as pool:
pool.map(foo_pool, range(loc_size))
for x in sieve:
print(x)
if __name__ == '__main__':
size = 50
sieve = RawArray('i', size * [1]) # shared memory between processes
apply_async_with_callback(size)

Python Multiprocessing with a single function

I have a simulation that is currently running, but the ETA is about 40 hours -- I'm trying to speed it up with multi-processing.
It essentially iterates over 3 values of one variable (L), and over 99 values of of a second variable (a). Using these values, it essentially runs a complex simulation and returns 9 different standard deviations. Thus (even though I haven't coded it that way yet) it is essentially a function that takes two values as inputs (L,a) and returns 9 values.
Here is the essence of the code I have:
STD_1 = []
STD_2 = []
# etc.
for L in range(0,6,2):
for a in range(1,100):
### simulation code ###
STD_1.append(value_1)
STD_2.append(value_2)
# etc.
Here is what I can modify it to:
master_list = []
def simulate(a,L):
### simulation code ###
return (a,L,STD_1, STD_2 etc.)
for L in range(0,6,2):
for a in range(1,100):
master_list.append(simulate(a,L))
Since each of the simulations are independent, it seems like an ideal place to implement some sort of multi-threading/processing.
How exactly would I go about coding this?
EDIT: Also, will everything be returned to the master list in order, or could it possibly be out of order if multiple processes are working?
EDIT 2: This is my code -- but it doesn't run correctly. It asks if I want to kill the program right after I run it.
import multiprocessing
data = []
for L in range(0,6,2):
for a in range(1,100):
data.append((L,a))
print (data)
def simulation(arg):
# unpack the tuple
a = arg[1]
L = arg[0]
STD_1 = a**2
STD_2 = a**3
STD_3 = a**4
# simulation code #
return((STD_1,STD_2,STD_3))
print("1")
p = multiprocessing.Pool()
print ("2")
results = p.map(simulation, data)
EDIT 3: Also what are the limitations of multiprocessing. I've heard that it doesn't work on OS X. Is this correct?
Wrap the data for each iteration up into a tuple.
Make a list data of those tuples
Write a function f to process one tuple and return one result
Create p = multiprocessing.Pool() object.
Call results = p.map(f, data)
This will run as many instances of f as your machine has cores in separate processes.
Edit1: Example:
from multiprocessing import Pool
data = [('bla', 1, 3, 7), ('spam', 12, 4, 8), ('eggs', 17, 1, 3)]
def f(t):
name, a, b, c = t
return (name, a + b + c)
p = Pool()
results = p.map(f, data)
print results
Edit2:
Multiprocessing should work fine on UNIX-like platforms such as OSX. Only platforms that lack os.fork (mainly MS Windows) need special attention. But even there it still works. See the multiprocessing documentation.
Here is one way to run it in parallel threads:
import threading
L_a = []
for L in range(0,6,2):
for a in range(1,100):
L_a.append((L,a))
# Add the rest of your objects here
def RunParallelThreads():
# Create an index list
indexes = range(0,len(L_a))
# Create the output list
output = [None for i in indexes]
# Create all the parallel threads
threads = [threading.Thread(target=simulate,args=(output,i)) for i in indexes]
# Start all the parallel threads
for thread in threads: thread.start()
# Wait for all the parallel threads to complete
for thread in threads: thread.join()
# Return the output list
return output
def simulate(list,index):
(L,a) = L_a[index]
list[index] = (a,L) # Add the rest of your objects here
master_list = RunParallelThreads()
Use Pool().imap_unordered if ordering is not important. It will return results in a non-blocking fashion.

Categories