I am currently studying '''multiprocessing''' package. Here is a simple code I tried on '''multiprocessing.Process''' and '''multiprocessing.Pool'''.
import random
import multiprocessing
import time
def list_append(count, id, out_list):
"""
Creates an empty list and then appends a
random number to the list 'count' number
of times. A CPU-heavy operation!
"""
for i in range(count):
out_list.append(random.random())
if __name__ == "__main__":
size = 10000000 # Number of random numbers to add
procs = 8 # Number of processes to create
# Create a list of jobs and then iterate through
# the number of processes appending each process to
# the job list
print('number of CPU: ', multiprocessing.cpu_count())
starting = time.time()
jobs = []
for i in range(procs):
out_list = list()
process = multiprocessing.Process(target=list_append,
args=(size, i, out_list))
jobs.append(process)
# Start the processes (i.e. calculate the random number lists)
for j in jobs:
j.start()
# Ensure all of the processes have finished
for j in jobs:
j.join()
print("jobs one done in {}".format(time.time()-starting))
starting = time.time()
for i in range(procs):
p = multiprocessing.Pool(8)
p.starmap(list_append, [(size, i, list())])
print('jobs two done in {}'.format(time.time()-starting))
My laptop has 12 cup cores, so I expect that job one and job two would finish in similar time. However, the job one finish in 3 seconds, but job two finish in 12 seconds. It looks to me that '''multiprocessing.Pool()''' does not actually do multiprocess... Is there sth I did wrong?
In your jobs two, you are not using multiprocessing. The starmap() distributes the specified method (list_append) to each of the arg lists provided in the second argument, but you only provide a list with one element, so each iteration of your for loop executes one process. I think you meant to do:
p = multiprocessing.Pool(8)
p.starmap(list_append, [(size, i, list()) for i in range(procs)])
without the containing for loop.
Note, also, that starmap waits for the result, so in the for loop, it waits for each single process.
Related
I am learning multiprocessing in Python, and thinking of a problem. I want that for a shared list(nums = mp.Manager().list), is there any way that it automatically splits the list for all the processes so that it does not compute on same numbers in parallel.
Current code:
# multiple processes
nums = mp.Manager().list(range(10000))
results = mp.Queue()
def get_square(list_of_num, results_sharedlist):
# simple get square
results_sharedlist.put(list(map(lambda x: x**2, list_of_num)))
start = time.time()
process1 = mp.Process(target=get_square, args = (nums, results))
process2 = mp.Process(target=get_square, args=(nums, results))
process1.start()
process2.start()
process1.join()
process2.join()
print(time.time()-start)
for i in range(results.qsize()):
print(results.get())
Current Behaviour
It computes the square of same list twice
What I want
I want the process 1 and process 2 to compute squares of nums list 1 time in parallel without me defining the split.
You can make function to decide on which data it needs to perform operations. In current scenario, you want your function to divide the square calculation work by it's own based on how many processes are working in parallel.
To do so, you need to let your function know which process it is working on and how many other processes are working along with it. So that it can only work on specific data. So you can just pass two more parameters to your functions which will give information about processes running in parallel. i.e. current_process and total_process.
If you have a list of length divisible by 2 and you want to calculate squares of same using two processes then your function would look something like as follows:
def get_square(list_of_num, results_sharedlist, current_process, total_process):
total_length = len(list_of_num)
start = (total_length // total_process) * (current_process - 1)
end = (total_length // total_process) * current_process
results_sharedlist.put(list(map(lambda x: x**2, list_of_num[start:end])))
TOTAL_PROCESSES = 2
process1 = mp.Process(target=get_square, args = (nums, results, 1, TOTAL_PROCESSES))
process2 = mp.Process(target=get_square, args=(nums, results, 2, TOTAL_PROCESSES))
The assumption I have made here is that the length of list on which you are going to work is in multiple of processes you are allocating. And if it not then the current logic will leave behind some numbers with no output.
Hope this answers your question!
Agree on the answer by Jake here, but as a bonus:
if you are using a multiprocessing.Pool(), it keeps an internal counter of the multiprocessing threads spawned, so you can avoid the parametr to identify the current_process by accessing _identity from the current_process by multiprocessing, like this:
from multiprocessing import current_process, Pool
p = current_process()
print('process counter:', p._identity[0])
more info from this answer.
Let us consider the following code where I calculate the factorial of 4 really large numbers, saving each output to a separate .txt file (out_mp_{idx}.txt). I use multiprocessing (4 processes) to reduce the computation time. Though this works fine, I want to output all the 4 results in one file.
One way is to open each of the generated (4) files I create (from the code below) and append to a new file, but that's not my choice (below is just a simplistic version of my code, I have too many files to handle, which defeats the purpose of time-saving via multiprocessing). Is there a better way to automate such that the results from the processes are all dumped/appended to some file? Also, in my case the returned results form each process could be several lines, so how would we avoid open-file conflict for the case when the results are appended in the output file by one process and second process returns its answer and wants to open/access the output file?
As an alternative, I tried process.immap route, but that's not as computationally efficient as the below code. Something like this SO post.
from multiprocessing import Process
import os
import time
tic = time.time()
def factorial(n, idx): # function to calculate the factorial
num = 1
while n >= 1:
num *= n
n = n - 1
with open(f'out_mp_{idx}.txt', 'w') as f0: # saving output to a separate file
f0.writelines(str(num))
def My_prog():
jobs = []
N = [10000, 20000, 40000, 50000] # numbers for which factorial is desired
n_procs = 4
# executing multiple processes
for i in range(n_procs):
p = Process(target=factorial, args=(N[i], i))
jobs.append(p)
for j in jobs:
j.start()
for j in jobs:
j.join()
print(f'Exec. Time:{time.time()-tic} [s]')
if __name__=='__main__':
My_prog()
You can do this.
Create a Queue
a) manager = Manager()
b) data_queue = manager.Queue()
c) put all data in this queue.
Create a thread and start it before multiprocess
a) create a function which waits on data_queue.
Something like
`
def fun():
while True:
data = data_queue.get()
if instance(data_queue, Sentinal):
break
#write to a file
`
3) Remember to send some Sentinal object after all multiprocesses are done.
You can also make this thread a daemon thread and skip sentinal part.
I am pulling .8 million of records in one go(this is one time process) from mongodb using pymongo and performing some operation over it .
My code look as below.
proc = []
for rec in cursor: # cursor has .8 million rows
print cnt
cnt = cnt + 1
url = rec['urlk']
mkptid = rec['mkptid']
cii = rec['cii']
#self.process_single_layer(url, mkptid, cii)
proc = Process(target=self.process_single_layer, args=(url, mkptid, cii))
procs.append(proc)
proc.start()
# complete the processes
for proc in procs:
proc.join()
process_single_layer is a function which is basically downloading urls.from cloud and storing locally.
Now the problem is downloading process is slow as it has to hit a url. And since records are huge to process 1k rows it is taking 6 minutes.
To reduce the time I wanted to implement Multiprocessing. But It is hard to see any difference with above code.
Please suggest me how can I improve the performance in this scenario.
First of all you need to count all the rows in your file and then spawn a fixed number of processes (ideally matching the number of your processor cores), to which you feed via queues (one for each process) a number of rows equal to the division total_number_of_rows / number_of_cores. The idea behind this approach is that you split the processing of those rows between multiple processes, hence achieving parallelism.
A way to find out the number of cores dynamically is by doing:
import multiprocessing as mp
cores_count = mp.cpu_count()
A slight improvement that can be done by avoiding the initial rows count is by adding a row cyclically by creating the list of queues and then apply a cycle iterator on it.
A full example:
import queue
import multiprocessing as mp
import itertools as itools
cores_count = mp.cpu_count()
def dosomething(q):
while True:
try:
row = q.get(timeout=5)
except queue.Empty:
break
# ..do some processing here with the row
pass
if __name__ == '__main__':
processes
queues = []
# spawn the processes
for i in range(cores_count):
q = mp.Queue()
queues.append(q)
proc = Process(target=dosomething, args=(q,))
processes.append(proc)
queues_cycle = itools.cycle(queues)
for row in cursor:
q = next(queues_cycle)
q.put(row)
# do the join after spawning all the processes
for p in processes:
p.join()
It's easier to use a pool in this scenario.
Queues are not necessary as you don't need to communicate between your spawned processes. We can use the Pool.map to distribute the workload.
Pool.imap or Pool.imap_unordered might be faster with a larger chunk size. (Ref: https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.imap) You can use the Pool.starmap if you want and get rid of tuple unpacking.
from multiprocessing import Pool
def process_single_layer(data):
# unpack the tuple and do the processing
url, mkptid, cii = data
return "downloaded" + url
def get_urls():
# replace this code: iterate over cursor and yield necessary data as a tuple
for rec in range(8):
url = "url:" + str(rec)
mkptid = "mkptid:" + str(rec)
cii = "cii:" + str(rec)
yield (url, mkptid, cii)
# you can come up with suitable process count based on the number of CPUs.
with Pool(processes=4) as pool:
print(pool.map(process_single_layer, get_urls()))
Please bear with me as this is a bit of a contrived example of my real application. Suppose I have a list of numbers and I wanted to add a single number to each number in the list using multiple (2) processes. I can do something like this:
import multiprocessing
my_list = list(range(100))
my_number = 5
data_line = [{'list_num': i, 'my_num': my_number} for i in my_list]
def worker(data):
return data['list_num'] + data['my_num']
pool = multiprocessing.Pool(processes=2)
pool_output = pool.map(worker, data_line)
pool.close()
pool.join()
Now however, there's a wrinkle to my problem. Suppose that I wanted to alternate adding two numbers (instead of just adding one). So around half the time, I want to add my_number1 and the other half of the time I want to add my_number2. It doesn't matter which number gets added to which item on the list. However, the one requirement is that I don't want to be adding the same number simultaneously at the same time across the different processes. What this boils down to essentially (I think) is that I want to use the first number on Process 1 and the second number on Process 2 exclusively so that the processes are never simultaneously adding the same number. So something like:
my_num1 = 5
my_num2 = 100
data_line = [{'list_num': i, 'my_num1': my_num1, 'my_num2': my_num2} for i in my_list]
def worker(data):
# if in Process 1:
return data['list_num'] + data['my_num1']
# if in Process 2:
return data['list_num'] + data['my_num2']
# and so forth
Is there an easy way to specify specific inputs per process? Is there another way to think about this problem?
multiprocessing.Pool allows to execute an initializer function which is going to be executed before the actual given function will be run.
You can use it altogether with a global variable to allow your function to understand in which process is running.
You probably want to control the initial number the processes will get. You can use a Queue to notify to the processes which number to pick up.
This solution is not optimal but it works.
import multiprocessing
process_number = None
def initializer(queue):
global process_number
process_number = queue.get() # atomic get the process index
def function(value):
print "I'm process %s" % process_number
return value[process_number]
def main():
queue = multiprocessing.Queue()
for index in range(multiprocessing.cpu_count()):
queue.put(index)
pool = multiprocessing.Pool(initializer=initializer, initargs=[queue])
tasks = [{0: 'Process-0', 1: 'Process-1', 2: 'Process-2'}, ...]
print(pool.map(function, tasks))
My PC is a dual core, as you can see only Process-0 and Process-1 are processed.
I'm process 0
I'm process 0
I'm process 1
I'm process 0
I'm process 1
...
['Process-0', 'Process-0', 'Process-1', 'Process-0', ... ]
I have a complexed problem with python multiprocessing module.
I have build a script that in one place has to call a multiargument function (call_function) for each element in a specyfic list. My idea is to define an integer 'N' and divide this problem for single sub processes.
li=[a,b,c,d,e] #elements are int's
for element in li:
call_function(element,string1,string2,int1)
call_summary_function()
Summary function will analyze results obtained by all iterations of the loop. Now, I want each iteration to be carried out by single sub process, but there cannot be more than N subprocesses altogether. If so, main process should wait until 1 of subprocesses end and then perform another iteration. Also, call_sumary_function need to be called after all the sub processes finish.
I have tried my best with multiprocessing module, Locks and global variables to keep the actual number of subprocesses running (to compare to N) but every time i get error.
//--------------EDIT-------------//
Firstly, the main process code:
MAX_PROCESSES=3
lock=multiprocessing.Lock()
processes=0
k=0
while k < len(k_list):
if processes<=MAX_PROCESSES: # running processes <= 'N' set by me
p = multiprocessing.Process(target=single_analysis, args=(k_list[k],main_folder,training_testing,subsets,positive_name,ratio_list,lock,processes))
p.start()
k+=1
else: time.sleep(1)
while processes>0: time.sleep(1)
Now: the function that is called by multiprocessing:
def single_analysis(k,main_folder,training_testing,subsets,positive_name,ratio_list,lock,processes):
lock.acquire()
processes+=1
lock.release()
#stuff to do
lock.acquire()
processes-=1
lock.release()
I get the Error that int value (processes variable) is always equal to 0, since single_analysis() function seems to create new, local variable processes.
When I change processes to global and import it in single_analysis() with global keyword and type print processes in within the function I get len(li) times 1...
What you're describing is pefectly suited for multiprocessing.Pool - specifically its map method:
import multiprocessing
from functools import partial
def call_function(string1, string2, int1, element):
# Do stuff here
if __name__ == "__main__":
li=[a,b,c,d,e]
p = multiprocessing.Pool(N) # The pool will contain N worker processes.
# Use partial so that we can pass a method that takes more than one argument to map.
func = partial(call_function, string1,string2,int1)
results = p.map(func, li)
call_summary_function(results)
p.map will call call_function(string1, string2, int1, element), for each element in the li list. results will be a list containing the value returned by each call to call_function. You can pass that list to call_summary_function to process the results.