threading/multiprocessing/Queue?

threading/multiprocessing/Queue? - python

I need to add two (A & B) big random size (10^7 , 10^12) of vector with the help of two threads in python or multiprocessing. and then need to store it in C. I have to Time my code as well. and at last need to find minimum and average number from the final vector. I have tried so many things and currently working on Anaconda Jupyter notebook. It accept the code but not giving me any output.
this is my code
"import time
import multiprocessing
import numpy as np
import threading
add_result = []
a = np.random.rand(10000000)
b = np.random.rand(10000000)
def calc_add(numbers):
global add_results
for n in numbers:
print('add' + str(a+b))
add_result.append(a+b)
print('within a process result' +str(add_result))
time.Time = start_time
if __name__=="__main__":
arr = a+b
p1 = multiprocessing.Process(target = calc_add, args = (arr))
p2 = multiprocessing.Process(target = calc_add, args = (arr))
p1.start()
p2.start()
p1.join()
p2.join()
print("result" +str(add_result))
print("done!")

You can't do this kind of operations with multiprocessing, because (in Python) Processes are separate and don't share anything between themselves. That means your globalvariable is only global in the second process p1, which is why your add_result`variable is still equal to "[]".
Please add your code in your question so we can help you re-writing it.
You should also take a look at python's GIL to better understand why processes (and threads) can't help you with your task.

Related

Split list automatically for multiprocessing

I am learning multiprocessing in Python, and thinking of a problem. I want that for a shared list(nums = mp.Manager().list), is there any way that it automatically splits the list for all the processes so that it does not compute on same numbers in parallel.
Current code:
# multiple processes
nums = mp.Manager().list(range(10000))
results = mp.Queue()
def get_square(list_of_num, results_sharedlist):
# simple get square
results_sharedlist.put(list(map(lambda x: x**2, list_of_num)))
start = time.time()
process1 = mp.Process(target=get_square, args = (nums, results))
process2 = mp.Process(target=get_square, args=(nums, results))
process1.start()
process2.start()
process1.join()
process2.join()
print(time.time()-start)
for i in range(results.qsize()):
print(results.get())
Current Behaviour
It computes the square of same list twice
What I want
I want the process 1 and process 2 to compute squares of nums list 1 time in parallel without me defining the split.

You can make function to decide on which data it needs to perform operations. In current scenario, you want your function to divide the square calculation work by it's own based on how many processes are working in parallel.
To do so, you need to let your function know which process it is working on and how many other processes are working along with it. So that it can only work on specific data. So you can just pass two more parameters to your functions which will give information about processes running in parallel. i.e. current_process and total_process.
If you have a list of length divisible by 2 and you want to calculate squares of same using two processes then your function would look something like as follows:
def get_square(list_of_num, results_sharedlist, current_process, total_process):
total_length = len(list_of_num)
start = (total_length // total_process) * (current_process - 1)
end = (total_length // total_process) * current_process
results_sharedlist.put(list(map(lambda x: x**2, list_of_num[start:end])))
TOTAL_PROCESSES = 2
process1 = mp.Process(target=get_square, args = (nums, results, 1, TOTAL_PROCESSES))
process2 = mp.Process(target=get_square, args=(nums, results, 2, TOTAL_PROCESSES))
The assumption I have made here is that the length of list on which you are going to work is in multiple of processes you are allocating. And if it not then the current logic will leave behind some numbers with no output.
Hope this answers your question!

Agree on the answer by Jake here, but as a bonus:
if you are using a multiprocessing.Pool(), it keeps an internal counter of the multiprocessing threads spawned, so you can avoid the parametr to identify the current_process by accessing _identity from the current_process by multiprocessing, like this:
from multiprocessing import current_process, Pool
p = current_process()
print('process counter:', p._identity[0])
more info from this answer.

Multiprocessing in Python2.7

I'm trying to understand "multiprocessing" module more through examples before i start applying it to my main code,and i get little confused from the execution sequence in this code .
The Code :
import multiprocessing as mp
import time
import os
def square( nums , r , t1 ) :
print ("square started at :")
print ("%.6f" % (time.clock()-t1))
for n in nums :
r.append(n*n)
print ("square endeded at :")
print ("%.6f" % (time.clock()-t1))
def cube ( nums , r , t1 ) :
#time.sleep(2)
print ("cube started at :")
print ("%.6f" % (time.clock()-t1))
for n in nums :
r.append(n*n*n)
print ("cube endeded at :")
print ("%.6f" % (time.clock()-t1))
if __name__ == "__main__" :
numbers = range(1,1000000)
results1 = []
results2 = []
t1 = time.clock()
# With multiprocessing :
p1 = mp.Process(target = square , args = (numbers , results1 , t1))
p2 = mp.Process(target = cube , args = (numbers , results2 , t1))
p1.start()
#time.sleep(2)
p2.start()
p1.join()
print ("After p1.join() :")
print ("%.6f" % (time.clock()-t1))
p2.join()
'''
# Without multiprocessing :
square(numbers , results1 ,t1)
cube(numbers , results2 , t1)
'''
print ("square + cube :")
print ("%.6f" % (time.clock()-t1))
The code output was :
square started at :
0.000000
square endeded at :
0.637105
After p1.join() :
12.310289
cube started at :
0.000000
cube endeded at :
0.730428
square + cube :
13.057885
And i have few questions :
according to the code and timing above should it be in this order ?
square started at :
cube started at :
square endeded at :
cube endeded at :
After p1.join() :
square + cube :
why it takes so long from the program to reach (p1.join()) despite of it finished the "square" several seconds earlier ?
in another word why square & cube take around 13 seconds to run while there real time execution is 0.7s!
in my main code i would like to start the second function (cube in this example) after a one second delay from the first function,so i tried to put a delay (time.sleep(1)) between "p1.start()" and "p2.start()"but it did not work and both functions still starting at (0.000000s),then i placed the delay at the the beggining of the "cube " function and it also did not work , so my question is how to achive a delay between this two functions ?

When dealing with multithreading, all sorts of other factors can impact what you are seeing. Since you are literally adding subprocesses to the process manager of your OS, they will operate entirely separately from your running program, including having their own resources, scheduling priorities, pipes, etc.
1.) No. The reason is that each child process gets its own output buffer that it is writing into that gets written back into the parent process. Since you start both child processes and then tell the parent process to block the thread until subprocess p1 completes, the p2 child process cannot write its buffer into the parent until the p1 process completes. This is why, despite waiting 12 seconds, the output of the p2 process still says 0.7 seconds.
2.) It is difficult to know for certain why it took 12 seconds for the subprocess to run its course. It may be something in your code or it may be a dozen other reasons, such as a complete different process hijacking your CPU for a time. First off, time.clock is probably not what you are looking for if you are trying to measure actual time versus how much time the process has spent on the CPU. Other commenters have correctly recommended using a high performance counter to accurately track timings to make sure there is not any weirdness in the way you are measuring time. Furthermore, there is always some level of overhead, though certainly not 12 seconds worth, when starting, running and terminating a new process. The best way to determine if this 12 seconds is something you could have controlled for or not is to run the application multiple times and see if there is a wide variance of total resulting times. If there is, it may be other conditions related to the computer running it.
3.) I am guessing that the issue is the time.clock measurement. The time.clock call calculates how much time a process has spent on the CPU. Since you are using multiple processes, time.clock is reset to 0 when a process is started. It is a relative time, not an absolute time, and relative to the lifespan of the process. If you are jumping between processes or sleeping threads, time.clock won't necessarily increment the way you would think with an absolute time measurement. You should use something like time.time() or better yet, a high performance counter to properly track real time.

Basic parallelization. Create a pool of workers and all of them run the same function in parallel

I have the following code
class calculator:
def __init__(self, value):
print("hi")
result = self.do_stuff()
print(result)
if __name__ == '__main__':
calculator(20) # I want to do this 4 times in parallel
I have a class that calculates stuff. I have 4 processors. So I want to instantiate this class 4 times with value 20. These 4 identical classes with the same input then calculate something in parallel. To make sure that it works asynchronly I want to see in the console 4 times hi and after a short while 4 times the result.
But I don't really know how to do that with
import multiprocessing as mp
It seems that you can only parallelize definitions so I added the definition
def start_calculator(value):
calculator(value)
and the main is now
if __name__ == '__main__':
p1 = mp.Process(target=start_calculator, args=(20,))
p2 = mp.Process(target=start_calculator, args=(20,))
p3 = mp.Process(target=start_calculator, args=(20,))
p4 = mp.Process(target=start_calculator, args=(20,))
p1.start()
p2.start()
p3.start()
p4.start()
p1.join()
p2.join()
p3.join()
p4.join()
but this just looks incredibly bad.
can I somehow create 4 processes and then in a loop start them all
and then join them all without writing p1,p2,p3,p4 every time.
And also I want to make it variable to only have 2 Processes etc. So I can't hardcode it like this
Ideally I would have an array with a fixed amount of processes and then I give them asynchronally the function. And If one process is finished I would be able to give it the function again

Can you use the pool?
pool = mp.Pool(4) #with no argument it takes the maximum
args = [20.,20.,20.,20.]
output = pool.map(start_calculator,args)
pool.close()
pool.join()

How to run multiple for loops in parallel under a single function in python

Is there a way to have multithreading implemented for multiple for loops under a single function. I am aware that it can be achieved if we have separate functions, but is it possible to have it under the same function.
For example:
def sqImport():
for i in (0,50):
do something specific to 0-49
for i in (50,100):
do something specific to 50-99
for i in (100,150):
do something specific to 100-149
If there are 3 separate functions for 3 different for loops then we can do:
threadA = Thread(target = loopA)
threadB = Thread(target = loopB)
threadC = Thread(target = loopC)
threadA.run()
threadB.run()
threadC.run()
# Do work indepedent of loopA and loopB
threadA.join()
threadB.join()
threadC.join()
But is there a way to achieve this under a single function?

First of all: I think you really should take a look at multiprocessing.ThreadPool if you are going to use it in a productive system. What I describe below is just a possible workaround (which might be simpler and therefore could be used for testing purposes).
You could pass an id to the function and use that to decide which loop you take like so:
from threading import Thread
def sqImport(tId):
if tId == 0:
for i in range(0,50):
print i
elif tId == 1:
for i in range(50,100):
print i
elif tId == 2:
for i in range(100,150):
print i
threadA = Thread(target = sqImport, args=[0])
threadB = Thread(target = sqImport, args=[1])
threadC = Thread(target = sqImport, args=[2])
threadA.start()
threadB.start()
threadC.start()
# Do work indepedent of loopA and loopB
threadA.join()
threadB.join()
threadC.join()
Note that I used start() instead of run() because run() does not start a different thread but executes in the current thread context. Moreover I changed your for i in (x, y) loops in for i in range(x,y) loops, because I think, You want to iterate over a range and not a tuple(that would iterate only over x and y).
An alternative Solution using multiprocessing might look like this:
from multiprocessing.dummy import Pool as ThreadPool
# The worker function
def sqImport(data):
for i in data:
print i
# The three ranges for the three different threads
ranges = [
range(0, 50),
range(50, 100),
range(100, 150)
]
# Create a threadpool with 3 threads
pool = ThreadPool(3)
# Run sqImport() on all ranges
pool.map(sqImport, ranges)
pool.close()
pool.join()

You can use multiprocessing.ThreadPool which will divide you tasks equally between running threads.
Follow Threading pool similar to the multiprocessing Pool? for more on this.
If you are really looking for parallel execution then go for processes because threads will face python GIL(Global Interpreted Lock).

Python Multiprocess not terminate

I am new to python multiprocess and I want to understand why my code does not terminate (maybe zombi or deadlock) and how to fix it. The createChain functions execute a for loop also and returns a tuple: (value1, value2). Inside createChain function there are other calls to other functions. I don't think posting the createChain function code will help because inside that function I am not doing something regarding multiprocess. I tried to make the processes as deamon but still didn't work. The strange think is that if I decrease the value of maxChains i.e to 500 or 100 is working.
I just want the process to do some heavy tasks and put the results to a data type.
My version of python is 2.7
def createTable(chainsPerCore, q, chainLength):
for chain in xrange(chainsPerCore):
q.put(createChain(chainLength, chain))
def initTable():
maxChains = 1000
chainLength = 10000
resultsQueue = JoinableQueue()
numOfCores = cpu_count()
chainsPerCore = maxChains / numOfCores
processes = [Process(target=createTable, args=(chainsPerCore, resultsQueue, chainLength,)) for x in range(numOfCores)]
for p in processes:
# p.daemon = True
p.start()
# Wait for hashing cores to finish
for p in processes:
p.join()
resultsQueue.task_done()
temp = [resultsQueue.get() for p in processes]
print temp

Based on the very useful comments of Tadhg McDonald-Jensen I understood better my needs and how the Queues are workings and for what purpose they should be used.
I change my code to
def initTable(output):
maxChains = 1000
results = []
with closing(Pool(processes=8)) as pool:
results = pool.map(createChain, xrange(maxChains))
pool.terminate()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.