Here is the code:
def function(index):
print('start process '+str(index))
time.sleep(1)
print('end process '+str(index))
return str(index)
if __name__ == '__main__':
pool = Pool(processes=3)
for i in range(4):
res = pool.apply_async(function,args=(i,))
print(res.get())
pool.close()
print('done')
and the output:
start process 0
end process 0
0
start process 1
end process 1
1
start process 2
end process 2
2
start process 3
end process 3
3
done
In my opinion, if the I don't use the pool.join(), the code should only print 'done' and that's it, because the function of pool.join() is 'Wait for the worker processes to exit', but now without pool.join(), it get the same result.
I really don't understand.
In your code, the method get() has the same effect as join(). It also waits for the process to finish because you want to get the result of it.
If you remove it from your code, you will see the 'done' being printed first:
done
start process 0
res.get waits for the process to finish (how else would it get the return value?) which means that process 0 must finish before process 1 can start, and so on.
Remove res.get and you won't see the processes finish. Move res.get to a separate loop after the first one and you'll see they all start before any of them finish.
Also check out Pool.map.
Related
I have a program i want to split into 10 parts with multiprocessing. Each worker will be searching for the same answer using different variables to look for it (in this case its brute forcing a password). How to I get the processes to communicate their status, and how do I terminate all processes once one process has found the answer. Thank you!
If you are going to split it into 10 parts than either you should have 10 cores or at least your worker function should not be 100% CPU bound.
The following code initializes each process with a multiprocess.Queue instance to which the worker function will write its result. The main process waits for the first entry written to the queue and then terminates all pool processes. For this demo, the worker function is passed arguments 1, 2, 3, ... 10 and then sleeps for that amount of time and returns the argument passed. So we would expect that the worker function that was passed the argument value of 1 to complete first and that the total running time of the program should be slightly more than 1 second (it takes some time to create the 10 processes):
import multiprocessing
import time
def init_pool(q):
global queue
queue = q
def worker(x):
time.sleep(x)
# write result to queue
queue.put_nowait(x)
def main():
queue = multiprocessing.Queue()
pool = multiprocessing.Pool(10, initializer=init_pool, initargs=(queue,))
for i in range(1, 11):
# non-blocking:
pool.apply_async(worker, args=(i,))
# wait for first result
result = queue.get()
pool.terminate() # kill all tasks
print('Result: ', result)
# required for Windows:
if __name__ == '__main__':
t = time.time()
main()
print('total time =', time.time() - t)
Prints:
Result: 1
total time = 1.2548246383666992
I'm used to multiprocessing, but now I have a problem where mp.Pool isn't the tool that I need.
I have a process that prepares input and another process that uses it. I'm not using up all of my cores, so I want to have the two go at the same time, with the first getting the batch ready for the next iteration. How do I do this? And (importantly) what is this sort of thing called, so that I can go and google it?
Here's a dummy example. The following code takes 8 seconds:
import time
def make_input():
time.sleep(1)
return "cthulhu r'lyeh wgah'nagl fhtagn"
def make_output(input):
time.sleep(1)
return input.upper()
start = time.time()
for i in range(4):
input = make_input()
output = make_output(input)
print(output)
print(time.time() - start)
CTHULHU R'LYEH WGAH'NAGL FHTAGN
CTHULHU R'LYEH WGAH'NAGL FHTAGN
CTHULHU R'LYEH WGAH'NAGL FHTAGN
CTHULHU R'LYEH WGAH'NAGL FHTAGN
8.018263101577759
If I were preparing input batches at the same time as I was doing the output, it would take four seconds. Something like this:
next_input = make_input()
start = time.time()
for i in range(4):
res = do_at_the_same_time(
output = make_output(next_input),
next_input = make_input()
)
print(output)
print(time.time() - start)
But, obviously, that doesn't work. How can I accomplish what I'm trying to accomplish?
Important note: I tried the following, but it failed because the executing worker was working in the wrong scope (like, for my actual use-case). In my dummy use-case, it doesn't work because it prints in a different process.
def proc(i):
if i == 0:
return make_input()
if i == 1:
return make_output(next_input)
next_input = make_input()
for i in range(4):
pool = mp.Pool(2)
next_input = pool.map(proc, [0, 1])[0]
pool.close()
So I need a solution where the second processes happens in the same scope or environment as the for loop, and where the first has output that can be gotten from that scope.
You should be able to use Pool. If I understand it correctly, you want one worker to prepare the input for the next worker which runs and does something more with it, given your example functions, this should do just that:
pool = mp.Pool(2)
for i in range(4):
next_input = pool.apply(make_input)
pool.apply_async(make_output, (next_input, ), callback=print)
pool.close()
pool.join()
We prepare a pool with 2 workers, now we want run the loop to run our pair of tasks twice.
We delegate make_input to a worker using apply() waiting for the function to complete assign the result to next_input. Note: in this example we could have used a single worker pool and just run next_input = make_input() (i.e. in the same process your script runs in and just delegate the make_output()).
Now the more interesting bit: by using apply_async() we ask a worker to run make_output, passing single parameter next_input to it and telling it to runt (or any function) print with the result of make_output as argument passed to the function registered with callback.
Then we close() the pool not accepting any more jobs and join() to wait for processes to complete their jobs.
I have a stata do file pyexample3.do, which uses its argument as a regressor to run a regression. The F-statistic from the regression is saved in a text file. The code is as follows:
clear all
set more off
local y `1'
display `"first parameter: `y'"'
sysuse auto
regress price `y'
local f=e(F)
display "`f'"
file open myhandle using test_result.txt, write append
file write myhandle "`f'" _n
file close myhandle
exit, STATA clear
Now I am trying to run the stata do file in parallel in python and write all the F-statistics in one text file. My cpu has 4 cores.
import multiprocessing
import subprocess
def work(staname):
dofile = "pyexample3.do"
cmd = ["StataMP-64.exe","/e", "do", dofile,staname]
return subprocess.call(cmd, shell=False)
if __name__ == '__main__':
my_list =[ "mpg","rep78","headroom","trunk","weight","length","turn","displacement","gear_ratio" ]
my_list.sort()
print my_list
# Get the number of processors available
num_processes = multiprocessing.cpu_count()
threads = []
len_stas = len(my_list)
print "+++ Number of stations to process: %s" % (len_stas)
# run until all the threads are done, and there is no data left
for list_item in my_list:
# if we aren't using all the processors AND there is still data left to
# compute, then spawn another thread
if( len(threads) < num_processes ):
p = multiprocessing.Process(target=work,args=[list_item])
p.start()
print p, p.is_alive()
threads.append(p)
else:
for thread in threads:
if not thread.is_alive():
threads.remove(thread)
Although the do file is supposed to run 9 times as there are 9 strings in my_list, it was only run 4 times. So where went wrong?
In your for list_item in my_list loop, after the first 4 processes get initiated, it then goes into else:
for thread in threads:
if not thread.is_alive():
threads.remove(thread)
As you can see since thread.is_alive() won't block, this loop get executed immediately without any of those 4 processes finishing their task. Therefore only the first 4 processes get executed in total.
You could simply use a while loop to constantly check process status with a small interval:
keep_checking = True
while keep_checking:
for thread in threads:
if not thread.is_alive():
threads.remove(thread)
keep_checking = False
time.sleep(0.5) # wait 0.5s
Python's semaphore doesn't support negative initial values. How, then, do I make a thread wait until 8 other threads have done something? If Semophore supported negative initial values, I could have just set it to -8, and have each thread increment the value by 1, until we have a 0, which unblocks the waiting thread.
I can manually increment a global counter inside a critical section, then use a conditional variable, but I want to see if there are other suggestions.
Surely it's late for an answer, but it can come handy to someone else.
If you want to wait for 8 different threads to do something, you can just wait 8 times.
You initialize a semaphore in 0 with
s = threading.Semaphore(0)
and then
for _ in range(8):
s.acquire()
will do the job.
Full example:
import threading
import time
NUM_THREADS = 4
s = threading.Semaphore(0)
def thread_function(i):
print("start of thread", i)
time.sleep(1)
s.release()
print("end of thread", i)
def main_thread():
print("start of main thread")
threads = [
threading.Thread(target=thread_function, args=(i, ))
for i
in range(NUM_THREADS)
]
[t.start() for t in threads]
[s.acquire() for _ in range(NUM_THREADS)]
print("end of main thread")
main_thread()
Possible output:
start of main thread
start of thread 0
start of thread 1
start of thread 2
start of thread 3
end of thread 0
end of thread 2
end of thread 1
end of thread 3
end of main thread
For any further readers: starting from python3.2 there are Barrier Objects which
provides a simple synchronization primitive for use by a fixed number of threads that need to wait for each other.
import time
from multiprocessing import Process
def loop(limit):
for i in xrange(limit):
pass
print i
limit = 100000000 #100 million
start = time.time()
for i in xrange(5):
p = Process(target=loop, args=(limit,))
p.start()
p.join()
end = time.time()
print end - start
I tried running this code, this is the output I am getting
99999999
99999999
2.73401999474
99999999
99999999
99999999
and sometimes
99999999
99999999
3.72434902191
99999999
99999999
99999999
99999999
99999999
In this case the loop function is called 7 times instead of 5. Why this strange behaviour?
I am also confused about the role of the p.join() statement. Is it ending any one process or all of them at the same time?
The join function currently will wait for the last process you call to finish before moving onto the next section of code. If you walk through what you have done you should see why you get the "strange" output.
for i in xrange(5):
p = Process(target=loop, args=(limit,))
p.start()
This starts 5 new processes one after the other. These are all running at the same time. Just about at least, it is down to the scheduler to decide what process is currently being processed.
This mean you have 5 processes running now:
Process 1
Process 2
Process 3
Process 4
Process 5
p.join()
This is going to wait for p process to finish Process 5 as that was the last process to be assigned to p.
Lets now say that Process 2 finishes first followed by Process 5, which is perfectly feasible as the scheduler could give those processes more time on the CPU.
Process 1
Process 2 prints 99999999
Process 3
Process 4
Process 5 prints 99999999
The p.join() line will now move on to the next part as p Process 5 has finished.
end = time.time()
print end - start
This section prints its part and now there are 3 Processes still going on after this output.
The other Processes finish and print there 99999999.
To fix this behaviour you will need to .join() all the processes. To do this you could alter your code to this...
processes = []
for i in xrange(5):
p = Process(target=loop, args=(limit,))
p.start()
processes.append(p)
for process in processes:
process.join()
This will wait for the first process, then the second and so on. It won't matter if one process finished before anther because every process on the list must be waited on before the script continues.
There are some problems with the way you are doing things, try this:
start = time.time()
procs = []
for i in xrange(5):
p = Process(target=loop, args=(limit,))
p.start()
procs.append(p)
[p.join() for p in procs]
The problem is that you are not tracking of individual processes (p variables inside the loop). You need to keep them around so you can interact with them. This update will keep them in the array and then join all of them at the end.
Output looks like this:
99999999
99999999
99999999
99999999
99999999
6.29328012466
Note that now the time it took to run is also printed at the end of the execution.
Also, I ran your code and was not able to get the loop to execute multiple times.