I am using python to run multiple subprocesses at the same time.
I want to get the run time of each process.
I am using the subprocess module.
What I did:
I created two separate for loops:
The first one for running each process
The second waits for all processes to end.
for prcs in batch:
p = subprocess.Popen([prcs])
ps.append(p)
for p in ps:
p.wait()
This code works fine for running the processes simultaneously, but I do not know what to add to it in order to get the run time of each process separately.
Edit: Is there a way to get the run time through the module subprocess?
For example: runtime = p.runtime()
I agree with #quamrana that the easiest way to do this would be with threads.
First, we need to import some standard library modules:
import collections
import subprocess
import threading
import time
Instead of a list to store the processes, we use an ordered dictionary to keep track of the processes and their times. Since we don't know how long each thread will take, we need some way to keep track of the original order of our {process: time} pairs. The threads themselves can be stored in a list.
ps = collections.OrderedDict()
ts = []
Initializing the value paired to each process as the current time makes the whole thing cleaner, despite the fact that it is generally inadvisable to use the same variable for two different things (in this case, starting time followed by process duration). The target for our thread simply waits for the thread to finish and updates the ps ordered dictionary from the start time to the process duration.
def time_p(p):
p.wait()
ps[p] = time.time() - ps[p]
for prcs in batch:
p = subprocess.Popen([prcs])
ps[p] = time.time()
ts.append(threading.Thread(target=time_p, args=(p,)))
Now, we just start each of the threads, then wait for them all to complete.
for t in ts:
t.start()
for t in ts:
t.join()
Once they are all complete, we can print out the results for each:
for prcs, p in zip(batch, ps):
print('%s took %s seconds' % (prcs, ps[p]))
Related
I like to run a bunch of processes concurrently but never want to reuse an already existing process. So, basically once a process is finished I like to create a new one. But at all times the number of processes should not exceed N.
I don't think I can use multiprocessing.Pool for this since it reuses processes.
How can I achieve this?
One solution would be to run N processes and wait until all processed are done. Then repeat the same thing until all tasks are done. This solution is not very good since each process can have very different runtimes.
Here is a naive solution that appears to work fine:
from multiprocessing import Process, Queue
import random
import os
from time import sleep
def f(q):
print(f"{os.getpid()} Starting")
sleep(random.choice(range(1, 10)))
q.put("Done")
def create_proc(q):
p = Process(target=f, args=(q,))
p.start()
if __name__ == "__main__":
q = Queue()
N = 5
for n in range(N):
create_proc(q)
while True:
q.get()
create_proc(q)
Pool can reuse a process a limited number of times, including one time only when you pass maxtasksperchild=1. You might also try initializer to see if you can run the picky once per process parts of your library there instead of in your pool jobs.
Objective
a process (.exe) with multiple input arguments
Multiple files. For each the above mentioned process shall be executed
I want to use python to parallelize the process
I am using subprocess.Popen to create the processes and afterwards keep a maximum of N parallel processes.
For testing purposes, I want to parallelize a simple script like "cmd timeout 5".
State of work
import subprocess
count = 10
parallel = 2
processes = []
for i in range(0,count):
while (len(processes) >= parallel):
for process in processes:
if (process.poll() is None):
processes.remove(process)
break
process = subprocess.Popen(["cmd", "/c timeout 5"])
processes.append(process)
[...]
I read somewhere that a good approach for checking if a process is running would be is not None like shown in the code.
Question
I am somehow struggling to set it up correctly, especially the Popen([...]) part. In some cases, all processes are executed without considering the maximum parallel count and in other cases, it doesnt work at all.
I guess that there has to be a part where the process is closed if finished.
Thanks!
You will probably have a better time using the built-in multiprocessing module to manage the subprocesses running your tasks.
The reason I've wrapped the command in a dict is that imap_unordered (which is faster than imap but doesn't guarantee ordered execution since any worker process can grab any job – whether that's okay for you is your business problem) doesn't have a starmap alternative, so it's easier to unpack a single "job" within the callable.
import multiprocessing
import subprocess
def run_command(job):
# TODO: add other things here?
subprocess.check_call(job["command"])
def main():
with multiprocessing.Pool(2) as p:
jobs = [{"command": ["cmd", "/c timeout 5"]} for x in range(10)]
for result in p.imap_unordered(run_command, jobs):
pass
if __name__ == "__main__":
main()
This is my first time to use multi-threading..
I write a code to process every file in a directory like:
list_battle=[]
start = time.time()
for filepath in pathlib.Path(dir_battle).glob('**/*'):
battle_json = gzip.GzipFile(filepath,'rb').read().decode("utf-8")
battle_id = eval(type_cmd)
list_battle.append((battle_id, battle_json))
end = time.time()
print(end - start)
it shows the code runs 8.74 seconds.
Then, I tried to use multi-threading as follows:
# define function to process each file
def get_file_data(path, cmd, result_list):
data_json = gzip.GzipFile(path,'rb').read().decode("utf-8")
data_id = eval(cmd)
result_list.append((battle_id, battle_json))
# start to run multi-threading
pool = Pool(5)
start = time.time()
for filepath in pathlib.Path(dir_battle).glob('**/*'):
pool.apply_async( get_file_data(filepath, type_cmd, list_battle) )
end = time.time()
print(end - start)
However, the result shows it takes 12.36 seconds!
In my view, in single threading, in each turn of loop, the loop waits for the single thread to finish codes and then starts the next turn; while in multi-processing, 1st turn, the loop calls thread1 to run the codes, then 2nd turn calls thread2 to run.... during this job dispatching for other 4 threads, thread1 is running and when the 6th turn arrives, it should finishes its job and the loop could directly ask it to run the code of 6th trun...
So this should be quicker than single thread...Why the code with multi-processing runs even slower? How to address this issue? What is wrong with my thinking?
Any help is appreciated.
Multiprocessing does not reduce your processing time unless your process has a lot of dead time (waiting). The main purpose of multiprocessing is parallelism of different tasks at the burden of context switching. Whenever you switch from one task to another, interupting the previous one, your program needs to store all variables for the former task and get the ones from the new one. This takes time as well.
This means the shorter the time you spend per task, the less efficient (in regards of computing time) your multiprocessing is.
Is it possible to have 2, 3 or more threads in Python to be able to execute something simultaneously - at the exact same moment? Is it possible if one of the threads is late, for the other to be waiting for it, so the last request can be executed at the same time?
Example: There are two threads that are calculating specific parameters, after they have done that they need to click one button at the same time (to send post request to the server).
"Exact the same time" is really difficult, at almost the same time is possible but you need to use multiprocessing instead of threads. Here one example.
from time import time
from multiprocessing import Pool
def f(*args):
while time() < start + 5: #syncronize the execution of each process
pass
print(time())
start = time()
with Pool(10) as p:
p.map(f, range(10))
It prints
1495552973.6672032
1495552973.6672032
1495552973.669514
1495552973.667697
1495552973.6672032
1495552973.668086
1495552973.6693969
1495552973.6672032
1495552973.6677089
1495552973.669164
Note that some of the processes are really simultaneous (in the 10e-7 second precision). It's impossible to guarantee that all the processes will be executed at the very same moment.
However, if you limitate the number of processes to the number of core you actually have, then most of the time they will run exactly at the same moment.
I want to do clustering on 10,000 models. Before that, I have to calculate the pearson corralation coefficient associated with every two models. That's a large amount of computation, so I use multiprocessing to spawn processes, assigning the computing job to 16 cpus.My code is like this:
import numpy as np
from multiprocessing import Process, Queue
def cc_calculator(begin, end, q):
index=lambda i,j,n: i*n+j-i*(i+1)/2-i-1
for i in range(begin, end):
for j in range(i, nmodel):
all_cc[i][j]=get_cc(i,j)
q.put((index(i,j,nmodel),all_cc[i][j]))
def func(i):
res=(16-i)/16
res=res**0.5
res=int(nmodel*(1-res))
return res
nmodel=int(raw_input("Entering the number of models:"))
all_cc=np.zeros((nmodel,nmodel))
ncc=int(nmodel*(nmodel-1)/2)
condensed_cc=[0]*ncc
q=Queue()
mprocess=[]
for ii in range(16):
begin=func(i)
end=func(i+1)
p=Process(target=cc_calculator,args=(begin,end,q))
mprocess+=[p]
p.start()
for x in mprocess:
x.join()
while not q.empty():
(ind, value)=q.get()
ind=int(ind)
condensed_cc[ind]=value
np.save("condensed_cc",condensed_cc)
where get_cc(i,j) calculates the corralation coefficient associated with model i and j. all_cc is an upper triangular matrix and all_cc[i][j] stores the cc value. condensed_cc is another version of all_cc. I'll process it to achive condensed_dist to do the clustering. The "func" function helps assign to each cpu almost the same amout of computing.
I run the program successfully with nmodel=20. When I try to run the program with nmodel=10,000, however, seems that it never ends.I wait about two days and use top command in another terminal window, no process with command "python" is still running. But the program is still running and there is no output file. I use Ctrl+C to force it to stop, it points to the line: x.join(). nmodel=40 ran fast but failed with the same problem.
Maybe this problem has something to do with q. Because if I comment the line: q.put(...), it runs successfully.Or something like this:
q.put(...)
q.get()
It is also ok.But the two methods will not give a right condensed_cc. They don't change all_cc or condensed_cc.
Another example with only one subprocess:
from multiprocessing import Process, Queue
def g(q):
num=10**2
for i in range(num):
print '='*10
print i
q.put((i,i+2))
print "qsize: ", q.qsize()
q=Queue()
p=Process(target=g,args=(q,))
p.start()
p.join()
while not q.empty():
q.get()
It is ok with num= 100 but fails with num=10,000. Even with num=100**2, they did print all i and q.qsizes. I cannot figure out why. Also, Ctrl+C causes trace back to p.join().
I want to say more about the size problem of queue. Documentation about Queue and its put method introduces Queue as Queue([maxsize]), and it says about the put method:...block if neccessary until a free slot is available. These all make one think that the subprocess is blocked because of running out of spaces of the queue. However, as I mentioned before in the second example, the result printed on the screen proves an increasing qsize, meaning that the queue is not full. I add one line:
print q.full()
after the print size statement, it is always false for num=10,000 while the program still stuck somewhere. Emphasize one thing: top command in another terminal shows no process with command python. That really puzzles me.
I'm using python 2.7.9.
I believe the problem you are running into is described in the multiprocessing programming guidelines: https://docs.python.org/2/library/multiprocessing.html#multiprocessing-programming
Specifically this section:
Joining processes that use queues
Bear in mind that a process that has put items in a queue will wait before terminating until all the buffered items are fed by the “feeder” thread to the underlying pipe. (The child process can call the cancel_join_thread() method of the queue to avoid this behaviour.)
This means that whenever you use a queue you need to make sure that all items which have been put on the queue will eventually be removed before the process is joined. Otherwise you cannot be sure that processes which have put items on the queue will terminate. Remember also that non-daemonic processes will be joined automatically.
An example which will deadlock is the following:
from multiprocessing import Process, Queue
def f(q):
q.put('X' * 1000000)
if __name__ == '__main__':
queue = Queue()
p = Process(target=f, args=(queue,))
p.start()
p.join() # this deadlocks
obj = queue.get()
A fix here would be to swap the last two lines (or simply remove the p.join() line).
You might also want to check out the section on "Avoid Shared State".
It looks like you are using .join to avoid the race condition of q.empty() returning True before something is added to it. You should not rely on .empty() at all while using multiprocessing (or multithreading). Instead you should handle this by signaling from the worker process to the main process when it is done adding items to the queue. This is normally done by placing a sentinal value in the queue, but there are other options as well.