I have a fairly simple program that I am trying to run in parallel. The program works when I run it using 4 processes. The processor I have is a 4 core processor with 8 logical cores. When I increase the number of processes from 4 to 5+, the program will run with the increased number of processes up until it hits .join() where the program hangs...whereas if I keep it at 4 or less, it never hangs and finishes properly. Are there different considerations that need to be made when spawning more processes than physical cores in your machine that I haven't thought of? Here's some example code of what I'm doing to create and run the processes.
import multiprocessing as mp
def f(list1, list2):
#do something
if __name__ == '__main__':
list1 = list()
list2 = list()
procs = list()
num_cores = 4
for i in xrange(0, num_cores):
p = mp.Process(target=f, args=(list1, list2,))
procs.append(p)
for p in procs:
p.start()
for p in procs:
p.join() # hangs here with num_cores over 4
Related
I have 50 processes I want to run in parallel. I need to run the processes on a gpu. My machine has 8 gpus, I pass the device number to each process so it knows what device to run on. Once that processes is done I want to run another process on that device. The processes are run as subprocesses using POpen with the command below
python special_process.py device
A simple way to do this would be
for group in groups:
processes = [subprocess.POpen(f'python special_process.py {device}'.split()) for device in range(8)]
[p.wait() for p in process]
where groups, are the 50 processes split into groups of 8.
The downside of this is some processes take longer than others and all processes need to finish before it moves to the next group.
I was hoping to do something like multiprocess.spawn, but I need the last process to return the device number so it is clear which device is open to run on. I tried using Queue and Process from multiprocessing but I can't get more than 1 process to run at once.
Any help would be very appreciated. Thanks
Simple while loop and building your own queue worked. Just don't use wait until the end.
import subprocess
d = list(range(20))
num_gpus = 8
procs = []
gpus_free = set([j for j in range(num_gpus)])
gpus_used = set()
while len(d) > 0:
for proc, gpu in procs:
poll = proc.poll()
if poll is None:
# Proc still running
continue
else:
# Proc complete - pop from list
procs.remove((proc, gpu))
gpus_free.add(gpu)
# Submit new processes
if len(procs) < num_gpus:
this_process = d.pop()
gpu_for_this_process = gpus_free.pop()
command = f"python3 inner_function.py {gpu_for_this_process} {this_process}"
proc = subprocess.Popen(command, shell= True)
procs.append((proc, gpu_for_this_process))
[proc.wait() for proc, _ in procs]
print('DONE with all')
I have written a very basic code to test multiprocess in python.
When i try to run the code on my windows machine, it does not run while it works fine on linux machine.
Below is the code and the error that it throws.
from multiprocessing import Process
import os
import time
# creating a list to store all the processes that will be created
processes = []
# Get count of CPU Cores
cores = os.cpu_count()
def square(n): #just creating a random program for demostration
for i in range (n):
print(i)
i*i
time.sleep(0.1)
# create a process.
for i in range(cores):
p = Process(target=square,args=(100,))
processes.append(p)
#statrting all the processes
for proc in processes:
proc.start()
# join process
for proc in processes:
proc.join()
print("All processes are done")
Most Process executions (Not Threads) on Python will start a new instance importing itself. This means your global code will be executed every time the instance is doing the import. (This only applies to the spawn start method)
In order to avoid these issues, you have to move your code into the if __name__ == "__main__": function in order for the Process to create a new instance correctly.
You fix it like so:
from multiprocessing import Process
import os
import time
def square(n): #just creating a random program for demostration
for i in range (n):
print(i)
i*i
time.sleep(0.1)
if __name__ == "__main__":
# Get count of CPU Cores
cores = os.cpu_count()
# creating a list to store all the processes that will be created
processes = []
# create a process.
for i in range(cores):
p = Process(target=square,args=(100,))
processes.append(p)
#statrting all the processes
for proc in processes:
proc.start()
# join process
for proc in processes:
proc.join()
print("All processes are done")
Result:
1
0
0
1
1
0
2
0
0
1
1
1
2
... Truncated
All processes are done
i want to hav a variable number of threads who run at the same time.
I tested multiple multithread examples from multiprocessing but they dont run at the same time.
To explain it better here an example:
from multiprocessing import Pool
def f(x):
print("a",x)
time.sleep(1)
print("b",x)
if __name__ == '__main__':
with Pool(3) as p:
for i in range(5):
p.map(f, [i])
Result:
a 0
b 0
a 1
b 1
a 2
b 2
Here it does a waits 1 sec and then b, but i want it that all a's get printed first and then b's (That every Thread runs at the same time so that the result looks like this:
a0
a1
a2
b0
b1
b2
You mentioned threads but seem to be using processes. The threading module uses threads, the multiprocessing module uses processes. The primary difference is that threads run in the same memory space, while processes have separate memory. If you are looking to use process library. Try using below code snippet.
from multiprocessing import Process
import time
def f(x):
print("a",x)
time.sleep(1)
print("b",x)
if __name__ == '__main__':
for i in range(5):
p = Process(target=f, args=(i,))
p.start()
processes are spawned by creating a Process object and then calling its start() method.
First of all this is not a threads pool, but a processes pool. If you want threads, you need to use multiprocessing.dummy.
Second, it seems like you misunderstood the map method. Most importantly, it is blocking. You are calling it with a single numbered list each time - [i]. So you don't actually use the Pool's powers. You utilize just one process, wait for it to finish, and move on to the next number. To get the output you want, you should instead do:
if __name__ == '__main__':
with Pool(3) as p:
p.map(f, range(5))
But note that in this case you have a race between the number of processes and the range. If you want all as and only then all bs, try to use Pool(5).
I observed this behavior when trying to create nested child processes in Python. Here is the parent program parent_process.py:
import multiprocessing
import child_process
pool = multiprocessing.Pool(processes=4)
for i in range(4):
pool.apply_async(child_process.run, ())
pool.close()
pool.join()
The parent program calls the "run" function in the following child program child_process.py:
import multiprocessing
def run():
pool = multiprocessing.Pool(processes=4)
print 'TEST!'
pool.close()
pool.join()
When I run the parent program, nothing was printed out and the program exited quickly. However, if print 'TEST!' is moved one line above (before the nested child processes are created), 'TEST!' are printed for 4 times.
Because errors in a child process won't print to screen, this seems to show that the program crashes when a child process creates its own nested child processes.
Could anyone explain what happens behind the scene? Thanks!
According to the documentation for multiprocessing, daemonic processes cannot spawn child processes.
multiprocessing.Pool uses daemonic processes to ensure that they don't leak when your program exits.
As noxdafox said, multiprocessing.Pool uses daemonic processes. I found a simple workaround that uses multiprocess.Process instead:
Parent program:
import multiprocessing
import child_process
processes = [None] * 4
for i in range(4):
processes[i] = multiprocessing.Process(target=child_process.run, args=(i,))
processes[i].start()
for i in range(4):
processes[i].join()
Child program (with name child_process.py):
import multiprocessing
def test(info):
print 'TEST', info[0], info[1]
def run(proc_id):
pool = multiprocessing.Pool(processes=4)
pool.map(test, [(proc_id, i) for i in range(4)])
pool.close()
pool.join()
The output is 16 lines of TEST:
TEST 0 0
TEST 0 1
TEST 0 3
TEST 0 2
TEST 2 0
TEST 2 1
TEST 2 2
TEST 2 3
TEST 3 0
TEST 3 1
TEST 3 3
TEST 3 2
TEST 1 0
TEST 1 1
TEST 1 2
TEST 1 3
I do not have enough reputation to post a comment, but since python version determines the options for running hierarchical multiprocessing (e.g., a post from 2015), I wanted to share my experience. The above solution by Da Kuang worked for me with python 3.7.1 running through Anaconda 3.
I made a small modification to child_process.py to make it run the cpu for a little while so I could check system monitor to verify 16 simultaneous processes were running.
import multiprocessing
def test(info):
print('TEST', info[0], info[1])
aa=[1]*100000
a=[1 for i in aa if all([ii<1 for ii in aa])]
print('exiting')
def run(proc_id):
pool = multiprocessing.Pool(processes=4)
pool.map(test, [(proc_id, i) for i in range(4)])
pool.close()
pool.join()
I have a simple implementation of python's multi-processing module
if __name__ == '__main__':
jobs = []
while True:
for i in range(40):
# fetch one by one from redis queue
#item = item from redis queue
p = Process(name='worker '+str(i), target=worker, args=(item,))
# if p is not running, start p
if not p.is_alive():
jobs.append(p)
p.start()
for j in jobs:
j.join()
jobs.remove(j)
def worker(url_data):
"""worker function"""
print url_data['link']
What I expect this code to do:
run in infinite loop, keep waiting for Redis queue.
if Redis queue not empty, fetch item.
create 40 multiprocess.Process, not more not less
if a process has finished processing, start new process, so that ~40 process are running at all time.
I read that, to avoid zombie process that should be bound(join) to the parent, that's what I expected to achieve in the second loop. But the issue is that on launching it spawns 40 processes, workers finish processing and enter zombie state, until all currently spawned processes haven't finished,
then in next iteration of "while True", the same pattern continues.
So my question is:
How can I avoid zombie processes. and spawn new process as soon as 1 in 40 has finished
For a task like the one you described is usually better to use a different approach using Pool.
You can have the main process fetching data and the workers deal with it.
Following an example of Pool from Python Docs
def f(x):
return x*x
if __name__ == '__main__':
pool = Pool(processes=4) # start 4 worker processes
result = pool.apply_async(f, [10]) # evaluate "f(10)" asynchronously
print result.get(timeout=1) # prints "100" unless your computer is *very* slow
print pool.map(f, range(10)) # prints "[0, 1, 4,..., 81]"
I also suggest to use imap instead of map as it seems your task can be asynch.
Roughly your code will be:
p = Pool(40)
while True:
items = items from redis queue
p.imap_unordered(worker, items) #unordered version is faster
def worker(url_data):
"""worker function"""
print url_data['link']