I observed this behavior when trying to create nested child processes in Python. Here is the parent program parent_process.py:
import multiprocessing
import child_process
pool = multiprocessing.Pool(processes=4)
for i in range(4):
pool.apply_async(child_process.run, ())
pool.close()
pool.join()
The parent program calls the "run" function in the following child program child_process.py:
import multiprocessing
def run():
pool = multiprocessing.Pool(processes=4)
print 'TEST!'
pool.close()
pool.join()
When I run the parent program, nothing was printed out and the program exited quickly. However, if print 'TEST!' is moved one line above (before the nested child processes are created), 'TEST!' are printed for 4 times.
Because errors in a child process won't print to screen, this seems to show that the program crashes when a child process creates its own nested child processes.
Could anyone explain what happens behind the scene? Thanks!
According to the documentation for multiprocessing, daemonic processes cannot spawn child processes.
multiprocessing.Pool uses daemonic processes to ensure that they don't leak when your program exits.
As noxdafox said, multiprocessing.Pool uses daemonic processes. I found a simple workaround that uses multiprocess.Process instead:
Parent program:
import multiprocessing
import child_process
processes = [None] * 4
for i in range(4):
processes[i] = multiprocessing.Process(target=child_process.run, args=(i,))
processes[i].start()
for i in range(4):
processes[i].join()
Child program (with name child_process.py):
import multiprocessing
def test(info):
print 'TEST', info[0], info[1]
def run(proc_id):
pool = multiprocessing.Pool(processes=4)
pool.map(test, [(proc_id, i) for i in range(4)])
pool.close()
pool.join()
The output is 16 lines of TEST:
TEST 0 0
TEST 0 1
TEST 0 3
TEST 0 2
TEST 2 0
TEST 2 1
TEST 2 2
TEST 2 3
TEST 3 0
TEST 3 1
TEST 3 3
TEST 3 2
TEST 1 0
TEST 1 1
TEST 1 2
TEST 1 3
I do not have enough reputation to post a comment, but since python version determines the options for running hierarchical multiprocessing (e.g., a post from 2015), I wanted to share my experience. The above solution by Da Kuang worked for me with python 3.7.1 running through Anaconda 3.
I made a small modification to child_process.py to make it run the cpu for a little while so I could check system monitor to verify 16 simultaneous processes were running.
import multiprocessing
def test(info):
print('TEST', info[0], info[1])
aa=[1]*100000
a=[1 for i in aa if all([ii<1 for ii in aa])]
print('exiting')
def run(proc_id):
pool = multiprocessing.Pool(processes=4)
pool.map(test, [(proc_id, i) for i in range(4)])
pool.close()
pool.join()
Related
i want to hav a variable number of threads who run at the same time.
I tested multiple multithread examples from multiprocessing but they dont run at the same time.
To explain it better here an example:
from multiprocessing import Pool
def f(x):
print("a",x)
time.sleep(1)
print("b",x)
if __name__ == '__main__':
with Pool(3) as p:
for i in range(5):
p.map(f, [i])
Result:
a 0
b 0
a 1
b 1
a 2
b 2
Here it does a waits 1 sec and then b, but i want it that all a's get printed first and then b's (That every Thread runs at the same time so that the result looks like this:
a0
a1
a2
b0
b1
b2
You mentioned threads but seem to be using processes. The threading module uses threads, the multiprocessing module uses processes. The primary difference is that threads run in the same memory space, while processes have separate memory. If you are looking to use process library. Try using below code snippet.
from multiprocessing import Process
import time
def f(x):
print("a",x)
time.sleep(1)
print("b",x)
if __name__ == '__main__':
for i in range(5):
p = Process(target=f, args=(i,))
p.start()
processes are spawned by creating a Process object and then calling its start() method.
First of all this is not a threads pool, but a processes pool. If you want threads, you need to use multiprocessing.dummy.
Second, it seems like you misunderstood the map method. Most importantly, it is blocking. You are calling it with a single numbered list each time - [i]. So you don't actually use the Pool's powers. You utilize just one process, wait for it to finish, and move on to the next number. To get the output you want, you should instead do:
if __name__ == '__main__':
with Pool(3) as p:
p.map(f, range(5))
But note that in this case you have a race between the number of processes and the range. If you want all as and only then all bs, try to use Pool(5).
I've stumbled across a weird timing issue while using the multiprocessing module.
Consider the following scenario. I have functions like this:
import multiprocessing as mp
def workerfunc(x):
# timehook 3
# something with x
# timehook 4
def outer():
# do something
mygen = ... (some generator expression)
pool = mp.Pool(processes=8)
# time hook 1
result = [pool.apply(workerfunc, args=(x,)) for x in mygen]
# time hook 2
if __name__ == '__main__':
outer()
I am utilizing the time module to get an arbitrary feeling for how long my functions run. I successfully create 8 separate processes, which terminate without error. The longest time for a worker to finish is about 130 ms (measured between timehook 3 and 4).
I expected (as they are running in parallel) that the time between hook 1 and 2 will be approximately the same. Surprisingly, I get 600 ms as a result.
My machine has 32 cores and should be able to handle this easily. Can anybody give me a hint where this difference in time comes from?
Thanks!
You are using pool.apply which is blocking. Use pool.apply_async instead and then the function calls will all run in parallel, and each will return an AsyncResult object immediately. You can use this object to check when the processes are done and then retrieve the results using this object also.
Since you are using multiprocessing and not multithreading your performance issue is not related to GIL (Python's Global Interpreter Lock).
I've found an interesting link explaining this with an example, you can find it in the bottom of this answer.
The GIL does not prevent a process from running on a different
processor of a machine. It simply only allows one thread to run at
once within the interpreter.
So multiprocessing not multithreading will allow you to achieve true
concurrency.
Lets understand this all through some benchmarking because only that
will lead you to believe what is said above. And yes, that should be
the way to learn — experience it rather than just read it or
understand it. Because if you experienced something, no amount of
argument can convince you for the opposing thoughts.
import random
from threading import Thread
from multiprocessing import Process
size = 10000000 # Number of random numbers to add to list
threads = 2 # Number of threads to create
my_list = []
for i in xrange(0,threads):
my_list.append([])
def func(count, mylist):
for i in range(count):
mylist.append(random.random())
def multithreaded():
jobs = []
for i in xrange(0, threads):
thread = Thread(target=func,args=(size,my_list[i]))
jobs.append(thread)
# Start the threads
for j in jobs:
j.start()
# Ensure all of the threads have finished
for j in jobs:
j.join()
def simple():
for i in xrange(0, threads):
func(size,my_list[i])
def multiprocessed():
processes = []
for i in xrange(0, threads):
p = Process(target=func,args=(size,my_list[i]))
processes.append(p)
# Start the processes
for p in processes:
p.start()
# Ensure all processes have finished execution
for p in processes:
p.join()
if __name__ == "__main__":
multithreaded()
#simple()
#multiprocessed()
Additional information
Here you can find the source of this information and a more detailed technical explanation (bonus: there's also Guido Van Rossum quotes in it :) )
when I trying to make my script multi-threading,
I've found out multiprocessing,
I wonder if there is a way to make multiprocessing work with threading?
cpu 1 -> 3 threads(worker A,B,C)
cpu 2 -> 3 threads(worker D,E,F)
...
Im trying to do it myself but I hit so much problems.
is there a way to make those two work together?
You can generate a number of Processes, and then spawn Threads from inside them. Each Process can handle almost anything the standard interpreter thread can handle, so there's nothing stopping you from creating new Threads or even new Processes within each Process. As a minimal example:
def foo():
print("Thread Executing!")
def bar():
threads = []
for _ in range(3): # each Process creates a number of new Threads
thread = threading.Thread(target=foo)
threads.append(thread)
thread.start()
for thread in threads:
thread.join()
if __name__ == "__main__":
processes = []
for _ in range(3):
p = multiprocessing.Process(target=bar) # create a new Process
p.start()
processes.append(p)
for process in processes:
process.join()
Communication between threads can be handled within each Process, and communication between the Processes can be handled at the root interpreter level using Queues or Manager objects.
You can define a function that takes a process and make it run 3 threads and then spawn your processes to target this function, for example:
def threader(process):
for _ in range(3):
threading.Thread(target=yourfunc).start()
def main():
# spawn whatever processes here to target threader
I have a fairly simple program that I am trying to run in parallel. The program works when I run it using 4 processes. The processor I have is a 4 core processor with 8 logical cores. When I increase the number of processes from 4 to 5+, the program will run with the increased number of processes up until it hits .join() where the program hangs...whereas if I keep it at 4 or less, it never hangs and finishes properly. Are there different considerations that need to be made when spawning more processes than physical cores in your machine that I haven't thought of? Here's some example code of what I'm doing to create and run the processes.
import multiprocessing as mp
def f(list1, list2):
#do something
if __name__ == '__main__':
list1 = list()
list2 = list()
procs = list()
num_cores = 4
for i in xrange(0, num_cores):
p = mp.Process(target=f, args=(list1, list2,))
procs.append(p)
for p in procs:
p.start()
for p in procs:
p.join() # hangs here with num_cores over 4
I want to stop all threads from a single worker.
I have a thread pool with 10 workers:
def myfunction(i):
print(i)
if (i == 20):
sys.exit()
p = multiprocessing.Pool(10, init_worker)
for i in range(100):
p.apply_async(myfunction, (i,))
My program does not stop and the other processes continue working until all 100 iterations are complete. I want to stop the pool entirely from inside the thread that calls sys.exit(). The way it is currently written will only stop the worker that calls sys.exit().
This isn't working the way you're intending because calling sys.exit() in a worker process will only terminate the worker. It has no effect on the parent process or the other workers, because they're separate processes and raising SystemExit only affects the current process. You need to send a signal back the parent process to tell it that it should shut down. One way to do this for your use-case would be to use an Event created in a multiprocessing.Manager server:
import multiprocessing
def myfunction(i, event):
if not event.is_set():
print i
if i == 20:
event.set()
if __name__ == "__main__":
p= multiprocessing.Pool(10)
m = multiprocessing.Manager()
event = m.Event()
for i in range(100):
p.apply_async(myfunction , (i, event))
p.close()
event.wait() # We'll block here until a worker calls `event.set()`
p.terminate() # Terminate all processes in the Pool
Output:
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
As pointed out in Luke's answer, there is a race here: There's no guarantee that all the workers will run in order, so it's possible that myfunction(20, ..) will run prior to myfuntion(19, ..), for example. It's also possible that other workers after 20 will run before the main process can act on the event being set. I reduced the size of the race window by adding the if not event.is_set(): call prior to printing i, but it still exists.
You can't do this.
Even if you were able to end all of your processes when i == 20, you couldn't be sure that only 20 numbers were printed because your processes will execute in a non-deterministic order.
If you want to only run 20 processes, then you need to manage this from your master process (ie. your control loop).