I want to apply a function in parallel using multiprocessing.Pool.
The problem is that if one function call triggers a segmentation fault the Pool hangs forever.
Has anybody an idea how I can make a Pool that detects when something like this happens and raises an error?
The following example shows how to reproduce it (requires scikit-learn > 0.14)
import numpy as np
from sklearn.ensemble import gradient_boosting
import time
from multiprocessing import Pool
class Bad(object):
tree_ = None
def fit_one(i):
if i == 3:
# this will segfault
bad = np.array([[Bad()] * 2], dtype=np.object)
gradient_boosting.predict_stages(bad,
np.random.rand(20, 2).astype(np.float32),
1.0, np.random.rand(20, 2))
else:
time.sleep(1)
return i
pool = Pool(2)
out = pool.imap_unordered(fit_one, range(10))
# we will never see 3
for o in out:
print o
As described in the comments, this just works in Python 3 if you use concurrent.Futures.ProcessPoolExecutor instead of multiprocessing.Pool.
If you're stuck on Python 2, the best option I've found is to use the timeout argument on the result objects returned by Pool.apply_async and Pool.map_async. For example:
pool = Pool(2)
out = pool.map_async(fit_one, range(10))
for o in out:
print o.get(timeout=1000) # allow 1000 seconds max
This works as long as you have an upper bound for how long a child process should take to complete a task.
This is a known bug, issue #22393, in Python. There is no meaningful workaround as long as you're using multiprocessing.pool until it's fixed. A patch is available at that link, but it has not been integrated into the main release as yet, so no stable release of Python fixes the problem.
Instead of using Pool().imap() maybe you would rather manually create child processes yourself with Process(). I bet the object returned would allow you to get liveness status of any child. You will know if they hang up.
I haven't run your example to see if it can handle the error, but try concurrent futures. Simply replace my_function(i) with your fit_one(i). Keep the __name__=='__main__': structure. concurrent futures seems to need this. The code below is tested on my machine so will hopefully work straight up on yours.
import concurrent.futures
def my_function(i):
print('function running')
return i
def run():
number_processes=4
executor = concurrent.futures.ProcessPoolExecutor(number_processes)
futures = [executor.submit(my_function,i) for i in range(10)]
concurrent.futures.wait(futures)
for f in futures:
print(f.result())
if __name__ == '__main__':
run()
Related
I am trying to create a pool inside a pool to parallelize a for cycle. I'm trying to do this to see if it is faster than running a for cycle with only one pool creation. My issue is that the code I wrote doesn't seem to ever finish running and I don't quite get why. Here is the code:
import numpy as np
import multiprocessing as mp
import time
cpus = mp.cpu_count() - 1
def f(x):
lista = list(pool.map(time.sleep, [1,2,3] * x))
print('done')
return lista
pool = mp.Pool(cpus)
lista2 = pool.map(f, range(2))
pool.close()
pool.join()
From the docs: "Note that the methods of a pool should only ever be used by the process which created it."
https://docs.python.org/3.4/library/multiprocessing.html#module-multiprocessing.pool
Also note that the processes started by Pool are daemonic processes that are not allowed to have their own children. This may explain why you're experiencing a deadlock. According to this blog post you should have seen an exception getting raised over this, I'm not sure why you're not experiencing that:
https://blog.mbedded.ninja/programming/languages/python/python-multiprocessing/
I've stumbled across a weird timing issue while using the multiprocessing module.
Consider the following scenario. I have functions like this:
import multiprocessing as mp
def workerfunc(x):
# timehook 3
# something with x
# timehook 4
def outer():
# do something
mygen = ... (some generator expression)
pool = mp.Pool(processes=8)
# time hook 1
result = [pool.apply(workerfunc, args=(x,)) for x in mygen]
# time hook 2
if __name__ == '__main__':
outer()
I am utilizing the time module to get an arbitrary feeling for how long my functions run. I successfully create 8 separate processes, which terminate without error. The longest time for a worker to finish is about 130 ms (measured between timehook 3 and 4).
I expected (as they are running in parallel) that the time between hook 1 and 2 will be approximately the same. Surprisingly, I get 600 ms as a result.
My machine has 32 cores and should be able to handle this easily. Can anybody give me a hint where this difference in time comes from?
Thanks!
You are using pool.apply which is blocking. Use pool.apply_async instead and then the function calls will all run in parallel, and each will return an AsyncResult object immediately. You can use this object to check when the processes are done and then retrieve the results using this object also.
Since you are using multiprocessing and not multithreading your performance issue is not related to GIL (Python's Global Interpreter Lock).
I've found an interesting link explaining this with an example, you can find it in the bottom of this answer.
The GIL does not prevent a process from running on a different
processor of a machine. It simply only allows one thread to run at
once within the interpreter.
So multiprocessing not multithreading will allow you to achieve true
concurrency.
Lets understand this all through some benchmarking because only that
will lead you to believe what is said above. And yes, that should be
the way to learn — experience it rather than just read it or
understand it. Because if you experienced something, no amount of
argument can convince you for the opposing thoughts.
import random
from threading import Thread
from multiprocessing import Process
size = 10000000 # Number of random numbers to add to list
threads = 2 # Number of threads to create
my_list = []
for i in xrange(0,threads):
my_list.append([])
def func(count, mylist):
for i in range(count):
mylist.append(random.random())
def multithreaded():
jobs = []
for i in xrange(0, threads):
thread = Thread(target=func,args=(size,my_list[i]))
jobs.append(thread)
# Start the threads
for j in jobs:
j.start()
# Ensure all of the threads have finished
for j in jobs:
j.join()
def simple():
for i in xrange(0, threads):
func(size,my_list[i])
def multiprocessed():
processes = []
for i in xrange(0, threads):
p = Process(target=func,args=(size,my_list[i]))
processes.append(p)
# Start the processes
for p in processes:
p.start()
# Ensure all processes have finished execution
for p in processes:
p.join()
if __name__ == "__main__":
multithreaded()
#simple()
#multiprocessed()
Additional information
Here you can find the source of this information and a more detailed technical explanation (bonus: there's also Guido Van Rossum quotes in it :) )
I am in the following setting: I have a method that takes an objective function f as input. As a subrouting of that method i want to evaluate f on a small set of points. Since f has high complexity i considered doing that in parallel.
All online examples hang up even for trivial functions like squaring on sets with 5 points. They are using the multiprocessing library - and i don't know what i am doing wrong. I am not sure how to encapsulate that __name__ == "__main__" statement in my method. (since it is part of a module - i guess instead of "__main__" i should use the module name?)
Code i have been using looks like
from multiprocessing.pool import Pool
from multiprocessing import cpu_count
x = [1,2,3,4,5]
num_cores = cpu_count()
def f(x):
return x**2
if __name__ == "__main__":
pool = Pool(num_cores)
y = list(pool.map(f, x))
pool.join()
print(y)
When executing this code in my spyder it takes a bloody long time to finish.
So my main questions are: What am i doing wrong in this code? How can i encapsulate the __name__-statement, when this code is part of a bigger method?
Is it even worth it parallelizing this? (one function evaluation can take multiple minutes and in serial this adds up to a total runtime of hours...)
According to documentation :
close()
Prevents any more tasks from being submitted to the pool. Once all the tasks have been completed the worker processes will exit.
terminate()
Stops the worker processes immediately without completing outstanding work. When the pool object is garbage collected
terminate() will be called immediately.
join()
Wait for the worker processes to exit. One must call close() or terminate() before using join().
So you should add :
from multiprocessing.pool import Pool
from multiprocessing import cpu_count
x = [1,2,3,4,5]
def f(x):
return x**2
if __name__ == "__main__":
pool = Pool()
y = list(pool.map(f, x))
pool.close()
pool.join()
print(y)
You can call Pool without any argument and it will use cpu_count by default
If processes is None then the number returned by cpu_count() is used
About the if name == "main", read more informations here.
So you need to think a bit about which code you want executed only in the main program. The most obvious example is that you want code that creates child processes to run only in the main program - so that should be protected by name == 'main'
You might want to look into the chunksize argument of the map function that you are using.
On a large enough input list, a lot of your time is spent simply communicating the arguments to and from the separate parallel processes.
One symptom of this problem is that when you use something like htop all cores are firing but at < 100%.
I am trying to use the multiprocessing.Pool to implement a multithread application. To share some variables I am using a Queue as hinted here:
def get_prediction(data):
#here the real calculation will be performed
....
def mainFunction():
def get_prediction_init(q):
print("a")
get_prediction.q = q
queue = Queue()
pool = Pool(processes=16, initializer=get_prediction_init, initargs=[queue,])
if __name__== '__main__':
mainFunction()
This code is running perfectly on a Debian machine, but is not working at all on another Windows 10 device. It fails with the error
AttributeError: Can't pickle local object 'mainFunction.<locals>.get_prediction_init'
I do not really know what exactly is causing the error. How can I solve the problem so that I can run the code on the Windows device as well?
EDIT: The problem is solved if I create the get_predediction_init function on the same level as the mainFunction. It has only failed when I defined it as an inner function. Sorry for the confusion in my post.
The problem is in something you haven't shown us. For example, it's a mystery where "mainFunction" came from in the AttributeError message you showed.
Here's a complete, executable program based on the fragment you posted. Worked fine for me under Windows 10 just now, under Python 3.6.1 (I'm guessing you're using Python 3 from your print syntax), printing "a" 16 times:
import multiprocessing as mp
def get_prediction(data):
#here the real calculation will be performed
pass
def get_prediction_init(q):
print("a")
get_prediction.q = q
if __name__ == "__main__":
queue = mp.Queue()
pool = mp.Pool(processes=16, initializer=get_prediction_init, initargs=[queue,])
pool.close()
pool.join()
Edit
And, based on your edit, this program also works fine for me:
import multiprocessing as mp
def get_prediction(data):
#here the real calculation will be performed
pass
def get_prediction_init(q):
print("a")
get_prediction.q = q
def mainFunction():
queue = mp.Queue()
pool = mp.Pool(processes=16, initializer=get_prediction_init, initargs=[queue,])
pool.close()
pool.join()
if __name__ == "__main__":
mainFunction()
Edit 2
And now you've moved the definition of get_prediction_init() into the body of mainFunction. Now I can see your error :-)
As shown, define the function at module level instead. Trying to pickle local function objects can be a nightmare. Perhaps someone wants to fight with that, but not me ;-)
I'm trying to run a function with multiprocessing. This is the code:
import multiprocessing as mu
output = []
def f(x):
output.append(x*x)
jobs = []
np = mu.cpu_count()
for n in range(np*500):
p = mu.Process(target=f, args=(n,))
jobs.append(p)
running = []
for i in range(np):
p = jobs.pop()
running.append(p)
p.start()
while jobs != []:
for r in running:
if r.exitcode == 0:
try:
running.remove(r)
p = jobs.pop()
p.start()
running.append(p)
except IndexError:
break
print "Done:"
print output
The output is [], while it should be [1,4,9,...]. Someone sees where i'm making a mistake?
You are using multiprocessing, not threading. So your output list is not shared between the processes.
There are several possible solutions;
Retain most of your program but use a multiprocessing.Queue instead of a list. Let the workers put their results in the queue, and read it from the main program. It will copy data from process to process, so for big chunks of data this will have significant overhead.
You could use shared memory in the form of multiprocessing.Array. This might be the best solution if the processed data is large.
Use a Pool. This takes care of all the process management for you. Just like with a queue, it copies data from process to process. It is probably the easiest to use. IMO this is the best option if the data sent to/from each worker is small.
Use threading so that the output list is shared between threads. Threading in CPython has the restriction that only one thread at a time can be executing Python bytecode, so you might not get as much performance benefit as you'd expect. And unlike the multiprocessing solutions it will not take advantage of multiple cores.
Edit:
Thanks to #Roland Smith to point out.
The main problem is the function f(x). When child process call this, it's unable for them to fine the output variable (since it's not shared).
Edit:
Just as #cdarke said, in multiprocessing you have to carefully control the shared object that child process could access(maybe a lock), and it's pretty complicated and hard to debug.
Personally I suggest to use the Pool.map method for this.
For instance, I assume that you run this code directly, not as a module, then your code would be:
import multiprocessing as mu
def f(x):
return x*x
if __name__ == '__main__':
np = mu.cpu_count()
args = [n for n in range(np*500)]
pool = mu.Pool(processes=np)
result = pool.map(f, args)
pool.close()
pool.join()
print result
but there's something you must know
if you just run this file but not import with module, the if __name__ == '__main__': is important, since python will load this file as a module for other process, if you don't place the function 'f' outside if __name__ == '__main__':, the child process would not be able to find your function 'f'
**Edit:**thanks #Roland Smith point out that we could use tuple
if you have more then one args for the function f, then you might need a tuple to do so, for instance
def f((x,y))
return x*y
args = [(n,1) for n in range(np*500)]
result = pool.map(f, args)
or check here for more detailed discussion