I found this code online:
import multiprocessing
import time
def foo(n):
for i in range(10000 * n):
print(i)
time.sleep(1)
if __name__ == '__main__':
p = multiprocessing.Process(target=foo, name="Foo", args=(10,))
p.start()
time.sleep(10)
p.terminate()
p.join()
Is it somehow possible to use that code inside a function? Something like:
def test():
def foo(n):
for i in range(10000 * n):
print(i)
time.sleep(1)
if __name__ == '__main__':
p = multiprocessing.Process(target=foo, name="Foo", args=(10,))
p.start()
time.sleep(10)
p.terminate()
p.join()
While I tried it there was always an error. But i'm still quite new to python and have most of the time no Idea what I'm really doing. Or is there maybe any alternative? (I'm using windows so SIGALRM is probably no alternative)
Related
Current Code
import multiprocessing as mu
import time
global_array=[]
def add_array1(array):
while True:
time.sleep(2.5)
global_array.append(1)
print(global_array)
def add_array2(array):
while True:
time.sleep(3)
global_array.append(2)
print(global_array)
def runInParallel(*fns):
if __name__=='__main__':
proc = []
for fn in fns:
p = mu.Process(target=fn)
p.start()
proc.append(p)
for p in proc:
p.join()
runInParallel(
add_array1(global_array),
add_array2(global_array)
)
When running my code above only the first function add_array1() is appending the value to the array and printing instead of both functions providing the wrong output:
[1]
[1,1]
[1,1,1]
When the actual desired output for the following code is:
[1]
[1,2]
[1,2,1]
[1,2,1,2]
Your problem is that the function call
runInParallel( add_array1(global_array), add_array2(global_array))
executes the functions and provides the return value of the function calls as parameters to runInParallel. As add_array1 is an endless loop, it never returns from the execution. You need to provide your functions as functions - not the returnvalue of the functions as parameters to runInParallel(...)
Start with
runInParallel( add_array1, add_array2) # name of the functions, dont execute em
and change
def runInParallel(*fns):
proc = []
for fn in fns:
p = mu.Process(target=fn, args=(global_array,)) # provide param here
p.start()
proc.append(p)
for p in proc:
p.join()
and then fix the "not joining" problem due to your threaded functions never returning.
Example from the official documentation of multiprocessing.Process:
from multiprocessing import Process
import os
def info(title):
print(title)
print('module name:', __name__)
print('parent process:', os.getppid())
print('process id:', os.getpid())
# Function name is f
def f(name):
info('function f')
print('hello', name)
if __name__ == '__main__':
info('main line')
# f is provided, and args is provided - not f("bob")
p = Process(target=f, args=('bob',))
p.start()
p.join()
You can use the the code below to get your desired output:
import threading
import time
global_array = []
def add_array1():
while True:
time.sleep(2.5)
global_array.append(1)
print(global_array)
def add_array2():
while True:
time.sleep(3)
global_array.append(2)
print(global_array)
def runInParallel(*fns):
if __name__ == '__main__':
proc = []
for fn in fns:
p = threading.Thread(target=fn)
p.start()
proc.append(p)
for p in proc:
p.join()
if __name__ == '__main__':
runInParallel(
add_array1,
add_array2
)
you can use this simple code to get your desired output:
'''
import time
global_array = []
def add_array1(array):
while True:
time.sleep(2.5)
if len(global_array) % 2 == 0:
global_array.append(1)
print(global_array)
else:
global_array.append(2)
print(global_array)
add_array1(global_array)
Thanks to How to run functions in parallel? the following code works.
import time
from multiprocessing import Process
def worker():
time.sleep(2)
print("Working")
def runInParallel(*fns):
proc = []
for fn in fns:
p = Process(target=fn)
p.start()
proc.append(p)
for p in proc:
p.join()
if __name__ == '__main__':
start = time.time()
runInParallel(worker, worker, worker, worker)
print("Total time taken: ", time.time()-start)
However if I add argument to worker() it does not run in parallel anymore.
import time
from multiprocessing import Process
def worker(ii):
time.sleep(ii)
print("Working")
def runInParallel(*fns):
proc = []
for fn in fns:
p = Process(target=fn)
p.start()
proc.append(p)
for p in proc:
p.join()
if __name__ == '__main__':
start = time.time()
runInParallel(worker(2), worker(2), worker(2), worker(2))
print("Total time taken: ", time.time()-start)
What might be the reason for that?
You should modify runInParallel to do iterable unpacking.
import time
from multiprocessing import Process
def worker(ii):
time.sleep(ii)
print("Working")
def runInParallel(*fns):
proc = []
for fn in fns:
func, *args = fn
p = Process(target=func, args=args)
p.start()
proc.append(p)
for p in proc:
p.join()
if __name__ == '__main__':
start = time.time()
runInParallel((worker, 2), (worker, 3), (worker, 5), (worker, 2))
print("Total time taken: ", time.time()-start)
It's because of the difference between worker and worker(). The first is the function, and the latter is a function call. What is happening on the line runInParallel(worker(2), worker(2), worker(2), worker(2)) is that all four calls are run before the execution of runInParallel is even begun. If you add a print(fns) in beginning of runInParallel you will see some difference.
Quick fix:
def worker_caller():
worker(2)
and:
runInParallel(worker_caller, worker_caller, worker_caller, worker_caller)
That's not very convenient but it's mostly intended to show what the problem is. The problem is not in the function worker. The problem is that you're mixing up passing a function and passing a function call. If you changed your first version to:
runInParallel(worker(), worker(), worker(), worker())
then you would run into exactly the same issue.
But you can do this:
runInParallel(lambda:worker(2), lambda: worker(2), lambda: worker(2), lambda: worker(2))
Lambdas are very useful. Here is another version:
a = lambda:worker(2)
b = lambda:worker(4)
c = lambda:worker(3)
d = lambda:worker(1)
runInParallel(a, b, c, d)
To pass arguments, you need to pass them to the Process constructor:
p = Process(target=fn, args=(arg1,))
The Process constructor accepts args and kwargs parameters, which are then passed to the process when it is executed.
The documentation is quite clear about this.
So your code should be modified something like this:
def worker(ii):
time.sleep(ii)
print("Working")
def runInParallel(*fns):
proc = []
for fn in fns:
p = Process(target=fn, args=(2,))
p.start()
proc.append(p)
for p in proc:
p.join()
if __name__ == '__main__':
start = time.time()
runInParallel(worker, worker, worker, worker)
print("Total time taken: ", time.time()-start)
Of course parameters can be different for each process, you need to arrange that the right one is passed to each in args (or kwargs for keyword parameters).
This can be achieved by passing tuples such as runInParallel((worker,2), (worker,3), (worker,5), (worker,1) for example, and then processing the tuples inside runInParallel.
I have my demo code shown as below. I realize that all subprocesses have finished but they do not exit. Is there anything wrong with my code? Python version: 3.7.4, Operation system: win10
import multiprocessing as mp
res_queue = mp.Queue()
def runCalculation(i):
count_list = []
total_count = i
for k in range(100000):
total_count += k
count_list.append(total_count)
print('task {} finished calculation, putting results to queue'.format(i))
for item in count_list: res_queue.put(item)
print('task {} has put all results to queue'.format(i))
def initPool(res_queue_):
global res_queue
res_queue = res_queue_
def mainFunc():
p = mp.Pool(initializer=initPool, initargs=(res_queue,))
for i in range(20): p.apply_async(runCalculation, args=(i,))
print('Waiting for all subprocesses done...')
p.close()
p.join()
print('All subprocesses done.')
if __name__ == '__main__':
mainFunc()
Here is an example:
import multiprocessing
def function():
for i in range(10):
print(i)
if __name__ == '__main__':
p = multiprocessing.Pool(5)
p.map(function, )
yields the error: TypeError: map() missing 1 required positional argument: 'iterable'
The function does not need any input, so I wish to not artificially force it to. Or does multiprocessing need some iterable?
The following code returns / prints nothing. Why?
import multiprocessing
def function():
for i in range(10):
print(i)
if __name__ == '__main__':
p = multiprocessing.Pool(5)
p.map(function, ())
If you are only trying to perform a small number of tasks, it may be better to use Process for reasons described here.
This site provides an excellent tutorial on use of Process() which i have found helpful. Here is an example from the tutorial using your function():
import multiprocessing
def function():
for i in range(10):
print(i)
if __name__ == '__main__':
jobs = []
for i in range(5):
p = multiprocessing.Process(target=function)
jobs.append(p)
p.start()
If you have no arguments to pass in, you don't have to use map. You can simply use multiprocessing.Pool.apply instead:
import multiprocessing
def function():
for i in range(10):
print(i)
if __name__ == '__main__':
p = multiprocessing.Pool(5)
p.apply(function)
I know the basic usage of multiprocessing about pools,and I use apply_async() func to avoid block,my problem code such like:
from multiprocessing import Pool, Queue
import time
q = Queue(maxsize=20)
script = "my_path/my_exec_file"
def initQueue():
...
def test_func(queue):
print 'Coming'
While True:
do_sth
...
if __name__ == '__main__':
initQueue()
pool = Pool(processes=3)
for i in xrange(11,20):
result = pool.apply_async(test_func, (q,))
pool.close()
while True:
if q.empty():
print 'Queue is emty,quit'
break
print 'Main Process Lintening'
time.sleep(2)
The results output are always Main Process Linstening,I can;t find word 'Coming'..
The code above has no syntax error and no any Exceptions.
Any one can help, thanks!