studying parallel programming python - python

import multiprocessing
from multiprocessing import Pool
from source.RUN import*
def func(r,grid,pos,h):
return r,grid,pos,h
p = multiprocessing.Pool() # Creates a pool with as many workers as you have CPU cores
results = []
if __name__ == '__main__':
for i in pos[-1]<2:
results.append(Pool.apply_async(LISTE,(r,grid,pos[i,:],h)))
p.close()
p.join()
for result in results:
print('liste', result.get())
I want to create Pool for (LISTE,(r,grid,pos[i,:],h)) process and i is in pos which is variable in different file which is a ndarray[] and I have to call this whole function in another file in between one While Loop. but this code gives error and if I am using
if __name__ == '__main__':
it will not pass through below the if __name__ == '__main__': module
please give me idea how I can make it

I'm still having a somewhat difficult time understanding your question. But I think this is what you're looking for:
You want to be able to call a function that creates a pool given r, grid, pos, h Iterate over pos feed it to the Pool, then return the results. You also want to be able to access that function from different modules. If that's what you're asking, you can do it like this:
async_module.py:
from multiprocessing import Pool
# Not sure where the LISTE function gets defined, but it needs to be in here.
def do_LISTE(*args):
# args is a tuple containing (r, grid, pos[i, :], h)
# we use tuple expansion (*args( to send each parameter to LISTE
return LISTE(*args)
def async_process(r,grid,pos,h):
return r,grid,pos,h
p = multiprocessing.Pool() # Creates a pool with as many workers as you have CPU cores
results = p.map(do_LISTE, [(r,grid,pos[i,:], h) for i in pos[-1]<2])
p.close()
p.join()
return results
Then in some other module:
from async_module import async_process
def do_async_processing():
r = "something"
grid = get_grid()
pos = get_pos()
h = 345
results = async_process(r, grid, pos, h)
if __name__ == "__main__":
do_async_processing() # Make sure the entry point is protected by `if __name__ == "__main__":`.

Related

Multiprocessing goes stuck after .get()

I'm trying to understand how multiprocessing works in python and having some issues.
This is the example:
import multiprocessing
def func():
return 1
p = multiprocessing.Pool()
result = p.apply_async(func).get()
When .get() function is called, the code is just stuck. What am I doing wrong?
You need to add these 2 lines inside if __name__ == "__main__":
so now your code must look like
import multiprocessing
def func():
return 1
if __name__ == "__main__":
p = multiprocessing.Pool()
result = p.apply_async(func).get()
If this is being called as an import, it will cause an infinite sequence of new processes. And adding them inside the if block works because if statement won't execute during import.
i do not have enough details to know exactly what is the problem.
but i have real strong guess that putting those lines:
p = multiprocessing.Pool()
result = p.apply_async(func).get()
inside a function will fix your problem.
try this:
import multiprocessing
def func():
return 1
def main():
p = multiprocessing.Pool()
result = p.apply_async(func).get()
print(result)
if __name__ == '__main__':
main()
tell me if it worked :)

Python - multiprocessing max # of processes

I would like to create and run at most N processes at once.
As soon as a process is finished, a new one should take its place.
The following code works(assuming Dostuff is the function to execute).
The problem is that I am using a loop and need time.sleep to allow
the processes to do their work. This is rather ineficient.
What's the best method for this task?
import time,multiprocessing
if __name__ == "__main__":
Jobs = []
for i in range(10):
while len(Jobs) >= 4:
NotDead = []
for Job in Jobs:
if Job.is_alive():
NotDead.append(Job)
Jobs = NotDead
time.sleep(0.05)
NewJob = multiprocessing.Process(target=Dostuff)
Jobs.append(NewJob)
NewJob.start()
After a bit of tinkering, I thought about creating new threads and then
launching my processes from these threads like so:
import threading,multiprocessing,time
def processf(num):
print("in process:",num)
now=time.clock()
while time.clock()-now < 2:
pass ##..Intensive processing..
def main():
z = [0]
lock = threading.Lock()
def threadf():
while z[0] < 20:
lock.acquire()
work = multiprocessing.Process(target=processf,args=(z[0],))
z[0] = z[0] +1
lock.release()
work.start()
work.join()
activet =[]
for i in range(2):
newt = threading.Thread(target=threadf)
activet.append(newt)
newt.start()
for i in activet:
i.join()
if __name__ == "__main__":
main()
This solution is better(doesn't slow down the launched processes), however,
I wouldn't really trust code that I wrote in a field I don't know..
I've had to use a list(z = [0]) since an integer was immutable.
Is there a way to embed processf into main()? I'd prefer not needing an additional
global variable. If I try to simply copy/paste the function inside, I get a nasty error(
Attribute error can't pickle local object 'main.(locals).processf')
Why not using concurrent.futures.ThreadPoolExecutor?
executor = ThreadPoolExecutor(max_workers=20)
res = execuror.submit(any_def)

How do you use Python Multiprocessing for a function with zero positional arguments?

Here is an example:
import multiprocessing
def function():
for i in range(10):
print(i)
if __name__ == '__main__':
p = multiprocessing.Pool(5)
p.map(function, )
yields the error: TypeError: map() missing 1 required positional argument: 'iterable'
The function does not need any input, so I wish to not artificially force it to. Or does multiprocessing need some iterable?
The following code returns / prints nothing. Why?
import multiprocessing
def function():
for i in range(10):
print(i)
if __name__ == '__main__':
p = multiprocessing.Pool(5)
p.map(function, ())
If you are only trying to perform a small number of tasks, it may be better to use Process for reasons described here.
This site provides an excellent tutorial on use of Process() which i have found helpful. Here is an example from the tutorial using your function():
import multiprocessing
def function():
for i in range(10):
print(i)
if __name__ == '__main__':
jobs = []
for i in range(5):
p = multiprocessing.Process(target=function)
jobs.append(p)
p.start()
If you have no arguments to pass in, you don't have to use map. You can simply use multiprocessing.Pool.apply instead:
import multiprocessing
def function():
for i in range(10):
print(i)
if __name__ == '__main__':
p = multiprocessing.Pool(5)
p.apply(function)

Obtain results from processes using python multiprocessing

I am trying to understand how to use the multiprocessing module in Python. The code below spawns four processes and outputs the results as they become available. It seems to me that there must be a better way for how the results are obtained from the Queue; some method that does not rely on counting how many items the Queue contains but that just returns items as they become available and then gracefully exits once the queue is empty. The docs say that Queue.empty() method is not reliable. Is there a better alternative for how to consume the results from the queue?
import multiprocessing as mp
import time
def multby4_wq(x, queue):
print "Starting!"
time.sleep(5.0/x)
a = x*4
queue.put(a)
if __name__ == '__main__':
queue1 = mp.Queue()
for i in range(1, 5):
p = mp.Process(target=multbyc_wq, args=(i, queue1))
p.start()
for i in range(1, 5): # This is what I am referring to as counting again
print queue1.get()
Instead of using queue, how about using Pool?
For example,
import multiprocessing as mp
import time
def multby4_wq(x):
print "Starting!"
time.sleep(5.0/x)
a = x*4
return a
if __name__ == '__main__':
pool = mp.Pool(4)
for result in pool.map(multby4_wq, range(1, 5)):
print result
Pass multiple arguments
Assume you have a function that accept multiple parameters (add in this example). Make a wrapper function that pass arguments to add (add_wrapper).
import multiprocessing as mp
import time
def add(x, y):
time.sleep(1)
return x + y
def add_wrapper(args):
return add(*args)
if __name__ == '__main__':
pool = mp.Pool(4)
for result in pool.map(add_wrapper, [(1,2), (3,4), (5,6), (7,8)]):
print result

How do you pass a Queue reference to a function managed by pool.map_async()?

I want a long-running process to return its progress over a Queue (or something similar) which I will feed to a progress bar dialog. I also need the result when the process is completed. A test example here fails with a RuntimeError: Queue objects should only be shared between processes through inheritance.
import multiprocessing, time
def task(args):
count = args[0]
queue = args[1]
for i in xrange(count):
queue.put("%d mississippi" % i)
return "Done"
def main():
q = multiprocessing.Queue()
pool = multiprocessing.Pool()
result = pool.map_async(task, [(x, q) for x in range(10)])
time.sleep(1)
while not q.empty():
print q.get()
print result.get()
if __name__ == "__main__":
main()
I've been able to get this to work using individual Process objects (where I am alowed to pass a Queue reference) but then I don't have a pool to manage the many processes I want to launch. Any advise on a better pattern for this?
The following code seems to work:
import multiprocessing, time
def task(args):
count = args[0]
queue = args[1]
for i in xrange(count):
queue.put("%d mississippi" % i)
return "Done"
def main():
manager = multiprocessing.Manager()
q = manager.Queue()
pool = multiprocessing.Pool()
result = pool.map_async(task, [(x, q) for x in range(10)])
time.sleep(1)
while not q.empty():
print q.get()
print result.get()
if __name__ == "__main__":
main()
Note that the Queue is got from a manager.Queue() rather than multiprocessing.Queue(). Thanks Alex for pointing me in this direction.
Making q global works...:
import multiprocessing, time
q = multiprocessing.Queue()
def task(count):
for i in xrange(count):
q.put("%d mississippi" % i)
return "Done"
def main():
pool = multiprocessing.Pool()
result = pool.map_async(task, range(10))
time.sleep(1)
while not q.empty():
print q.get()
print result.get()
if __name__ == "__main__":
main()
If you need multiple queues, e.g. to avoid mixing up the progress of the various pool processes, a global list of queues should work (of course, each process will then need to know what index in the list to use, but that's OK to pass as an argument;-).

Categories