I've read the documentation here, and seems that to make sure that the Value does not hang we need to use a lock. I did just that but it still gets stuck:
from multiprocessing import Process, Value, freeze_support, Lock
nb_threads = 3
nbloops = 10
v = Value('i', 0)
def run_process(lock):
global nbloops
i = 0
while i < nbloops:
# do stuff
i += 1
with lock:
v.value += 1
# wait for all the processes to finish doing something
while v.value % nb_threads != 0:
pass
if __name__ == '__main__':
freeze_support()
processes = []
lock = Lock()
for i in range(0, 3):
processes.append( Process( target=run_process, args=(lock,) ) )
for process in processes:
process.start()
for process in processes:
process.join()
I've tried accessing the value using lock but it still blocks:
val = -1
while val % nb_threads != 0:
with lock:
val = v.value
How can I fix this? Thanks
Your code has a race condition; you do not guarantee that all three processes break free from the while v.value % nb_threads != 0 loop before allowing them to move on. This allows one or two of the processes to move on to the next iteration of the while i < nbloops loop, increment v.value, and then prevent the remaining process/processes from ever breaking out of their own while v.value % nb_threads != 0 loop. The kind of synchronization you're trying to do there is best handled by a Barrier, rather than looping and repeatedly checking the value.
Also, multiprocessing.Value also has a built-in synchronization by default, and you can explicitly access the Lock it uses for that by calling Value.get_lock, so there is no need to explicitly a Lock of your own to each process. Putting together, you have:
from multiprocessing import Process, Value, freeze_support, Lock, Barrier
nb_threads = 3
nbloops = 10
v = Value('i', 0)
def run_process(barrier):
global nbloops
i = 0
while i < nbloops:
# do stuff
i += 1
with v.get_lock():
v.value += 1
# wait for all the processes to finish doing something
out = barrier.wait()
if __name__ == '__main__':
freeze_support()
processes = []
b = Barrier(nb_threads)
for i in range(0, nb_threads):
processes.append( Process( target=run_process, args=(b,) ) )
for process in processes:
process.start()
for process in processes:
process.join()
The Barrier guarantees that no process can move on to the next iteration of the loop until all of them have called Barrier.wait(), at which point all three are simultaneously able to progress. The Barrier object supports re-use, so it can safely be called on each iteration.
Related
import multiprocessing
global stop
stop = False
def makeprocesses():
processes = []
for _ in range(50):
p = multiprocessing.Process(target=runprocess)
processes.append(p)
for _ in range(50):
processes[_].start()
runprocess()
def runprocess():
global stop
while stop == False:
x = 1 #do something here
if x = 1:
stop = True
makeprocesses()
while stop == True:
x = 0
makeprocesses()
How could I make all the other 49 processes stop if just one changes stop to True?
I would think since stop is a global variable once one process changes stop all the others would stop.
No. Each process gets its own copy. It's global to the script, but not across processes. Remember that each process has a completely separate address space. It gets a COPY of the first process' data.
If you need to communicate across processes, you need to use one of the synchronization techniques in the multiprocessing documentation (https://docs.python.org/3/library/multiprocessing.html#synchronization-primitives), like an Event or a shared object.
Whenever you want to synchronise threads you need some shared context and make sure it is safe. as #Tim Roberts mentioned These can be taken from (https://docs.python.org/3/library/multiprocessing.html#synchronization-primitives)
Try something like this:
import multiprocessing
from multiprocessing import Event
from time import sleep
def makeprocesses():
processes = []
e = Event()
for i in range(50):
p = multiprocessing.Process(target=runprocess,args= (e,i))
p.start()
processes.append(p)
for p in processes:
p.join()
def runprocess(e: Event() = None,name = 0):
while not e.is_set():
sleep(1)
if name == 1:
e.set() # here we make all other processes to stop
print("end")
if __name__ == '__main__':
makeprocesses()
My favorite way is using cancelation token which is a object wrapping what we did here
I have a process that is essentially just an infinite loop and I have a second process that is a timer. How can I kill the loop process once the timer is done?
def action():
x = 0
while True:
if x < 1000000:
x = x + 1
else:
x = 0
def timer(time):
time.sleep(time)
exit()
loop_process = multiprocessing.Process(target=action)
loop_process.start()
timer_process = multiprocessing.Process(target=timer, args=(time,))
timer_process.start()
I want the python script to end once the timer is done.
You could do it by using a sharing state between the processes and creating a flag value that all the concurrent processes can access (although this may be somewhat inefficient).
Here's what I'm suggesting:
import multiprocessing as mp
import time
def action(run_flag):
x = 0
while run_flag.value:
if x < 1000000:
x = x + 1
else:
x = 0
print('action() terminating')
def timer(run_flag, secs):
time.sleep(secs)
run_flag.value = False
if __name__ == '__main__':
run_flag = mp.Value('I', True)
loop_process = mp.Process(target=action, args=(run_flag,))
loop_process.start()
timer_process = mp.Process(target=timer, args=(run_flag, 2.0))
timer_process.start()
loop_process.join()
timer_process.join()
print('done')
A simple return statement after else in action() would work perfectly. Moreover, you had an error in your timer function. Your argument had the same name as inbuilt library time.
def action():
x = 0
while True:
if x < 1000000:
x = x + 1
else:
x = 0
return # To exit else it will always revolve in infinite loop
def timer(times):
time.sleep(times)
exit()
loop_process = multiprocessing.Process(target=action)
loop_process.start()
timer_process = multiprocessing.Process(target=timer(10))
timer_process.start()
Hope this answers your question!!!
I think you don't need to make a second process just for a timer.
Graceful Timeout
In case you need clean up before exit in your action process, you can use a Timer-thread and let the while-loop check if it is still alive. This allows your worker process to exit gracefully, but you'll have to pay with reduced performance
because the repeated method call takes some time. Doesn't have to be an issue if it' s not a tight loop, though.
from multiprocessing import Process
from datetime import datetime
from threading import Timer
def action(runtime, x=0):
timer = Timer(runtime, lambda: None) # just returns None on timeout
timer.start()
while timer.is_alive():
if x < 1_000_000_000:
x += 1
else:
x = 0
if __name__ == '__main__':
RUNTIME = 1
p = Process(target=action, args=(RUNTIME,))
p.start()
print(f'{datetime.now()} {p.name} started')
p.join()
print(f'{datetime.now()} {p.name} ended')
Example Output:
2019-02-28 19:18:54.731207 Process-1 started
2019-02-28 19:18:55.738308 Process-1 ended
Termination on Timeout
If you don't have the need for a clean shut down (you are not using shared queues, working with DBs etc.), you can let the parent process terminate() the worker-process after your specified time.
terminate()
Terminate the process. On Unix this is done using the SIGTERM signal; on Windows TerminateProcess() is used. Note that exit handlers and finally clauses, etc., will not be executed.
Note that descendant processes of the process will not be terminated – they will simply become orphaned.
Warning If this method is used when the associated process is using a pipe or queue then the pipe or queue is liable to become corrupted and may become unusable by other process. Similarly, if the process has acquired a lock or semaphore etc. then terminating it is liable to cause other processes to deadlock. docs
If you don't have anything to do in the parent you can simply .join(timeout) the worker-process and .terminate() afterwards.
from multiprocessing import Process
from datetime import datetime
def action(x=0):
while True:
if x < 1_000_000_000:
x += 1
else:
x = 0
if __name__ == '__main__':
RUNTIME = 1
p = Process(target=action)
p.start()
print(f'{datetime.now()} {p.name} started')
p.join(RUNTIME)
p.terminate()
print(f'{datetime.now()} {p.name} terminated')
Example Output:
2019-02-28 19:22:43.705596 Process-1 started
2019-02-28 19:22:44.709255 Process-1 terminated
In case you want to use terminate(), but need your parent unblocked you could also use a Timer-thread within the parent for that.
from multiprocessing import Process
from datetime import datetime
from threading import Timer
def action(x=0):
while True:
if x < 1_000_000_000:
x += 1
else:
x = 0
def timeout(process, timeout):
timer = Timer(timeout, process.terminate)
timer.start()
if __name__ == '__main__':
RUNTIME = 1
p = Process(target=action)
p.start()
print(f'{datetime.now()} {p.name} started')
timeout(p, RUNTIME)
p.join()
print(f'{datetime.now()} {p.name} terminated')
Example Output:
2019-02-28 19:23:45.776951 Process-1 started
2019-02-28 19:23:46.778840 Process-1 terminated
I've read a number of answers here on Stackoverflow about Python multiprocessing, and I think this one is the most useful for my purposes: python multiprocessing queue implementation.
Here is what I'd like to do: poll the database for new work, put it in the queue and have 4 processes continuously do the work. What I'm unclear on is what happens when an item in the queue is done being processed. In the question above, the process terminates when the queue is empty. However, in my case, I'd just like to keep waiting until there is data in the queue. So do I just sleep and periodically check the queue? So my worker processes will never die? Is that good practice?
def mp_worker(queue):
while True:
if (queue.qsize() == 0):
time.sleep(20)
else:
db_record = queue.get()
process_file(db_record)
def mp_handler():
num_workers = 4
processes = [Process(target=mp_worker, args=(queue,)) for _ in range(num_workers)]
for process in processes:
process.start()
for process in processes:
process.join()
if __name__ == '__main__':
db_conn = db.create_postgre_connection(DB_CONFIG)
while True:
db_records = db.retrieve_received_files(DB_CONN)
if (len(db_records) > 0):
for db_record in db_records:
queue.put(db_record)
mp_handler()
else:
time.sleep(20)
db_conn.close()
Does it make sense?
Thanks.
Figured it out. Workers have to die, since otherwise they never return. But I start a new set of workers when there is data anyway, so that's not a problem. Updated code:
def mp_worker(queue):
while queue.qsize() > 0 :
db_record = queue.get()
process_file(db_record)
def mp_handler():
num_workers = 4
if (queue.qsize() < num_workers):
num_workers = queue.qsize()
processes = [Process(target=mp_worker, args=(queue,)) for _ in range(num_workers)]
for process in processes:
process.start()
for process in processes:
process.join()
if __name__ == '__main__':
while True:
db_records = db.retrieve_received_files(DB_CONN)
print(db_records)
if (len(db_records) > 0):
for db_record in db_records:
queue.put(db_record)
mp_handler()
else:
time.sleep(20)
DB_CONN.close()
The following simple code should, as far I can see, always print out '0' in the end. However, when running it with "lock = True", it often prints out other positive or negative numbers.
import multiprocessing as mp
import sys
import time
num = mp.Value('d', 0.0, lock = False)
def func1():
global num
print ('start func1')
#While num.value < 100000:
for x in range(1000):
num.value += 1
#print(num.value)
print ('end func1')
def func2():
global num
print ('start func2')
#while num.value > -10000:
for x in range(1000):
num.value -= 1
#print(num.value)
print ('end func2')
if __name__=='__main__':
ctx = mp.get_context('fork')
p1 = ctx.Process(target=func1)
p1.start()
p2 = ctx.Process(target=func2)
p2.start()
p1.join()
p2.join()
sys.stdout.flush()
time.sleep(25)
print(num.value)
Can anyone offer any explanation?
To clarify: When lock is set to "False", it behaves as expected, printing out '0', however, when it is "True" it often does not.
This is more noticeable/happens more often for larger values of 'range'.
Tested this on two platforms (Mac OSx and Ubuntu 14.04.01) both with python 3.6.
The docs for multiprocessing.Value are very explicit about this:
Operations like += which involve a read and write are not atomic. So if, for instance, you want to atomically increment a shared value it is insufficient to just do
counter.value += 1
Assuming the associated lock is recursive (which it is by default) you can instead do
with counter.get_lock():
counter.value += 1
To your comment, this is not "1000 incrementations". This is 1000 iterations of:
# Take lock on num.value
temp_value = num.value # (1)
# release lock on num.value (anything can modify it now)
temp_value += 1 # (2)
# Take lock on num.value
num.value = temp_value # (3)
# release lock on num.value
That's what it means when it says += is not atomic.
If num.value is modified by another process during line 2, then line 3 will write the wrong value to num.value.
To give an example of a better way to approach what you're doing, here's a version using Queues that ensures everything stays tick-tock in lock step:
import multiprocessing as mp
import queue
import sys
# An increment process. Takes a value, increments it, passes it along
def func1(in_queue: mp.Queue, out_queue: mp.Queue):
print('start func1')
for x in range(1000):
n = in_queue.get()
n += 1
print("inc", n)
out_queue.put(n)
print('end func1')
# An decrement process. Takes a value, decrements it, passes it along
def func2(in_queue: mp.Queue, out_queue: mp.Queue):
print('start func2')
for x in range(1000):
n = in_queue.get()
n -= 1
print("dec", n)
out_queue.put(n)
print('end func2')
if __name__ == '__main__':
ctx = mp.get_context('fork')
queue1 = mp.Queue()
queue2 = mp.Queue()
# Make two processes and tie their queues back to back. They hand a value
# back and forth until they've run their course.
p1 = ctx.Process(target=func1, args=(queue1, queue2,))
p1.start()
p2 = ctx.Process(target=func2, args=(queue2, queue1,))
p2.start()
# Get it started
queue1.put(0)
# Wait from them to finish
p1.join()
p2.join()
# Since this is a looping process, the result is on the queue we put() to.
# (Using block=False because I'd rather throw an exception if something
# went wrong rather than deadlock.)
num = queue1.get(block=False)
print("FINAL=%d" % num)
This is a very simplistic example. In more robust code you need to think about what happens in failure cases. For example, if p1 throws an exception, p2 will deadlock waiting for its value. In many ways that's a good thing since it means you can recover the system by starting a new p1 process with the same queues. This way of dealing with concurrency is called the Actor model if you want to study it further.
I have code that takes a long time to run and so I've been investigating Python's multiprocessing library in order to speed things up. My code also has a few steps that utilize the GPU via PyOpenCL. The problem is, if I set multiple processes to run at the same time, they all end up trying to use the GPU at the same time, and that often results in one or more of the processes throwing an exception and quitting.
In order to work around this, I staggered the start of each process so that they'd be less likely to bump into each other:
process_list = []
num_procs = 4
# break data into chunks so each process gets it's own chunk of the data
data_chunks = chunks(data,num_procs)
for chunk in data_chunks:
if len(chunk) == 0:
continue
# Instantiates the process
p = multiprocessing.Process(target=test, args=(arg1,arg2))
# Sticks the thread in a list so that it remains accessible
process_list.append(p)
# Start threads
j = 1
for process in process_list:
print('\nStarting process %i' % j)
process.start()
time.sleep(5)
j += 1
for process in process_list:
process.join()
I also wrapped a try except loop around the function that calls the GPU so that if two processes DO try to access it at the same time, the one who doesn't get access will wait a couple of seconds and try again:
wait = 2
n = 0
while True:
try:
gpu_out = GPU_Obj.GPU_fn(params)
except:
time.sleep(wait)
print('\n Waiting for GPU memory...')
n += 1
if n == 5:
raise Exception('Tried and failed %i times to allocate memory for opencl kernel.' % n)
continue
break
This workaround is very clunky and even though it works most of the time, processes occasionally throw exceptions and I feel like there should be a more effecient/elegant solution using multiprocessing.queue or something similar. However, I'm not sure how to integrate it with PyOpenCL for GPU access.
Sounds like you could use a multiprocessing.Lock to synchronize access to the GPU:
data_chunks = chunks(data,num_procs)
lock = multiprocessing.Lock()
for chunk in data_chunks:
if len(chunk) == 0:
continue
# Instantiates the process
p = multiprocessing.Process(target=test, args=(arg1,arg2, lock))
...
Then, inside test where you access the GPU:
with lock: # Only one process will be allowed in this block at a time.
gpu_out = GPU_Obj.GPU_fn(params)
Edit:
To do this with a pool, you'd do this:
# At global scope
lock = None
def init(_lock):
global lock
lock = _lock
data_chunks = chunks(data,num_procs)
lock = multiprocessing.Lock()
for chunk in data_chunks:
if len(chunk) == 0:
continue
# Instantiates the process
p = multiprocessing.Pool(initializer=init, initargs=(lock,))
p.apply(test, args=(arg1, arg2))
...
Or:
data_chunks = chunks(data,num_procs)
m = multiprocessing.Manager()
lock = m.Lock()
for chunk in data_chunks:
if len(chunk) == 0:
continue
# Instantiates the process
p = multiprocessing.Pool()
p.apply(test, args=(arg1, arg2, lock))