Python Multiprocessing shared variables erratic behavior - python

The following simple code should, as far I can see, always print out '0' in the end. However, when running it with "lock = True", it often prints out other positive or negative numbers.
import multiprocessing as mp
import sys
import time
num = mp.Value('d', 0.0, lock = False)
def func1():
global num
print ('start func1')
#While num.value < 100000:
for x in range(1000):
num.value += 1
#print(num.value)
print ('end func1')
def func2():
global num
print ('start func2')
#while num.value > -10000:
for x in range(1000):
num.value -= 1
#print(num.value)
print ('end func2')
if __name__=='__main__':
ctx = mp.get_context('fork')
p1 = ctx.Process(target=func1)
p1.start()
p2 = ctx.Process(target=func2)
p2.start()
p1.join()
p2.join()
sys.stdout.flush()
time.sleep(25)
print(num.value)
Can anyone offer any explanation?
To clarify: When lock is set to "False", it behaves as expected, printing out '0', however, when it is "True" it often does not.
This is more noticeable/happens more often for larger values of 'range'.
Tested this on two platforms (Mac OSx and Ubuntu 14.04.01) both with python 3.6.

The docs for multiprocessing.Value are very explicit about this:
Operations like += which involve a read and write are not atomic. So if, for instance, you want to atomically increment a shared value it is insufficient to just do
counter.value += 1
Assuming the associated lock is recursive (which it is by default) you can instead do
with counter.get_lock():
counter.value += 1
To your comment, this is not "1000 incrementations". This is 1000 iterations of:
# Take lock on num.value
temp_value = num.value # (1)
# release lock on num.value (anything can modify it now)
temp_value += 1 # (2)
# Take lock on num.value
num.value = temp_value # (3)
# release lock on num.value
That's what it means when it says += is not atomic.
If num.value is modified by another process during line 2, then line 3 will write the wrong value to num.value.
To give an example of a better way to approach what you're doing, here's a version using Queues that ensures everything stays tick-tock in lock step:
import multiprocessing as mp
import queue
import sys
# An increment process. Takes a value, increments it, passes it along
def func1(in_queue: mp.Queue, out_queue: mp.Queue):
print('start func1')
for x in range(1000):
n = in_queue.get()
n += 1
print("inc", n)
out_queue.put(n)
print('end func1')
# An decrement process. Takes a value, decrements it, passes it along
def func2(in_queue: mp.Queue, out_queue: mp.Queue):
print('start func2')
for x in range(1000):
n = in_queue.get()
n -= 1
print("dec", n)
out_queue.put(n)
print('end func2')
if __name__ == '__main__':
ctx = mp.get_context('fork')
queue1 = mp.Queue()
queue2 = mp.Queue()
# Make two processes and tie their queues back to back. They hand a value
# back and forth until they've run their course.
p1 = ctx.Process(target=func1, args=(queue1, queue2,))
p1.start()
p2 = ctx.Process(target=func2, args=(queue2, queue1,))
p2.start()
# Get it started
queue1.put(0)
# Wait from them to finish
p1.join()
p2.join()
# Since this is a looping process, the result is on the queue we put() to.
# (Using block=False because I'd rather throw an exception if something
# went wrong rather than deadlock.)
num = queue1.get(block=False)
print("FINAL=%d" % num)
This is a very simplistic example. In more robust code you need to think about what happens in failure cases. For example, if p1 throws an exception, p2 will deadlock waiting for its value. In many ways that's a good thing since it means you can recover the system by starting a new p1 process with the same queues. This way of dealing with concurrency is called the Actor model if you want to study it further.

Related

python problem with multiprocessing and for

I'd like to check how much difference the for statement takes with multiprocessing. I don't think the for statement of the function do_something can be executed when I run the code. Please help me out on which part I did wrong.
The sum result kept on going to zero.
import time
import multiprocessing
from sys import stdout
sum=0
def do_something():
for i in range(1000):
global sum
sum=sum+1
progress = 100*(i+1)/1000 #process percentage
stdout.write("\r ===== %d%% completed =====" % progress) #process percentage
stdout.flush()
stdout.write("\n")
# str=StringVar()
if __name__ == '__main__':
start = time.perf_counter()
processes = []
for _ in range(1):
p = multiprocessing.Process(target=do_something) ##
p.start()
processes.append(p)
for process in processes:
process.join()
finish = time.perf_counter()
print(f'{round(finish-start,2)} sec completed')
print(sum)
#Result
0.16 sec completed
0
As #tdelaney commented the subprocess created will be updating an instance of sum that "lives" in its own address space distinct from the address space of the main process that launched it. The usual solution would be to pass to tdelaney a multiprocessing.Queue instance that it can write the sum to and which the main process can then read (which should be done before joining the subprocess).
In the code below, however, I am using a multiprocessing.Pipe on which the multiprocessing.Queue is built. It is not as flexible as a queue in that it only readily supports a single reader and writer, but for this application that is all you need and it is a much better performer. The call to Pipe() returns two connections, one for sending objects and the other for receiving objects.
Note that in your code that the final print statement needs to be indented.
You should also refrain from naming variables the same as builtin functions, e.g. sum.
import time
import multiprocessing
from sys import stdout
def do_something(send_conn):
the_sum = 0
for i in range(1000):
the_sum = the_sum + 1
progress = 100*(i+1)/1000 #process percentage
stdout.write("\r ===== %d%% completed =====" % progress) #process percentage
stdout.flush()
stdout.write("\n")
send_conn.send(the_sum)
# str=StringVar()
if __name__ == '__main__':
start = time.perf_counter()
read_conn, send_conn = multiprocessing.Pipe(duplex=False)
p = multiprocessing.Process(target=do_something, args=(send_conn,)) ##
p.start()
the_sum = read_conn.recv()
p.join()
finish = time.perf_counter()
print(f'{round(finish-start,2)} sec completed')
print(the_sum)
Prints:
===== 100% completed =====
0.16 sec completed
1000
Here is the same code using a multiprocessing.Queue:
import time
import multiprocessing
from sys import stdout
def do_something(queue):
the_sum = 0
for i in range(1000):
the_sum = the_sum + 1
progress = 100*(i+1)/1000 #process percentage
stdout.write("\r ===== %d%% completed =====" % progress) #process percentage
stdout.flush()
stdout.write("\n")
queue.put(the_sum)
# str=StringVar()
if __name__ == '__main__':
start = time.perf_counter()
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=do_something, args=(queue,)) ##
p.start()
the_sum = queue.get()
p.join()
finish = time.perf_counter()
print(f'{round(finish-start,2)} sec completed')
print(the_sum)
Prints:
===== 100% completed =====
0.17 sec completed
1000

multiprocessing value hangs with lock

I've read the documentation here, and seems that to make sure that the Value does not hang we need to use a lock. I did just that but it still gets stuck:
from multiprocessing import Process, Value, freeze_support, Lock
nb_threads = 3
nbloops = 10
v = Value('i', 0)
def run_process(lock):
global nbloops
i = 0
while i < nbloops:
# do stuff
i += 1
with lock:
v.value += 1
# wait for all the processes to finish doing something
while v.value % nb_threads != 0:
pass
if __name__ == '__main__':
freeze_support()
processes = []
lock = Lock()
for i in range(0, 3):
processes.append( Process( target=run_process, args=(lock,) ) )
for process in processes:
process.start()
for process in processes:
process.join()
I've tried accessing the value using lock but it still blocks:
val = -1
while val % nb_threads != 0:
with lock:
val = v.value
How can I fix this? Thanks
Your code has a race condition; you do not guarantee that all three processes break free from the while v.value % nb_threads != 0 loop before allowing them to move on. This allows one or two of the processes to move on to the next iteration of the while i < nbloops loop, increment v.value, and then prevent the remaining process/processes from ever breaking out of their own while v.value % nb_threads != 0 loop. The kind of synchronization you're trying to do there is best handled by a Barrier, rather than looping and repeatedly checking the value.
Also, multiprocessing.Value also has a built-in synchronization by default, and you can explicitly access the Lock it uses for that by calling Value.get_lock, so there is no need to explicitly a Lock of your own to each process. Putting together, you have:
from multiprocessing import Process, Value, freeze_support, Lock, Barrier
nb_threads = 3
nbloops = 10
v = Value('i', 0)
def run_process(barrier):
global nbloops
i = 0
while i < nbloops:
# do stuff
i += 1
with v.get_lock():
v.value += 1
# wait for all the processes to finish doing something
out = barrier.wait()
if __name__ == '__main__':
freeze_support()
processes = []
b = Barrier(nb_threads)
for i in range(0, nb_threads):
processes.append( Process( target=run_process, args=(b,) ) )
for process in processes:
process.start()
for process in processes:
process.join()
The Barrier guarantees that no process can move on to the next iteration of the loop until all of them have called Barrier.wait(), at which point all three are simultaneously able to progress. The Barrier object supports re-use, so it can safely be called on each iteration.

Solving deadlock in python using multiprocessing subprocess?

I am suppose to modify this code without changing the main function to stop it from deadlocking. It is deadlocking because of how the locks end up waiting for each other but I cannot figure out how to stop it. My professors lecture talks about os.fork which I can't use since I am on windows.
I was looking into the pool thing with multiprocessing but can't see how to implement that without changing the main function. I am pretty sure I am supposed to use subprocess, but again, she didn't include any information about it and I can't find a relevant example online.
import threading
x = 0
def task(lock1, lock2, count):
global x
for i in range(count):
lock1.acquire()
lock2.acquire()
# Assume that a thread can update the x value
# only after both locks have been acquired.
x+=1
print(x)
lock2.release()
lock1.release()
# Do not modify the main method
def main():
global x
count = 1000
lock1 = threading.Lock()
lock2 = threading.Lock()
T1 = threading.Thread(target = task, args = (lock1, lock2, count))
T2 = threading.Thread(target = task, args = (lock2, lock1, count))
T1.start()
T2.start()
T1.join()
T2.join()
print(f"x = {x}")
main()
Edit: Changing task to this seems to have fixed it, although I do not think it was done the way she wanted...
def task(lock1, lock2, count):
global x
for i in range(count):
lock1.acquire(False)
lock2.acquire(False)
# Assume that a thread can update the x value
# only after both locks have been acquired.
x+=1
print(x)
if lock2.locked():
lock2.release()
if lock1.locked():
lock1.release()
Your threads need to lock the locks in a consistent order. You can do this by locking the one with the lower id value first:
def task(lock1, lock2, count):
global x
if id(lock1) > id(lock2):
lock1, lock2 = lock2, lock1
for i in range(count):
lock1.acquire()
lock2.acquire()
# Assume that a thread can update the x value
# only after both locks have been acquired.
x+=1
print(x)
lock2.release()
lock1.release()
With a consistent lock order, it's impossible for two threads to each be holding a lock the other needs.
(multiprocessing, subprocess, and os.fork are all unhelpful here. They would just add more issues.)

How to kill a process using the multiprocessing module?

I have a process that is essentially just an infinite loop and I have a second process that is a timer. How can I kill the loop process once the timer is done?
def action():
x = 0
while True:
if x < 1000000:
x = x + 1
else:
x = 0
def timer(time):
time.sleep(time)
exit()
loop_process = multiprocessing.Process(target=action)
loop_process.start()
timer_process = multiprocessing.Process(target=timer, args=(time,))
timer_process.start()
I want the python script to end once the timer is done.
You could do it by using a sharing state between the processes and creating a flag value that all the concurrent processes can access (although this may be somewhat inefficient).
Here's what I'm suggesting:
import multiprocessing as mp
import time
def action(run_flag):
x = 0
while run_flag.value:
if x < 1000000:
x = x + 1
else:
x = 0
print('action() terminating')
def timer(run_flag, secs):
time.sleep(secs)
run_flag.value = False
if __name__ == '__main__':
run_flag = mp.Value('I', True)
loop_process = mp.Process(target=action, args=(run_flag,))
loop_process.start()
timer_process = mp.Process(target=timer, args=(run_flag, 2.0))
timer_process.start()
loop_process.join()
timer_process.join()
print('done')
A simple return statement after else in action() would work perfectly. Moreover, you had an error in your timer function. Your argument had the same name as inbuilt library time.
def action():
x = 0
while True:
if x < 1000000:
x = x + 1
else:
x = 0
return # To exit else it will always revolve in infinite loop
def timer(times):
time.sleep(times)
exit()
loop_process = multiprocessing.Process(target=action)
loop_process.start()
timer_process = multiprocessing.Process(target=timer(10))
timer_process.start()
Hope this answers your question!!!
I think you don't need to make a second process just for a timer.
Graceful Timeout
In case you need clean up before exit in your action process, you can use a Timer-thread and let the while-loop check if it is still alive. This allows your worker process to exit gracefully, but you'll have to pay with reduced performance
because the repeated method call takes some time. Doesn't have to be an issue if it' s not a tight loop, though.
from multiprocessing import Process
from datetime import datetime
from threading import Timer
def action(runtime, x=0):
timer = Timer(runtime, lambda: None) # just returns None on timeout
timer.start()
while timer.is_alive():
if x < 1_000_000_000:
x += 1
else:
x = 0
if __name__ == '__main__':
RUNTIME = 1
p = Process(target=action, args=(RUNTIME,))
p.start()
print(f'{datetime.now()} {p.name} started')
p.join()
print(f'{datetime.now()} {p.name} ended')
Example Output:
2019-02-28 19:18:54.731207 Process-1 started
2019-02-28 19:18:55.738308 Process-1 ended
Termination on Timeout
If you don't have the need for a clean shut down (you are not using shared queues, working with DBs etc.), you can let the parent process terminate() the worker-process after your specified time.
terminate()
Terminate the process. On Unix this is done using the SIGTERM signal; on Windows TerminateProcess() is used. Note that exit handlers and finally clauses, etc., will not be executed.
Note that descendant processes of the process will not be terminated – they will simply become orphaned.
Warning If this method is used when the associated process is using a pipe or queue then the pipe or queue is liable to become corrupted and may become unusable by other process. Similarly, if the process has acquired a lock or semaphore etc. then terminating it is liable to cause other processes to deadlock. docs
If you don't have anything to do in the parent you can simply .join(timeout) the worker-process and .terminate() afterwards.
from multiprocessing import Process
from datetime import datetime
def action(x=0):
while True:
if x < 1_000_000_000:
x += 1
else:
x = 0
if __name__ == '__main__':
RUNTIME = 1
p = Process(target=action)
p.start()
print(f'{datetime.now()} {p.name} started')
p.join(RUNTIME)
p.terminate()
print(f'{datetime.now()} {p.name} terminated')
Example Output:
2019-02-28 19:22:43.705596 Process-1 started
2019-02-28 19:22:44.709255 Process-1 terminated
In case you want to use terminate(), but need your parent unblocked you could also use a Timer-thread within the parent for that.
from multiprocessing import Process
from datetime import datetime
from threading import Timer
def action(x=0):
while True:
if x < 1_000_000_000:
x += 1
else:
x = 0
def timeout(process, timeout):
timer = Timer(timeout, process.terminate)
timer.start()
if __name__ == '__main__':
RUNTIME = 1
p = Process(target=action)
p.start()
print(f'{datetime.now()} {p.name} started')
timeout(p, RUNTIME)
p.join()
print(f'{datetime.now()} {p.name} terminated')
Example Output:
2019-02-28 19:23:45.776951 Process-1 started
2019-02-28 19:23:46.778840 Process-1 terminated

How to use Python multiprocessing queue to access GPU (through PyOpenCL)?

I have code that takes a long time to run and so I've been investigating Python's multiprocessing library in order to speed things up. My code also has a few steps that utilize the GPU via PyOpenCL. The problem is, if I set multiple processes to run at the same time, they all end up trying to use the GPU at the same time, and that often results in one or more of the processes throwing an exception and quitting.
In order to work around this, I staggered the start of each process so that they'd be less likely to bump into each other:
process_list = []
num_procs = 4
# break data into chunks so each process gets it's own chunk of the data
data_chunks = chunks(data,num_procs)
for chunk in data_chunks:
if len(chunk) == 0:
continue
# Instantiates the process
p = multiprocessing.Process(target=test, args=(arg1,arg2))
# Sticks the thread in a list so that it remains accessible
process_list.append(p)
# Start threads
j = 1
for process in process_list:
print('\nStarting process %i' % j)
process.start()
time.sleep(5)
j += 1
for process in process_list:
process.join()
I also wrapped a try except loop around the function that calls the GPU so that if two processes DO try to access it at the same time, the one who doesn't get access will wait a couple of seconds and try again:
wait = 2
n = 0
while True:
try:
gpu_out = GPU_Obj.GPU_fn(params)
except:
time.sleep(wait)
print('\n Waiting for GPU memory...')
n += 1
if n == 5:
raise Exception('Tried and failed %i times to allocate memory for opencl kernel.' % n)
continue
break
This workaround is very clunky and even though it works most of the time, processes occasionally throw exceptions and I feel like there should be a more effecient/elegant solution using multiprocessing.queue or something similar. However, I'm not sure how to integrate it with PyOpenCL for GPU access.
Sounds like you could use a multiprocessing.Lock to synchronize access to the GPU:
data_chunks = chunks(data,num_procs)
lock = multiprocessing.Lock()
for chunk in data_chunks:
if len(chunk) == 0:
continue
# Instantiates the process
p = multiprocessing.Process(target=test, args=(arg1,arg2, lock))
...
Then, inside test where you access the GPU:
with lock: # Only one process will be allowed in this block at a time.
gpu_out = GPU_Obj.GPU_fn(params)
Edit:
To do this with a pool, you'd do this:
# At global scope
lock = None
def init(_lock):
global lock
lock = _lock
data_chunks = chunks(data,num_procs)
lock = multiprocessing.Lock()
for chunk in data_chunks:
if len(chunk) == 0:
continue
# Instantiates the process
p = multiprocessing.Pool(initializer=init, initargs=(lock,))
p.apply(test, args=(arg1, arg2))
...
Or:
data_chunks = chunks(data,num_procs)
m = multiprocessing.Manager()
lock = m.Lock()
for chunk in data_chunks:
if len(chunk) == 0:
continue
# Instantiates the process
p = multiprocessing.Pool()
p.apply(test, args=(arg1, arg2, lock))

Categories