python multiprocessing manager.dict() is blocking

python multiprocessing manager.dict() is blocking - python

I would like to have main python process to create a child process that continuosly updates an object (Node). An object needs to be accessible from both main process and child process. Once I add instance of my Node object to instance of manager.dict() when trying to retrieve Node object from it, main process is blocked.
Below is a simplified code
test.py
from multiprocessing import Process, Manager
import time
class Node(object):
def __init__(self, host):
self.host = host
self.refreshed = 0
def refresh(self):
self.refreshed = int(time.time())
def __repr__(self):
return 'Node host:%s' % (self.host,)
man = Manager()
d = man.dict()
def worker(d):
while True:
node = d['n1']
node.refresh()
d['n1'] = node
time.sleep(3)
proc = Process(target=worker, args=(d,))
run.py
import test
test.d['n1'] = test.Node('localhost')
test.proc.start()
If I drop to interpreter here and do test.d.items() it will block.
Update
If I alter the code and instead of Node instance just use primitive value, e.g. increment an int, it works fine.
Update
If I move code from run.py to the bottom of test.py so everything is in the same scope, then it works fine.

try to put your code behind
if __name__ == "main":
for example:
if __name__ == "__main__":
man = Manager()
d = man.dict()
proc = Process(target=worker, args=(d,))

Related

How to allow a class's variables to be modified concurrently by multiple threads

I have a class (MyClass) which contains a queue (self.msg_queue) of actions that need to be run and I have multiple sources of input that can add tasks to the queue.
Right now I have three functions that I want to run concurrently:
MyClass.get_input_from_user()
Creates a window in tkinter that has the user fill out information and when the user presses submit it pushes that message onto the queue.
MyClass.get_input_from_server()
Checks the server for a message, reads the message, and then puts it onto the queue. This method uses functions from MyClass's parent class.
MyClass.execute_next_item_on_the_queue()
Pops a message off of the queue and then acts upon it. It is dependent on what the message is, but each message corresponds to some method in MyClass or its parent which gets run according to a big decision tree.
Process description:
After the class has joined the network, I have it spawn three threads (one for each of the above functions). Each threaded function adds items from the queue with the syntax "self.msg_queue.put(message)" and removes items from the queue with "self.msg_queue.get_nowait()".
Problem description:
The issue I am having is that it seems that each thread is modifying its own queue object (they are not sharing the queue, msg_queue, of the class of which they, the functions, are all members).
I am not familiar enough with Multiprocessing to know what the important error messages are; however, it is stating that it cannot pickle a weakref object (it gives no indication of which object is the weakref object), and that within the queue.put() call the line "self._sem.acquire(block, timeout) yields a '[WinError 5] Access is denied'" error. Would it be safe to assume that this failure in the queue's reference not copying over properly?
[I am using Python 3.7.2 and the Multiprocessing package's Process and Queue]
[I have seen multiple Q/As about having threads shuttle information between classes--create a master harness that generates a queue and then pass that queue as an argument to each thread. If the functions didn't have to use other functions from MyClass I could see adapting this strategy by having those functions take in a queue and use a local variable rather than class variables.]
[I am fairly confident that this error is not the result of passing my queue to the tkinter object as my unit tests on how my GUI modifies its caller's queue work fine]
Below is a minimal reproducible example for the queue's error:
from multiprocessing import Queue
from multiprocessing import Process
import queue
import time
class MyTest:
def __init__(self):
self.my_q = Queue()
self.counter = 0
def input_function_A(self):
while True:
self.my_q.put(self.counter)
self.counter = self.counter + 1
time.sleep(0.2)
def input_function_B(self):
while True:
self.counter = 0
self.my_q.put(self.counter)
time.sleep(1)
def output_function(self):
while True:
try:
var = self.my_q.get_nowait()
except queue.Empty:
var = -1
except:
break
print(var)
time.sleep(1)
def run(self):
process_A = Process(target=self.input_function_A)
process_B = Process(target=self.input_function_B)
process_C = Process(target=self.output_function)
process_A.start()
process_B.start()
process_C.start()
# without this it generates the WinError:
# with this it still behaves as if the two input functions do not modify the queue
process_C.join()
if __name__ == '__main__':
test = MyTest()
test.run()

Indeed - these are not "threads" - these are "processes" - while if you were using multithreading, and not multiprocessing, the self.my_q instance would be the same object, placed at the same memory space on the computer,
multiprocessing does a fork of the process, and any data in the original process (the one in execution in the "run" call) will be duplicated when it is used - so, each subprocess will see its own "Queue" instance, unrelated to the others.
The correct way to have various process share a multiprocessing.Queue object is to pass it as a parameter to the target methods. The simpler way to reorganize your code so that it works is thus:
from multiprocessing import Queue
from multiprocessing import Process
import queue
import time
class MyTest:
def __init__(self):
self.my_q = Queue()
self.counter = 0
def input_function_A(self, queue):
while True:
queue.put(self.counter)
self.counter = self.counter + 1
time.sleep(0.2)
def input_function_B(self, queue):
while True:
self.counter = 0
queue.put(self.counter)
time.sleep(1)
def output_function(self, queue):
while True:
try:
var = queue.get_nowait()
except queue.Empty:
var = -1
except:
break
print(var)
time.sleep(1)
def run(self):
process_A = Process(target=self.input_function_A, args=(queue,))
process_B = Process(target=self.input_function_B, args=(queue,))
process_C = Process(target=self.output_function, args=(queue,))
process_A.start()
process_B.start()
process_C.start()
# without this it generates the WinError:
# with this it still behaves as if the two input functions do not modify the queue
process_C.join()
if __name__ == '__main__':
test = MyTest()
test.run()
As you can see, since your class is not actually sharing any data through the instance's attributes, this "class" design does not make much sense for your application - but for grouping the different workers in the same code block.
It would be possible to have a magic-multiprocess-class that would have some internal method to actually start the worker-methods and share the Queue instance - so if you have a lot of those in a project, there would be a lot less boilerplate.
Something along:
from multiprocessing import Queue
from multiprocessing import Process
import time
class MPWorkerBase:
def __init__(self, *args, **kw):
self.queue = None
self.is_parent_process = False
self.is_child_process = False
self.processes = []
# ensure this can be used as a colaborative mixin
super().__init__(*args, **kw)
def run(self):
if self.is_parent_process or self.is_child_process:
# workers already initialized
return
self.queue = Queue()
processes = []
cls = self.__class__
for name in dir(cls):
method = getattr(cls, name)
if callable(method) and getattr(method, "_MP_worker", False):
process = Process(target=self._start_worker, args=(self.queue, name))
self.processes.append(process)
process.start()
# Setting these attributes here ensure the child processes have the initial values for them.
self.is_parent_process = True
self.processes = processes
def _start_worker(self, queue, method_name):
# this method is called in a new spawned process - attribute
# changes here no longer reflect attributes on the
# object in the initial process
# overwrite queue in this process with the queue object sent over the wire:
self.queue = queue
self.is_child_process = True
# call the worker method
getattr(self, method_name)()
def __del__(self):
for process in self.processes:
process.join()
def worker(func):
"""decorator to mark a method as a worker that should
run in its own subprocess
"""
func._MP_worker = True
return func
class MyTest(MPWorkerBase):
def __init__(self):
super().__init__()
self.counter = 0
#worker
def input_function_A(self):
while True:
self.queue.put(self.counter)
self.counter = self.counter + 1
time.sleep(0.2)
#worker
def input_function_B(self):
while True:
self.counter = 0
self.queue.put(self.counter)
time.sleep(1)
#worker
def output_function(self):
while True:
try:
var = self.queue.get_nowait()
except queue.Empty:
var = -1
except:
break
print(var)
time.sleep(1)
if __name__ == '__main__':
test = MyTest()
test.run()

Multiprocessing share variables (type self created)

I have a GUI where I want to do multiprocessing. My problem comes when i want to share a variable from a class that I had created.
I am trying to share it into the two modules that are running simultaneosly by using class multiprocessing.Queue([maxsize]).
But it dosen't work...
sch=sched()
q=Queue()
q.put([sch])
def foo1():
sch=q.get([sch])
event=Event(8) #another class created
sch.add_list(event)
q.put([sch])
time.sleep(12)
def foo2():
time.sleep(4)
sch=q.get([sch])
q.put([sch])
print(sch.event_list)
if __name__ == '__main__':
p1 = Process(target=foo1)
p2 = Process(target=foo2)
p1.start()
p2.start()

You have to pass the Queue instance as argument to the processes.
from multiprocessing import Process, Queue
def foo1(q):
sch = q.get([sch])
event = Event(8)
sch.add_list(event)
q.put([sch])
time.sleep(12)
def foo2(q):
time.sleep(4)
sch = q.get([sch])
q.put([sch])
print(sch.event_list)
if __name__ == "__main__":
q = Queue()
sch = sched()
q.put([sch])
p1 = Process(target=foo1, args=(q,))
p2 = Process(target=foo2, args=(q,))
p1.start()
p2.start()
When Python forks a new process it gets its own namespace, so it is not possible to access variables of your main program from within a child process.
If you want to share one object without passing it around through Pipes or Queues, you can also use Python Manager classes (see https://docs.python.org/3/library/multiprocessing.html?#managers). A manager creates a shared object which can be accessed from child processes. I suppose, sch is a list, so I used manager.list() in my example. For other data types just check the Python docs.
from multiprocessing import Process, Manager
def foo():
sch = sch_shared # Read from shared list
if __name__ == "__main__":
sch = sched()
mgr = Manager()
sch_shared = mgr.list(sch) # Create a shared list and return proxy
p = Process(target=foo)
p.start()

Python: multiprocessing Queue.put() in module won't send anything to parent process

I am trying to make 2 processes communicate between each other using the multiprocessing package in Python, and more precisely the Queue() class. From the parent process, I want to get an updated value of the child process each 5 seconds. This child process is a class function. I have done a toy example where everything works fine.
However, when I try to implement this solution in my project, it seems that the Queue.put() method of the child process in the sub-module won't send anything to the parent process, because the parent process won't print the desired value and the code never stops running. Actually, the parent process only prints the value sent to the child process, which is True here, but as I said, never stops.
So my questions are:
Is there any error in my toy-example ?
How should I modify my project in order to get it working just like my toy example ?
Toy example: works
main module
from multiprocessing import Process, Event, Lock, Queue, Pipe
import time
import test_mod as test
def loop(output):
stop_event = Event()
q = Queue()
child_process = Process(target=test.child.sub, args=(q,))
child_process.start()
i = 0
print("started at {} ".format(time.time()))
while not stop_event.is_set():
i+=1
time.sleep(5)
q.put(True)
print(q.get())
if i == 5:
child_process.terminate()
stop_event.set()
output.put("main process looped")
if __name__ == '__main__':
stop_event, output = Event(), Queue()
k = 0
while k < 5:
loop_process = Process(target=loop, args=(output,))
loop_process.start()
print(output.get())
loop_process.join()
k+=1
submodule
from multiprocessing import Process, Event, Lock, Queue, Pipe
import time
class child(object):
def __init__(self):
pass
def sub(q):
i = 0
while i < 2000:
latest_value = time.time()
accord = q.get()
if accord == True:
q.put(latest_value)
accord = False
time.sleep(0.0000000005)
i+=1
Project code: doesn't work
main module
import neat #package in which the submodule is
import *some other stuff*
def run(config_file):
config = neat.Config(some configuration)
p = neat.Population(config)
**WHERE MY PROBLEM IS**
stop_event = Event()
q = Queue()
pe = neat.ParallelEvaluator(**args)
child_process = Process(target=p.run, args=(pe.evaluate, q, other args))
child_process.start()
i = 0
while not stop_event.is_set():
q.put(True)
print(q.get())
time.sleep(5)
i += 1
if i == 5:
child_process.terminate()
stop_event.set()
if __name__ == '__main__':
run(config_file)
submodule
class Population(object):
def __init__():
*initialization*
def run(self, q, other args):
while n is None or k < n:
*some stuff*
accord = add_2.get()
if accord == True:
add_2.put(self.best_genome.fitness)
accord = False
return self.best_genome
NB:
I am not used to multiprocessing
I have tried to give the most relevant parts of my project, given that the entire code would be far too long.
I have also considered using Pipe(), however this option didn't work either.

If I see it correctly, your desired submodule is the class Population. However, you start your process with a parameter of the type ParallelEvaluator. Next, I can't see that you supply your Queue q to the sub-Process. That's what I see from the code provided:
stop_event = Event()
q = Queue()
pe = neat.ParallelEvaluator(**args)
child_process = Process(target=p.run, args=(pe.evaluate, **args)
child_process.start()
Moreover, the following lines create a race condition:
q.put(True)
print(q.get())
The get command is like a pop. So it takes an element and deletes it from the queue. If your sub-process doesn't access the queue between these two lines (because it is busy), the True will never make it to the child-process. Hence, it is better two use multiple queues. One for each direction. Something like:
stop_event = Event()
q_in = Queue()
q_out = Queue()
pe = neat.ParallelEvaluator(**args)
child_process = Process(target=p.run, args=(pe.evaluate, **args))
child_process.start()
i = 0
while not stop_event.is_set():
q_in.put(True)
print(q_out.get())
time.sleep(5)
i += 1
if i == 5:
child_process.terminate()
stop_event.set()
This is your submodule
class Population(object):
def __init__():
*initialization*
def run(self, **args):
while n is None or k < n:
*some stuff*
accord = add_2.get() # add_2 = q_in
if accord == True:
add_3.put(self.best_genome.fitness) #add_3 = q_out
accord = False
return self.best_genome

Process containing object method doesn't recognize edit to object

I have the following situation process=Process(target=sample_object.run) I then would like to edit a property of the sample_object: sample_object.edit_property(some_other_object).
class sample_object:
def __init__(self):
self.storage=[]
def edit_property(self,some_other_object):
self.storage.append(some_other_object)
def run:
while True:
if len(self.storage) is not 0:
print "1"
#I know it's an infinite loop. It's just an example.
_______________________________________________________
from multiprocessing import Process
from sample import sample_object
from sample2 import some_other_object
class driver:
if __name__ == "__main__":
samp = sample_object()
proc = Process(target=samp.run)
proc.start()
while True:
some = some_other_object()
samp.edit_property(some)
#I know it's an infinite loop
The previous code never prints "1". How would I connect the Process to the sample_object so that an edit made to the object whose method Process is calling is recognized by the process? In other words, is there a way to get .run to recognize the change in sample_object ?
Thank you.

You can use multiprocessing.Manager to share Python data structures between processes.
from multiprocessing import Process, Manager
class A(object):
def __init__(self, storage):
self.storage = storage
def add(self, item):
self.storage.append(item)
def run(self):
while True:
if self.storage:
print 1
if __name__ == '__main__':
manager = Manager()
storage = manager.list()
a = A(storage)
p = Process(target=a.run)
p.start()
for i in range(10):
a.add({'id': i})
p.join()

Shared variable in Python Process subclass

I was wondering if it would be possible to create some sort of static set in a Python Process subclass to keep track the types processes that are currently running asynchronously.
class showError(Process):
# Define some form of shared set that is shared by all Processes
displayed_errors = set()
def __init__(self, file_name, error_type):
super(showError, self).__init__()
self.error_type = error_type
def run(self):
if error_type not in set:
displayed_errors.add(error_type)
message = 'Please try again. ' + str(self.error_type)
winsound.MessageBeep(-1)
result = win32api.MessageBox(0, message, 'Error', 0x00001000)
if result == 0:
displayed_errors.discard(error_type)
That way, when I create/start multiple showError processes with the same error_type, subsequent error windows will not be created. So how can we define this shared set?

You can use a multiprocessing.Manager.dict (there's no set object available, but you can use a dict in the same way) and share that between all your subprocesses.
import multiprocessing as mp
if __name__ == "__main__":
m = mp.Manager()
displayed_errors = m.dict()
subp = showError("some filename", "some error type", displayed_errors)
Then change showError.__init__ to accept the shared dict:
def __init__(self, file_name, error_type, displayed_errors):
super(showError, self).__init__()
self.error_type = error_type
self.displayed_errors = displayed_errors
Then this:
displayed_errors.add(error_type)
Becomes:
self.displayed_errors[error_type] = 1
And this:
displayed_errors.discard(error_type)
Becomes:
try:
del self.displayed_errors[error_type]
except KeyError:
pass

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python multiprocessing manager.dict() is blocking - python

try to put your code behind if name == "main": for example: if name == "main": man = Manager() d = man.dict() proc = Process(target=worker, args=(d,))

Related

How to allow a class's variables to be modified concurrently by multiple threads

Multiprocessing share variables (type self created)

Python: multiprocessing Queue.put() in module won't send anything to parent process

Process containing object method doesn't recognize edit to object

Shared variable in Python Process subclass

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python multiprocessing manager.dict() is blocking - python

try to put your code behind if __name__ == "main": for example: if __name__ == "__main__": man = Manager() d = man.dict() proc = Process(target=worker, args=(d,))

Related

How to allow a class's variables to be modified concurrently by multiple threads

Multiprocessing share variables (type self created)

Python: multiprocessing Queue.put() in module won't send anything to parent process

Process containing object method doesn't recognize edit to object

Shared variable in Python Process subclass

Categories

Resources

try to put your code behind if name == "main": for example: if name == "main": man = Manager() d = man.dict() proc = Process(target=worker, args=(d,))