How can I pass a Lock object to subclass of multiprocessing.Process ? I've tried this and I faced pickling error.
from multiprocessing import Process
from threading import Lock
class myProcess (Process):
def setLock (self , lock) :
self.lock = lock
def run(self) :
with self.lock :
# do stuff
if __name__ == '__main__' :
lock = Lock()
proc1 = myProcess()
proc1.setLock(lock)
proc2 = myProcess()
proc2.setLock(lock)
proc1.start()
proc2.start()
There are many answered questions about passing lock to multiprocessing.Pool but none of them solved my problem with OOP approach usage of Process. If I wanna make a global lock , where should I define it and where can I pass it to myProcess objects ?
You can't use a threading.Lock for multiprocessing, you need to use a multiprocessing.Lock.
You get the pickling-error because a threading.Lock can't be pickled and you are on a OS which uses "spawn" as default for starting new processes (Windows or macOS with Python 3.8+).
Note that on a forking OS (Linux, BSD...), with using threading.Lock, you wouldn't get a pickling error, but the lock would be silently replicated, not providing the synchronization between processes you intended.
Using a separate function for setting the lock is possible, but I would prefer passing it as argument to Process.__init__() along with possible other arguments.
import time
from multiprocessing import Process, Lock, current_process
class MyProcess(Process):
def __init__(self, lock, name=None, args=(), kwargs={}, daemon=None):
super().__init__(
group=None, name=name, args=args, kwargs=kwargs, daemon=daemon
)
# `args` and `kwargs` are stored as `self._args` and `self._kwargs`
self.lock = lock
def run(self) :
with self.lock :
for i in range(3):
print(current_process().name, *self._args)
time.sleep(1)
if __name__ == '__main__' :
lock = Lock()
p1 = MyProcess(lock=lock, args=("hello",))
p2 = MyProcess(lock=lock, args=("world",))
p1.start()
p2.start()
p1.join() # don't forget joining to prevent parent from exiting too soon.
p2.join()
Output:
MyProcess-1 hello
MyProcess-1 hello
MyProcess-1 hello
MyProcess-2 world
MyProcess-2 world
MyProcess-2 world
Process finished with exit code 0
Related
I have two python scripts and I want them to communicate to each other. Specifically, I want script Communication.py to send an array to script Process.py if required by the latter. I've used module multiprocessing.Process and multiprocessing.Pipe to make it works. My code works, but I want to handle gracefully SIGINT and SIGTERM, I've tried the following but it does not exit gracefully:
Process.py
from multiprocessing import Process, Pipe
from Communication import arraySender
import time
import signal
class GracefulKiller:
kill_now = False
def __init__(self):
signal.signal(signal.SIGINT, self.exit_gracefully)
signal.signal(signal.SIGTERM, self.exit_gracefully)
def exit_gracefully(self, *args):
self.kill_now = True
def main():
parent_conn, child_conn = Pipe()
p = Process(target=arraySender, args=(child_conn,True))
p.start()
print(parent_conn.recv())
if __name__ == '__main__':
killer = GracefulKiller()
while not killer.kill_now:
main()
Communication.py
import numpy
from multiprocessing import Process, Pipe
def arraySender(child_conn, sendData):
if sendData:
child_conn.send(numpy.random.randint(0, high=10, size=15, dtype=int))
child_conn.close()
what am I doing wrong?
I strongly suspect you are running this under Windows because I think the code you have should work under Linux. This is why it is important to always tag your questions concerning Python and multiprocessing with the actual platform you are on.
The problem appears to be due to the fact that in addition to your main process you have created a child process in function main that is also receiving the signals. The solution would normally be to add calls like signal.signal(signal.SIGINT, signal.SIG_IGN) to your array_sender worker function. But there are two problems with this:
There is a race condition: The signal could be received by the child process before it has a change to ignore signals.
Regardless, the call to ignore signals when you are using multiprocess.Processing does not seem to work (perhaps that class does its own signal handling that overrides these calls).
The solution is to create a multiprocessing pool and initialize each pool process so that they ignore signals before you submit any tasks. The other advantage of using a pool, although in this case we only need a pool size of 1 because you never have more than one task running at a time, is that you only need to create the process once which can then be reused.
As an aside, you have some inconsistency in your GracefulKiller class by mixing a class attribute kill_now with an instance attribute kill_now that gets created when you execute self.kill_now = True. So when the main process is testing killer.kill_now it is accessing the class attribute until such time as self.kill_now is set to True when it will then be accessing the instance attribute.
from multiprocessing import Pool, Pipe
import time
import signal
import numpy
class GracefulKiller:
def __init__(self):
self.kill_now = False # Instance attribute
signal.signal(signal.SIGINT, self.exit_gracefully)
signal.signal(signal.SIGTERM, self.exit_gracefully)
def exit_gracefully(self, *args):
self.kill_now = True
def init_pool_processes():
signal.signal(signal.SIGINT, signal.SIG_IGN)
signal.signal(signal.SIGTERM, signal.SIG_IGN)
def arraySender(sendData):
if sendData:
return numpy.random.randint(0, high=10, size=15, dtype=int)
def main(pool):
result = pool.apply(arraySender, args=(True,))
print(result)
if __name__ == '__main__':
# Create pool with only 1 process:
pool = Pool(1, initializer=init_pool_processes)
killer = GracefulKiller()
while not killer.kill_now:
main(pool)
pool.close()
pool.join()
Ideally GracefulKiller should be a singleton class so that regardless of how many times GracefulKiller was instantiated by a process, you would be calling signal.signal only once for each type of signal you want to handle:
class Singleton(type):
def __init__(self, *args, **kwargs):
self.__instance = None
super().__init__(*args, **kwargs)
def __call__(self, *args, **kwargs):
if self.__instance is None:
self.__instance = super().__call__(*args, **kwargs)
return self.__instance
class GracefulKiller(metaclass=Singleton):
def __init__(self):
self.kill_now = False # Instance attribute
signal.signal(signal.SIGINT, self.exit_gracefully)
signal.signal(signal.SIGTERM, self.exit_gracefully)
def exit_gracefully(self, *args):
self.kill_now = True
I am using the multiprocessing.Pool class within an object and attempting the following:
from multiprocessing import Lock, Pool
class A:
def __init__(self):
self.lock = Lock()
self.file = open('test.txt')
def function(self, i):
self.lock.acquire()
line = self.file.readline()
self.lock.release()
return line
def anotherfunction(self):
pool = Pool()
results = pool.map(self.function, range(10000))
pool.close()
pool.join()
return results
However I am getting a runtime error stating that lock objects should only be shared between processes through inheritance. I am fairly new to Python and multiprocessing. How can I get put on the right track?
multiprocessing.Lock instances can be attributes of multiprocessing.Process instances. When a process is created in the main process with a lock attribute, the lock exists in the main process’s address space. When the process’s start method is invoked and runs a subprocess which invokes the process’s run method, the lock has to be serialized/deserialized to the subprocess address space. This works as expected:
from multiprocessing import Lock, Process
class P(Process):
def __init__(self, *args, **kwargs):
Process.__init__(self, *args, **kwargs)
self.lock = Lock()
def run(self):
print(self.lock)
if __name__ == '__main__':
p = P()
p.start()
p.join()
Prints:
<Lock(owner=None)>
Unfortuantely, this does not work when you are dealing with multiprocessing.Pool instances. In your example, self.lock is created in the main process by the __init__ method. But when Pool.map is called to invoke self.function, the lock cannot be serialized/deserialized to the already-running pool process that will be running this method.
The solution is to initialize each pool process with a global variable set to this lock (there is no point in having this lock being an attribute of the class now). The way to do this is to use the initializer and initargs parameters of the pool __init__ method. See the documentation:
from multiprocessing import Lock, Pool
def init_pool_processes(the_lock):
'''Initialize each process with a global variable lock.
'''
global lock
lock = the_lock
class Test:
def function(self, i):
lock.acquire()
with open('test.txt', 'a') as f:
print(i, file=f)
lock.release()
def anotherfunction(self):
lock = Lock()
pool = Pool(initializer=init_pool_processes, initargs=(lock,))
pool.map(self.function, range(10))
pool.close()
pool.join()
if __name__ == '__main__':
t = Test()
t.anotherfunction()
I want to call a multiprocessing.pool.map inside a process.
When initialized inside the run() function, it works. When initialized at instantiation, it does not.
I cannot figure the reason for this behavior ? What happens in the process ?
I am on python 3.6
from multiprocessing import Pool, Process, Queue
def DummyPrinter(key):
print(key)
class Consumer(Process):
def __init__(self, task_queue):
Process.__init__(self)
self.task_queue = task_queue
self.p = Pool(1)
def run(self):
p = Pool(8)
while True:
next_task = self.task_queue.get()
if next_task is None:
break
p.map(DummyPrinter, next_task) # Works
#self.p.map(DummyPrinter, next_task) # Does not Work
return
if __name__ == '__main__':
task_queue = Queue()
Consumer(task_queue).start()
task_queue.put(range(5))
task_queue.put(None)
multiprocessing.Pool cannot be shared by multiple processes because it relies on pipes and threads for its functioning.
The __init__ method gets executed in the parent process whereas the run logic belongs to the child process.
I usually recommend against sub-classing the Process object as it's quite counter intuitive.
A logic like the following would better show the actual division of responsibilities.
def function(task_queue):
"""This runs in the child process."""
p = Pool(8)
while True:
next_task = self.task_queue.get()
if next_task is None:
break
p.map(DummyPrinter, next_task) # Works
def main():
"""This runs in the parent process."""
task_queue = Queue()
process = Process(target=function, args=[task_queue])
process.start()
import multiprocessing as mp
import time as t
class MyProcess(mp.Process):
def __init__(self, target, args, name):
mp.Process.__init__(self, target=target, args=args)
self.exit = mp.Event()
self.name = name
print("{0} initiated".format(self.name))
def run(self):
while not self.exit.is_set():
pass
print("Process {0} exited.".format(self.name))
def shutdown(self):
print("Shutdown initiated for {0}.".format(self.name))
self.exit.set()
def f(x):
while True:
print(x)
x = x+1
if __name__ == "__main__":
p = MyProcess(target=f, args=[3], name="function")
p.start()
#p.join()
t.wait(2)
p.shutdown()
I'm trying to extend the multiprocessing.Process class to add a shutdown method in order to be able to exit a function which could potentially have to be run for an undefined amount of time. Following instructions from Python Multiprocessing Exit Elegantly How? and adding the argument passing I came up with myself, only gets me this output:
function initiated
Shutdown initiated for function.
Process function exited.
But no actual method f(x) output. It seems that the actual process target doesn't get started. I'm obviously doing something wrong, but just can't figure out what, any ideas?
Thanks!
The sane way to handle this situation is, where possible, to have the background task cooperate in the exit mechanism by periodically checking the exit event. For that, there's no need to subclass Process: you can rewrite your background task to include that check. For example, here's your code rewritten using that approach:
import multiprocessing as mp
import time as t
def f(x, exit_event):
while not exit_event.is_set():
print(x)
x = x+1
print("Exiting")
if __name__ == "__main__":
exit_event = mp.Event()
p = mp.Process(target=f, args=(3, exit_event), name="function")
p.start()
t.sleep(2)
exit_event.set()
p.join()
If that's not an option (for example because you can't modify the code that's being run in the background job), then you can use the Process.terminate method. But you should be aware that using it is dangerous: the child process won't have an opportunity to clean up properly, so for example if it's shutdown while holding a multiprocessing lock, no other process will be able to acquire that lock, giving a risk of deadlock. It's far better to have the child cooperate in the shutdown if possible.
The solution to this problem is to call the super().run() function in your class run method.
Of course, this will cause the permanent execution of your function due to the existence of while True, and the specified event will not cause its end.
You can use Process.terminate() method to end your process.
import multiprocessing as mp
import time as t
class MyProcess(mp.Process):
def __init__(self, target, args, name):
mp.Process.__init__(self, target=target, args=args)
self.name = name
print("{0} initiated".format(self.name))
def run(self):
print("Process {0} started.".format(self.name))
super().run()
def shutdown(self):
print("Shutdown initiated for {0}.".format(self.name))
self.terminate()
def f(x):
while True:
print(x)
t.sleep(1)
x += 1
if __name__ == "__main__":
p = MyProcess(target=f, args=(3,), name="function")
p.start()
# p.join()
t.sleep(5)
p.shutdown()
I am new to multiprocessing
I have run example code for two 'highly recommended' multiprocessing examples given in response to other stackoverflow multiprocessing questions. Here is an example of one (which i dare not run again!)
test2.py (running from pydev)
import multiprocessing
class MyFancyClass(object):
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
print(proc_name, self.name)
def worker(q):
obj = q.get()
obj.do_something()
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
queue.put(MyFancyClass('Fancy Dan'))
# Wait for the worker to finish
queue.close()
queue.join_thread()
p.join()
When I run this my computer slows down imminently. It gets incrementally slower. After some time I managed to get into the task manager only to see MANY MANY python.exe under the processes tab. after trying to end process on some, my mouse stopped moving. It was the second time i was forced to reboot.
I am too scared to attempt a third example...
running - Intel(R) Core(TM) i7 CPU 870 # 2.93GHz (8 CPUs), ~2.9GHz on win7 64
If anyone know what the issue is and can provide a VERY SIMPLE example of multiprocessing (send a string too a multiprocess, alter it and send it back for printing) I would be very grateful.
From the docs:
Make sure that the main module can be safely imported by a new Python
interpreter without causing unintended side effects (such a starting a
new process).
Thus, on Windows, you must wrap your code inside a
if __name__=='__main__':
block.
For example, this sends a string to the worker process, the string is reversed and the result is printed by the main process:
import multiprocessing as mp
def worker(inq,outq):
obj = inq.get()
obj = obj[::-1]
outq.put(obj)
if __name__=='__main__':
inq = mp.Queue()
outq = mp.Queue()
p = mp.Process(target=worker, args=(inq,outq))
p.start()
inq.put('Fancy Dan')
# Wait for the worker to finish
p.join()
result = outq.get()
print(result)
Because of the way multiprocessing works on Windows (child processes import the __main__ module) the __main__ module cannot actually run anything when imported -- any code that should execute when run directly must be protected by the if __name__ == '__main__' idiom. Your corrected code:
import multiprocessing
class MyFancyClass(object):
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
print(proc_name, self.name)
def worker(q):
obj = q.get()
obj.do_something()
if __name__ == '__main__':
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
queue.put(MyFancyClass('Fancy Dan'))
# Wait for the worker to finish
queue.close()
queue.join_thread()
p.join()
Might I suggest this link? It's using threads, instead of multiprocessing, but many of the principles are the same.