How to call same process multiple times in a program - python

I am using python 2.7 multiprocessing module.I want to start a process and then terminate it and then start it again with new arguments.
p = Process(target=realwork, args=(up, down, middle, num))
def fun1():
p.start()
def fun2():
p.terminate()
And in the course of the program flow (through loops and events) I call the functions in this order:
fun1()
fun2()
fun1()
fun2()
If I try that i get an error saying I cannot call same process multiple times. Is there a workaround?

So - ypu probably had read somewhere that "using global variables is not a good pratice" - and this is why. Your "p" variable only holds one "Process" instance, and it can only be started (and terminated) once.
If you refactor your code so that "fun1" and "fun2" take the process upon which they act as a parameter, or maybe in an O.O. way, in which fun1 and fun2 are methods and the Process is an instance variable, you would not have any of these problems.
In this case, O.O. is quick to see and straightforward to use:
class MyClass(object):
def __init__(self):
self.p = Process(target=realwork,args=(up,down,middle,num))
def fun1(self):
self.p.start()
def fun2(self):
self.p.terminate()
And then, wherever you need a pair of calls to "fun1" and "fun2", you do:
action = Myclass()
action.fun1()
action.fun2()
instead. This would work even inside a for loop, or whatever.
*edit - I just saw ou are using this as answer to a button press in a Tkinter
program, so, you have to record the Process instance somewhere between button clicks.
Without having your code to refactor, and supposing you intend to persist
on your global variable approach, you can simply create the Process instance inside
"fun1" - this would work:
def fun1():
global p
p = Process(target=realwork,args=(up,down,middle,num))
p.start()
def fun2():
p.terminate()

Once you terminate a process, it's dead. You might be able to restructure the process to be able to communicate with it using pipes, for instance. That may be difficult to get right if the child process is not well-behaved. Otherwise, you can just create a new instance of Process for each set of arguments.

Related

Use multiprocessing to run functions inside a while loop in a class method

I have a method which calculates a final result using multiple other methods. It has a while loop inside which continuously checks for new data, and if new data is received, it runs the other methods and calculates the results. This main method is the only one which is called by the user, and it stays active until the program is closed. the basic structure is as follows:
class sample:
def __init__(self):
results = []
def main_calculation(self):
while True:
#code to get data
if newdata != olddata:
#insert code to prepare data for analysis
res1 = self.calc1(prepped_data)
res2 = self.calc2(prepped_data)
final = res1 + res2
self.results.append(final)
I want to run calc1 and calc2 in parallel, so that I can get the final result faster. However, I am unsure of how to implement multiprocessing in this way, since I'm not using a __main__ guard. Is there any way to run these processes in parallel?
This is likely not the best organization for this code, but it is what is easiest for the actual calculations I am running, since it is necessary that this code be imported and run from a different file. However, I can restructure the code if this is not a salvageable structure.
According to the documentation, the reason you need to use a __main__ guard is that when your program creates a multiprocessing.Process object, it starts up a whole new copy of the Python interpreter which will import a new copy of your program's modules. If importing your module calls multiprocessing.Process() itself, that will create yet another copy of the Python interpreter which interprets yet another copy of your code, and so on until your system crashes (or actually, until Python hits a non-reentrant piece of the multiprocessing code).
In the main module of your program, which usually calls some code at the top level, checking __name__ == '__main__' is the way you can tell whether the program is being run for the first time or is being run as a subprocess. But in a different module, there might not be any code at the top level (other than definitions), and in that case there's no need to use a guard because the module can be safely imported without starting a new process.
In other words, this is dangerous:
import multiprocessing as mp
def f():
...
p = mp.Process(target=f)
p.start()
p.join()
but this is safe:
import multiprocessing as mp
def f():
...
def g():
p = mp.Process(target=f)
p.start()
p.join()
and this is also safe:
import multiprocessing as mp
def f():
...
class H:
def g(self):
p = mp.Process(target=f)
p.start()
p.join()
So in your example, you should be able to directly create Process objects in your function.
However, I'd suggest making it clear in the documentation for the class that that method creates a Process, because whoever uses it (maybe you) needs to know that it's not safe to call that method at the top level of a module. It would be like doing this, which also falls in the "dangerous" category:
import multiprocessing as mp
def f():
...
class H:
def g(self):
p = mp.Process(target=f)
p.start()
p.join()
H().g() # this creates a Process at the top level
You could also consider an alternative approach where you make the caller do all the process creation. In this approach, either your sample class constructor or the main_calculation() method could accept, say, a Pool object, and it can use the processes from that pool to do its calculations. For example:
class sample:
def main_calculation(self, pool):
while True:
if newdata != olddata:
res1_async = pool.apply_async(self.calc1, [prepped_data])
res2_async = pool.apply_async(self.calc2, [prepped_data])
res1 = res1_async.get()
res2 = res2_async.get()
# and so on
This pattern may also allow your program to be more efficient in its use of resources, if there are many different calculations happening, because they can all use the same pool of processes.

Python Multiprocessing Async Can't Terminate Process

I have an infinite loop running async but I can't terminate it. Here is a similiar version of my code :
from multiprocessing import Pool
test_pool = Pool(processes=1)
self.button1.clicked.connect(self.starter)
self.button2.clicked.connect(self.stopper)
def starter(self):
global test_pool
test_pool.apply_async(self.automatizer)
def automatizer(self):
i = 0
while i != 0 :
self.job1()
# safe stop point
self.job2()
# safe stop point
self.job3()
# safe stop point
def job1(self):
# doing some stuff
def job2(self):
# doing some stuff
def job3(self):
# doing some stuff
def stopper(self):
global test_pool
test_pool.terminate()
My problem is terminate() inside stopper function doesn't work. I tried to put terminate() inside job1,job2,job3 functions still not working, tried putting at the end of the loop in starter function, again not working. How can I stop this async process ?
While stopping the process at anytime is good enough, is it possible to make it stop at the points I want ? I mean if a stop command (not sure about what command it is) is given to process, I want it to complete the steps to "# safe stop point" marker then terminate the process.
You really should be avoiding the use of terminate() in normal operation. It should only be used in unusual cases, such as hanging or unresponsive processes. The normal way to end a process pool is to call pool.close() followed by pool.join().
These methods do require the function that your pool is executing to return, and your call to pool.join() will block your main process until it does so. I would suggest you add a multiprocess.Queue to give yourself a way to tell your subprocess to exit:
# this import is NOT the same as multiprocessing.Queue - this is here for the
# queue.Empty exception
import Queue
queue = multiprocessing.Queue() # not the same as a Queue.Queue()
def stopper(self):
# don't need "global" keyword to call a global object's method
# it's only necessary if we want to modify a global
queue.put("Stop")
test_pool.close()
test_pool.join()
def automatizer(self):
while True: # cleaner infinite loop - yours was never executing
for func in [self.job1, self.job2, self.job3]: # iterate over methods
func() # call each one
# between each function call, check the queue for "poison pill"
try:
if queue.get(block=False) == "Stop":
return
except Queue.Empty:
pass
Since you didn't provide a more complete code sample, you'll have to figure out where to actually instantiate the multiprocessing.Queue and how to pass things around. Also, the comment from Janne Karila was correct. You should switch your code to use a single Process instead of a pool if you're only using one process at a time anyway. The Process class also uses a blocking join() method to tell it to end once it has returned. The only safe way to end processes at "known safe points" is to implement some kind of interprocess communication like I've done here. Pipes would work as well.

Python simplest form of multiprocessing

Ive been trying to read up on threading and multiprocessing but all the examples are to intricate and advanced for my level of python/programming knowlegde. I want to run a function, which consists of a while loop, and while that loop runs I want to continue with the program and eventually change the condition for the while-loop and end that process. This is the code:
class Example():
def __init__(self):
self.condition = False
def func1(self):
self.condition = True
while self.condition:
print "Still looping"
time.sleep(1)
print "Finished loop"
def end_loop(self):
self.condition = False
The I make the following function-calls:
ex = Example()
ex.func1()
time.sleep(5)
ex.end_loop()
What I want is for the func1 to run for 5s before the end_loop() is called and changes the condition and ends the loop and thus also the function. I.e I want one process to start and "go" into func1 and at the same time I want time.sleep(5) to be called, so the processes "split" when arriving at func1, one process entering the function while the other continues down the program and start with the time.sleep(5) execution.
This must be the most basic example of a multiprocess, still Ive had trouble finding a simple way to do it!
Thank you
EDIT1: regarding do_something. In my real problem do_something is replaced by some code that communicates with another program via a socket and receives packages with coordinates every 0.02s and stores them in membervariables of the class. I want this constant updating of the coordinates to start and then be able to to read the coordinates via other functions at the same time.
However that is not so relevant. What if do_something is replaced by:
time.sleep(1)
print "Still looping"
How do I solve my problem then?
EDIT2: I have tried multiprocessing like this:
from multiprocessing import Process
ex = Example()
p1 = Process(target=ex.func1())
p2 = Process(target=ex.end_loop())
p1.start()
time.sleep(5)
p2.start()
When I ran this, I never got to p2.start(), so that did not help. Even if it had this is not really what Im looking for either. What I want would be just to start the process p1, and then continue with time.sleep and ex.end_loop()
The first problem with your code are the calls
p1 = Process(target=ex.func1())
p2 = Process(target=ex.end_loop())
With ex.func1() you're calling the function and pass the return value as target parameter. Since the function doesn't return anything, you're effectively calling
p1 = Process(target=None)
p2 = Process(target=None)
which makes, of course, no sense.
After fixing that, the next problem will be shared data: when using the multiprocessing package, you implement concurrency using multiple processes which, by default, cannot simply share data afaik. Have a look at Sharing state between processes in the package's documentation to read about this. Especially take the first sentence into account: "when doing concurrent programming it is usually best to avoid using shared state as far as possible"!
So you might want to also have a look at Exchanging objects between processes to read about how to send/receive data between two different processes. So, instead of simply setting a flag to stop the loop, it might be better to send a message to signal the loop should be terminated.
Also note that processes are a heavyweight form of multiprocessing, they spawn multiple OS processes which comes with a relatively big overhead. multiprocessing's main purpose is to avoid problems imposed by Python's Global Interpreter Lock (google about this to read more...) If your problem is'nt much more complex than what you've told us, you might want to use the threading package instead: threads come with less overhead than processes and also allow to access the same data (although you really should read about synchronization when doing this...)
I'm afraid, multiprocessing is an inherently complex subject. So I think you will need to advance your programming/python skills to successfully use it. But I'm sure you'll manage this, the python documentation about this is comprehensive and there are a lot of other resources about this.
To tackle your EDIT2 problem, you could try using the shared memory map Value.
import time
from multiprocessing import Process, Value
class Example():
def func1(self, cond):
while (cond.value == 1):
print('do something')
time.sleep(1)
return
if __name__ == '__main__':
ex = Example()
cond = Value('i', 1)
proc = Process(target=ex.func1, args=(cond,))
proc.start()
time.sleep(5)
cond.value = 0
proc.join()
(Note the target=ex.func1 without the parentheses and the comma after cond in args=(cond,).)
But look at the answer provided by MartinStettner to find a good solution.

How to use multiprocessing with class instances in Python?

I am trying to create a class than can run a separate process to go do some work that takes a long time, launch a bunch of these from a main module and then wait for them all to finish. I want to launch the processes once and then keep feeding them things to do rather than creating and destroying processes. For example, maybe I have 10 servers running the dd command, then I want them all to scp a file, etc.
My ultimate goal is to create a class for each system that keeps track of the information for the system in which it is tied to like IP address, logs, runtime, etc. But that class must be able to launch a system command and then return execution back to the caller while that system command runs, to followup with the result of the system command later.
My attempt is failing because I cannot send an instance method of a class over the pipe to the subprocess via pickle. Those are not pickleable. I therefore tried to fix it various ways but I can't figure it out. How can my code be patched to do this? What good is multiprocessing if you can't send over anything useful?
Is there any good documentation of multiprocessing being used with class instances? The only way I can get the multiprocessing module to work is on simple functions. Every attempt to use it within a class instance has failed. Maybe I should pass events instead? I don't understand how to do that yet.
import multiprocessing
import sys
import re
class ProcessWorker(multiprocessing.Process):
"""
This class runs as a separate process to execute worker's commands in parallel
Once launched, it remains running, monitoring the task queue, until "None" is sent
"""
def __init__(self, task_q, result_q):
multiprocessing.Process.__init__(self)
self.task_q = task_q
self.result_q = result_q
return
def run(self):
"""
Overloaded function provided by multiprocessing.Process. Called upon start() signal
"""
proc_name = self.name
print '%s: Launched' % (proc_name)
while True:
next_task_list = self.task_q.get()
if next_task is None:
# Poison pill means shutdown
print '%s: Exiting' % (proc_name)
self.task_q.task_done()
break
next_task = next_task_list[0]
print '%s: %s' % (proc_name, next_task)
args = next_task_list[1]
kwargs = next_task_list[2]
answer = next_task(*args, **kwargs)
self.task_q.task_done()
self.result_q.put(answer)
return
# End of ProcessWorker class
class Worker(object):
"""
Launches a child process to run commands from derived classes in separate processes,
which sit and listen for something to do
This base class is called by each derived worker
"""
def __init__(self, config, index=None):
self.config = config
self.index = index
# Launce the ProcessWorker for anything that has an index value
if self.index is not None:
self.task_q = multiprocessing.JoinableQueue()
self.result_q = multiprocessing.Queue()
self.process_worker = ProcessWorker(self.task_q, self.result_q)
self.process_worker.start()
print "Got here"
# Process should be running and listening for functions to execute
return
def enqueue_process(target): # No self, since it is a decorator
"""
Used to place an command target from this class object into the task_q
NOTE: Any function decorated with this must use fetch_results() to get the
target task's result value
"""
def wrapper(self, *args, **kwargs):
self.task_q.put([target, args, kwargs]) # FAIL: target is a class instance method and can't be pickled!
return wrapper
def fetch_results(self):
"""
After all processes have been spawned by multiple modules, this command
is called on each one to retreive the results of the call.
This blocks until the execution of the item in the queue is complete
"""
self.task_q.join() # Wait for it to to finish
return self.result_q.get() # Return the result
#enqueue_process
def run_long_command(self, command):
print "I am running number % as process "%number, self.name
# In here, I will launch a subprocess to run a long-running system command
# p = Popen(command), etc
# p.wait(), etc
return
def close(self):
self.task_q.put(None)
self.task_q.join()
if __name__ == '__main__':
config = ["some value", "something else"]
index = 7
workers = []
for i in range(5):
worker = Worker(config, index)
worker.run_long_command("ls /")
workers.append(worker)
for worker in workers:
worker.fetch_results()
# Do more work... (this would actually be done in a distributor in another class)
for worker in workers:
worker.close()
Edit: I tried to move the ProcessWorker class and the creation of the multiprocessing queues outside of the Worker class and then tried to manually pickle the worker instance. Even that doesn't work and I get an error
RuntimeError: Queue objects should only be shared between processes
through inheritance
. But I am only passing references of those queues into the worker instance?? I am missing something fundamental. Here is the modified code from the main section:
if __name__ == '__main__':
config = ["some value", "something else"]
index = 7
workers = []
for i in range(1):
task_q = multiprocessing.JoinableQueue()
result_q = multiprocessing.Queue()
process_worker = ProcessWorker(task_q, result_q)
worker = Worker(config, index, process_worker, task_q, result_q)
something_to_look_at = pickle.dumps(worker) # FAIL: Doesn't like queues??
process_worker.start()
worker.run_long_command("ls /")
So, the problem was that I was assuming that Python was doing some sort of magic that is somehow different from the way that C++/fork() works. I somehow thought that Python only copied the class, not the whole program into a separate process. I seriously wasted days trying to get this to work because all of the talk about pickle serialization made me think that it actually sent everything over the pipe. I knew that certain things could not be sent over the pipe, but I thought my problem was that I was not packaging things up properly.
This all could have been avoided if the Python docs gave me a 10,000 ft view of what happens when this module is used. Sure, it tells me what the methods of multiprocess module does and gives me some basic examples, but what I want to know is what is the "Theory of Operation" behind the scenes! Here is the kind of information I could have used. Please chime in if my answer is off. It will help me learn.
When you run start a process using this module, the whole program is copied into another process. But since it is not the "__main__" process and my code was checking for that, it doesn't fire off yet another process infinitely. It just stops and sits out there waiting for something to do, like a zombie. Everything that was initialized in the parent at the time of calling multiprocess.Process() is all set up and ready to go. Once you put something in the multiprocess.Queue or shared memory, or pipe, etc. (however you are communicating), then the separate process receives it and gets to work. It can draw upon all imported modules and setup just as if it was the parent. However, once some internal state variables change in the parent or separate process, those changes are isolated. Once the process is spawned, it now becomes your job to keep them in sync if necessary, either through a queue, pipe, shared memory, etc.
I threw out the code and started over, but now I am only putting one extra function out in the ProcessWorker, an "execute" method that runs a command line. Pretty simple. I don't have to worry about launching and then closing a bunch of processes this way, which has caused me all kinds of instability and performance issues in the past in C++. When I switched to launching processes at the beginning and then passing messages to those waiting processes, my performance improved and it was very stable.
BTW, I looked at this link to get help, which threw me off because the example made me think that methods were being transported across the queues: http://www.doughellmann.com/PyMOTW/multiprocessing/communication.html
The second example of the first section used "next_task()" that appeared (to me) to be executing a task received via the queue.
Instead of attempting to send a method itself (which is impractical), try sending a name of a method to execute.
Provided that each worker runs the same code, it's a matter of a simple getattr(self, task_name).
I'd pass tuples (task_name, task_args), where task_args were a dict to be directly fed to the task method:
next_task_name, next_task_args = self.task_q.get()
if next_task_name:
task = getattr(self, next_task_name)
answer = task(**next_task_args)
...
else:
# poison pill, shut down
break
REF: https://stackoverflow.com/a/14179779
Answer on Jan 6 at 6:03 by David Lynch is not factually correct when he says that he was misled by
http://www.doughellmann.com/PyMOTW/multiprocessing/communication.html.
The code and examples provided are correct and work as advertised. next_task() is executing a task received via the queue -- try and understand what the Task.__call__() method is doing.
In my case what, tripped me up was syntax errors in my implementation of run(). It seems that the sub-process will not report this and just fails silently -- leaving things stuck in weird loops! Make sure you have some kind of syntax checker running e.g. Flymake/Pyflakes in Emacs.
Debugging via multiprocessing.log_to_stderr()F helped me narrow down the problem.

Python and multiprocessing... how to call a function in the main process?

I would like to implement an async callback style function in python... This is what I came up with but I am not sure how to actually return to the main process and call the function.
funcs = {}
def runCallback(uniqueId):
'''
I want this to be run in the main process.
'''
funcs[uniqueId]()
def someFunc(delay, uniqueId):
'''
This function runs in a seperate process and just sleeps.
'''
time.sleep(delay)
### HERE I WANT TO CALL runCallback IN THE MAIN PROCESS ###
# This does not work... It calls runCallback in the separate process:
runCallback(uniqueId)
def setupCallback(func, delay):
uniqueId = id(func)
funcs[uniqueId] = func
proc = multiprocessing.Process(target=func, args=(delay, uniqueId))
proc.start()
return unqiueId
Here is how I want it to work:
def aFunc():
return None
setupCallback(aFunc, 10)
### some code that gets run before aFunc is called ###
### aFunc runs 10s later ###
There is a gotcha here, because I want this to be a bit more complex. Basically when the code in the main process is done running... I want to examine the funcs dict and then run any of the callbacks that have not yet run. This means that runCallback also needs to remove entries from the funcs dict... the funcs dict is not shared with the seperate processes, so I think runCallback needs to be called in the main process???
It is unclear why do you use multiprocessing module here.
To call a function with delay in the same process you could use threading.Timer.
threading.Timer(10, aFunc).start()
Timer has .cancel() method if you'd like to cancel the callback later:
t = threading.Timer(10, runCallback, args=[uniqueId, funcs])
t.start()
timers.append((t, uniqueId))
# do other stuff
# ...
# run callbacks right now
for t, uniqueId in timers:
t.cancel() # after this the `runCallback()` won't be called by Timer()
# if it's not been called already
runCallback(uniqueId, funcs)
Where runCallback() is modified to remove functions to be called:
def runCallback(uniqueId, funcs):
f = funcs.pop(uniqueId, None) # GIL protects this code with some caveats
if f is not None:
f()
To do exactly what you're trying to do, you're going to need to set up a signal handler in the parent process to run the callback (or just remove the callback function that the child runs if it doesn't need access to any of the parent process's memory), and have the child process send a signal, but if your logic gets any more complex, you'll probably need to use another type of inter-process communication (IPC) such as pipes or sockets.
Another possibility is using threads instead of processes, then you can just run the callback from the second thread. You'll need to add a lock to synchronize access to the funcs dict.

Categories