I have a couple of different scripts that require opening a MongoDB instance that go something like this:
mongod = Popen(
["mongod", "--dbpath", '/path/to/db'],
)
#Do some stuff
mongod.terminate()
And this works great when the code I'm executing works, but while I'm tinkering, errors inevitably arise. Then the Mongod instance remains running, and the next time I attempt to run the script, it detects that and doesn't open a new one.
I can terminate the process from the command line, but this is somewhat tedious. Or I can wrap everything in a try loop, but for some of the scripts, I have to do this a bunch, since every function depends on every other one. Is there a more elegant way to force close the process even in the event of an error somewhere else in the code?
EDIT: Did some testing based on tdelaney's comment, it looks like when I run these scripts in Sublime text and en error is generated, the script doesn't actually finish - it hits the error and then waits with the mongod instance open... i think. Once I kill the process in the terminal, sublime text tells me "finished in X seconds with exit code1"
EDIT2: On Kirby's suggestion, tried:
def testing():
mongod = Popen(
["mongod", "--dbpath", '/Users/KBLaptop/computation/db/'],
)
#Stuff that generates error
mongod.terminate()
def cleanup():
for proc in subprocess._active[:]:
try: proc.terminate()
except: pass
atexit.register(cleanup)
testing()
The error in testing() seems to prevent anything from continuing, so the atexit never registers and the process keeps running. Am I missing something obvious?
If you're running under CPython, you can cheat and take advantage of Python's destructors:
class PopenWrapper(object):
def __del__(self):
if self._child_created:
self.terminate()
This is slightly ucky, though. My preference would be to atexit:
import atexit
mongod = Popen(...)
def cleanup():
for proc in subprocess._active[:]:
try: proc.terminate()
except: pass
atexit.register(cleanup)
Still slightly hack-ish, though.
EDIT: Try this:
from subprocess import Popen
import atexit
started = []
def auto_popen(*args, **kw):
p = Popen(*args, **kw)
started.append(p)
return p
def testing():
mongod = auto_popen(['blah blah'], shell=True)
assert 0
#Stuff that generates error
mongod.terminate()
def cleanup():
for proc in started:
if proc.poll() is None:
try: proc.kill()
except: pass
atexit.register(cleanup)
testing()
Related
I am trying to work with subprocess routine that spawns an interactive child process which expects user inputs. This process normally hangs immediately if I try to read its stdout stream directly.
I read through many solutions using fcntl, asynchronous operations, pexpect and file output and reading redirections. Although temporary log files should work, I don't want to go through that route as I would like to keep the process interactive within the Python interface. From all of those, threads seemed to be the most easiest and straightforward way (I could not get pexpect to work properly, although it seemed to be a good option, too).
Indeed, when I implemented the following code (stolen from Non-blocking read on a subprocess.PIPE in python):
import os
import subprocess as sp
from threading import Thread
from queue import Queue, Empty
class App:
def __init__(self):
proc = sp.Popen(['app'], stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE, encoding='utf8')
out = NonBlockingStreamReader(proc.stdout)
print(out.readline(1))
class NonBlockingStreamReader:
def __init__(self, stream):
self.s = stream
self.q = Queue()
def populateQueue(stream, queue):
while True:
line = stream.readline()
if line:
queue.put(line)
else:
raise UnexpectedEndOfStream
self.t = Thread(target = populateQueue, args = (self.s, self.q))
self.t.daemon = True
self.t.start()
def readline(self, timeout = None):
try:
return self.q.get(block = timeout is not None, timeout = timeout)
except Empty:
return None
class UnexpectedEndOfStream(Exception):
pass
everything worked, flawlessly. Well, the problem is -- it worked on Linux only, even though the solution should be Windows compatible.
When I try to run this implementation on Windows, the newly created thread hangs the moment it tries to execute stream.readline(), never gets to actually populate the queue and thus the output of out.readline(1) read from the main thread is None.
How can I make this work on Windows?
I want code like this:
if True:
run('ABC.PY')
else:
if ScriptRunning('ABC.PY):
stop('ABC.PY')
run('ABC.PY'):
Basically, I want to run a file, let's say abc.py, and based on some conditions. I want to stop it, and run it again from another python script. Is it possible?
I am using Windows.
You can use python Popen objects for running processes in a child process
So run('ABC.PY') would be p = Popen("python 'ABC.PY'")
if ScriptRunning('ABC.PY) would be if p.poll() == None
stop('ABC.PY') would be p.kill()
This is a very basic example for what you are trying to achieve
Please checkout subprocess.Popen docs to fine tune your logic for running the script
import subprocess
import shlex
import time
def run(script):
scriptArgs = shlex.split(script)
commandArgs = ["python"]
commandArgs.extend(scriptArgs)
procHandle = subprocess.Popen(commandArgs, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
return procHandle
def isScriptRunning(procHandle):
return procHandle.poll() is None
def stopScript(procHandle):
procHandle.terminate()
time.sleep(5)
# Forcefully terminate the script
if isScriptRunning(procHandle):
procHandle.kill()
def getOutput(procHandle):
# stderr will be redirected to stdout due "stderr=subprocess.STDOUT" argument in Popen call
stdout, _ = procHandle.communicate()
returncode = procHandle.returncode
return returncode, stdout
def main():
procHandle = run("main.py --arg 123")
time.sleep(5)
isScriptRunning(procHandle)
stopScript(procHandle)
print getOutput(procHandle)
if __name__ == "__main__":
main()
One thing that you should be aware about is stdout=subprocess.PIPE.
If your python script has a very large output, the pipes may overflow causing your script to block until .communicate is called over the handle.
To avoid this, pass a file handle to stdout, like this
fileHandle = open("main_output.txt", "w")
subprocess.Popen(..., stdout=fileHandle)
In this way, the output of the python process will be dumped into the file.(You will have to modily the getOutput() function too for this)
import subprocess
process = None
def run_or_rerun(flag):
global process
if flag:
assert(process is None)
process = subprocess.Popen(['python', 'ABC.PY'])
process.wait() # must wait or caller will hang
else:
if process.poll() is None: # it is still running
process.terminate() # terminate process
process = subprocess.Popen(['python', 'ABC.PY']) # rerun
process.wait() # must wait or caller will hang
I have this code:
class ExtendedProcess(multiprocessing.Process):
def __init__(self):
super(ExtendedProcess, self).__init__()
self.stop_request = multiprocessing.Event()
def join(self, timeout=None):
logging.debug("stop request received")
self.stop_request.set()
super(ExtendedProcess, self).join(timeout)
def run(self):
logging.debug("process has started")
while not self.stop_request.is_set():
print "doing something"
logging.debug("proc is stopping")
When I call start() on the process it should be running forever, since self.stop_request() is not set. After some miliseconds join() is being called by itself and breaking run. What is going on!? why is join being called by itself?
Moreover, when I start a debugger and go line by line it's suddenly working fine.... What am I missing?
OK, thanks to ely's answer the reason hit me:
There is a race condition -
new process created...
as it's starting itself and about to run logging.debug("process has started") the main function hits end.
main function calls sys exit and on sys exit python calls for all finished processes to close with join().
since the process didn't actually hit "while not self.stop_request.is_set()" join is called and "self.stop_request.set()". Now stop_request.is_set and the code closes.
As mentioned in the updated question, this is because of a race condition. Below I put an initial example highlighting a simplistic race condition where the race is against the overall program exit, but this could also be caused by other types of scope exits or other general race conditions involving your process.
I copied your class definition and added some "main" code to run it, here's my full listing:
import logging
import multiprocessing
import time
class ExtendedProcess(multiprocessing.Process):
def __init__(self):
super(ExtendedProcess, self).__init__()
self.stop_request = multiprocessing.Event()
def join(self, timeout=None):
logging.debug("stop request received")
self.stop_request.set()
super(ExtendedProcess, self).join(timeout)
def run(self):
logging.debug("process has started")
while not self.stop_request.is_set():
print("doing something")
time.sleep(1)
logging.debug("proc is stopping")
if __name__ == "__main__":
p = ExtendedProcess()
p.start()
while True:
pass
The above code listing runs as expected for me using both Python 2.7.11 and 3.6.4. It loops infinitely and the process never terminates:
ely#eschaton:~/programming$ python extended_process.py
doing something
doing something
doing something
doing something
doing something
... and so on
However, if I instead use this code in my main section, it exits right away (as expected):
if __name__ == "__main__":
p = ExtendedProcess()
p.start()
This exits because the interpreter reaches the end of the program, which in turn triggers automatically destroying the p object as it goes out of scope of the whole program.
Note this could also explain why it works for you in the debugger. That is an interactive programming session, so after you start p, the debugger environment allows you to wait around and inspect it ... it would not be automatically destroyed unless you somehow invoked it within some scope that is exited while stepping through the debugger.
Just to verify the join behavior too, I also tried with this main block:
if __name__ == "__main__":
log = logging.getLogger()
log.setLevel(logging.DEBUG)
p = ExtendedProcess()
p.start()
st_time = time.time()
while time.time() - st_time < 5:
pass
p.join()
print("Finished!")
and it works as expected:
ely#eschaton:~/programming$ python extended_process.py
DEBUG:root:process has started
doing something
doing something
doing something
doing something
doing something
DEBUG:root:stop request received
DEBUG:root:proc is stopping
Finished!
I would like to repeatedly execute a subprocess as fast as possible. However, sometimes the process will take too long, so I want to kill it.
I use signal.signal(...) like below:
ppid=pipeexe.pid
signal.signal(signal.SIGALRM, stop_handler)
signal.alarm(1)
.....
def stop_handler(signal, frame):
print 'Stop test'+testdir+'for time out'
if(pipeexe.poll()==None and hasattr(signal, "SIGKILL")):
os.kill(ppid, signal.SIGKILL)
return False
but sometime this code will try to stop the next round from executing.
Stop test/home/lu/workspace/152/treefit/test2for time out
/bin/sh: /home/lu/workspace/153/squib_driver: not found ---this is the next execution; the program wrongly stops it.
Does anyone know how to solve this? I want to stop in time not execute 1 second the time.sleep(n) often wait n seconds. I do not want that I want it can execute less than 1 second
You could do something like this:
import subprocess as sub
import threading
class RunCmd(threading.Thread):
def __init__(self, cmd, timeout):
threading.Thread.__init__(self)
self.cmd = cmd
self.timeout = timeout
def run(self):
self.p = sub.Popen(self.cmd)
self.p.wait()
def Run(self):
self.start()
self.join(self.timeout)
if self.is_alive():
self.p.terminate() #use self.p.kill() if process needs a kill -9
self.join()
RunCmd(["./someProg", "arg1"], 60).Run()
The idea is that you create a thread that runs the command and to kill it if the timeout exceeds some suitable value, in this case 60 seconds.
Here is something I wrote as a watchdog for subprocess execution. I use it now a lot, but I'm not so experienced so maybe there are some flaws in it:
import subprocess
import time
def subprocess_execute(command, time_out=60):
"""executing the command with a watchdog"""
# launching the command
c = subprocess.Popen(command)
# now waiting for the command to complete
t = 0
while t < time_out and c.poll() is None:
time.sleep(1) # (comment 1)
t += 1
# there are two possibilities for the while to have stopped:
if c.poll() is None:
# in the case the process did not complete, we kill it
c.terminate()
# and fill the return code with some error value
returncode = -1 # (comment 2)
else:
# in the case the process completed normally
returncode = c.poll()
return returncode
Usage:
return = subprocess_execute(['java', '-jar', 'some.jar'])
Comments:
here, the watchdog time out is in seconds; but it's easy to change to whatever needed by changing the time.sleep() value. The time_out will have to be documented accordingly;
according to what is needed, here it maybe more suitable to raise some exception.
Documentation: I struggled a bit with the documentation of subprocess module to understand that subprocess.Popen is not blocking; the process is executed in parallel (maybe I do not use the correct word here, but I think it's understandable).
But as what I wrote is linear in its execution, I really have to wait for the command to complete, with a time out to avoid bugs in the command to pause the nightly execution of the script.
I guess this is a common synchronization problem in event-oriented programming with threads and processes.
If you should always have only one subprocess running, make sure the current subprocess is killed before running the next one. Otherwise the signal handler may get a reference to the last subprocess run and ignore the older.
Suppose subprocess A is running. Before the alarm signal is handled, subprocess B is launched. Just after that, your alarm signal handler attempts to kill a subprocess. As the current PID (or the current subprocess pipe object) was set to B's when launching the subprocess, B gets killed and A keeps running.
Is my guess correct?
To make your code easier to understand, I would include the part that creates a new subprocess just after the part that kills the current subprocess. That would make clear there is only one subprocess running at any time. The signal handler could do both the subprocess killing and launching, as if it was the iteration block that runs in a loop, in this case event-driven with the alarm signal every 1 second.
Here's what I use:
class KillerThread(threading.Thread):
def __init__(self, pid, timeout, event ):
threading.Thread.__init__(self)
self.pid = pid
self.timeout = timeout
self.event = event
self.setDaemon(True)
def run(self):
self.event.wait(self.timeout)
if not self.event.isSet() :
try:
os.kill( self.pid, signal.SIGKILL )
except OSError, e:
#This is raised if the process has already completed
pass
def runTimed(dt, dir, args, kwargs ):
event = threading.Event()
cwd = os.getcwd()
os.chdir(dir)
proc = subprocess.Popen(args, **kwargs )
os.chdir(cwd)
killer = KillerThread(proc.pid, dt, event)
killer.start()
(stdout, stderr) = proc.communicate()
event.set()
return (stdout,stderr, proc.returncode)
A bit more complex, I added an answer to solve a similar problem: Capturing stdout, feeding stdin, and being able to terminate after some time of inactivity and/or after some overall runtime.
I need to do the following in Python. I want to spawn a process (subprocess module?), and:
if the process ends normally, to continue exactly from the moment it terminates;
if, otherwise, the process "gets stuck" and doesn't terminate within (say) one hour, to kill it and continue (possibly giving it another try, in a loop).
What is the most elegant way to accomplish this?
The subprocess module will be your friend. Start the process to get a Popen object, then pass it to a function like this. Note that this only raises exception on timeout. If desired you can catch the exception and call the kill() method on the Popen process. (kill is new in Python 2.6, btw)
import time
def wait_timeout(proc, seconds):
"""Wait for a process to finish, or raise exception after timeout"""
start = time.time()
end = start + seconds
interval = min(seconds / 1000.0, .25)
while True:
result = proc.poll()
if result is not None:
return result
if time.time() >= end:
raise RuntimeError("Process timed out")
time.sleep(interval)
There are at least 2 ways to do this by using psutil as long as you know the process PID.
Assuming the process is created as such:
import subprocess
subp = subprocess.Popen(['progname'])
...you can get its creation time in a busy loop like this:
import psutil, time
TIMEOUT = 60 * 60 # 1 hour
p = psutil.Process(subp.pid)
while 1:
if (time.time() - p.create_time()) > TIMEOUT:
p.kill()
raise RuntimeError('timeout')
time.sleep(5)
...or simply, you can do this:
import psutil
p = psutil.Process(subp.pid)
try:
p.wait(timeout=60*60)
except psutil.TimeoutExpired:
p.kill()
raise
Also, while you're at it, you might be interested in the following extra APIs:
>>> p.status()
'running'
>>> p.is_running()
True
>>>
I had a similar question and found this answer. Just for completeness, I want to add one more way how to terminate a hanging process after a given amount of time: The python signal library
https://docs.python.org/2/library/signal.html
From the documentation:
import signal, os
def handler(signum, frame):
print 'Signal handler called with signal', signum
raise IOError("Couldn't open device!")
# Set the signal handler and a 5-second alarm
signal.signal(signal.SIGALRM, handler)
signal.alarm(5)
# This open() may hang indefinitely
fd = os.open('/dev/ttyS0', os.O_RDWR)
signal.alarm(0) # Disable the alarm
Since you wanted to spawn a new process anyways, this might not be the best soloution for your problem, though.
A nice, passive, way is also by using a threading.Timer and setting up callback function.
from threading import Timer
# execute the command
p = subprocess.Popen(command)
# save the proc object - either if you make this onto class (like the example), or 'p' can be global
self.p == p
# config and init timer
# kill_proc is a callback function which can also be added onto class or simply a global
t = Timer(seconds, self.kill_proc)
# start timer
t.start()
# wait for the test process to return
rcode = p.wait()
t.cancel()
If the process finishes in time, wait() ends and code continues here, cancel() stops the timer. If meanwhile the timer runs out and executes kill_proc in a separate thread, wait() will also continue here and cancel() will do nothing. By the value of rcode you will know if we've timeouted or not. Simplest kill_proc: (you can of course do anything extra there)
def kill_proc(self):
os.kill(self.p, signal.SIGTERM)
Koodos to Peter Shinners for his nice suggestion about subprocess module. I was using exec() before and did not have any control on running time and especially terminating it. My simplest template for this kind of task is the following and I am just using the timeout parameter of subprocess.run() function to monitor the running time. Of course you can get standard out and error as well if needed:
from subprocess import run, TimeoutExpired, CalledProcessError
for file in fls:
try:
run(["python3.7", file], check=True, timeout=7200) # 2 hours timeout
print("scraped :)", file)
except TimeoutExpired:
message = "Timeout :( !!!"
print(message, file)
f.write("{message} {file}\n".format(file=file, message=message))
except CalledProcessError:
message = "SOMETHING HAPPENED :( !!!, CHECK"
print(message, file)
f.write("{message} {file}\n".format(file=file, message=message))