subprocess becomes defunct, `communicate()` hangs - python

In python 2.7 on Ubuntu 14.04, I launch a process like this:
bag_process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
for i in range(5):
print "Countdown: {}".format(5 - i - 1)
time.sleep(1)
print "Sending SIGINT to PID {}".format(bag_process.pid)
bag_process.send_signal(signal.SIGINT)
(bag_out, bag_err) = bag_process.communicate()
The program hangs on the communicate() line. When I open another terminal, I run ps -ef | grep ### to find the pid of the subprocess, and I see it's <defunct>.
Why is the child program becoming defunct, and the parent program hanging on communicate()? Provided that the child truly exits after receiving SIGINT, how can I make the parent program reliably handle that without hanging?

The problem was: Don't kill a process like this:
bag_process.send_signal(signal.SIGINT)
Instead, kill the process and all of its sub-processes like this:
parent = psutil.Process(bag_process.pid)
for child in parent.get_children(recursive=True):
child.send_signal(signal.SIGINT)
bag_process.send_signal(signal.SIGINT)

Related

subprocess.Popen makes terminal crash after KeyboardInterrupt

I wrote a simple python script ./vader-shell which uses subprocess.Popen to launch a spark-shell and I have to deal with KeyboardInterrupt, since otherwise the child process would not die
command = ['/opt/spark/current23/bin/spark-shell']
command.extend(params)
p = subprocess.Popen(command)
try:
p.communicate()
except KeyboardInterrupt:
p.terminate()
This is what I see with ps f
When I actually interrupt with ctrl-C, I see the processes dying (most of the time). However the terminal starts acting weird: I don't see any cursor, and all the lines starts to appear randomly
I am really lost in what is the best way to run a subprocess with this library and how to handle killing of the child processes. What I want to achieve is basic: whenever my python process is killed with a ctrl-C, I want all the family of process being killed. I googled several solutions os.kill, p.wait() after termination, calling subprocess.Popen(['reset']) after termination but none of them worked.
Do you know what is the best way to kill when KeyboardInterrupt happens? Or do you know any other more reliable library to use to spin-up processes?
There is nothing blatantly wrong with your code, the problem is that the command you are launching tries to do stuff with the current terminal, and does not correctly restore the settings where shutting down. Replacing your command with a "sleep" like below will run just fine and stop on Ctrl+C without problems:
import subprocess
command = ['/bin/bash']
command.extend(['-c', 'sleep 600'])
p = subprocess.Popen(command)
try:
p.communicate()
except KeyboardInterrupt:
p.terminate()
I don't know what you're trying to do with spark-shell, but if you don't need it's output you could try to redirect it to /dev/null so that it's doesn't mess up the terminal display:
p = subprocess.Popen(command, stdout=subprocess.DEVNULL)

How to kill subprocess after time.sleep()? [duplicate]

I am running some shell scripts with the subprocess module in python. If the shell scripts is running to long, I like to kill the subprocess. I thought it will be enough if I am passing the timeout=30 to my run(..) statement.
Here is the code:
try:
result=run(['utilities/shell_scripts/{0} {1} {2}'.format(
self.language_conf[key][1], self.proc_dir, config.main_file)],
shell=True,
check=True,
stdout=PIPE,
stderr=PIPE,
universal_newlines=True,
timeout=30,
bufsize=100)
except TimeoutExpired as timeout:
I have tested this call with some shell scripts that runs 120s. I expected the subprocess to be killed after 30s, but in fact the process is finishing the 120s script and than raises the Timeout Exception. Now the Question how can I kill the subprocess by timeout?
The documentation explicitly states that the process should be killed:
from the docs for subprocess.run:
"The timeout argument is passed to Popen.communicate(). If the timeout expires, the child process will be killed and waited for. The TimeoutExpired exception will be re-raised after the child process has terminated."
But in your case you're using shell=True, and I've seen issues like that before, because the blocking process is a child of the shell process.
I don't think you need shell=True if you decompose your arguments properly and your scripts have the proper shebang. You could try this:
result=run(
[os.path.join('utilities/shell_scripts',self.language_conf[key][1]), self.proc_dir, config.main_file], # don't compose argument line yourself
shell=False, # no shell wrapper
check=True,
stdout=PIPE,
stderr=PIPE,
universal_newlines=True,
timeout=30,
bufsize=100)
note that I can reproduce this issue very easily on Windows (using Popen, but it's the same thing):
import subprocess,time
p=subprocess.Popen("notepad",shell=True)
time.sleep(1)
p.kill()
=> notepad stays open, probably because it manages to detach from the parent shell process.
import subprocess,time
p=subprocess.Popen("notepad",shell=False)
time.sleep(1)
p.kill()
=> notepad closes after 1 second
Funnily enough, if you remove time.sleep(), kill() works even with shell=True probably because it successfully kills the shell which is launching notepad.
I'm not saying you have exactly the same issue, I'm just demonstrating that shell=True is evil for many reasons, and not being able to kill/timeout the process is one more reason.
However, if you need shell=True for a reason, you can use psutil to kill all the children in the end. In that case, it's better to use Popen so you get the process id directly:
import subprocess,time,psutil
parent=subprocess.Popen("notepad",shell=True)
for _ in range(30): # 30 seconds
if parent.poll() is not None: # process just ended
break
time.sleep(1)
else:
# the for loop ended without break: timeout
parent = psutil.Process(parent.pid)
for child in parent.children(recursive=True): # or parent.children() for recursive=False
child.kill()
parent.kill()
(source: how to kill process and child processes from python?)
that example kills the notepad instance as well.

Launch a completely independent process

I want to initiate a process from my python script main.py. Specifically, I want to run the below command:
`nohup python ./myfile.py &`
and the file myfile.py should continue running, even after the main.py script exits.
I also wish to get the pid of the new process.
I tried:
os.spawnl*
os.exec*
subprocess.Popen
and all are terminating the myfile.py when the main.py script exits.
Update: Can I use os.startfile with xdg-open? Is it the right approach?
Example
a = subprocess.Popen([sys.executable, "nohup /usr/bin/python25 /long_process.py &"],\
stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
print a.pid
If I check ps aux | grep long_process, there is no process running.
long_process.py which keeps on printing some text: no exit.
Am I doing anything wrong here?
You open your long-running process and keep a pipe to it. So you expect to talk to it. When yor launcher script exits, you can no longer talk to it. The long-running process receives a SIGPIPE and exits.
The following just worked for me (Linux, Python 2.7).
Create a long-running executable:
$ echo "sleep 100" > ~/tmp/sleeper.sh
Run Python REPL:
$ python
>>>
import subprocess
import os
p = subprocess.Popen(['/bin/sh', os.path.expanduser('~/tmp/sleeper.sh')])
# look ma, no pipes!
print p.pid
# prints 29893
Exit the REPL and see the process still running:
>>> ^D
$ ps ax | grep sleeper
29893 pts/0 S 0:00 /bin/sh .../tmp/sleeper.sh
29917 pts/0 S+ 0:00 grep --color=auto sleeper
If you want to first communicate to the started process and then leave it alone to run further, you have a few options:
Handle SIGPIPE in your long-running process, do not die on it. Live without stdin after the launcher process exits.
Pass whatever you wanted using arguments, environment, or a temporary file.
If you want bidirectional communication, consider using a named pipe (man mkfifo) or a socket, or writing a proper server.
Make the long-running process fork after the initial bi-direcional communication phase is done.
You can use os.fork().
import os
pid=os.fork()
if pid==0: # new process
os.system("nohup python ./myfile.py &")
exit()
# parent process continues
I could not see any process running.
You don't see any process running because the child python process exits immediately. The Popen arguments are incorrect as user4815162342 says in the comment.
To launch a completely independent process, you could use python-daemon package or use systemd/supervisord/etc:
#!/usr/bin/python25
import daemon
from long_process import main
with daemon.DaemonContext():
main()
Though it might be enough in your case, to start the child with correct Popen arguments:
with open(os.devnull, 'r+b', 0) as DEVNULL:
p = Popen(['/usr/bin/python25', '/path/to/long_process.py'],
stdin=DEVNULL, stdout=DEVNULL, stderr=STDOUT, close_fds=True)
time.sleep(1) # give it a second to launch
if p.poll(): # the process already finished and it has nonzero exit code
sys.exit(p.returncode)
If the child process doesn't require python2.5 then you could use sys.executable instead (to use the same Python version as the parent).
Note: the code closes DEVNULL in the parent without waiting for the child process to finish (it has no effect on the child).

Kill a chain of sub processes on KeyboardInterrupt

I'm having a strange problem I've encountered as I wrote a script to start my local JBoss instance.
My code looks something like this:
with open("/var/run/jboss/jboss.pid", "wb") as f:
process = subprocess.Popen(["/opt/jboss/bin/standalone.sh", "-b=0.0.0.0"])
f.write(str(process.pid))
try:
process.wait()
except KeyboardInterrupt:
process.kill()
Should be fairly simple to understand, write the PID to a file while its running, once I get a KeyboardInterrupt, kill the child process.
The problem is that JBoss keeps running in the background after I send the kill signal, as it seems that the signal doesn't propagate down to the Java process started by standalone.sh.
I like the idea of using Python to write system management scripts, but there are a lot of weird edge cases like this where if I would have written it in Bash, everything would have just worked™.
How can I kill the entire subprocess tree when I get a KeyboardInterrupt?
You can do this using the psutil library:
import psutil
#..
proc = psutil.Process(process.pid)
for child in proc.children(recursive=True):
child.kill()
proc.kill()
As far as I know the subprocess module does not offer any API function to retrieve the children spawned by subprocesses, nor does the os module.
A better way of killing the processes would probably be the following:
proc = psutil.Process(process.pid)
procs = proc.children(recursive=True)
procs.append(proc)
for proc in procs:
proc.terminate()
gone, alive = psutil.wait_procs(procs, timeout=1)
for p in alive:
p.kill()
This would give a chance to the processes to terminate correctly and when the timeout ends the remaining processes will be killed.
Note that psutil also provides a Popen class that has the same interface of subprocess.Popen plus all the extra functionality of psutil.Process. You may want to simply use that instead of subprocess.Popen. It is also safer because psutil checks that PIDs don't get reused if a process terminates, while subprocess doesn't.

Kill a running subprocess call

I'm launching a program with subprocess on Python.
In some cases the program may freeze. This is out of my control. The only thing I can do from the command line it is launched from is CtrlEsc which kills the program quickly.
Is there any way to emulate this with subprocess? I am using subprocess.Popen(cmd, shell=True) to launch the program.
Well, there are a couple of methods on the object returned by subprocess.Popen() which may be of use: Popen.terminate() and Popen.kill(), which send a SIGTERM and SIGKILL respectively.
For example...
import subprocess
import time
process = subprocess.Popen(cmd, shell=True)
time.sleep(5)
process.terminate()
...would terminate the process after five seconds.
Or you can use os.kill() to send other signals, like SIGINT to simulate CTRL-C, with...
import subprocess
import time
import os
import signal
process = subprocess.Popen(cmd, shell=True)
time.sleep(5)
os.kill(process.pid, signal.SIGINT)
p = subprocess.Popen("echo 'foo' && sleep 60 && echo 'bar'", shell=True)
p.kill()
Check out the docs on the subprocess module for more info: http://docs.python.org/2/library/subprocess.html
You can use two signals to kill a running subprocess call i.e., signal.SIGTERM and signal.SIGKILL; for example
import subprocess
import os
import signal
import time
..
process = subprocess.Popen(..)
..
# killing all processes in the group
os.killpg(process.pid, signal.SIGTERM)
time.sleep(2)
if process.poll() is None: # Force kill if process is still alive
time.sleep(3)
os.killpg(process.pid, signal.SIGKILL)
Your question is not too clear, but If I assume that you are about to launch a process wich goes to zombie and you want to be able to control that in some state of your script. If this in the case, I propose you the following:
p = subprocess.Popen([cmd_list], shell=False)
This in not really recommanded to pass through the shell.
I would suggest you ti use shell=False, this way you risk less an overflow.
# Get the process id & try to terminate it gracefuly
pid = p.pid
p.terminate()
# Check if the process has really terminated & force kill if not.
try:
os.kill(pid, 0)
p.kill()
print "Forced kill"
except OSError, e:
print "Terminated gracefully"
Following command worked for me
os.system("pkill -TERM -P %s"%process.pid)
Try wrapping your subprocess.Popen call in a try except block. Depending on why your process is hanging, you may be able to cleanly exit. Here is a list of exceptions you can check for: Python 3 - Exceptions Handling

Categories