Stopping a subprocess when a specific string is found in the logs - python

This is a question about subprocesses.
I'm working on using this asynchronous script for some of the work I'm doing: https://github.com/ethereum/trinity/blob/master/scripts/peer.py
The functionality of the script don't matter as much as the way I want to use this script.
Since it's asynchronous, I want to run it in a subprocess with different values, and for each subprocess, I want to wait for a certain timeout before I check for a string in the logs of the script. If I find the string I'm looking for, I exit the subprocess for that parameter passed, and pass in a new parameter, repeating the process.
From a high level, this is the subprocess script I'm trying out.
import subprocess
enode = 'enode://ecd3f3de6fc1a69fdbb459ccfeedb9ae4b#127.0.0.1:30303'
command = [
'python',
'-m',
'scripts.peer',
'-mainnet',
'-enode',
enode
]
proc = subprocess.Popen(command)
try:
outs, errs = proc.communicate(timeout=15)
except TimeoutExpired:
proc.kill()
outs, errs = proc.communicate()
print(outs)
print(errs)
This code doesn't work and never exits the script. How can I use subprocess with an async script so I can halt it when I grep for a value in the logs of the subprocess and that value I'm looking for is found there.
The string in the logs I'm looking for is: failed DAO fork check validation which I'd use as my trigger to stop the script.

You'll need to explicitly set up the input and error streams of the subprocess to subprocess.PIPE to read from them, like so.
# ...
proc = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

Related

How to make a generic method in Python to execute multiple piped shell commands?

I have many shell commands that need to be executed in my python script. I know that I shouldn't use shell=true as mentioned here and that I can use the std outputs and inputs in case when I have pipes in the command as mentioned here.
But the problem is that my shell commands are complex and full of pipes, so I'd like to make a generic method to be used by my script.
I made a small test below, but is hanging after printing the result (I simplified just to put here). Can somebody please let me know:
Why is hanging.
If there's a better method of doing this.
Thanks.
PS: This is just a small portion of a big python project and there are business reasons why I'm trying to do this. Thanks.
#!/usr/bin/env python3
import subprocess as sub
from subprocess import Popen, PIPE
import shlex
def exec_cmd(cmd,p=None,isFirstLoop=True):
if not isFirstLoop and not p:
print("Error, p is null")
exit()
if "|" in cmd:
cmds = cmd.split("|")
while "|" in cmd:
# separates what is before and what is after the first pipe
now_cmd = cmd.split('|',1)[0].strip()
next_cmd = cmd.split('|',1)[-1].strip()
try:
if isFirstLoop:
p1 = sub.Popen(shlex.split(now_cmd), stdout=PIPE)
exec_cmd(next_cmd,p1,False)
else:
p2 = sub.Popen(shlex.split(now_cmd),stdin=p.stdout, stdout=PIPE)
exec_cmd(next_cmd,p2,False)
except Exception as e:
print("Error executing command '{0}'.\nOutput:\n:{1}".format(cmd,str(e)))
exit()
# Adjust cmd to execute the next part
cmd = next_cmd
else:
proc = sub.Popen(shlex.split(cmd),stdin=p.stdout, stdout=PIPE, universal_newlines=True)
(out,err) = proc.communicate()
if err:
print(str(err).strip())
else:
print(out)
exec_cmd("ls -ltrh | awk '{print $9}' | wc -l ")
Instead of using a shell string and trying to parse it with your own means, I’d ask the user to provide the commands as separate entities themselves. This avoid the obvious trap of detecting a | that is part of a command and not used as a shell pipe. That you ask them to provide commands as a list of strings or a single string that you will shlex.split afterwards is up to the interface that you want to expose. I’d choose the first one for its simplicity in the following example.
Once you have the individual commands, a simple for loop is enough to pipe outputs of the previous commands to inputs of the next ones, as you have found yourself:
def pipe_subprocesses(*commands):
if not commands:
return
next_input = None
for command in commands:
p = subprocess.Popen(command, stdin=next_input, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
next_input = p.stdout
out, err = p.communicate()
if err:
print(err.decode().strip())
else:
print(out.decode())
Usage being:
>>> pipe_subprocesses(['ls', '-lhtr'], ['awk', '{print $9}'], ['wc', '-l'])
25
Now this is a quick and dirty way to get it setup and have seemingly work as you want it. But there are at least two issues with this code:
You leak zombies process/opened process handles because no process' exit code but the last one is collected; and the OS is keeping resources opened for you to do so;
You can't access the informations of a process that would fail midway through.
To avoid that, you need to maintain a list of opened process and explicitly wait for each of them. And because I don't know your exact use case, I'll just return the first process that failed (if any) or the last process (if not) so you can act accordingly:
def pipe_subprocesses(*commands):
if not commands:
return
processes = []
next_input = None
for command in commands:
if isinstance(command, str):
command = shlex.split(command)
p = subprocess.Popen(command, stdin=next_input, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
next_input = p.stdout
processes.append(p)
for p in processes:
p.wait()
for p in processes:
if p.returncode != 0:
return p
return p # return the last process in case everything went well
I also thrown in some shlex as an example so you can mix raw strings and already parsed lists:
>>> pipe_subprocesses('ls -lhtr', ['awk', '{print $9}'], 'wc -l')
25
This unfortunately has a few edge cases in it that the shell takes care of for you, or alternatively, that the shell completely ignores for you. Some concerns:
The function should always wait() for every process to finish, or else you will get what are called zombie processes.
The commands should be connected to each other using real pipes, that way the entire output doesn't need to be read into memory at once. This is the normal way pipes work.
The read end of every pipe should be closed in the parent process, so children can properly SIGPIPE when the next process closes its input. Without this, the parent process can keep the pipe open and the child does not know to exit, and it may run forever.
Errors in child processes should be raised as exceptions, except SIGPIPE. It is left as an exercise to the reader to raise exceptions for SIGPIPE on the final process because SIGPIPE is not expected there, but ignoring it is not harmful.
Note that subprocess.DEVNULL does not exist prior to Python 3.3. I know there are some of you out there still living with 2.x, you will have to open a file for /dev/null manually or just decide that the first process in the pipeline gets to share stdin with the parent process.
Here is the code:
import signal
import subprocess
def run_pipe(*cmds):
"""Run a pipe that chains several commands together."""
pipe = subprocess.DEVNULL
procs = []
try:
for cmd in cmds:
proc = subprocess.Popen(cmd, stdin=pipe,
stdout=subprocess.PIPE)
procs.append(proc)
if pipe is not subprocess.DEVNULL:
pipe.close()
pipe = proc.stdout
stdout, _ = proc.communicate()
finally:
# Must call wait() on every process, otherwise you get
# zombies.
for proc in procs:
proc.wait()
# Fail if any command in the pipe failed, except due to SIGPIPE
# which is expected.
for proc in procs:
if (proc.returncode
and proc.returncode != -signal.SIGPIPE):
raise subprocess.CalledProcessError(
proc.returncode, proc.args)
return stdout
Here we can see it in action. You can see that the pipeline correctly terminates with yes (which runs until SIGPIPE) and correctly fails with false (which always fails).
In [1]: run_pipe(["yes"], ["head", "-n", "1"])
Out[1]: b'y\n'
In [2]: run_pipe(["false"], ["true"])
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
<ipython-input-2-db97c6876cd7> in <module>()
----> 1 run_pipe(["false"], ["true"])
~/test.py in run_pipe(*cmds)
22 for proc in procs:
23 if proc.returncode and proc.returncode != -signal.SIGPIPE:
---> 24 raise subprocess.CalledProcessError(proc.returncode, proc.args)
25 return stdout
CalledProcessError: Command '['false']' returned non-zero exit status 1

How to validate subprocess.check_output()?

I am executing commands through subprocess.check_output() because i want the o/p of it to be stored in a buffer.
Now, while doing this if command gets failed or if there is any error then it is causing problem for my whole application.
What i want is, even if command fails it should just print and go on for next instruction.
Can anybody help me to solve it?
Below is the sample code.
from subprocess import check_output
buff=check_output(["command","argument"])
if buff=="condition":
print "Do Some Task "
One way to do this would be to use the Popen.communicate method on a Process instance.
from subprocess import Popen, PIPE
proc = Popen(["command", "argument"], stdout=PIPE, stderr=PIPE)
out, err = proc.communicate() # Blocks until finished
if proc.returncode != 0: # failed in some way
pass # handle however you want
# continue here

Filter out command that needs a terminal in Python subprocess module

I am developing a robot that accepts commands from network (XMPP) and uses subprocess module in Python to execute them and sends back the output of commands. Essentially it is an SSH-like XMPP-based non-interactive shell.
The robot only executes commands from authenticated trusted sources, so arbitrary shell commands are allowed (shell=True).
However, when I accidentally send some command that needs a tty, the robot is stuck.
For example:
subprocess.check_output(['vim'], shell=False)
subprocess.check_output('vim', shell=True)
Should each of the above commands is received, the robot is stuck, and the terminal from which the robot is run, is broken.
Though the robot only receives commands from authenticated trusted sources, human errs. How could I make the robot filter out those commands that will break itself? I know there is os.isatty but how could I utilize it? Is there a way to detect those "bad" commands and refuse to execute them?
TL;DR:
Say, there are two kinds of commands:
Commands like ls: does not need a tty to run.
Commands like vim: needs a tty; breaks subprocess if no tty is given.
How could I tell a command is ls-like or is vim-like and refuses to run the command if it is vim-like?
What you expect is a function that receives command as input, and returns meaningful output by running the command.
Since the command is arbitrary, requirement for tty is just one of many bad cases may happen (other includes running a infinite loop), your function should only concern about its running period, in other words, a command is “bad” or not should be determined by if it ends in a limited time or not, and since subprocess is asynchronous by nature, you can just run the command and handle it in a higher vision.
Demo code to play, you can change the cmd value to see how it performs differently:
#!/usr/bin/env python
# coding: utf-8
import time
import subprocess
from subprocess import PIPE
#cmd = ['ls']
#cmd = ['sleep', '3']
cmd = ['vim', '-u', '/dev/null']
print 'call cmd'
p = subprocess.Popen(cmd, shell=True,
stdin=PIPE, stderr=PIPE, stdout=PIPE)
print 'called', p
time_limit = 2
timer = 0
time_gap = 0.2
ended = False
while True:
time.sleep(time_gap)
returncode = p.poll()
print 'process status', returncode
timer += time_gap
if timer >= time_limit:
print 'timeout, kill process'
p.kill()
break
if returncode is not None:
ended = True
break
if ended:
print 'process ended by', returncode
print 'read'
out, err = p.communicate()
print 'out', repr(out)
print 'error', repr(err)
else:
print 'process failed'
Three points are notable in the above code:
We use Popen instead of check_output to run the command, unlike check_output which will wait for the process to end, Popen returns immediately, thus we can do further things to control the process.
We implement a timer to check for the process's status, if it runs for too long, we killed it manually because we think a process is not meaningful if it could not end in a limited time. In this way your original problem will be solved, as vim will never end and it will definitely being killed as an “unmeaningful” command.
After the timer helps us filter out bad commands, we can get stdout and stderr of the command by calling communicate method of the Popen object, after that its your choice to determine what to return to the user.
Conclusion
tty simulation is not needed, we should run the subprocess asynchronously, then control it by a timer to determine whether it should be killed or not, for those ended normally, its safe and easy to get the output.
Well, SSH is already a tool that will allow users to remotely execute commands and be authenticated at the same time. The authentication piece is extremely tricky, please be aware that building the software you're describing is a bit risky from a security perspective.
There isn't a way to determine whether a process is going to need a tty or not. And there's no os.isatty method because if you ran a sub-processes that needed one wouldn't mean that there was one. :)
In general, it would probably be safer from a security perspective and also a solution to this problem if you were to consider a white list of commands. You could choose that white list to avoid things that would need a tty, because I don't think you'll easily get around this.
Thanks a lot for #J.F. Sebastia's help (see comments under the question), I've found a solution (workaround?) for my case.
The reason why vim breaks terminal while ls does not, is that vim needs a tty. As Sebastia says, we can feed vim with a pty using pty.openpty(). Feeding a pty gurantees the command will not break terminal, and we can add a timout to auto-kill such processes. Here is (dirty) working example:
#!/usr/bin/env python3
import pty
from subprocess import STDOUT, check_output, TimeoutExpired
master_fd, slave_fd = pty.openpty()
try:
output1 = check_output(['ls', '/'], stdin=slave_fd, stderr=STDOUT, universal_newlines=True, timeout=3)
print(output1)
except TimeoutExpired:
print('Timed out')
try:
output2 = check_output(['vim'], stdin=slave_fd, stderr=STDOUT, universal_newlines=True, timeout=3)
print(output2)
except TimeoutExpired:
print('Timed out')
Note it is stdin that we need to take care of, not stdout or stderr.
You can refer to my answer in: https://stackoverflow.com/a/43012138/3555925, which use pseudo-terminal to make stdout no-blocking, and use select in handle stdin/stdout.
I can just modify the command var to 'vim'. And the script is working fine.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import sys
import select
import termios
import tty
import pty
from subprocess import Popen
command = 'vim'
# save original tty setting then set it to raw mode
old_tty = termios.tcgetattr(sys.stdin)
tty.setraw(sys.stdin.fileno())
# open pseudo-terminal to interact with subprocess
master_fd, slave_fd = pty.openpty()
# use os.setsid() process the leader of a new session, or bash job control will not be enabled
p = Popen(command,
preexec_fn=os.setsid,
stdin=slave_fd,
stdout=slave_fd,
stderr=slave_fd,
universal_newlines=True)
while p.poll() is None:
r, w, e = select.select([sys.stdin, master_fd], [], [])
if sys.stdin in r:
d = os.read(sys.stdin.fileno(), 10240)
os.write(master_fd, d)
elif master_fd in r:
o = os.read(master_fd, 10240)
if o:
os.write(sys.stdout.fileno(), o)
# restore tty settings back
termios.tcsetattr(sys.stdin, termios.TCSADRAIN, old_tty)

How to run a subprocess with Python, wait for it to exit and get the full stdout as a string?

So I noticed subprocess.call while it waits for the command to finish before proceeding with the python script, I have no way of getting the stdout, except with subprocess.Popen. Are there any alternative function calls that would wait until it finishes? (I also tried Popen.wait)
NOTE: I'm trying to avoid os.system call
result = subprocess.Popen([commands...,
self.tmpfile.path()], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = result.communicate()
print out+"HIHIHI"
my output:
HIHIHI
NOTE: I am trying to run wine with this.
I am using the following construct, although you might want to avoid shell=True. This gives you the output and error message for any command, and the error code as well:
process = subprocess.Popen(cmd, shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
# wait for the process to terminate
out, err = process.communicate()
errcode = process.returncode
subprocess.check_output(...)
calls the process, raises if its error code is nonzero, and otherwise returns its stdout. It's just a quick shorthand so you don't have to worry about PIPEs and things.
If your process gives a huge stdout and no stderr, communicate() might be the wrong way to go due to memory restrictions.
Instead,
process = subprocess.Popen(cmd, shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
# wait for the process to terminate
for line in process.stdout: do_something(line)
errcode = process.returncode
might be the way to go.
process.stdout is a file-like object which you can treat as any other such object, mainly:
you can read() from it
you can readline() from it and
you can iterate over it.
The latter is what I do above in order to get its contents line by line.
I'd try something like:
#!/usr/bin/python
from __future__ import print_function
import shlex
from subprocess import Popen, PIPE
def shlep(cmd):
'''shlex split and popen
'''
parsed_cmd = shlex.split(cmd)
## if parsed_cmd[0] not in approved_commands:
## raise ValueError, "Bad User! No output for you!"
proc = Popen(parsed_command, stdout=PIPE, stderr=PIPE)
out, err = proc.communicate()
return (proc.returncode, out, err)
... In other words let shlex.split() do most of the work. I would NOT attempt to parse the shell's command line, find pipe operators and set up your own pipeline. If you're going to do that then you'll basically have to write a complete shell syntax parser and you'll end up doing an awful lot of plumbing.
Of course this raises the question, why not just use Popen with the shell=True (keyword) option? This will let you pass a string (no splitting nor parsing) to the shell and still gather up the results to handle as you wish. My example here won't process any pipelines, backticks, file descriptor redirection, etc that might be in the command, they'll all appear as literal arguments to the command. Thus it is still safer then running with shell=True ... I've given a silly example of checking the command against some sort of "approved command" dictionary or set --- through it would make more sense to normalize that into an absolute path unless you intend to require that the arguments be normalized prior to passing the command string to this function.
With Python 3.8 this workes for me. For instance to execute a python script within the venv:
import subprocess
import sys
res = subprocess.run(
[
sys.executable, # venv3.8/bin/python
'main.py',
'--help',
],
stdout=subprocess.PIPE,
text=True
)
print(res.stdout)

Killing python ffmpeg subprocess breaks cli output

I'm trying to execute a system command with subprocess and reading the output.
But if the command takes more than 10 seconds I want to kill the subprocess.
I've tried doing this in several ways.
My last try was inspired by this post: https://stackoverflow.com/a/3326559/969208
Example:
import os
import signal
from subprocess import Popen, PIPE
class Alarm(Exception):
pass
def alarm_handler(signum, frame):
raise Alarm
def pexec(args):
p = Popen(args, stdout=PIPE, stderr=PIPE)
signal.signal(signal.SIGALRM, alarm_handler)
signal.alarm(10)
stdout = stderr = ''
try:
stdout, stderr = p.communicate()
signal.alarm(0)
except Alarm:
try:
os.kill(p.pid, signal.SIGKILL)
except:
pass
return (stdout, stderr)
The problem is: After the program exits no chars are shown in the cli until I hit return. And hitting return will not give me a new line.
I suppose this has something to do with the stdout and stderr pipe.
I've tried flushing and reading from the pipe (p.stdout.flush())
I've also tried with different Popen args, but might've missed something. Just thought I'd keep it simple here.
I'm running this on a Debian server.
Am I missing something here?
EDIT:
It seems this is only the case when killing an ongoing ffmpeg process. If the ffmpeg process exits normally before 10 seconds, there is no problem at all.
I've tried executing a couple of different command that take longer than 10 seconds, one who prints output, one who doesn't and a ffmpeg command to check the integrity of a file.
args = ['sleep', '12s'] # Works fine
args = ['ls', '-R', '/var'] # Works fine, prints lots for a long time
args = ['ffmpeg', '-v', '1', '-i', 'large_file.mov','-f', 'null', '-'] # Breaks cli output
I believe ffmpeg prints using \r and prints everything on the strerr pipe. Can this be the cause? Any ideas how to fix it?
Well. your code surely works fine on my Ubuntu server.
(which is close cousin or brother of Debian I suppose)
I added few more lines, so that I can test your code.
import os
import signal
from subprocess import Popen, PIPE
class Alarm(Exception):
pass
def alarm_handler(signum, frame):
raise Alarm
def pexec(args):
p = Popen(args, stdout=PIPE, stderr=PIPE)
signal.signal(signal.SIGALRM, alarm_handler)
signal.alarm(1)
stderr = ''
try:
stdout, stderr = p.communicate()
signal.alarm(0)
except Alarm:
print "Done!"
try:
os.kill(p.pid, signal.SIGKILL)
except:
pass
return (stdout, stderr)
args = ('find', '/', '-name','*')
stdout = pexec(args)
print "----------------------result--------------------------"
print stdout
print "----------------------result--------------------------"
Works like a charm.
If this code works on your server, I guess problem actually lies on
command line application that you trying to retrieve data.
I have the same problem. I can't get a running FFmpeg to terminate gracefully from a python subprocess, so I am using <process>.kill(). However I think this means FFmpeg does not restore the mode of the tty properly (as described here: https://askubuntu.com/a/172747)
You can get your shell back by running reset at the bash prompt, but that clears the screen so you can't see your script's output as you continue to work.
Better is to run stty echo which turns echoing back on for your shell session.
You can even run this in your script after you've nuked FFmpeg. I am doing:
ffmpeg_popen.kill()
ffmpeg_popen.wait()
subprocess.call(["stty", "echo"])
This works for me on Ubuntu with bash as my shell. YMMV, but I hope it helps. It smells hacky but it's the best solution I've found.
I ran into a similar issue with ffmpeg. It seems that if ffmpeg is killed using Popen.kill() it does not properly close and does not reinstate echoing on your terminal.
We can solve this using a pipe to stdin, and writing q to close ffmpeg as we would in a cli session:
p = Popen(args, stdin=PIPE stdout=PIPE, stderr=PIPE)
p.stdin.write(b"q")
It's probably preferable to use Popen.communicate in order to avoid a deadlock. The following will also work:
p = Popen(args, stdin=PIPE stdout=PIPE, stderr=PIPE)
p.communicate(b'q')
But it seems like even the following works:
p = Popen(args, stdin=PIPE stdout=PIPE, stderr=PIPE)
p.kill()
I'm not sure what causes this ffmpeg to close cleanly if it has an input pipe. Perhaps it has something to do with what causes this bug in the first place?

Categories