A: Why does it block?
B: How may I massage this slightly so that it will run without blocking?
#!/usr/bin/env python
import subprocess as sp
import os
kwds = dict(
stdin=sp.PIPE,
stdout=sp.PIPE,
stderr=sp.PIPE,
cwd=os.path.abspath(os.getcwd()),
shell=True,
executable='/bin/bash',
bufsize=1,
universal_newlines=True,
)
cmd = '/bin/bash'
proc = sp.Popen(cmd, **kwds)
proc.stdin.write('ls -lashtr\n')
proc.stdin.flush()
# This blocks and never returns
proc.stdout.read()
I need this to run interactively.
This is a simplified example, but the reality is I have a long running process and I'd like to startup a shell script that can more or less run arbitrary code (because it's an installation script).
EDIT:
I would like to effectively take a .bash_history over several different logins, clean it up so it is a single script, and then execute the newly crafted shell script line-by-line within a shell stored within a Python script.
For example:
> ... ssh to remote aws system ...
> sudo su -
> apt-get install stuff
> su - $USERNAME
> ... create and enter a docker snapshot ...
> ... install packages, update configurations
> ... install new services, update service configurations ...
> ... drop out of snapshot ...
> ... commit the snapshot ...
> ... remove the snapshot ...
> ... update services ...
> ... restart services ...
> ... drop into a tmux within the new docker ...
This takes hours manually; it should be automated.
A: Why does it block?
It blocks because that's what .read() does: it reads all of the bytes until an end-of-file indication. Since the process never indicates end of file, the .read() never returns.
B: How may I massage this slightly (emphasis on slightly) so that it will run without blocking?
One thing to do is to cause the process to indicate end of file. A small change is to cause the subprocess to exit.
proc.stdin.write('ls -lashtr; exit\n')
This is an example form my another answer: https://stackoverflow.com/a/43012138/3555925, which did not use pexpect. You can see more detail in that answer.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import sys
import select
import termios
import tty
import pty
from subprocess import Popen
command = 'bash'
# command = 'docker run -it --rm centos /bin/bash'.split()
# save original tty setting then set it to raw mode
old_tty = termios.tcgetattr(sys.stdin)
tty.setraw(sys.stdin.fileno())
# open pseudo-terminal to interact with subprocess
master_fd, slave_fd = pty.openpty()
# use os.setsid() make it run in a new process group, or bash job control will not be enabled
p = Popen(command,
preexec_fn=os.setsid,
stdin=slave_fd,
stdout=slave_fd,
stderr=slave_fd,
universal_newlines=True)
while p.poll() is None:
r, w, e = select.select([sys.stdin, master_fd], [], [])
if sys.stdin in r:
d = os.read(sys.stdin.fileno(), 10240)
os.write(master_fd, d)
elif master_fd in r:
o = os.read(master_fd, 10240)
if o:
os.write(sys.stdout.fileno(), o)
# restore tty settings back
termios.tcsetattr(sys.stdin, termios.TCSADRAIN, old_tty)
Related
What specific syntax needs to be added to the Python 3 script below in order for the script to filter through each line of the results and evaluate whether any of the lines of output contain specific substrings?
Here is the code which now successfully runs a git clone command:
newpath="C:\\path\\to\\destination\\"
cloneCommand='git clone https://github.com/someuser/somerepo.git'
proc = subprocess.check_call(cloneCommand, stdout=subprocess.PIPE, shell=True, cwd=newpath, timeout=None)
The above successfully clones the intended repo. But the problem is that there is not error handling.
I would like to be able to have the script listen for the words deltas and done in each line of output so that it can indicate success when the following line is printed in the output:
Resolving deltas: 100% (164/164), done.
subprocess.Popen(...) allows us to filter each line of the streaming output. However, subprocess.Popen(...) does not work when we run remote commands like git clone because subprocess.Popen(...) does not wait to receive the return from a remote call like git clone.
What syntax do we need to use to filter the output from calls to subprocess.check_call(...)?
A small script that we can execute to test our Popen code. It generates some STDOUT and STDERR before exiting with a code of our choosing, optionally with some delay:
from sys import stdout, stderr, exit, argv
from time import sleep
stdout.write('OUT 1\nOUT 2\nOUT 3\n')
sleep(2)
stderr.write('err 1\nerr 2\n')
exit(int(argv[1]))
A script demonstrating how to use Popen. The arguments to this script will be the external command that we want to execute.
import sys
from subprocess import Popen, PIPE
# A function that takes some subprocess command arguments (list, tuple,
# or string), runs that command, and returns a dict containing STDOUT,
# STDERR, PID, and exit code. Error handling is left to the caller.
def run_subprocess(cmd_args, shell = False):
p = Popen(cmd_args, stdout = PIPE, stderr = PIPE, shell = shell)
stdout, stderr = p.communicate()
return dict(
stdout = stdout.decode('utf-8').split('\n'),
stderr = stderr.decode('utf-8').split('\n'),
pid = p.pid,
exit_code = p.returncode,
)
# Run a command.
cmd = sys.argv[1:]
d = run_subprocess(cmd)
# Inspect the returned dict.
for k, v in d.items():
print('\n#', k)
print(v)
If the first script is called other_program.py and this script is called demo.py, you would run the whole thing along these lines:
python demo.py python other_program.py 0 # Exit with success.
python demo.py python other_program.py 1 # Exit with failure.
python demo.py python other_program.py X # Exit with a Python error.
Usage example with git clone as discussed with OP in comments:
$ python demo.py git clone --progress --verbose https://github.com/hindman/jump
# stdout
['']
# stderr
["Cloning into 'jump'...", 'POST git-upload-pack (165 bytes)', 'remote: Enumerating objects: 70, done. ', 'remote: Total 70 (delta 0), reused 0 (delta 0), pack-reused 70 ', '']
# pid
7094
# exit_code
0
I am trying to use Sailfish, which takes multiple fastq files as arguments, in a ruffus pipeline. I execute Sailfish using the subprocess module in python, but <() in the subprocess call does not work even when I set shell=True.
This is the command I want to execute using python:
sailfish quant [options] -1 <(cat sample1a.fastq sample1b.fastq) -2 <(cat sample2a.fastq sample2b.fastq) -o [output_file]
or (preferably):
sailfish quant [options] -1 <(gunzip sample1a.fastq.gz sample1b.fastq.gz) -2 <(gunzip sample2a.fastq.gz sample2b.fastq.gz) -o [output_file]
A generalization:
someprogram <(someprocess) <(someprocess)
How would I go about doing this in python? Is subprocess the right approach?
To emulate the bash process substitution:
#!/usr/bin/env python
from subprocess import check_call
check_call('someprogram <(someprocess) <(anotherprocess)',
shell=True, executable='/bin/bash')
In Python, you could use named pipes:
#!/usr/bin/env python
from subprocess import Popen
with named_pipes(n=2) as paths:
someprogram = Popen(['someprogram'] + paths)
processes = []
for path, command in zip(paths, ['someprocess', 'anotherprocess']):
with open(path, 'wb', 0) as pipe:
processes.append(Popen(command, stdout=pipe, close_fds=True))
for p in [someprogram] + processes:
p.wait()
where named_pipes(n) is:
import os
import shutil
import tempfile
from contextlib import contextmanager
#contextmanager
def named_pipes(n=1):
dirname = tempfile.mkdtemp()
try:
paths = [os.path.join(dirname, 'named_pipe' + str(i)) for i in range(n)]
for path in paths:
os.mkfifo(path)
yield paths
finally:
shutil.rmtree(dirname)
Another and more preferable way (no need to create a named entry on disk) to implement the bash process substitution is to use /dev/fd/N filenames (if they are available) as suggested by #Dunes. On FreeBSD, fdescfs(5) (/dev/fd/#) creates entries for all file descriptors opened by the process. To test availability, run:
$ test -r /dev/fd/3 3</dev/null && echo /dev/fd is available
If it fails; try to symlink /dev/fd to proc(5) as it is done on some Linuxes:
$ ln -s /proc/self/fd /dev/fd
Here's /dev/fd-based implementation of someprogram <(someprocess) <(anotherprocess) bash command:
#!/usr/bin/env python3
from contextlib import ExitStack
from subprocess import CalledProcessError, Popen, PIPE
def kill(process):
if process.poll() is None: # still running
process.kill()
with ExitStack() as stack: # for proper cleanup
processes = []
for command in [['someprocess'], ['anotherprocess']]: # start child processes
processes.append(stack.enter_context(Popen(command, stdout=PIPE)))
stack.callback(kill, processes[-1]) # kill on someprogram exit
fds = [p.stdout.fileno() for p in processes]
someprogram = stack.enter_context(
Popen(['someprogram'] + ['/dev/fd/%d' % fd for fd in fds], pass_fds=fds))
for p in processes: # close pipes in the parent
p.stdout.close()
# exit stack: wait for processes
if someprogram.returncode != 0: # errors shouldn't go unnoticed
raise CalledProcessError(someprogram.returncode, someprogram.args)
Note: on my Ubuntu machine, the subprocess code works only in Python 3.4+, despite pass_fds being available since Python 3.2.
Whilst J.F. Sebastian has provided an answer using named pipes it is possible to do this with anonymous pipes.
import shlex
from subprocess import Popen, PIPE
inputcmd0 = "zcat hello.gz" # gzipped file containing "hello"
inputcmd1 = "zcat world.gz" # gzipped file containing "world"
def get_filename(file_):
return "/dev/fd/{}".format(file_.fileno())
def get_stdout_fds(*processes):
return tuple(p.stdout.fileno() for p in processes)
# setup producer processes
inputproc0 = Popen(shlex.split(inputcmd0), stdout=PIPE)
inputproc1 = Popen(shlex.split(inputcmd1), stdout=PIPE)
# setup consumer process
# pass input processes pipes by "filename" eg. /dev/fd/5
cmd = "cat {file0} {file1}".format(file0=get_filename(inputproc0.stdout),
file1=get_filename(inputproc1.stdout))
print("command is:", cmd)
# pass_fds argument tells Popen to let the child process inherit the pipe's fds
someprogram = Popen(shlex.split(cmd), stdout=PIPE,
pass_fds=get_stdout_fds(inputproc0, inputproc1))
output, error = someprogram.communicate()
for p in [inputproc0, inputproc1, someprogram]:
p.wait()
assert output == b"hello\nworld\n"
If possible I would like to not use subProcess.popen. The reason I want to capture the stdout of the process started by the child is because I need to save the output of the child in a variable to display it back later. However I have yet to find a way to do so anywhere. I also need to activate multiple programs without necessarily closing the one that's active. I also need to be controlling the child process whit the parent process.
I'm launching a subprocess like this
listProgram = ["./perroquet.py"]
listOutput = ["","",""]
tubePerroquet = os.pipe()
pipeMain = os.pipe()
pipeAge = os.pipe()
pipeSavoir = os.pipe()
pid = os.fork()
process = 1
if pid == 0:
os.close(pipePerroquet[1])
os.dup2(pipePerroquet[0],0)
sys.stdout = os.fdopen(tubeMain[1], 'w')
os.execvp("./perroquet.py", listProgram)
Now as you can see I'm launching the program with os.execvp and using os.dup2() to redirect the stdout of the child. However I'm not sure of what I've done in the code and want to know of the correct way to redirect stdout with os.dup2 and then be able to read it in the parent process.
Thank you for your help.
I cannot understand why you do not want to use the excellent subprocess module that could save you a lot of boiler plate code (and as much error possibilities ...). Anyway, I assume perroquet.py is a python script, not an executable progam. Shell know how to find the correct interpretor for scripts, but exec family are low-level functions that expect a real executable program.
You should at least have something like :
listProgram = [ "python", "./perroquet.py","",""]
...
os.execvp("python", listProgram)
But I'd rather use :
prog = subprocess.Popen(("python", "./perroquet.py", "", ""), stdout = PIPE)
or even as you are already in python import it and directly call the functions from there.
EDIT :
It looks thart what you really want is :
user gives you a command (can be almost anything)
[ you validate that the command is safe ] - unsure if you intend to do it but you should ...
you make the shell execute the command and get its output - you may want to read stderr too and control exit code
You should try something like
while True:
cmd = raw_input("commande :") # input with Python 3
if cmd.strip().lower() == exit: break
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE,
stderr=subprocess.PIPE, shell=True)
out, err = proc.communicate()
code = proc.returncode
print("OUT", out, "ERR", err, "CODE", code)
It is absolutely unsafe, since this code executes any command as the underlying shell would do (include rm -rf *, rd /s/q ., ...), but it gives you the output, the output and the return code of the command, and it can be used is a loop. The only limitation is that as you use a different shell for each command, you cannot use commands that change shell environment - they will be executed but will have no effect.
Here's a solution if you need to extract any changes to the environment
from subprocess import Popen, PIPE
import os
def execute_and_get_env(cmd, initial_env=None):
if initial_env is None:
initial_env = os.environ
r_fd, w_fd = os.pipe()
write_env = "; env >&{}".format(w_fd)
p = Popen(cmd + write_env, shell=True, env=initial_env, pass_fds=[w_fd], stdout=PIPE, stderr=PIPE)
output, error = p.communicate()
# this will cause problems if the environment gets very large as
# writing to the pipe will hang because it gets full and we only
# read from the pipe when the process is over
os.close(w_fd)
with open(r_fd) as f:
env = dict(line[:-1].split("=", 1) for line in f)
return output, error, env
export_cmd = "export my_var='hello world'"
echo_cmd = "echo $my_var"
out, err, env = execute_and_get_env(export_cmd)
out, err, env = execute_and_get_env(echo_cmd, env)
print(out)
I've got a python script that calls ffmpeg via subprocess to do some mp3 manipulations. It works fine in the foreground, but if I run it in the background, it gets as far as the ffmpeg command, which itself gets as far as dumping its config into stderr. At this point, everything stops and the parent task is reported as stopped, without raising an exception anywhere. I've tried a few other simple commands in the place of ffmpeg, they execute normally in foreground or background.
This is the minimal example of the problem:
import subprocess
inf = "3HTOSD.mp3"
outf = "out.mp3"
args = [ "ffmpeg",
"-y",
"-i", inf,
"-ss", "0",
"-t", "20",
outf
]
print "About to do"
result = subprocess.call(args)
print "Done"
I really can't work out why or how a wrapped process can cause the parent to terminate without at least raising an error, and how it only happens in so niche a circumstance. What is going on?
Also, I'm aware that ffmpeg isn't the nicest of packages, but I'm interfacing with something that has using ffmpeg compiled into it, so using it again seems sensible.
It might be related to Linux process in background - “Stopped” in jobs? e.g., using parent.py:
from subprocess import check_call
check_call(["python", "-c", "import sys; sys.stdin.readline()"])
should reproduce the issue: "parent.py script shown as stopped" if you run it in bash as a background job:
$ python parent.py &
[1] 28052
$ jobs
[1]+ Stopped python parent.py
If the parent process is in an orphaned process group then it is killed on receiving SIGTTIN signal (a signal to stop).
The solution is to redirect the input:
import os
from subprocess import check_call
try:
from subprocess import DEVNULL
except ImportError: # Python 2
DEVNULL = open(os.devnull, 'r+b', 0)
check_call(["python", "-c", "import sys; sys.stdin.readline()"], stdin=DEVNULL)
If you don't need to see ffmpeg stdout/stderr; you could also redirect them to /dev/null:
check_call(ffmpeg_cmd, stdin=DEVNULL, stdout=DEVNULL, stderr=STDOUT)
I like to use the commands module. It's simpler to use in my opinion.
import commands
cmd = "ffmpeg -y -i %s -ss 0 -t 20 %s 2>&1" % (inf, outf)
status, output = commands.getstatusoutput(cmd)
if status != 0:
raise Exception(output)
As a side note, sometimes PATH can be an issue, and you might want to use an absolute path to the ffmpeg binary.
matt#goliath:~$ which ffmpeg
/opt/local/bin/ffmpeg
From the python/subprocess/call documentation:
Wait for command to complete, then return the returncode attribute.
So as long as the process you called does not exit, your program does not go on.
You should set up a Popen process object, put its standard output and error in different buffers/streams and when there is an error, you terminate the process.
Maybe something like this works:
proc = subprocess.Popen(args, stderr = subprocess.PIPE) # puts stderr into a new stream
while proc.poll() is None:
try:
err = proc.stderr.read()
except: continue
else:
if err:
proc.terminate()
break
I have the following python code that hangs :
cmd = ["ssh", "-tt", "-vvv"] + self.common_args
cmd += [self.host]
cmd += ["cat > %s" % (out_path)]
p = subprocess.Popen(cmd, stdin=subprocess.PIPE,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = p.communicate(in_string)
It is supposed to save a string (in_string) into a remote file over ssh.
The file is correctly saved but then the process hangs. If I use
cmd += ["echo"] instead of
cmd += ["cat > %s" % (out_path)]
the process does not hang so I am pretty sure that I misunderstand something about the way communicate considers that the process has exited.
do you know how I should write the command so the the "cat > file" does not make communicate hang ?
-tt option allocates tty that prevents the child process to exit when .communicate() closes p.stdin (EOF is ignored). This works:
import pipes
from subprocess import Popen, PIPE
cmd = ["ssh", self.host, "cat > " + pipes.quote(out_path)] # no '-tt'
p = Popen(cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE)
stdout, stderr = p.communicate(in_string)
You could use paramiko -- pure Python ssh library, to write data to a remote file via ssh:
#!/usr/bin/env python
import os
import posixpath
import sys
from contextlib import closing
from paramiko import SSHConfig, SSHClient
hostname, out_path, in_string = sys.argv[1:] # get from command-line
# load parameters to setup ssh connection
config = SSHConfig()
with open(os.path.expanduser('~/.ssh/config')) as config_file:
config.parse(config_file)
d = config.lookup(hostname)
# connect
with closing(SSHClient()) as ssh:
ssh.load_system_host_keys()
ssh.connect(d['hostname'], username=d.get('user'))
with closing(ssh.open_sftp()) as sftp:
makedirs_exists_ok(sftp, posixpath.dirname(out_path))
with sftp.open(out_path, 'wb') as remote_file:
remote_file.write(in_string)
where makedirs_exists_ok() function mimics os.makedirs():
from functools import partial
from stat import S_ISDIR
def isdir(ftp, path):
try:
return S_ISDIR(ftp.stat(path).st_mode)
except EnvironmentError:
return None
def makedirs_exists_ok(ftp, path):
def exists_ok(mkdir, name):
"""Don't raise an error if name is already a directory."""
try:
mkdir(name)
except EnvironmentError:
if not isdir(ftp, name):
raise
# from os.makedirs()
head, tail = posixpath.split(path)
if not tail:
assert path.endswith(posixpath.sep)
head, tail = posixpath.split(head)
if head and tail and not isdir(ftp, head):
exists_ok(partial(makedirs_exists_ok, ftp), head) # recursive call
# do create directory
assert isdir(ftp, head)
exists_ok(ftp.mkdir, path)
It makes sense that the cat command hangs. It is waiting for an EOF. I tried sending an EOF in the string but couldn't get it to work. Upon researching this question, I found a great module for streamlining the use of SSH for command line tasks like your cat example. It might not be exactly what you need for your usecase, but it does do what your question asks.
Install fabric with
pip install fabric
Inside a file called fabfile.py put
from fabric.api import run
def write_file(in_string, path):
run('echo {} > {}'.format(in_string,path))
And then run this from the command prompt with,
fab -H username#host write_file:in_string=test,path=/path/to/file