Shell piping with subprocess in Python - python

I read every thread I found on StackOverflow on invoking shell commands from Python using subprocess, but I couldn't find an answer that applies to my situation below:
I would like to do the following from Python:
Run shell command command_1. Collect the output in variable result_1
Shell pipe result_1 into command_2 and collect the output on result_2. In other words, run command_1 | command_2 using the result that I obtained when running command_1 in the step before
Do the same piping result_1 into a third command command_3 and collecting the result in result_3.
So far I have tried:
p = subprocess.Popen(command_1, stdout=subprocess.PIPE, shell=True)
result_1 = p.stdout.read();
p = subprocess.Popen("echo " + result_1 + ' | ' +
command_2, stdout=subprocess.PIPE, shell=True)
result_2 = p.stdout.read();
the reason seems to be that "echo " + result_1 does not simulate the process of obtaining the output of a command for piping.
Is this at all possible using subprocess? If so, how?

You can do:
pipe = Popen(command_2, shell=True, stdin=PIPE, stdout=PIPE)
pipe.stdin.write(result_1)
pipe.communicate()
instead of the line with the pipe.

Related

Unix Popen.communicate not able to gzip large file

I need to gzip files of size more than 10 GB using python on top of shell commands and hence decided to use subprocess Popen.
Here is my code:
outputdir = '/mnt/json/output/'
inp_cmd='gzip -r ' + outputdir
pipe = Popen(["bash"], stdout =PIPE,stdin=PIPE,stderr=PIPE)
cmd = bytes(inp_cmd.encode('utf8'))
stdout_data,stderr_data = pipe.communicate(input=cmd)
It is not gzip-ing the files within output directory.
Any way out?
The best way is to use subprocess.call() instead of subprocess.communicate().
call() waits till the command is executed completely while in Popen(), one has to extrinsically use wait() method for the execution to finish.
Have you tried it like this:
output_dir = "/mnt/json/output/"
cmd = "gzip -r {}".format(output_dir)
proc = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
stdin=subprocess.PIPE,
shell=True,
)
out, err = proc.communicate()

python subprocess popen synchronous commands

I am trying to use popen to kick off a subprocess that calls two commands (with multiple arguements) one after the other. The second command relies on the first command running, so I was hoping to use a single subprocess to run both rather than spawning two processes and wait on the first.
But I am running into issues because I am not sure how to give two command inputs or to seperate the command as one single object.
Also, I am trying to avoid setting shell to true if possible.
This is essentially, what I am trying to do:
for test in resources:
command = [
'pgh',
'resource',
'create',
'--name', test['name'],
'--description', test['description'],
]
command2 = [
'pgh',
'assignment',
'create',
'--name', test['name'],
'--user', test['user'],
]
p = Popen(command, stdout=PIPE, stderr=PIPE)
stdout, stderr = p.communicate()
print(stdout)
print(stderr)
As per my understanding the following should work for you.
To chain the execution once the previous completes use.
p1 = subprocess.Popen(command, stdout=subprocess.PIPE)
p2 = subprocess.Popen(command2, stdin=p1.stdout, stdout=subprocess.PIPE)
print p2.communicate()
You will have to launch command and wait for completion before launching another command. You should do this repeatedly for each command.
This can be done as
ps = [ Popen(c, stdout=PIPE, stderr=PIPE).communicate()
for c in command]
Note that this launches the next command irrespective of weather the first command succeeded or failed. If you want to launch the next command only if the previous command succeds then use
def check_execute(commands):
return_code = 0
for c in commands:
p = Popen(c, stdout=PIPE, stderr=PIPE)
result = p.communicate()
yield result
return_code = p.returncode
if return_code != 0:
break

Using subprocess to get output

Using the subprocess module how do I get the following command to work?
isql -v -b -d, DSN_NAME "DOMAIN\username" password <<<
"SELECT column_name, data_type
FROM database_name.information_schema.columns
WHERE table_name = 'some_table';"
This command works perfectly when I run it in a bash shell but I can't get it to work when running from within Python. I'm trying to do this from within Python because I need to be able to modify the query and get different result sets back and then process them in Python. I can't use one of the nice Python database connectors for various reasons which leaves me trying to pipe output from isql.
My code currently looks similar to the following:
bash_command = '''
isql -v -b -d, DSN_NAME "DOMAIN\username" password <<<
"SELECT column_name, data_type
FROM database_name.information_schema.columns
WHERE table_name = 'some_table';"
'''
process = subprocess.Popen(bash_command,
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
output, error = process.communicate()
However I have tried lots of variations:
Using the entire command as a string, or as a list of strings.
Using check_output vs Popen.
Using communicate() to try and send the query to the isql command or having the query be part of the command string using a heredoc.
Using shell = True or not.
Specifying /bin/bash or using the default /bin/sh.
Lots of different quoting and escaping patterns.
And pretty much every permutation of the above.
In no case do I receive the output of the query that I'm looking for. I'm pretty sure that the command isn't being sent to the shell as is but I can't tell what is being sent to the shell.
I feel like this should be pretty simple, send a command to the shell and get the output back, but I just can't make it work. I can't even see what command is being sent to the shell, even using pdb.
shell=True makes subprocess use /bin/sh by default. <<< "here-string" is a bash-ism; pass executable='/bin/bash':
>>> import subprocess
>>> subprocess.call(u'cat <<< "\u0061"', shell=True)
/bin/sh: 1: Syntax error: redirection unexpected
2
>>> subprocess.call(u'cat <<< "\u0061"', shell=True, executable='/bin/bash')
a
0
You should also use raw-string literals to avoid escaping backslashes: "\\u0061" == r"\u0061" != u"\u0061":
>>> subprocess.call(r'cat <<< "\u0061"', shell=True, executable='/bin/bash')
\u0061
0
Though you don't need shell=True here. You could pass the input as a string using process.communicate(input=input_string):
>>> process = subprocess.Popen(['cat'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
>>> process.communicate(br"\u0061")
('\\u0061', None)
The result could look like:
#!/usr/bin/env python
import shlex
from subprocess import Popen, PIPE
cmd = shlex.split(r'isql -v -b -d, DSN_NAME "DOMAIN\username" password')
process = Popen(cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE)
output, errors = process.communicate(
b"SELECT column_name, data_type "
b"FROM database_name.information_schema.columns "
b"WHERE table_name = 'some_table';")
Try giving this a shot:
import shlex
from subprocess import Popen, PIPE, STDOUT
sql_statement = '''"SELECT column_name, data_type
FROM database_name.information_schema.columns
WHERE table_name = 'some_table';"'''
isqlcommand = 'isql -v -b -d, DSN_NAME "DOMAIN\username" password'
isqlcommand_args = shlex.split(isqlcommand)
process = Popen(isqlcommand_args, stdin=PIPE, stdout=PIPE, stderr=STDOUT)
output = process.communicate(input=sql_statement)[0]
print output
The idea here is to separate the here-string redirection from the isql command execution. This example will pipe the here-string into the stdin of process via process.communicate(). I'm also using shlex.split() to tokenize the command and its arguments.
Edit: Removed Shell=True after reviewing comment from J.F. Sebastian

How to avoid passing shell constructs to executable using Popen

I am trying to call an executable called foo, and pass it some command line arguments. An external script calls into the executable and uses the following command:
./main/foo --config config_file 2>&1 | /usr/bin/tee temp.log
The script uses Popen to execute this command as follows:
from subprocess import Popen
from subprocess import PIPE
def run_command(command, returnObject=False):
cmd = command.split(' ')
print('%s' % cmd)
p = None
print('command : %s' % command)
if returnObject:
p = Popen(cmd)
else:
p = Popen(cmd)
p.communicate()
print('returncode: %s' % p.returncode)
return p.returncode
return p
command = "./main/foo --config config_file 2>&1 | /usr/bin/tee temp.log
"
run_command(command)
However, this passes extra arguments ['2>&1', '|', '/usr/bin/tee', 'temp.log'] to the foo executable.
How can I get rid of these extra arguments getting passed to foo while maintaining the functionality?
I have tried shell=True but read about avoiding it for security purposes (shell injection attack). Looking for a neat solution.
Thanks
UPDATE:
- Updated the file following the tee command
The string
./main/foo --config config_file 2>&1 | /usr/bin/tee >temp.log
...is full of shell constructs. These have no meaning to anything without a shell in play. Thus, you have two options:
Set shell=True
Replace them with native Python code.
For instance, 2>&1 is the same thing as passing stderr=subprocess.STDOUT to Popen, and your tee -- since its output is redirected and it's passed no arguments -- could just be replaced with stdout=open('temp.log', 'w').
Thus:
p = subprocess.Popen(['./main/foo', '--config', 'config_file'],
stderr=subprocess.STDOUT,
stdout=open('temp.log', 'w'))
...or, if you really did want the tee command, but were just using it incorrectly (that is, if you wanted tee temp.log, not tee >temp.log):
p1 = subprocess.Popen(['./main/foo', '--config', 'config_file'],
stderr=subprocess.STDOUT,
stdout=subprocess.PIPE)
p2 = subprocess.Popen(['tee', 'temp.log'], stdin=p1.stdout)
p1.stdout.close() # drop our own handle so p2's stdin is the only handle on p1.stdout
stdout, _ = p2.communicate()
Wrapping this in a function, and checking success for both ends might look like:
def run():
p1 = subprocess.Popen(['./main/foo', '--config', 'config_file'],
stderr=subprocess.STDOUT,
stdout=subprocess.PIPE)
p2 = subprocess.Popen(['tee', 'temp.log'], stdin=p1.stdout)
p1.stdout.close() # drop our own handle so p2's stdin is the only handle on p1.stdout
# True if both processes were successful, False otherwise
return (p2.wait() == 0 && p1.wait() == 0)
By the way -- if you want to use shell=True and return the exit status of foo, rather than tee, things get a bit more interesting. Consider the following:
p = subprocess.Popen(['bash', '-c', 'set -o pipefail; ' + command_str])
...the pipefail bash extension will force the shell to exit with the status of the first pipeline component to fail (and 0 if no components fail), rather than using only the exit status of the final component.
Here's a couple of "neat" code examples in addition to the explanation from #Charles Duffy answer.
To run the shell command in Python:
#!/usr/bin/env python
from subprocess import check_call
check_call("./main/foo --config config_file 2>&1 | /usr/bin/tee temp.log",
shell=True)
without the shell:
#!/usr/bin/env python
from subprocess import Popen, PIPE, STDOUT
tee = Popen(["/usr/bin/tee", "temp.log"], stdin=PIPE)
foo = Popen("./main/foo --config config_file".split(),
stdout=tee.stdin, stderr=STDOUT)
pipestatus = [foo.wait(), tee.wait()]
Note: don't use "command arg".split() with non-literal strings.
See How do I use subprocess.Popen to connect multiple processes by pipes?
You may combine answers to two StackOverflow questions:
1. piping together several subprocesses
x | y problem
2. Merging a Python script's subprocess' stdout and stderr (while keeping them distinguishable)
2>&1 problem

running bash command from python shell

I want to run a bash command from python shell.
my bash is:
grep -Po "(?<=<cite>).*?(?=</cite>)" /tmp/file1.txt | awk -F/ '{print $1}' | awk '!x[$0]++' > /tmp/file2.txt
what I tried is:
#!/usr/bin/python
import commands
commands.getoutput('grep ' + '-Po ' + '\"\(?<=<dev>\).*?\(?=</dev>\)\" ' + '/tmp/file.txt ' + '| ' + 'awk \'!x[$0]++\' ' + '> ' + '/tmp/file2.txt')
But I don't have any result.
Thank you
If you want to avoid splitting your arguments and worrying about pipes, you can use the shell=True option:
cmd = "grep -Po \"(?<=<dev>).*?(?=</dev>)\" /tmp/file.txt | awk -F/ '{print $1}' | awk '!x[$0]++' > file2.txt"
out = subprocess.check_output(cmd, shell=True)
This will run a subshell which will understands all your directives, including "|" for piping, ">" for redirection. If you do not do this, these symbols normally parsed by the shell will just be passed to grep program.
Otherwise, you have to create the pipes yourself. For example (untested code below):
grep_p = subprocess.Popen(["grep", "-Po", "(?<=<dev>).*?(?=</dev>)", "/tmp/file.txt"], stdout=subprocess.PIPE)
awk_p = subprocess.Popen(["awk", "-F/", "'{print $1}'"], stdin = grep_p.stdout)
file2_fh = open("file2.txt", "w")
awk_p_2 = subprocess.Popen(["awk", "!x[$0]++", stdout = file2_fh, stdin = awk_p.stdout)
awk_p_2.communicate()
However, you're missing the point of python if you are doing this. You should instead look into the re module: re.match, re.sub, re.search, though I'm not familiar enough with awk to translate your commands.
The recommend way to run system commands in python is to use the module subprocess.
import subprocess
a=['grep' ,'-Po', '"(?<=<dev>).*?(?=</dev>)"','/tmp/file.txt']
b=['awk', '-F/', '"{print $1}"']
c=["awk", '"!x[$0]++"']
p1 = subprocess.Popen(a,stdout=subprocess.PIPE)
p2 = subprocess.Popen(b,stdin=p1.stdout,stdout=subprocess.PIPE)
p3 = subprocess.Popen(c,stdin=p2.stdout,stdout=subprocess.PIPE)
p1.stdout.close()
p2.stdout.close()
out,err=p3.communicate()
print out
The point of creating pipes between each subprocess is for security and debugging reasons. Also it makes the code much clearer in terms, which process gets input and sends output to.
Let us write a simple function to easily deal with these messy pipes for us:
def subprocess_pipes (pipes, last_pipe_out = None):
import subprocess
from subprocess import PIPE
last_p = None
for cmd in pipes:
out_pipe = PIPE if not (cmd==pipes[-1] and last_pipe_out) else open(last_pipe_out, "w")
cmd = cmd if isinstance(cmd, list) else cmd.split(" ")
in_pipe = last_p.stdout if last_p else None
p = subprocess.Popen(cmd, stdout = out_pipe, stdin = in_pipe)
last_p = p
comm = last_p.communicate()
return comm
Then we run,
subprocess_pipes(("ps ax", "grep python"), last_pipe_out = "test.out.2")
The result is a "test.out.2" file with the contents of piping "ps ax" into "grep python".
In your case,
a = ["grep", "-Po", "(?<=<cite>).*?(?=</cite>)", "/tmp/file1.txt"]
b = ["awk", "-F/", "{print $1}"]
c = ["awk", "!x[$0]++"]
subprocess_pipes((a, b, c), last_pipe_out = "/tmp/file2.txt")
The commands module is obsolete now.
If you don't actually need the output of your command you can use
import os
exit_status = os.system("your-command")
Otherwise you can use
import suproccess
out, err = subprocess.Popen("your | commands", stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell = True).communicate()
Note: for your command you send stdout to file2.txt so I wouldn't expect to see anything in out you will however still see error messages on stderr which will go into err
you must use
import os
os.system(command)
I think what you are looking for is something like:
ubprocess.check_output(same as popen arguments, **kwargs) , use it the same way you would use a popen command , it should show you the output of the program that's being called.
For more details here is a link: http://freefilesdl.com/how-to-call-a-shell-command-from-python/

Categories