This question already has answers here:
How do I use subprocess.Popen to connect multiple processes by pipes?
(9 answers)
Closed 1 year ago.
I have a script in which I am trying to use subprocess.call to execute a series of shell commands, but which appears to have some commands omitted when executed.
Specifically:
#!/usr/bin/python
import tempfile
import subprocess
import os
import re
grepfd, grepfpath = tempfile.mkstemp(suffix=".xx")
sedfd, sedfpath = tempfile.mkstemp(suffix=".xx")
# grepoutfile = open( grepfpath, 'w')
sedoutfile = open( sedfpath, 'w' )
subprocess.call(['cp','/Users/bobby/Downloads/sample.txt', grepfpath])
sedcmd = [ 'sort',
grepfpath,
'|',
'uniq',
'|',
'sed',
'-e',
'"s/bigstring of word/ smaller /"',
'|',
'column',
'-t',
'-s',
'"=>"' ]
print "sedcmd = ", sedcmd
subprocess.call( ['ls', grepfpath ] )
subprocess.call( ['sort', '|', 'uniq' ], stdin = grepfd )
subprocess.call( sedcmd, stdout = sedoutfile )
And it generates this as output:
python d3.py
sedcmd = ['sort', /var/folders/3h/_0xwt5bx0hx8tgx06cmq9h_4f183ql/T/tmp5Gp0ff.xx', '|', 'uniq', '|', 'sed', '-e', '"s/bigstring of word/ smaller /"', '|', 'column', '-t', '-s', '"=>"']
/var/folders/3h/_0xwt5bx0hx8tgx06cmq9h_4f183ql/T/tmp5Gp0ff.xx
sort: open failed: |: No such file or directory
sort: invalid option -- e
Try `sort --help' for more information.
The first 'sort: open failed: |:No such file... is from the first subprocess call ['sort','|','uniq'], stdin = grepfd )
The 'sort: invalid option -- e .. is from the second subprocess call (sedcmd).
I have seen a lot of examples that use pipes in this context -- so what am I doing wrong?
Thanks!
This is a class that will run a command with an arbitrary number of pipes:
pipeline.py
import shlex
import subprocess
class Pipeline(object):
def __init__(self, command):
self.command = command
self.command_list = self.command.split('|')
self.output = None
self.errors = None
self.status = None
self.result = None
def run(self):
process_list = list()
previous_process = None
for command in self.command_list:
args = shlex.split(command)
if previous_process is None:
process = subprocess.Popen(args, stdout=subprocess.PIPE)
else:
process = subprocess.Popen(args,
stdin=previous_process.stdout,
stdout=subprocess.PIPE)
process_list.append(process)
previous_process = process
last_process = process_list[-1]
self.output, self.errors = last_process.communicate()
self.status = last_process.returncode
self.result = (0 == self.status)
return self.result
This example shows how to use the class:
harness.py
from pipeline import Pipeline
if __name__ == '__main__':
command = '|'.join([
"sort %s",
"uniq",
"sed -e 's/bigstring of word/ smaller /'",
"column -t -s '=>'"
])
command = command % 'sample.txt'
pipeline = Pipeline(command)
if not pipeline.run():
print "ERROR: Pipeline failed"
else:
print pipeline.output
I created this sample file to for testing:
sample.txt
word1>word2=word3
list1>list2=list3
a>bigstring of word=b
blah1>blah2=blah3
Output
a smaller b
blah1 blah2 blah3
list1 list2 list3
word1 word2 word3
So if in a command you want to use shell pipes you can add shell=True in subprocess:
so it will be like this:
sedcmd = 'sort /var/folders/3h/_0xwt5bx0hx8tgx06cmq9h_4f183ql/T/tmp5Gp0ff.xx | uniq | sed -e "s/bigstring of word/ smaller /" | column -t -s "=>" '
subprocess.call(sedcmd, shell=True)
But be carefull with shell=True, it's strongly discouraged to use it : subprocess official documentation
So if you want to use pipes without shell=True you can use subprocees.PIPE in the stdout , and here's an example on how to do it: stackoveflow answer
Related
Is there any way I can get the PID by process name in Python?
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3110 meysam 20 0 971m 286m 63m S 14.0 7.9 14:24.50 chrome
For example I need to get 3110 by chrome.
You can get the pid of processes by name using pidof through subprocess.check_output:
from subprocess import check_output
def get_pid(name):
return check_output(["pidof",name])
In [5]: get_pid("java")
Out[5]: '23366\n'
check_output(["pidof",name]) will run the command as "pidof process_name", If the return code was non-zero it raises a CalledProcessError.
To handle multiple entries and cast to ints:
from subprocess import check_output
def get_pid(name):
return map(int,check_output(["pidof",name]).split())
In [21]: get_pid("chrome")
Out[21]:
[27698, 27678, 27665, 27649, 27540, 27530, 27517, 14884, 14719, 13849, 13708, 7713, 7310, 7291, 7217, 7208, 7204, 7189, 7180, 7175, 7166, 7151, 7138, 7127, 7117, 7114, 7107, 7095, 7091, 7087, 7083, 7073, 7065, 7056, 7048, 7028, 7011, 6997]
Or pas the -s flag to get a single pid:
def get_pid(name):
return int(check_output(["pidof","-s",name]))
In [25]: get_pid("chrome")
Out[25]: 27698
You can use psutil package:
Install
pip install psutil
Usage:
import psutil
process_name = "chrome"
pid = None
for proc in psutil.process_iter():
if process_name in proc.name():
pid = proc.pid
break
print("Pid:", pid)
you can also use pgrep, in prgep you can also give pattern for match
import subprocess
child = subprocess.Popen(['pgrep','program_name'], stdout=subprocess.PIPE, shell=True)
result = child.communicate()[0]
you can also use awk with ps like this
ps aux | awk '/name/{print $2}'
For posix (Linux, BSD, etc... only need /proc directory to be mounted) it's easier to work with os files in /proc.
It's pure python, no need to call shell programs outside.
Works on python 2 and 3 ( The only difference (2to3) is the Exception tree, therefore the "except Exception", which I dislike but kept to maintain compatibility. Also could've created a custom exception.)
#!/usr/bin/env python
import os
import sys
for dirname in os.listdir('/proc'):
if dirname == 'curproc':
continue
try:
with open('/proc/{}/cmdline'.format(dirname), mode='rb') as fd:
content = fd.read().decode().split('\x00')
except Exception:
continue
for i in sys.argv[1:]:
if i in content[0]:
print('{0:<12} : {1}'.format(dirname, ' '.join(content)))
Sample Output (it works like pgrep):
phoemur ~/python $ ./pgrep.py bash
1487 : -bash
1779 : /bin/bash
Complete example based on the excellent #Hackaholic's answer:
def get_process_id(name):
"""Return process ids found by (partial) name or regex.
>>> get_process_id('kthreadd')
[2]
>>> get_process_id('watchdog')
[10, 11, 16, 21, 26, 31, 36, 41, 46, 51, 56, 61] # ymmv
>>> get_process_id('non-existent process')
[]
"""
child = subprocess.Popen(['pgrep', '-f', name], stdout=subprocess.PIPE, shell=False)
response = child.communicate()[0]
return [int(pid) for pid in response.split()]
To improve the Padraic's answer: when check_output returns a non-zero code, it raises a CalledProcessError. This happens when the process does not exists or is not running.
What I would do to catch this exception is:
#!/usr/bin/python
from subprocess import check_output, CalledProcessError
def getPIDs(process):
try:
pidlist = map(int, check_output(["pidof", process]).split())
except CalledProcessError:
pidlist = []
print 'list of PIDs = ' + ', '.join(str(e) for e in pidlist)
if __name__ == '__main__':
getPIDs("chrome")
The output:
$ python pidproc.py
list of PIDS = 31840, 31841, 41942
if you're using windows,
you can get PID of process/app with it's image name with this code:
from subprocess import Popen, PIPE
def get_pid_of_app(app_image_name):
final_list = []
command = Popen(['tasklist', '/FI', f'IMAGENAME eq {app_image_name}', '/fo', 'CSV'], stdout=PIPE, shell=False)
msg = command.communicate()
output = str(msg[0])
if 'INFO' not in output:
output_list = output.split(app_image_name)
for i in range(1, len(output_list)):
j = int(output_list[i].replace("\"", '')[1:].split(',')[0])
if j not in final_list:
final_list.append(j)
return final_list
it will return you all PID of a app like firefox or chrome e.g.
>>> get_pid_of_app("firefox.exe")
[10908, 4324, 1272, 6936, 1412, 2824, 6388, 1884]
let me know if it helped
If your OS is Unix base use this code:
import os
def check_process(name):
output = []
cmd = "ps -aef | grep -i '%s' | grep -v 'grep' | awk '{ print $2 }' > /tmp/out"
os.system(cmd % name)
with open('/tmp/out', 'r') as f:
line = f.readline()
while line:
output.append(line.strip())
line = f.readline()
if line.strip():
output.append(line.strip())
return output
Then call it and pass it a process name to get all PIDs.
>>> check_process('firefox')
['499', '621', '623', '630', '11733']
Since Python 3.5, subprocess.run() is recommended over subprocess.check_output():
>>> int(subprocess.run(["pidof", "-s", "your_process"], stdout=subprocess.PIPE).stdout)
Also, since Python 3.7, you can use the capture_output=true parameter to capture stdout and stderr:
>>> int(subprocess.run(["pidof", "-s", "your process"], capture_output=True).stdout)
On Unix, you can use pyproc2 package.
Installation
pip install pyproc2
Usage
import pyproc2
chrome_pid=pyproc2.find("chrome").pid #Returns PID of first process with name "chrome"
From a Python script, I need to call a PL->EN translation service. The translation requires 3 steps: tokenization, translation, detoknization
From Linux, I can achieve this using 3 processes by the following commands executed in mentioned order:
/home/nlp/opt/moses/scripts/tokenizer/tokenizer.perl -l pl < path_to_input.txt > path_to_output.tok.txt
/home/nlp/opt/moses/bin/moses -f /home/nlp/Downloads/TED/tuning/moses.tuned.ini.1 -drop-unknown -input-file path_to_output.tok.txt -th 8 > path_to_output.trans.txt
/home/nlp/opt/moses/scripts/tokenizer/detokenizer.perl -l en < path_to_output.trans.txt > path_to_output.final.txt
which translates the file path_to_input.txt and outputs to path_to_output.final.txt
I have made the following script for combining the 3 processes:
import shlex
import subprocess
from subprocess import STDOUT,PIPE
import os
import socket
class Translator:
#staticmethod
def pl_to_en(input_file, output_file):
# Tokenize
print("Tokenization started")
with open("tokenized.txt", "w+") as tokenizer_output:
with open(input_file) as tokenizer_input:
cmd = "/home/nlp/opt/moses/scripts/tokenizer/tokenizer.perl - l pl"
args = shlex.split(cmd)
p = subprocess.Popen(args, stdin=tokenizer_input, stdout=tokenizer_output)
p.wait()
print("Tokenization finished")
#Translate
print("Translation started")
with open("translated.txt", "w+") as translator_output:
cmd = "/home/nlp/opt/moses/bin/moses -f /home/nlp/Downloads/TED/tuning/moses.tuned.ini.1 -drop-unknown -input-file tokenized.txt -th 8"
args = shlex.split(cmd)
p = subprocess.Popen(args, stdout=translator_output)
p.wait()
print("Translation finished")
# Detokenize
print("Detokenization started")
with open("translated.txt") as detokenizer_input:
with open("detokenized.txt", "w+") as detokenizer_output:
cmd = "/home/nlp/opt/moses/scripts/tokenizer/detokenizer.perl -l en"
args = shlex.split(cmd)
p = subprocess.Popen(args, stdin=detokenizer_input, stdout=detokenizer_output)
p.wait()
print("Detokenization finished")
translator = Translator()
translator.pl_to_en("some_input_file.txt", "some_output_file.txt")
But only the tokenization part works.
The translator just outputs an empty file translated.txt. When looking at the output in the terminal, it looks like the translator loads the file tokenized.txt correctly, and does a translation. The problem is just how I collect the output from that process.
I would try something like the following - sending the output of the translator process to the pipe, and making the input of the detokenizer the pipe instead of using the files.
import shlex
import subprocess
from subprocess import STDOUT,PIPE
import os
import socket
class Translator:
#staticmethod
def pl_to_en(input_file, output_file):
# Tokenize
print("Tokenization started")
with open("tokenized.txt", "w+") as tokenizer_output:
with open(input_file) as tokenizer_input:
cmd = "/home/nlp/opt/moses/scripts/tokenizer/tokenizer.perl - l pl"
args = shlex.split(cmd)
p = subprocess.Popen(args, stdin=tokenizer_input, stdout=tokenizer_output)
p.wait()
print("Tokenization finished")
#Translate
print("Translation started")
cmd = "/home/nlp/opt/moses/bin/moses -f /home/nlp/Downloads/TED/tuning/moses.tuned.ini.1 -drop-unknown -input-file tokenized.txt -th 8"
args = shlex.split(cmd)
translate_p = subprocess.Popen(args, stdout=subprocess.PIPE)
translate_p.wait()
print("Translation finished")
# Detokenize
print("Detokenization started")
with open("detokenized.txt", "w+") as detokenizer_output:
cmd = "/home/nlp/opt/moses/scripts/tokenizer/detokenizer.perl -l en"
args = shlex.split(cmd)
detokenizer_p = subprocess.Popen(args, stdin=translate_p.stdout, stdout=detokenizer_output)
detokenizer_p.wait()
print("Detokenization finished")
translator = Translator()
translator.pl_to_en("some_input_file.txt", "some_output_file.txt")
I know how to run a command using cmd = subprocess.Popen and then subprocess.communicate.
Most of the time I use a string tokenized with shlex.split as 'argv' argument for Popen.
Example with "ls -l":
import subprocess
import shlex
print subprocess.Popen(shlex.split(r'ls -l'), stdin = subprocess.PIPE, stdout = subprocess.PIPE, stderr = subprocess.PIPE).communicate()[0]
However, pipes seem not to work... For instance, the following example returns noting:
import subprocess
import shlex
print subprocess.Popen(shlex.split(r'ls -l | sed "s/a/b/g"'), stdin = subprocess.PIPE, stdout = subprocess.PIPE, stderr = subprocess.PIPE).communicate()[0]
Can you tell me what I am doing wrong please?
Thx
I think you want to instantiate two separate Popen objects here, one for 'ls' and the other for 'sed'. You'll want to pass the first Popen object's stdout attribute as the stdin argument to the 2nd Popen object.
Example:
p1 = subprocess.Popen('ls ...', stdout=subprocess.PIPE)
p2 = subprocess.Popen('sed ...', stdin=p1.stdout, stdout=subprocess.PIPE)
print p2.communicate()
You can keep chaining this way if you have more commands:
p3 = subprocess.Popen('prog', stdin=p2.stdout, ...)
See the subprocess documentation for more info on how to work with subprocesses.
I've made a little function to help with the piping, hope it helps. It will chain Popens as needed.
from subprocess import Popen, PIPE
import shlex
def run(cmd):
"""Runs the given command locally and returns the output, err and exit_code."""
if "|" in cmd:
cmd_parts = cmd.split('|')
else:
cmd_parts = []
cmd_parts.append(cmd)
i = 0
p = {}
for cmd_part in cmd_parts:
cmd_part = cmd_part.strip()
if i == 0:
p[i]=Popen(shlex.split(cmd_part),stdin=None, stdout=PIPE, stderr=PIPE)
else:
p[i]=Popen(shlex.split(cmd_part),stdin=p[i-1].stdout, stdout=PIPE, stderr=PIPE)
i = i +1
(output, err) = p[i-1].communicate()
exit_code = p[0].wait()
return str(output), str(err), exit_code
output, err, exit_code = run("ls -lha /var/log | grep syslog | grep gz")
if exit_code != 0:
print "Output:"
print output
print "Error:"
print err
# Handle error here
else:
# Be happy :D
print output
shlex only splits up spaces according to the shell rules, but does not deal with pipes.
It should, however, work this way:
import subprocess
import shlex
sp_ls = subprocess.Popen(shlex.split(r'ls -l'), stdin = subprocess.PIPE, stdout = subprocess.PIPE, stderr = subprocess.PIPE)
sp_sed = subprocess.Popen(shlex.split(r'sed "s/a/b/g"'), stdin = sp_ls.stdout, stdout = subprocess.PIPE, stderr = subprocess.PIPE)
sp_ls.stdin.close() # makes it similiar to /dev/null
output = sp_ls.communicate()[0] # which makes you ignore any errors.
print output
according to help(subprocess)'s
Replacing shell pipe line
-------------------------
output=`dmesg | grep hda`
==>
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
output = p2.communicate()[0]
HTH
"""
Why don't you use shell
"""
def output_shell(line):
try:
shell_command = Popen(line, stdout=PIPE, stderr=PIPE, shell=True)
except OSError:
return None
except ValueError:
return None
(output, err) = shell_command.communicate()
shell_command.wait()
if shell_command.returncode != 0:
print "Shell command failed to execute"
return None
return str(output)
Thank #hernvnc, #glglgl, and #Jacques Gaudin for the answers. I fixed the code from #hernvnc. His version will cause hanging in some scenarios.
import shlex
from subprocess import PIPE
from subprocess import Popen
def run(cmd, input=None):
"""Runs the given command locally and returns the output, err and exit_code."""
if "|" in cmd:
cmd_parts = cmd.split('|')
else:
cmd_parts = []
cmd_parts.append(cmd)
i = 0
p = {}
for cmd_part in cmd_parts:
cmd_part = cmd_part.strip()
if i == 0:
if input:
p[i]=Popen(shlex.split(cmd_part),stdin=PIPE, stdout=PIPE, stderr=PIPE)
else:
p[i]=Popen(shlex.split(cmd_part),stdin=None, stdout=PIPE, stderr=PIPE)
else:
p[i]=Popen(shlex.split(cmd_part),stdin=p[i-1].stdout, stdout=PIPE, stderr=PIPE)
i = i +1
# close the stdin explicitly, otherwise, the following case will hang.
if input:
p[0].stdin.write(input)
p[0].stdin.close()
(output, err) = p[i-1].communicate()
exit_code = p[0].wait()
return str(output), str(err), exit_code
# test case below
inp = b'[ CMServer State ]\n\nnode node_ip instance state\n--------------------------------------------\n1 linux172 10.90.56.172 1 Primary\n2 linux173 10.90.56.173 2 Standby\n3 linux174 10.90.56.174 3 Standby\n\n[ ETCD State ]\n\nnode node_ip instance state\n--------------------------------------------------\n1 linux172 10.90.56.172 7001 StateFollower\n2 linux173 10.90.56.173 7002 StateLeader\n3 linux174 10.90.56.174 7003 StateFollower\n\n[ Cluster State ]\n\ncluster_state : Normal\nredistributing : No\nbalanced : No\ncurrent_az : AZ_ALL\n\n[ Datanode State ]\n\nnode node_ip instance state | node node_ip instance state | node node_ip instance state\n------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n1 linux172 10.90.56.172 6001 P Standby Normal | 2 linux173 10.90.56.173 6002 S Primary Normal | 3 linux174 10.90.56.174 6003 S Standby Normal'
cmd = "grep -E 'Primary' | tail -1 | awk '{print $3}'"
run(cmd, input=inp)
I am trying to capture the output of a tcpdump/grep pipeline from Python. I am using Python 2.6 on Mac OS 10.6.7.
When I try it with dmesg/grep, the caller receives output from the subprocesses, as expected.
When I try it with tcpdump/grep, select never returns anything.
What am I doing wrong?
#! /usr/bin/python
def tcpdump():
import subprocess, fcntl, os
# This works
# cmd1 = ['sudo', 'dmesg']
# cmd2 = ['grep', '-E', '.*']
# This doesn't work
# sudo tcpdump -i en0 -n -s 0 -w - | grep -a -o -E "Host\: .*|GET \/.*"
cmd1 = ['sudo', 'tcpdump', '-i', 'en0', '-n', '-s', '0', '-w', '-']
cmd2 = ['grep', '-a', '-o', '-E', 'Host\: .*|GET \/.*']
p1 = subprocess.Popen(cmd1, stdout=subprocess.PIPE)
p2 = subprocess.Popen(cmd2, stdout=subprocess.PIPE, stdin=p1.stdout)
# set stdout file descriptor to nonblocking
flags = \
fcntl.fcntl(p2.stdout.fileno(), fcntl.F_GETFL)
fcntl.fcntl(p2.stdout.fileno(), fcntl.F_SETFL, (flags | os.O_NDELAY | os.O_NONBLOCK))
return p2
def poll_tcpdump(proc):
import select
txt = None
while True:
# wait 1/10 of a second and check whether proc has written anything to stdout
readReady, _, _ = select.select([proc.stdout.fileno()], [], [], 0.1)
if not len(readReady):
break
for line in iter(proc.stdout.readline, ""):
if txt is None:
txt = ''
txt += line
break
return txt
proc = tcpdump()
while True:
text = poll_tcpdump(proc)
if text:
print '>>>> ' + text
Try
cmd2 = ['grep', '--line-buffered', '-a', '-o', '-E', 'Host\: .*|GET \/.*']
I have command like this.
wmctrl -lp | awk '/gedit/ { print $1 }'
And I want its output within python script, i tried this code
>>> import subprocess
>>> proc = subprocess.Popen(["wmctrl -lp", "|","awk '/gedit/ {print $1}"], shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
>>> proc.stdout.readline()
'0x0160001b -1 6504 beer-laptop x-nautilus-desktop\n'
>>> proc.stdout.readline()
'0x0352f117 0 6963 beer-laptop How to get output from external command combine with Pipe - Stack Overflow - Chromium\n'
>>> proc.stdout.readline()
'0x01400003 -1 6503 beer-laptop Bottom Expanded Edge Panel\n'
>>>
It seem my code is wrong only wmctrl -lp was execute, and | awk '{print $1}' is omitted
My expect output would like 0x03800081
$ wmctrl -lp | awk '/gedit/ {print $1}'
0x03800081
Does one please help.
With shell=True, you should use a single command line instead of an array, otherwise your additional arguments are interpreted as shell arguments. From the subprocess documentation:
On Unix, with shell=True: If args is a string, it specifies the command string to execute through the shell. If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional shell arguments.
So your call should be:
subprocess.Popen("wmctrl -lp | sed /gedit/ '{print $1}'", shell=True, ...
I think you may also have an unbalanced single quote in there.
Because you are passing a sequence in for the program, it thinks that the pipe is an argument to wmcrtrl, such as if you did
wmctrl -lp "|"
and thus the actual pipe operation is lost.
Making it a single string should indeed give you the correct result:
>>> import subprocess as s
>>> proc = s.Popen("echo hello | grep e", shell=True, stdout=s.PIPE, stderr=s.PIPE)
>>> proc.stdout.readline()
'hello\n'
>>> proc.stdout.readline()
''
After some research, I have the following code which works very well for me. It basically prints both stdout and stderr in real time. Hope it helps someone else who needs it.
stdout_result = 1
stderr_result = 1
def stdout_thread(pipe):
global stdout_result
while True:
out = pipe.stdout.read(1)
stdout_result = pipe.poll()
if out == '' and stdout_result is not None:
break
if out != '':
sys.stdout.write(out)
sys.stdout.flush()
def stderr_thread(pipe):
global stderr_result
while True:
err = pipe.stderr.read(1)
stderr_result = pipe.poll()
if err == '' and stderr_result is not None:
break
if err != '':
sys.stdout.write(err)
sys.stdout.flush()
def exec_command(command, cwd=None):
if cwd is not None:
print '[' + ' '.join(command) + '] in ' + cwd
else:
print '[' + ' '.join(command) + ']'
p = subprocess.Popen(
command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=cwd
)
out_thread = threading.Thread(name='stdout_thread', target=stdout_thread, args=(p,))
err_thread = threading.Thread(name='stderr_thread', target=stderr_thread, args=(p,))
err_thread.start()
out_thread.start()
out_thread.join()
err_thread.join()
return stdout_result + stderr_result
When needed, I think it's easy to collect the output or error in a string and return.