I was trying to run and process the stdout of some java program and found that my Python script was eternally waiting. Then I've wrote a new test script to test subprocess and found that, again, I see no output when running this:
$ cat test.py
#!/usr/bin/env python
import subprocess
c = ['/usr/bin/tail', '-f', '/var/log/dmesg']
proc = subprocess.Popen(c,
bufsize=1,
shell=False,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
for line in proc.stdout:
print line
Why is subprocess ignoring my bufsize argument? Is there some intermediate buffering I'm missing to take into account? I expect to read the first 10 lines of tail and then eternally wait until new lines are appened to the dmesg file. My user do have permissions, running the command on bash gives output.
Changing tail to yes seems to fill some buffers and I can see lots of output.
You can use iter(proc.stdout.readline,''):
for line in iter(proc.stdout.readline,''):
print line
for line in proc.stdout reads all the input before iterating over the content.
Related
I am facing difficulties calling a command line from my script.I run the script but I don't get any result. Through this command line in my script I want to run a tool which produces a folder that has the output files for each line.The inputpath is already defined. Can you please help me?
for line in inputFile:
cmd = 'python3 CRISPRcasIdentifier.py -f %s/%s.fasta -o %s/%s.csv -st dna -co %s/'%(inputpath,line.strip(),outputfolder,line.strip(),outputfolder)
os.system(cmd)
You really want to use the Python standard library module subprocess. Using functions from that module, you can construct you command line as a list of strings, and each would be processed as one file name, option or value. This bypasses the shell's escaping, and eliminates the need to massage you script arguments before calling.
Besides, your code would not work, because the body block of the for statement is not indented. Python would simply not accept this code (could be you pasted into the questiong without the proper indentations).
as mentioned before, executing command vias: os.system(command) is not recomended. please use subprocess (read in python docs about this modulesubprocess_module_docs). see the code here:
for command in input_file:
p = subprocess.Popen(command, stdout=subprocess.PIPE, stdin=subprocess.PIPE, stderr=subprocess.PIPE)
# use this if you want to communicate with child process
# p = subprocess.Popen(command, stdout=subprocess.PIPE, stdin=subprocess.PIPE, stderr=subprocess.PIPE)
p.communicate()
# --- do the rest
I usually do like this for static command
from subprocess import check_output
def sh(command):
return check_output(command, shell=True, universal_newlines=True)
output = sh('echo hello world | sed s/h/H/')
BUT THIS IS NOT SAFE!!! It's vunerable to shell injection you should do
from subprocess import check_output
from shlex import split
def sh(command):
return check_output(split(command), universal_newlines=True)
output = sh('echo hello world')
The difference is subtle but important. shell=True will create a new shell, so pipes, etc will work. I use this when I have a big command line with pipes and that is static, I mean, it do not depend on user input. This is because this variant is vunerable to shell injection, a user can input something; rm -rf / and it will run.
The second variant only accepts one command, it will not spawn a shell, instead it will run the command directly. So no pipes and such shell things will work, and is safer.
universal_newlines=True is for getting output as string instead of bytes. Use it for text output, if you need binary output just ommit it. The default is false.
So here is the full example
from subprocess import check_output
from shlex import split
def sh(command):
return check_output(split(command), universal_newlines=True)
for line in inputFile:
cmd = 'python3 CRISPRcasIdentifier.py -f %s/%s.fasta -o %s/%s.csv -st dna -co %s/'%(inputpath,line.strip(),outputfolder,line.strip(),outputfolder)
sh(cmd)
Ps: I didn't test this
As an add-on to my old question: I have a question. When I run cmd.exe and execute the target program, the output is printed to cmd nicely and it exits with code 0 in the end,
Since the program continuously prints to my cmd.exe stdout, how come I can't mimic that behaviour in Python?
The following code is how I parse lines from my target executable.
res = subprocess.Popen(command, universal_newlines=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, bufsize=1)
with res.stdout:
for line in iter(res.stdout.readline, b''):
print line
res.wait()
The python parsing doesnt even read the same things as cmd.exe does!
It doesnt print the last 5-10 lines (the ones telling me the process is complete).
Do I have to subprocess popen cmd.exe then call the target program? Are there any other alternatives?
I tried to write a code that can execute python codes easily.
but when I used subprocess library such:
import subprocess
print(subprocess.Popen("py setup.py install", shell = True, stdout = subprocess.PIPE).stdout.read())
print(subprocess.Popen("py setup.py py2exe", shell = True, stdout = subprocess.PIPE).stdout.read())
I saw just this result
b''
please help me please
Most likely the commands you are trying to run are producing a stderr, which your code does not display. It is possible to send the stderr messages to stdout if you don't want to handle it separately.
I'll use a different command in the subprocess that is relatively safe. And I will break it up a little instead of having one long line.
import subprocess
p = subprocess.Popen("python filedoesntexist",
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
print(p.stdout.read())
See that I added the parameter stderr=subprocess.STDOUT, this sends all the error messages to stdout. The subprocess tries to run "python filedoesntexist" and since filedoesntexist is a file that doesn't exists, it will print this message:
b"python: can't open file 'filedoesntexist': [Errno 2] No such file or directory\n"
But you might just want to get the string instead of bytes, and you can add the parameter universal_newlines=True like this:
p = subprocess.Popen("python filedoesntexist",
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
universal_newlines=True)
print(p.stdout.read())
Now it prints just the string like this:
python: can't open file 'filedoesntexist': [Errno 2] No such file or directory
For additional information, visit the python documentation
Edit
The documentation recommends using run(), which can be done like this (updated after comments from J.F. Sebastian) :
subprocess.run(["python", "filedoesntexist"])
If you need to handle stdout in some way, add parameters described earlier in the Popen examples.
I'm trying to print stdout in realtime for a subprocess but it looks like stdout is buffered even with bufsize=0 and I can't figure out how to make it work, I always have a delay.
The code I tried :
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
bufsize=0)
line = p.stdout.readline()
while line:
sys.stdout.write(line)
sys.stdout.flush()
# DO OTHER STUFF
line = p.stdout.readline()
Also tried with for line in iter(p.stdout.readline, b'') instead of the while loop and with read(1) instead of readline(). Always the same result, the output gets delayed by a lot of seconds or minutes and multiple lines appear suddenly at once.
What I think happens :
bufsize is set to 0 ( it is set to 0 by default according to the docs ) so the lines piped top.stdout should be available immediately. But since p.stdout.readline() doesn't return immediately when a new line is piped, that means that it IS buffered, hence the multiple lines at once when the buffer is finally flushed to p.stdout.
What can I do to make it work ?
Thanks to pobrelkey who found the source of the problem. Indeed, the delay is due to the fact that the child is buffering its write to stdout because it is not writing to a tty. The child uses stdio which is line buffered when writing to a tty, else it is fully buffered.
I managed to get it to work by using pexpect instead of subprocess. pexpect uses a pseudo-tty and that's exactly what we need here :
p = pexpect.spawn(cmd,args,timeout=None)
line = p.readline()
while line:
sys.stdout.write(line)
sys.stdout.flush()
# DO OTHER STUFF
line = p.readline()
Or even better in my case :
p = pexpect.spawn(cmd,args,timeout=None,logfile=sys.stdout)
line = p.readline()
while line:
# DO OTHER STUFF
line = p.readline()
No more delay !
More infos about pexpect : wiki
I would first make sure the subprocess itself doesn't buffer its output. If the subprocess is in turn a Python program, proceed to the paragraph below to see how to disable output buffering for Python processes.
As per Python, usually the problem is that Python by default buffers stderr and stdout even if you explicitly .flush() it from the code. The solution is to pass -u to Python when starting your program.
Also, you can just do for line in p.stdout instead of the tricky while loop.
P.S. actually I tried running your code (with cmd = ['cat', '/dev/urandom']) and without -u and it outputted everything in real time already; this is on OS X 10.8.
If you just want stdout of your child process to go to your stdout, why not just have the child process inherit stdout from your process?
subprocess.Popen(cmd, stdout=None, stderr=subprocess.STDOUT)
I have been fighting against Popen in python for couple of days now, so I decided to put all my doubts here, hopefully all of them can be clarified by python experts.
Initially I use Popen to execute a command and grep the result(as one command using pipe, something like xxx | grep yyy), with shell=False, as you can imagine, that doesn't work quite well. Following the guide in this post, I changed my code to the following:
checkCmd = ["sudo", "pyrit", "-r", self.capFile, "analyze"]
checkExec = Popen(checkCmd, shell=False, stdout=PIPE, stderr=STDOUT)
grepExec = Popen(["grep", "good"], stdin=checkExec.stdout, stdout=PIPE)
output = grepExec.stdout.readline()
output = grepExec.communicate()[0]
But I realized that the checkExec runs slowly and since Popen is non-blocking, grepExec always get executed before checkExec shows any result, thus the grep output would always be blank. How can I postpone the execution of grepExec till checkExec is finished?
In another Popen in my program, I tried to keep a service open at the back, so I use a separate thread to execute it. When all the tasks are done, I notify this thread to quit, and I explicitly call Popen.kill() to stop the service. However, my system ends up with a zombie process that is not reaped. I don't know if there's a nice way to clean up everything in this background thread after it finishes?
What are the differences between Popen.communicate()[0] and Popen.stdout.readline()? Can I use a loop to keep reading output from both of them?
Your example would work if you do it like this:
checkCmd = ["sudo", "pyrit", "-r", self.capFile, "analyze"]
checkExec = Popen(checkCmd, shell=False, stdout=PIPE, stderr=STDOUT)
grepExec = Popen(["grep", "good"], stdin=checkExec.stdout, stdout=PIPE)
for line in grepExec.stdout:
# do something with line
You use communicate when you want to give some input to a process and read all output on stdout, stderr of the process at the same time. This is probably not what you want for your case. communicate is more for the cases where you want to start an application, feed all the input it needs to it and read its output.
As other answers have pointed out you can use shell=True to create the pipeline in your call to subprocess, but an alternative which I would prefer is to leverage python and instead of setting up a pipeline doing:
checkCmd = ["sudo", "pyrit", "-r", self.capFile, "analyze"]
checkExec = Popen(checkCmd, shell=False, stdout=PIPE, stderr=STDOUT)
for line in checkExec.stdout:
if line.find('good') != -1:
do something with the matched line here
Use subprocess instead of popen, then you can simplify things drastically with the complete commandline.
http://docs.python.org/library/subprocess.html
eg.
import subprocess as sub
f = open('/dev/null', 'w')
proc = sub.call("cat file | grep string", executable="/bin/bash", shell=True)