I am using the Popen constructor from subprocess to capture the output of the command that I am running in my python script:
import os
from subprocess import Popen, PIPE
p = Popen(["my-cli", "ls", "/mypics/"], stdin=PIPE, stdout=PIPE,stderr=PIPE)
output, err = p.communicate()
print(output)
print(output.count('jpg'))
My objective is to save the output file names as an array of strings.
However, when I print the output, I notice that instead of saving file names as strings, the script saves each byte (letter) of each file as a string. Therefore, the printed output looks like this
f
i
l
e
.
j
p
g
1
So instead of printing one filename file.jpg I am getting a printout of the 8 separate characters that make up the filename. But running the ls command in the terminal directly will just list the filenames row by row as it should.
What am I doing wrong in this script and what is the workaround here? I am running Python 2.7 Any suggestions would be appreciated
What was that my-cli inside your Popen array. I think some new line character where appending after each char output. Just remove that my-cli and this could fork for you.
p = Popen(["ls", "/mypics/"], stdin=PIPE, stdout=PIPE,stderr=PIPE)
I hope this will work for you.
Related
I need to implement a code that takes two files as input (one as file 1 with data and the other one with the relationship between those data). Then the program finds the shared pairs of (a, b) and outputs the result as a new file - - using subprocess.check_output().
I don't know if I have to use:
subprocess.check_output()
OR
proc = subprocess.Popen('ls', stdout=subprocess.PIPE, shell=True)
output = proc.stdout.read()
print(output)
Can someone could help me for that? with some link or documentation?
Thank you in advance
I have a binary executable named as "abc" and I have a input file called as "input.txt". I can run these with following bash command:
./abc < input.txt
How can I run this bash command in Python, I tried some ways but I got errors.
Edit:
I also need the store the output of the command.
Edit2:
I solved with this way, thanks for the helps.
input_path = path of the input.txt file.
out = subprocess.Popen(["./abc"],stdin=open(input_path),stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
stdout,stderr = out.communicate()
print(stdout)
use os.system
import os
os.system("echo test from shell");
Using subprocess is the best way to invoke system commands and executables. It provides better control than os.system() and is intended to replace it. The python documentation link below provides additional information.
https://docs.python.org/3/library/subprocess.html
Here is a bit of code that uses subprocess to read output from head to return the first 100 rows from a txt file and process it row by row. It gives you the output (out) and any errors (err).
mycmd = 'head -100 myfile.txt'
(out, err) = subprocess.Popen(mycmd, stdout=subprocess.PIPE, shell=True).communicate()
myrows = str(out.decode("utf-8")).split("\n")
for myrow in myrows:
# do something with myrow
This can be done with os module. The following code works perfectly fine.
import os
path = "path of the executable 'abc' and 'input.txt' file"
os.chdir(path)
os.system("./abc < input.txt")
Hope this works :)
I have over 14000 fasta files, and I want to keep only the ones containing 5 sequences. I know I can use the following bash command to obtain the number of sequences in a single fasta file:
grep -c "^>" filename.fasta
So my approach was to write the the filename and count of sequences in each file to a text file, which I could then use to isolate only the sequences I want. To run the grep command on so many files, I am using subprocess.call:
import subprocess
import os
with open("five_seqs.txt", "w") as f:
for file in os.listdir("/Users/vivaksoni1/Downloads/DA_CDS/fasta_files"):
f.write(file),
subprocess.call(["grep", "-c", "^>", file], stdout = f)
Part of my problem is that the grep command is "^>", but subprocess requires each argument to have its own quotation marks. How can I use "^>" when I would essentially be entering as an argument: ""^>"".
Also, do I have to add f.write("\n") after f.write(file)? Currently my output is just a text file with each entry next to one another, and the subprocess command just prints each file name to the terminal and states no file found as such:
grep: MZ23900789.fasta: No such file or directory
Try the following code, it should work for your example. It will write the filename plus a tab separator and the number of sequences (i.e. > characters).
Using Popen and communicate gives better flexibility in handling the output. Tested on Ubuntu.
import subprocess
import os
fasta_dir = "/Users/vivaksoni1/Downloads/DA_CDS/fasta_files/"
with open("five_seqs.txt", "w") as f:
for file in os.listdir(fasta_dir):
f.write(file + '\t')
grep = subprocess.Popen(["grep", "-c", "^>", fasta_dir + file], stdout = subprocess.PIPE)
out, err = grep.communicate()
f.write(out + '\n')
I am using a python script to run a process using subprocess.Popen and simultaneously store the output in a text file as well as print it on the console. This is my code:
result = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
for line in result.stdout.readlines(): #read and store result in log file
openfile.write("%s\n" %line)
print("%s" %line)
Above code works fine, but what it does is it first completes the process and stores the output in result variable. After that for loop stores the output as well as print it.
But i want the output at runtime (as my process can take hours to complete, i don't get any output for all these hours).
So is there any other function that gives me the output dynamically (at runtime), means as soon as the process gives first line, it should get printed.
The problem here is that .readlines() gets the entire output before returning, as it constructs a full list. Just iterate directly:
for line in result.stdout:
print(line)
.readlines() returns a list of all the lines the process will return while open, i.e., it doesn't return anything until all output from the subprocess is received. To read line by line in "real time":
import sys
from subprocess import Popen, PIPE
proc = Popen(cmd, shell=True, bufsize=1, stdout=PIPE)
for line in proc.stdout:
openfile.write(line)
sys.stdout.buffer.write(line)
sys.stdout.buffer.flush()
proc.stdout.close()
proc.wait()
Note: if the subprocess uses block-buffering when it is run in non-interactive mode; you might need pexpect, pty modules or stdbuf, unbuffer, script commands.
Note: on Python 2, you might also need to use iter(), to get "real time" output:
for line in iter(proc.stdout.readline, ""):
openfile.write(line)
print line,
You can iterate over the lines one by one by using readline on the pipe:
while True:
line = result.stdout.readline()
print line.strip()
if not line:
break
The lines contain a trailing \n which I stripped for printing.
When the process terminates, readline returns an empty string, so you know when to stop.
I am trying to run grep command from my Python module using the subprocess library. Since, I am doing this operation on the doc file, I am using Catdoc third party library to get the content in a plan text file. I want to store the content in a file. I don't know where I am going wrong but the program fails to generate a plain text file and eventually to get the grep result. I have gone through the error log but its empty. Thanks for all the help.
def search_file(name, keyword):
#Extract and save the text from doc file
catdoc_cmd = ['catdoc', '-w' , name, '>', 'testing.txt']
catdoc_process = subprocess.Popen(catdoc_cmd, stdout=subprocess.PIPE,stderr=subprocess.PIPE, shell=True)
output = catdoc_process.communicate()[0]
grep_cmd = []
#Search the keyword through the text file
grep_cmd.extend(['grep', '%s' %keyword , 'testing.txt'])
print grep_cmd
p = subprocess.Popen(grep_cmd,stdout=subprocess.PIPE,stderr=subprocess.PIPE, shell=True)
stdoutdata = p.communicate()[0]
print stdoutdata
On UNIX, specifying shell=True will cause the first argument to be treated as the command to execute, with all subsequent arguments treated as arguments to the shell itself. Thus, the > won't have any effect (since with /bin/sh -c, all arguments after the command are ignored).
Therefore, you should actually use
catdoc_cmd = ['catdoc -w "%s" > testing.txt' % name]
A better solution, though, would probably be to just read the text out of the subprocess' stdout, and process it using re or Python string operations:
catdoc_cmd = ['catdoc', '-w' , name]
catdoc_process = subprocess.Popen(catdoc_cmd, stdout=subprocess.PIPE,stderr=subprocess.PIPE)
for line in catdoc_process.stdout:
if keyword in line:
print line.strip()
I think you're trying to pass the > to the shell, but that's not going to work the way you've done it. If you want to spawn a process, you should arrange for its standard out to be redirected. Fortunately, that's really easy to do; all you have to do is open the file you want the output to go to for writing and pass it to popen using the stdout keyword argument, instead of PIPE, which causes it to be attached to a pipe which you can read with communicate().