Avoid subprocess.Popen auto escaping my backslashes in grep - python

I'm trying to write an svn pre-commit hook in python. Part of this involves checking the diff file to see if there are any actual file changes (as opposed to just property changes).
I have a working grep command which I can execute fine on the shell
grep "^\(Added: \|Modified: \|Deleted: \)" diff filename | grep -v 'svn:'
However when I put it through subprocess.POpen it escapes all my backslashes, which knackers the regexp.
Executing command: ['grep', '"^\\Added: \\|Modified: \\|Deleted: \\)", ...]
How do I avoid this?
NB: I'm aware that I can pipe results between subprocesses and I can do the two greps that way. I need help getting the first one working first though :/
NB2: I also tried using filterdiff --clean instead and couldn't get it to work. Searching for Added, Modified or Deleted lines, removing those with 'svn:' in and checking I had some results seemed to work though.
Python code:
command = ['grep', '"^\(Added: \|Modified: \|Deleted: \)"', filename]
sys.stdout.write('Executing command: %s\n' % (command))
p = subprocess.Popen(command,
stdin = subprocess.PIPE
stdout = subprocess.PIPE
stderr = subprocess.STDOUT
shell = True)
data = p.stdout.read()
if len(data) == 0:
sys.stdout.write("Diff does not contain any file modifications./n")
exit(0)

You need to consider what you want grep to see in its command line arguments.
The first argument needs to be the literal string "^\(Added: \|Modified: \|Deleted: \)", so that means that it shouldn't include the double quotes but should include the backslashes.
The way to express this kind of string is to use Python raw strings:
command = ['grep', r'^\(Added: \|Modified: \|Deleted: \)', filename]
A good way to check what you're actually running is to replace grep by echo so you can at least see what you're passing to the command.

Related

How to separate the parts of this unix command to use in Python subprocess module?

I would like to add a Linux command to my Python script (using the subprocess module). This is the command I would like to add:
awk '/^>/ {printf("%s%s\t",(N>0?"\n":""),$0);N++;next;} {printf("%s",$0);} END {printf("\n");}' < input.fa
And this is currently what my code looks like:
out = subprocess.Popen(["awk '/^>/ {printf("%s%s\t",(N>0?"\n":""),$0);N++;next;} {printf("%s",$0);} END {printf("\n");}' < input.fa"],
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT)
stdout, stderr = out.communicate()
print(stdout)
print(stderr)
However, it appears that my syntax is wrong somewhere. In some other examples I have seen, it seems like the command needs to be split into sections with apostrophes - however, this command is quite complex, and I do not know where the splitting would be required.
I would appreciate any help in identifying the underlying issue and correcting the syntax of this script.
I have managed to circumvent my problem by using a command which does the same thing but does not include any "\n", which is what seems to have been the cause of my issues. Also, using different quotation marks (i.e. " and ') helps to prevent this causing issues. The code now looks like this and does what I need it to do:
print(subprocess.Popen('perl -ne "chomp;print;" chr11', shell=True, stdout=subprocess.PIPE).stdout.read())

Save output of a command execution with subprocess

I am trying to get output of subprocess.Popen in variable.
It is working fine for pwd command, but not working for pwdx $(pgrep -U $USER -f SimpleHTTPServer) command.
This works:
(Pdb++) p = subprocess.Popen("pwd", stdout=subprocess.PIPE)
(Pdb++) result = p.communicate()[0]
(Pdb++) result
'xyz'
This is not working:
(Pdb++) subprocess.Popen("pwdx $(pgrep -U $USER -f SimpleHTTPServer)", stdout=subprocess.PIPE)
*** OSError: [Errno 2] No such file or directory
Can someone please let me know how can I save the output of it to a variable?
If you want to pass a command with arguments to Popen(), you have to pass it as a list, like so:
subprocess.Popen(['/bin/ls', '-lat'])
If you just pass a single string as in your example, it assumes the entire thing is the command name, and obviously there is no command literally named pwdx $(pgrep -U $USER -f SimpleHTTPServer).
As the docs and previous answers already state the command you want to execute via subprocess.Popen() needs to be passed as a list.
From the docs:
Note in particular that options (such as -input) and arguments (such
as eggs.txt) that are separated by whitespace in the shell go in
separate list elements, while arguments that need quoting or backslash
escaping when used in the shell (such as filenames containing spaces) are single list elements.
Another useful tip you can get from the docs is to use shlex.split() to help you to properly split your command into a list.
Beware though that the use of special shell parameters (e.g. $USER) might not work well with subprocess.Popen() unless you would set the shell=True option, which you shouldn't do without reading the doc's Security Considerations.

Running bash command on server

I am trying to run the bash command pdfcrack in Python on a remote server. This is my code:
bashCommand = "pdfcrack -f pdf123.pdf > myoutput.txt"
import subprocess
process = subprocess.Popen(bashCommand.split(), stdout=subprocess.PIPE)
output, error = process.communicate()
I, however, get the following error message:
Non-option argument myoutput2.txt
Error: file > not found
Can anybody see my mistake?
The first argument to Popen is a list containing the command name and its arguments. > is not an argument to the command, though; it is shell syntax. You could simply pass the entire line to Popen and instruct it to use the shell to execute it:
process = subprocess.Popen(bashCommand, shell=True)
(Note that since you are redirecting the output of the command to a file, though, there is no reason to set its standard output to a pipe, because there will be nothing to read.)
A better solution, though, is to let Python handle the redirection.
process = subprocess.Popen(['pdfcrack', '-f', 'pdf123.pdf'], stdout=subprocess.PIPE)
with open('myoutput.txt', 'w') as fh:
for line in process.stdout:
fh.write(line)
# Do whatever else you want with line
Also, don't use str.split as a replacement for the shell's word splitting. A valid command line like pdfcrack -f "foo bar.pdf" would be split into the incorrect list ['pdfcrack', '-f', '"foo', 'bar.pdf"'], rather than the correct list ['pdfcrack', '-f', 'foo bar.pdf'].
> is interpreted by shell, but not valid otherwise.
So, that would work (don't split, use as-is):
process = subprocess.Popen(bashCommand, shell=True)
(and stdout=subprocess.PIPE isn't useful since all output is redirected to the output file)
But it could be better with native python for redirection to output file and passing arguments as list (handles quote protection if needed)
with open("myoutput.txt","w") as f:
process = subprocess.Popen(["pdfcrack","-f","pdf123.pdf"], stdout=subprocess.PIPE)
f.write(process.read())
process.wait()
Your mistake is > in command.
It doesn't treat this as redirection to file because normally bash does it and now you run it without using bash.
Try with shell=True if you whan to use bash. And then you don't have to split command into list.
subprocess.Popen("pdfcrack -f pdf123.pdf > myoutput.txt", shell=True)

Python subprocess library: Running grep command from Python

I am trying to run grep command from my Python module using the subprocess library. Since, I am doing this operation on the doc file, I am using Catdoc third party library to get the content in a plan text file. I want to store the content in a file. I don't know where I am going wrong but the program fails to generate a plain text file and eventually to get the grep result. I have gone through the error log but its empty. Thanks for all the help.
def search_file(name, keyword):
#Extract and save the text from doc file
catdoc_cmd = ['catdoc', '-w' , name, '>', 'testing.txt']
catdoc_process = subprocess.Popen(catdoc_cmd, stdout=subprocess.PIPE,stderr=subprocess.PIPE, shell=True)
output = catdoc_process.communicate()[0]
grep_cmd = []
#Search the keyword through the text file
grep_cmd.extend(['grep', '%s' %keyword , 'testing.txt'])
print grep_cmd
p = subprocess.Popen(grep_cmd,stdout=subprocess.PIPE,stderr=subprocess.PIPE, shell=True)
stdoutdata = p.communicate()[0]
print stdoutdata
On UNIX, specifying shell=True will cause the first argument to be treated as the command to execute, with all subsequent arguments treated as arguments to the shell itself. Thus, the > won't have any effect (since with /bin/sh -c, all arguments after the command are ignored).
Therefore, you should actually use
catdoc_cmd = ['catdoc -w "%s" > testing.txt' % name]
A better solution, though, would probably be to just read the text out of the subprocess' stdout, and process it using re or Python string operations:
catdoc_cmd = ['catdoc', '-w' , name]
catdoc_process = subprocess.Popen(catdoc_cmd, stdout=subprocess.PIPE,stderr=subprocess.PIPE)
for line in catdoc_process.stdout:
if keyword in line:
print line.strip()
I think you're trying to pass the > to the shell, but that's not going to work the way you've done it. If you want to spawn a process, you should arrange for its standard out to be redirected. Fortunately, that's really easy to do; all you have to do is open the file you want the output to go to for writing and pass it to popen using the stdout keyword argument, instead of PIPE, which causes it to be attached to a pipe which you can read with communicate().

When I write a python script to run Devenv with configure "Debug|Win32" it does nothing

Update: When I use the subprocess.call instead of subprocess.Popen, the problem is solved - does anybody know what's the cause? And there came another problem: I can't seem to find a way to control the output... Is there a way to redirect the output from subprocess.call to a string or something like that? Thanks!
I'm trying to use Devenv to build projects, and it runs just fine when i type it in command prompt like devenv A.sln /build "Debug|Win32" - but when I use a python to run it using Popen(cmd,shell=true) where cmd is the same line as above, it shows nothing. If I remove the |, change it to "Debug" only, it works....
Does anybody know why this happens? I've tried putting a \ before |, but still nothing happened..
This is the code I am using:
from subprocess import Popen, PIPE
cmd = ' "C:\\Program Files\\Microsoft Visual Studio 8\\Common7\\IDE\\devenv" solution.sln /build "Debug|Win32" '
sys.stdout.flush()
p = Popen(cmd,shell=True,stdout=PIPE,stderr=PIPE)
lines = []
for line in p.stdout.readlines():
lines.append(line)
out = string.join(lines)
print out
if out.strip():
print out.strip('\n')
sys.stdout.flush()
...which doesn't work, however, if I swap Debug|Win32 with Debug, it works perfectly..
Thanks for every comment here
There is a difference between devenv.exe and devenv.com, both of which are executable and live in the same directory (sigh). The command lines used in the question and some answers don't say which they want so I'm not sure which will get used.
If you want to call from the command line then you need to ensure you use devenv.com, otherwise you're likely to get a GUI popping up. I think this might be the cause of some (but not all) of the confusion.
See section 17.1.5.1. in the python documentation.
On Windows, Python automatically adds the double quotes around the project configuration argument i.e Debug|win32 is passed as "Debug|win32" to devenv. You DON'T need to add the double quotes and you DON'T need to pass shell=True to Popen.
Use ProcMon to view the argument string passed to devenv.
When shell = False is used, it will treat the string as a single command, so you need to pass the command/arugments as a list.. Something like:
from subprocess import Popen, PIPE
cmd = [
r"C:\Program Files\Microsoft Visual Studio 8\Common7\IDE\devenv", # in raw r"blah" string, you don't need to escape backslashes
"solution.sln",
"/build",
"Debug|Win32"
]
p = Popen(cmd, stdout=PIPE, stderr=PIPE)
out = p.stdout.read() # reads full output into string, including line breaks
print out
try double quoting like: 'devenv A.sln /build "Debug|Win32"'
Looks like Windows' shell is taking that | as a pipe (despite the quotes and escapes). Have you tried shell=False instead?

Categories