I am attempting to create a Python script that in turn runs the shell script "js2coffee" to convert some javascript into coffeescript.
From the command line I can run this, and get coffeescript back again...
echo "var myNumber = 100;" | js2coffee
What I need to do is use this same pattern from Python.
In Python, I've come to something like this:
command = "echo '" + myJavscript + "' | js2coffee"
result = os.popen(command).read()
This works sometimes, but there are issues related to special characters (mostly quotes, I think) not being properly escaped in the myJavascript. There has got to be a standard way of doing this. Any ideas? Thanks!
Use the input stream of a process to feed it the data, that way you can avoid the shell and you don't need to escape your javascript. Additionally, you're not vulnerable to shell injection attacks;
pr = subprocess.Popen(['js2coffee'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE)
result, stderrdata = pr.communicate('var myNumber = 100;')
subprocess module is the way to go:
http://docs.python.org/library/subprocess.html#frequently-used-arguments
be kindly noted the following:
args is required for all calls and should be a string, or a sequence of program arguments. Providing a sequence of arguments is generally preferred, as it allows the module to take care of any required escaping and quoting of arguments (e.g. to permit spaces in file names)
Related
For example I am using ffplay and want to run this command -bufsize[:stream_specifier] integer (output,audio,video)
At the moment I have this:
subprocess.call(["ffplay", "-vn", "-nodisp","-bufsize 4096", "%s" % url])
But this says it is invalid.
As JBernardo mentioned in a comment, separate the "-bufsize 4096" argument into two, "-bufsize", "4096". Each argument needs to be separated when subprocess.call is used with shell=False (the default). You can also specify shell=True and give the whole command as a single string, but this is not recommended due to potential security vulnerabilities.
You should not need to use string formatting where you have "%s" % url. If url is a string, pass it directly, otherwise call str(url) to get a string representation.
This is the way to go:
url = 'http://www.whatever.com'
cmd = 'ffplay -vn -nodisp -bufsize 4096 '.split()
subprocess.call(cmd + [str(url)], shell=False)
While using shlex.split() is overkill for your use case, many of the comments seem to be asking about the use of spaces in parameters in cases where a CLI allows you to pass in quoted strings containing spaces (i.e. git commit -m "Commit message here").
Here is a quick python function that can be used to run commands including parameters with spaces:
import shlex, subprocess
def run_command( command ):
subprocess.call(shlex.split(command))
This question already has answers here:
How to use `subprocess` command with pipes
(7 answers)
Closed 1 year ago.
When trying to run the tasklist command with grep by using subprocess:
command = ("tasklist | grep edpa.exe | gawk \"{ print $2 }\"")
p = subprocess.Popen(command, stdout=subprocess.PIPE)
text = p.communicate(timeout=600)[0]
print(text)
I get this error:
ERROR: Invalid argument/option - '|'.
Type "TASKLIST /?" for usage.
It works fine when i run the command directly from cmd, but when using subprocess something goes wrong.
How can it be fixed? I need to use the output of the command so i can not use os.system
.
Two options:
Use the shell=True option of the Popen(); this will pass it through the shell, which is the part that interprets things like the |
Just run tasklist in the Popen(), then do the processing in Python rather than invoking grep and awk
Of the two, the latter is probably the better approach in this particular instance, since these grep and awk commands are easily translated into Python.
Your linters may also complain that shell=True is prone to security issues, although this particular usage would be OK.
In the absence of shell=True, subprocess runs a single subprocess. In other words, you are passing | and grep etc as arguments to tasklist.
The simplest fix is to add shell=True; but a much better fix is to do the trivial text processing in Python instead. This also coincidentally gets rid of the useless grep.
for line in subprocess.check_output(['tasklist'], timeout=600).splitlines():
if 'edpa.exe' in line:
text = line.split()[1]
print(text)
I have assumed you really want to match edpa.exe literally, anywhere in the output line; your regex would match edpa followed by any character followed by exe. The code could be improved by doing the split first and then look for the search string only in the process name field (if that is indeed your intent).
Perhaps notice also how you generally want to avoid the low-level Popen whenever you can use one of the higher-level functions.
When using os.system() it's often necessary to escape filenames and other arguments passed as parameters to commands. How can I do this? Preferably something that would work on multiple operating systems/shells but in particular for bash.
I'm currently doing the following, but am sure there must be a library function for this, or at least a more elegant/robust/efficient option:
def sh_escape(s):
return s.replace("(","\\(").replace(")","\\)").replace(" ","\\ ")
os.system("cat %s | grep something | sort > %s"
% (sh_escape(in_filename),
sh_escape(out_filename)))
Edit: I've accepted the simple answer of using quotes, don't know why I didn't think of that; I guess because I came from Windows where ' and " behave a little differently.
Regarding security, I understand the concern, but, in this case, I'm interested in a quick and easy solution which os.system() provides, and the source of the strings is either not user-generated or at least entered by a trusted user (me).
shlex.quote() does what you want since python 3.
(Use pipes.quote to support both python 2 and python 3,
though note that pipes has been deprecated since 3.10
and slated for removal in 3.13)
This is what I use:
def shellquote(s):
return "'" + s.replace("'", "'\\''") + "'"
The shell will always accept a quoted filename and remove the surrounding quotes before passing it to the program in question. Notably, this avoids problems with filenames that contain spaces or any other kind of nasty shell metacharacter.
Update: If you are using Python 3.3 or later, use shlex.quote instead of rolling your own.
Perhaps you have a specific reason for using os.system(). But if not you should probably be using the subprocess module. You can specify the pipes directly and avoid using the shell.
The following is from PEP324:
Replacing shell pipe line
-------------------------
output=`dmesg | grep hda`
==>
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
output = p2.communicate()[0]
Maybe subprocess.list2cmdline is a better shot?
Note that pipes.quote is actually broken in Python 2.5 and Python 3.1 and not safe to use--It doesn't handle zero-length arguments.
>>> from pipes import quote
>>> args = ['arg1', '', 'arg3']
>>> print 'mycommand %s' % (' '.join(quote(arg) for arg in args))
mycommand arg1 arg3
See Python issue 7476; it has been fixed in Python 2.6 and 3.2 and newer.
I believe that os.system just invokes whatever command shell is configured for the user, so I don't think you can do it in a platform independent way. My command shell could be anything from bash, emacs, ruby, or even quake3. Some of these programs aren't expecting the kind of arguments you are passing to them and even if they did there is no guarantee they do their escaping the same way.
Notice: This is an answer for Python 2.7.x.
According to the source, pipes.quote() is a way to "Reliably quote a string as a single argument for /bin/sh". (Although it is deprecated since version 2.7 and finally exposed publicly in Python 3.3 as the shlex.quote() function.)
On the other hand, subprocess.list2cmdline() is a way to "Translate a sequence of arguments into a command line string, using the same rules as the MS C runtime".
Here we are, the platform independent way of quoting strings for command lines.
import sys
mswindows = (sys.platform == "win32")
if mswindows:
from subprocess import list2cmdline
quote_args = list2cmdline
else:
# POSIX
from pipes import quote
def quote_args(seq):
return ' '.join(quote(arg) for arg in seq)
Usage:
# Quote a single argument
print quote_args(['my argument'])
# Quote multiple arguments
my_args = ['This', 'is', 'my arguments']
print quote_args(my_args)
The function I use is:
def quote_argument(argument):
return '"%s"' % (
argument
.replace('\\', '\\\\')
.replace('"', '\\"')
.replace('$', '\\$')
.replace('`', '\\`')
)
that is: I always enclose the argument in double quotes, and then backslash-quote the only characters special inside double quotes.
On UNIX shells like Bash, you can use shlex.quote in Python 3 to escape special characters that the shell might interpret, like whitespace and the * character:
import os
import shlex
os.system("rm " + shlex.quote(filename))
However, this is not enough for security purposes! You still need to be careful that the command argument is not interpreted in unintended ways. For example, what if the filename is actually a path like ../../etc/passwd? Running os.system("rm " + shlex.quote(filename)) might delete /etc/passwd when you only expected it to delete filenames found in the current directory! The issue here isn't with the shell interpreting special characters, it's that the filename argument isn't interpreted by the rm as a simple filename, it's actually interpreted as a path.
Or what if the valid filename starts with a dash, for example, -f? It's not enough to merely pass the escaped filename, you need to disable options using -- or you need to pass a path that doesn't begin with a dash like ./-f. The issue here isn't with the shell interpreting special characters, it's that the rm command interprets the argument as a filename or a path or an option if it begins with a dash.
Here is a safer implementation:
if os.sep in filename:
raise Exception("Did not expect to find file path separator in file name")
os.system("rm -- " + shlex.quote(filename))
I think these answers are a bad idea for escaping command-line arguments on Windows. Based on the results: people are trying to apply a black-list approach to filtering 'bad' characters, assuming (and hoping) they got them all. Windows is very complex and there could be all manner of characters found in the future that might allow an attacker to hijack command line arguments.
I've already seen some answers neglect to filter basic meta-characters in Windows (like the semi-colon.) The approach I take is far simpler:
Make a list of allowed ASCII characters.
Remove all chars that aren't in that list.
Escape slashes and double-quotes.
Surround entire command with double quotes so the command argument cannot be maliciously broken and commandeered with spaces.
A basic example:
def win_arg_escape(arg, allow_vars=0):
allowed_list = """'"/\\abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_-. """
if allow_vars:
allowed_list += "~%$"
# Filter out anything that isn't a
# standard character.
buf = ""
for ch in arg:
if ch in allowed_list:
buf += ch
# Escape all slashes.
buf = buf.replace("\\", "\\\\")
# Escape double quotes.
buf = buf.replace('"', '""')
# Surround entire arg with quotes.
# This avoids spaces breaking a command.
buf = '"%s"' % (buf)
return buf
The function has an option to enable use of environmental variables and other shell variables. Enabling this poses more risk so its disabled by default.
Platform: Windows
Grep: http://gnuwin32.sourceforge.net/packages/grep.htm
Python: 2.7.2
Windows command prompt used to execute the commands.
I am searching for the for the following pattern "2345$" in a file.
Contents of the file are as follows:
abcd 2345
2345
abcd 2345$
grep "2345$" file.txt
grep returns 2 lines (first and second) successfully.
When I try to run the above command through python I don't see any output.
Python code snippet is as follows:
temp = open('file.txt', "r+")
grep_cmd = []
grep_cmd.extend([grep, '"2345$"' ,temp.name])
print grep_cmd
p = subprocess.Popen(grep_cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
stdoutdata = p.communicate()[0]
print stdoutdata
If I have
grep_cmd.extend([grep, '2345$' ,temp.name])
in my python script, I get the correct answer.
The questions is why the grep command with "
grep_cmd.extend([grep, '"2345$"' ,temp.name])
executed from python fails. Isn't python supposed to execute
the command as it is.
Thanks
Gudge.
Do not put double quotes around your pattern. It is only needed on the command line to quote shell metacharacters. When calling a program from python, you do not need this.
You also do not need to open the file yourself - grep will do that:
grep_cmd.extend([grep, '2345$', 'file.txt'])
To understand the reason for the double quotes not being needed and causing your command to fail, you need to understand the purpose of the double quotes and how they are processed.
The shell uses double quotes to prevent special processing of some shell metacharacters. Shell metacharacters are those characters that the shell handles specially and does not pass literally to the programs it executes. The most commonly used shell metacharacter is "space". The shell splits a command on space boundaries to build an argument vector to execute a program with. If you want to include a space in an argument, it must be quoted in some way (single or double quotes, backslash, etc). Another is the dollar sign ($), which is used to signify variable expansion.
When you are executing a program without the shell involved, all these rules about quoting and shell metacharacters are not relevant. In python, you are building the argument vector yourself, so the relevant quoting rules are python quoting rules (e.g. to include a double quote inside a double-quoted string, prefix the double quote with a backslash - the backslash will not be in the final string). The characters in each element of the argument vector when you have completed constructing it are the literal characters that will be passed to the program you are executing.
Grep does not treat double quotes as special characters, so if grep gets double quotes in its search pattern, it will attempt to match double quotes from its input.
My original answer's reference to shell=True was incorrect - first I did not notice that you had originally specified shell=True, and secondly I was coming from the perspective of a Unix/Linux implementation, not Windows.
The python subprocess module page has this to say about shell=True and Windows:
On Windows: the Popen class uses CreateProcess() to execute the child child program, which operates on strings. If args is a sequence, it will be converted to a string in a manner described in Converting an argument sequence to a string on Windows.
That linked section on converting an argument sequence to a string on Windows does not make sense to me. First, a string is a sequence, and so is a list, yet the Frequently Used Arguments section says this about arguments:
args is required for all calls and should be a string, or a sequence of program arguments. Providing a sequence of arguments is generally preferred, as it allows the module to take care of any required escaping and quoting of arguments (e.g. to permit spaces in file names).
This contradicts the conversion process described in the Python documentation, and given the behaviour you have observed, I'd say the documentation is wrong, and only applied to a argument string, not an argument vector. I cannot verify this myself as I do not have Windows or the source code for Python lying around.
I suspect that if you call subprocess.Popen like:
p = subprocess.Popen(grep + ' "2345$" file.txt', stdout=..., shell_True)
you may find that the double quotes are stripped out as part of the documented argument conversion.
You can use python-textops3 :
from textops import *
print('\n'.join(cat('file.txt') | grep('2345$')))
with python-textops3 you can use unix-like commands with pipes within python
so no need to fork a process which is very heavy
I'm having a problem with subprocess and printing quotes.
My Python script takes user input, mashes it around a bit - and I need it to send it's results to a bash script in this manner.
myscript.sh 'var1 == a var2 == b; othervar == c' /path/to/other/files
Where I'm getting hung up on is the single quotes. Python tries to rip them out.
I used this for my test.
subprocess.Popen([myscript.sh 'var=11; ignore all' /path/to/files], shell=True, executable="/bin/bash")
which returns an invalid syntax pointing at the 2nd single quote. I've also tried the above without the brackets and using single quotes outside and double quotes inside, etc.
Other - would-like.
As I was saying above the 'var == a var == b; othervar == c' is derived from the python script (in string format) - and I'll need to call that in the subprocess like this.
subprocess.Popen([myscript.sh myvariables /path/to/files], shell=True, executable="/bin/bash")
I just have to put the single quotes around the value of myvariables like the first example.
Any pointers as to where I'm going off the correct method?
Thank you.
When shell=True is passed to Popen, you pass whatever you would send on the command line. That means your list should only have one element. So for example:
subprocess.Popen(['myscript.sh "var=11; ignore all" /path/to/files'], shell=True, executable="/bin/bash")
Or if /path/to/files is a variable in your Python environment:
subprocess.Popen(['myscript.sh "var=11; ignore all" %s' % path_to_files], shell=True, executable="/bin/bash")
Having said that I STRONGLY encourage you not to use the shell argument. The reason is fragility. You'll get a much more robust way of doing it like this:
subprocess.Popen(["/bin/bash", "myscript.sh", "var=11; ignore all", path_to_files])
Note that "var=11; ignore all" is passed as one argument to your script. If those are separate arguments, make them separate list elements.
I haven't checked why this works, but it does and without the need for shell=True.
subprocess.Popen(["/bin/bash", "myscript.sh", '""' + string_to_be_quoted + '""', path_to_files])
That's a list, those are strings in it, so they need quotes:
["myscript.sh", "var=11; ignore all", "/path/to/files"]
That should work. If your script really somehow relies on quotes, then try this (I don't know the details of how subprocess works):
["myscript.sh", "'var=11; ignore all'", "/path/to/files"]