I'm trying a very simple python script with only positional arguments, handled by docopt.
#!/usr/bin/env python
opt_spec = """Test
Usage: docopt_test (import | export <output_file> <output_format>)
docopt_test (-h | --help)
docopt_test (-v | --version)
Options:
-h --help Show this screen.
-v --version Show version.
"""
from docopt import docopt
if __name__ == '__main__':
arguments = docopt(opt_spec, version='Test 1.0')
print(arguments)
When run it will print:
./docopt_test.py export file.xml xml
{'--help': False,
'--version': False,
'<output_file>': 'file.xml',
'<output_format>': 'xml',
'export': True,
'import': False}
The problem is that the output_file and output_format arguments retain the < and > delimiters in the name, making calls like args['output_file'] impossible. Removing the delimiters from the usage string changes the semantics, making the options into keywords.
Is there a way to solve this without having to resort to usage like args['<output_file>']?
I think that's considered part of dispatching, which may have all kinds of language-specific implications. There's an experimental project for docopt-dispatch, which seems to handle it nicely with Python annotations. I have a program called snippets where I remap the argument names within my own style of dispatch but I do so by calling args just the way you did.
Related
I am building a command line tool which should work as follows:
mytool [-h] [-c|--config FILE] [-l|--list] ACTION
positional arguments:
ACTION the action to be performed
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
-l, --list show list of configured actions and exit
-c, --config CONFIG use CONFIG instead of the default configuration file
I am using argparse to parse the command line and I am facing what seems to be a limitation of the library. Let me explain through some use-cases.
Use-case 1
$ mytool -h
$ mytool -c path/to/file -h
$ mytool -l -h
$ mytool some-action -h
All the above invocations of mytool shall print the help message and exit, exactly as it is shown above, more importantly showing ACTIONS to be mandatory.
Use-case 2
$ mytool -l
$ mytool -c path/to/file --list
$ mytool --list some-action
$ mytool --list --config path/to/file
All the above invocations must list the configured actions given the content of the configuration files, and exit. The allowed values of ACTION depend on the content of the configuration file, they are not simply hard-coded in the program.
Notice that even if an action is given, it is ignored because -l|--list has a higher precendance, similar to how -h works against other flags and arguments.
Also, please note that solutions such as this, which implement custom argparse.Action sub-classes won't work because the action of the listing flag needs the value of the configuration flag, so the parsing must complete (at least partially) for the listing to begin.
Finally, the absence of the otherwise required positional argument ACTION does not cause the parser to abort with an error.
Use-case 3
In the absence of -l|--list the parser works as expected:
$ mytool -c path/to/file # exits with error, positional argument missing
$ mytool some-action # ok
In simple words, I am trying to make the -l|--list flag "disable" the mandatory enforcing of the positional argument ACTION. Moreover I am looking for a solution that allows me to perform the listing (the action of the -l|--list flag) after the parsing has (at least partially) completed, so the value of the -c|--config flag is available.
-h works the way it does because it uses a special action, argparse._HelpAction. (Simiarly, -v uses argparse._VersionAction.) This action causes the script to exit while parsing, so no folllowing arguments are processed. (Note that any side effects produced while parsing previous arguments may still occur; the only notion of precedence parse_args has is that arguments are processed from left to right.)
If you want -l to similarly show available actions and exit, you need to define your own custom action. For example,
class ListAction(argparse.Action):
def __call__(self, parser, namespace, values, option_string):
print("Allowable actions are...")
print("foo")
print("bar")
print("baz")
parser.exit()
p = argparse.ArgumentParser()
p.add_argument("-l", "--list", action=ListAction)
...
args = p.parse_args()
In particular, p.parse_args(["-l", "-h"]) will list actions without displaying the help message, and p.parse_args(["-h", "-l"]) will print the help message but not list actions. Which one gets processed and terminates your script depends solely on which one appears first in the argument list.
As shown in the usage, ACTION will be required, unless the user uses '-h' or '-v', which exit in the middle of parsing.
You could define ACTION as nargs='?', or as a flagged argument, in which case it is optional.
I was going to suggest giving ACTION choices, which will then appear in the help. But if they depend on '--config' value that could be more awkward (though not impossible).
'-l' could have a custom action class that behaves like 'version', but prints the desired list and exits. But then you can't provide an action as well.
Creating arguments that depend on each other in some way is awkward, though not impossible. It easiest to handle interdependencies after parsing, when all arguments have been read. Keep in mind that argparse tries to accept arguments in any order, '-c' could be before, or after '-l', or Action.
In theory though your custom list action, could check the Namespace for a 'config' value, and base its action on that value. That would work only if you can count on the user providing '-c' before '-l'; argparse won't enforce that for you.
As a general rule it's best to think of argparse as a tool for finding out what the user wants, with limited assistance in policing that, and an ability to automatically format usage and help. Don't expect it to help with complicated interactions and logical mixes of arguments.
This script will print env vars.
Using Python 3.9.
The goal is to able to run any subcommands if desired. The error I am getting is that if any additional short flags are added, the "ignore environment" arg is trying to parse it. I dont want this. Additional short flags go to anything assigned after --eval.
parser.py
import argparse, os
def parseargs(p):
p.usage = '%(prog)s [OPTION]... [-] [NAME=VALUE]... [COMMAND [ARG]...]'
p.add_argument(
"-i",
"--ignore-environment",
action="store_const",
const=dict(),
dest="env",
help="start with an empty environment",
default=os.environ,
)
p.add_argument(
"--export",
nargs=1,
help="Set argument with --export NAME=VALUE"
)
p.add_argument(
"--eval",
nargs="+",
help="Run any commands with newly updated environment, "
"--eval COMMAND ARGS"
)
return p
Execution as follows
>>> p = argparse.ArgumentParser()
>>> parseargs(p) # assigns arguments to parser
>>> p.parse_args('--export FOO=bar --eval cat test.py'.split()) # This is ok and works correctly. cat is the bash command
Namespace([os.environs..], eval=['cat', 'test.py'], export=['FOO=bar'])
>>>p.parse_args('--export FOO=bar --eval ls -l'.split()) # This is fails
error: unrecognized arguments: -l
How do I get "-l" to be overlook by "-i/ignore environment" but passed to eval, like using cat test.py. I have tried using sub_parser but to no avail. The same result occurs.
The problem is that parse_args tries to identify possible options lexically before ever considering the semantics of any actual option.
Since an option taking a variable number of arguments pretty much has to be the last option used alway, consider making --eval a flag which is used to tell your program how to interpret the remaining positonal arguments. Then ls and -l can be offset by --, preventing parse_args from thinking -l is an undefined option.
p.add_argument(
"--eval",
action='store_true',
help="Run any commands with newly updated environment, "
)
# zero or more, so that you don't have to provide a dummy argument
# when the lack of --eval makes a command unnecessary.
# Wart: you can still use --eval without specifying any commands.
# I don't believe argparse alone is capable of handling this,
# at least not in a way that is simpler than just validating
# arguments after calling parse_args().
p.add_argument('cmd_and_args', nargs='*')
Then your command line could look like
>>> p.parse_args('--export FOO=bar --eval -- ls -l'.split())
or even
>>> p.parse_args('--eval --export FOO=bar -- ls -l'.split())
Later, you'll use the boolean value of args.eval to decide how to treat the list args.cmd_and_args.
Important: One wrinkle with this is that you are attaching these options to arbitrary pre-existing parsers, which may have their own positional arguments defined, so getting this to play nice with the original parser might be difficult, if not impossible.
The other option is to take a single argument to be parsed internally.
p.add_arguments("--eval")
...
args = p.parse_args()
cmd_and_args = shlex.split(args.eval) # or similar
Then
>>> p.parse_args(['--export', 'FOO=bar', '--eval', 'ls -l'])
(Note that using str.split isn't going to work for a command line like --export FOO=bar --eval "ls -l".)
From the Argparse documentation:
If you have positional arguments that must begin with - and don’t look like negative numbers, you can insert the pseudo-argument '--' which tells parse_args() that everything after that is a positional argument [...]
So in your case, there are no changes you can make to how you add or define the arguments, but the string you provide to be parsed should have -- preceding the arguments to the eval option, as such:
--export FOO=bar --eval ls -- -l
I have a simple python script that uses docopt to parse command line arguments. It looks like this:
#!/usr/bin/env python
__doc__ = """
Usage: mycopy <src>... <dest>
"""
from docopt import docopt
options = docopt(__doc__)
When I run it:
./mycopy source1/ source2/ destination/
it just prints the usage info, meaning that the command line arguments I passed it were wrong. Is something wrong with the usage spec? Is it even possible to do something like this using docopt?
If you put <dest> before <src>..., it works. Accordingly, run it with ./mycopy destination/ source1/ source2/.
I think the docopt hasn't implemented the support for: ARGS... ARG. This case adds some complexity to the implementation. But I agree 'copy src1 src2 ... dest' is more straightforward usage. So maybe you could raise a request to this project: https://github.com/docopt/docopt
Consider the following script:
import argparse
parser1 = argparse.ArgumentParser()
parser1.add_argument('-a')
args1 = parser1.parse_args()
parser2 = argparse.ArgumentParser()
parser2.add_argument('-b')
args2 = parser2.parse_args()
I have several questions:
Is parse_args a one-time method or is there a way to clear the
arguments before adding new ones? (e.g. something like
args1.clear() or parser1.clear())
The result of this script is unusable. Although this script accepts
the -a argument, it does not accept any value for 'a'. Nor does it
accept any -b argument. Is there some way to make any of the arguments really work?
This is my actual scenario: I have 2 scripts. Both import the same
file which has initialization code (load config files, create
loggers, etc.), lets call it init.py This init.py file also parses
the arguments only because it needs one value from it. The problem
is that I need one of the scripts to accept other arguments as well.
Since init.py does something with one argument, I cannot wait with
parse_args. How can I make it work?
Edit:
Here is the output of my script:
[prompt]# python2.7 myscript.py -a
usage: a.py [-h] [-a A]
myscript.py: error: argument -a: expected one argument
[prompt]# python2.7 myscript.py -a 1
Namespace(a='1')
usage: a.py [-h] [-b B]
myscript.py: error: unrecognized arguments: -a 1
Your scenario is quite unclear, but I guess what you're looking for is parse_known_args
Here I guessed that you called init.py from the other files, say caller1.py and caller2.py
Also suppose that init.py only parses -a argument, while the original script will parse the rest.
You can do something like this:
in init.py put this in do_things method:
parser = argparse.ArgumentParser()
parser.add_argument('-a')
parsed = parser.parse_known_args(sys.argv)
print 'From init.py: %s' % parsed['a']
In caller1.py:
init.do_things(sys.argv)
parser = argparse.ArgumentParser()
parser.add_argument('-b')
parsed = parser.parse_known_args(sys.argv)
print 'From caller1.py: %s' % parsed['b']
If you call caller1.py as follows: python caller1.py -a foo -b bar, the result will be:
From init.py: foo
From caller1.py: bar
But if your scenario is not actually like this, I would suggest to use #Michael0x2a answer, which is just to use single ArgumentParser object in caller1.py and pass the value appropriately for init.py
This doesn't really make sense, because for all intents and purposes, the parser object is stateless. There's nothing to clear, since all it does is takes in the console arguments, and returns a Namespace object (a pseudo-dict) without ever modifying anything in the process.
Therefore, you can consider parse_args() to be idempotent. You can repeatedly call it over and over, and the same output will occur. By default, it will read the arguments from sys.argv, which is where the console arguments are stored.
However, note that you can pipe in custom arguments by passing in a list to the parse_args function so that the parser will using something other then sys.argv as input.
I'm not sure what you mean. If you call python myscript.py -a 15, args1 will equal Namespace(a='15'). You can then do args1['a'] to obtain the value of 15. If you want to make the flag act as a toggle, call parser.add_argument('-a', action='store_true'). Here is a list of all available actions.
I would try and confine all the console/interface code into a single module and into a single parser. Basically, remove the code to parse the command line from init.py and the second file into an independent little section. Once you run the parser, which presents a unified interface for everything in your program, pass in the appropriate variables to functions inside init.py. This has the added advantage of keeping the UI separate and more easily interchangeable with the rest of the code.
When I'm writing shell scripts, I often find myself spending most of my time (especially when debugging) dealing with argument processing. Many scripts I write or maintain are easily more than 80% input parsing and sanitization. I compare that to my Python scripts, where argparse handles most of the grunt work for me, and lets me easily construct complex option structures and sanitization / string parsing behavior.
I'd love, therefore, to be able to have Python do this heavy lifting, and then get these simplified and sanitized values in my shell script, without needing to worry any further about the arguments the user specified.
To give a specific example, many of the shell scripts where I work have been defined to accept their arguments in a specific order. You can call start_server.sh --server myserver --port 80 but start_server.sh --port 80 --server myserver fails with You must specify a server to start. - it makes the parsing code a lot simpler, but it's hardly intuitive.
So a first pass solution could be something as simple as having Python take in the arguments, sort them (keeping their parameters next to them) and returning the sorted arguments. So the shell script still does some parsing and sanitization, but the user can input much more arbitrary content than the shell script natively accepts, something like:
# script.sh -o -aR --dir /tmp/test --verbose
#!/bin/bash
args=$(order.py "$#")
# args is set to "-a --dir /tmp/test -o -R --verbose"
# simpler processing now that we can guarantee the order of parameters
There's some obvious limitations here, notably that parse.py can't distinguish between a final option with an argument and the start of indexed arguments, but that doesn't seem that terrible.
So here's my question: 1) Is there any existing (Python preferably) utility to enable CLI parsing by something more powerful than bash, which can then be accessed by the rest of my bash script after sanitization, or 2) Has anyone done this before? Are there issues or pitfalls or better solutions I'm not aware of? Care to share your implementation?
One (very half-baked) idea:
#!/bin/bash
# Some sort of simple syntax to describe to Python what arguments to accept
opts='
"a", "append", boolean, help="Append to existing file"
"dir", str, help="Directory to run from"
"o", "overwrite", boolean, help="Overwrite duplicates"
"R", "recurse", boolean, help="Recurse into subdirectories"
"v", "verbose", boolean, help="Print additional information"
'
# Takes in CLI arguments and outputs a sanitized structure (JSON?) or fails
p=$(parse.py "Runs complex_function with nice argument parsing" "$opts" "$#")
if [ $? -ne 0 ]; exit 1; fi # while parse outputs usage to stderr
# Takes the sanitized structure and an argument to get
append=$(arg.py "$p" append)
overwrite=$(arg.py "$p" overwrite)
recurse=$(arg.py "$p" recurse)
verbose=$(arg.py "$p" verbose)
cd $(python arg.py "$p" dir)
complex_function $append $overwrite $recurse $verbose
Two lines of code, along with concise descriptions of the arguments to expect, and we're on to the actual script behavior. Maybe I'm crazy, but that seems way nicer than what I feel like I have to do now.
I've seen Parsing shell script arguments and things like this wiki page on easy CLI argument parsing, but many of these patterns feel clunky and error prone, and I dislike having to re-implement them every time I write a shell script, especially when Python, Java, etc. have such nice argument processing libraries.
You could potentially take advantage of associative arrays in bash to help obtain your goal.
declare -A opts=($(getopts.py $#))
cd ${opts[dir]}
complex_function ${opts[append]} ${opts[overwrite]} ${opts[recurse]} \
${opts[verbose]} ${opts[args]}
To make this work, getopts.py should be a python script that parses and sanitizes your arguments. It should print a string like the following:
[dir]=/tmp
[append]=foo
[overwrite]=bar
[recurse]=baz
[verbose]=fizzbuzz
[args]="a b c d"
You could set aside values for checking that the options were able to be properly parsed and sanitized as well.
Returned from getopts.py:
[__error__]=true
Added to bash script:
if ${opts[__error__]}; then
exit 1
fi
If you would rather work with the exit code from getopts.py, you could play with eval:
getopts=$(getopts.py $#) || exit 1
eval declare -A opts=($getopts)
Alternatively:
getopts=$(getopts.py $#)
if [[ $? -ne 0 ]]; then
exit 1;
fi
eval declare -A opts=($getopts)
Having the very same needs, I ended up writing an optparse-inspired parser for bash (which actually uses python internally); you can find it here:
https://github.com/carlobaldassi/bash_optparse
See the README at the bottom for a quick explanation. You may want to check out a simple example at:
https://github.com/carlobaldassi/bash_optparse/blob/master/doc/example_script_simple
From my experience, it's quite robust (I'm super-paranoid), feature-rich, etc., and I'm using it heavily in my scripts. I hope it may be useful to others. Feedback/contributions welcome.
Edit: I haven't used it (yet), but if I were posting this answer today I would probably recommend https://github.com/docopt/docopts instead of a custom approach like the one described below.
I've put together a short Python script that does most of what I want. I'm not convinced it's production quality yet (notably error handling is lacking), but it's better than nothing. I'd welcome any feedback.
It takes advantage of the set builtin to re-assign the positional arguments, allowing the remainder of the script to still handle them as desired.
bashparse.py
#!/usr/bin/env python
import optparse, sys
from pipes import quote
'''
Uses Python's optparse library to simplify command argument parsing.
Takes in a set of optparse arguments, separated by newlines, followed by command line arguments, as argv[2] and argv[3:]
and outputs a series of bash commands to populate associated variables.
'''
class _ThrowParser(optparse.OptionParser):
def error(self, msg):
"""Overrides optparse's default error handling
and instead raises an exception which will be caught upstream
"""
raise optparse.OptParseError(msg)
def gen_parser(usage, opts_ls):
'''Takes a list of strings which can be used as the parameters to optparse's add_option function.
Returns a parser object able to parse those options
'''
parser = _ThrowParser(usage=usage)
for opts in opts_ls:
if opts:
# yes, I know it's evil, but it's easy
eval('parser.add_option(%s)' % opts)
return parser
def print_bash(opts, args):
'''Takes the result of optparse and outputs commands to update a shell'''
for opt, val in opts.items():
if val:
print('%s=%s' % (opt, quote(val)))
print("set -- %s" % " ".join(quote(a) for a in args))
if __name__ == "__main__":
if len(sys.argv) < 2:
sys.stderr.write("Needs at least a usage string and a set of options to parse")
sys.exit(2)
parser = gen_parser(sys.argv[1], sys.argv[2].split('\n'))
(opts, args) = parser.parse_args(sys.argv[3:])
print_bash(opts.__dict__, args)
Example usage:
#!/bin/bash
usage="[-f FILENAME] [-t|--truncate] [ARGS...]"
opts='
"-f"
"-t", "--truncate",action="store_true"
'
echo "$(./bashparse.py "$usage" "$opts" "$#")"
eval "$(./bashparse.py "$usage" "$opts" "$#")"
echo
echo OUTPUT
echo $f
echo $#
echo $0 $2
Which, if run as: ./run.sh one -f 'a_filename.txt' "two' still two" three outputs the following (notice that the internal positional variables are still correct):
f=a_filename.txt
set -- one 'two'"'"' still two' three
OUTPUT
a_filename.txt
one two' still two three
./run.sh two' still two
Disregarding the debugging output, you're looking at approximately four lines to construct a powerful argument parser. Thoughts?
The original premise of my question assumes that delegating to Python is the right approach to simplify argument parsing. If we drop the language requirement we can actually do a decent job* in Bash, using getopts and a little eval magic:
main() {
local _usage='foo [-a] [-b] [-f val] [-v val] [args ...]'
eval "$(parse_opts 'f:v:ab')"
echo "f=$f v=$v a=$a b=$b -- $#: $*"
}
main "$#"
The implementation of parse_opts is in this gist, but the basic approach is to convert options into local variables which can then be handled like normal. All the standard getopts boilerplate is hidden away, and error handling works as expected.
Because it uses local variables within a function, parse_opts is not just useful for command line arguments, it can be used with any function in your script.
* I say "decent job" because Bash's getopts is a fairly limited parser and only supports single-letter options. Elegant, expressive CLIs are still better implemented in other languages like Python. But for reasonably small functions or scripts this provides a nice middle ground without adding too much complexity or bloat.