Parsing limited switches with python argparse - python

Is there a way to parse only limited number of switches in a function using argparse? Say, my command is:
python sample.py -t abc -r dfg -h klm -n -p qui
And I want argparse to parse from -t to -h and leave the remaining, also show help for these only.
Next I want to parse any switch after -h into another function and see corresponding help there.
Is this behavior possible in argparse? Also is there a way I can modify sys.arg it is using internally?
Thanks.

python sample.py -t abc -r dfg -h klm -n -p qui
And I want argparse to parse from -t to -h and leave the remaining, also show help for these only. Next I want to parse any switch after -h into another function and see corresponding help there.
There are some issues with your specification:
Is -h the regular help? If so it has priority, producing the help without parsing the other arguments. The string after -h suggests you are treating it like a normal user define argument, which would then require initiating the parser with help turned off. But then how would you ask for help?
What sets the break between the two parsings/help? The number of arguments, the -h flag (regardless of order), or the id of the flags. Remember argparse accepts flagged arguments in any order.
You could define one parser that knows about -t and -r, and another that handles -n and -p. Calling each with parse_known_args lets it operate without raising a unknown argument error.
You can also modify the sys.argv. parse_args (and the known variant), takes an optional argv argument. If that is none, then it uses sys.argv[1:]. So you could either modify sys.argv itself (deleting items), or you could pass a subset of sys.argv to the parser.
parser1.parse_known_args(sys.argv[1:5])
parser2.parse_known_args(['-n','one','-o','two'])
parser3.parse_args(sys.argv[3:])
Play with those ideas, and come back to us if there are further questions.

You can always modify sys.args and put anything you wish there.
As for your main question, you can have two parsers. One of them will have arguments -t to -h, the second -n and -p. Then you can use argparse's parse_known_args() method on each parser, which will parse only the arguments defined for each of them.

Related

Is there a way to disable positional arguments given that a flag is true with Python argparse?

I am building a command line tool which should work as follows:
mytool [-h] [-c|--config FILE] [-l|--list] ACTION
positional arguments:
ACTION the action to be performed
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
-l, --list show list of configured actions and exit
-c, --config CONFIG use CONFIG instead of the default configuration file
I am using argparse to parse the command line and I am facing what seems to be a limitation of the library. Let me explain through some use-cases.
Use-case 1
$ mytool -h
$ mytool -c path/to/file -h
$ mytool -l -h
$ mytool some-action -h
All the above invocations of mytool shall print the help message and exit, exactly as it is shown above, more importantly showing ACTIONS to be mandatory.
Use-case 2
$ mytool -l
$ mytool -c path/to/file --list
$ mytool --list some-action
$ mytool --list --config path/to/file
All the above invocations must list the configured actions given the content of the configuration files, and exit. The allowed values of ACTION depend on the content of the configuration file, they are not simply hard-coded in the program.
Notice that even if an action is given, it is ignored because -l|--list has a higher precendance, similar to how -h works against other flags and arguments.
Also, please note that solutions such as this, which implement custom argparse.Action sub-classes won't work because the action of the listing flag needs the value of the configuration flag, so the parsing must complete (at least partially) for the listing to begin.
Finally, the absence of the otherwise required positional argument ACTION does not cause the parser to abort with an error.
Use-case 3
In the absence of -l|--list the parser works as expected:
$ mytool -c path/to/file # exits with error, positional argument missing
$ mytool some-action # ok
In simple words, I am trying to make the -l|--list flag "disable" the mandatory enforcing of the positional argument ACTION. Moreover I am looking for a solution that allows me to perform the listing (the action of the -l|--list flag) after the parsing has (at least partially) completed, so the value of the -c|--config flag is available.
-h works the way it does because it uses a special action, argparse._HelpAction. (Simiarly, -v uses argparse._VersionAction.) This action causes the script to exit while parsing, so no folllowing arguments are processed. (Note that any side effects produced while parsing previous arguments may still occur; the only notion of precedence parse_args has is that arguments are processed from left to right.)
If you want -l to similarly show available actions and exit, you need to define your own custom action. For example,
class ListAction(argparse.Action):
def __call__(self, parser, namespace, values, option_string):
print("Allowable actions are...")
print("foo")
print("bar")
print("baz")
parser.exit()
p = argparse.ArgumentParser()
p.add_argument("-l", "--list", action=ListAction)
...
args = p.parse_args()
In particular, p.parse_args(["-l", "-h"]) will list actions without displaying the help message, and p.parse_args(["-h", "-l"]) will print the help message but not list actions. Which one gets processed and terminates your script depends solely on which one appears first in the argument list.
As shown in the usage, ACTION will be required, unless the user uses '-h' or '-v', which exit in the middle of parsing.
You could define ACTION as nargs='?', or as a flagged argument, in which case it is optional.
I was going to suggest giving ACTION choices, which will then appear in the help. But if they depend on '--config' value that could be more awkward (though not impossible).
'-l' could have a custom action class that behaves like 'version', but prints the desired list and exits. But then you can't provide an action as well.
Creating arguments that depend on each other in some way is awkward, though not impossible. It easiest to handle interdependencies after parsing, when all arguments have been read. Keep in mind that argparse tries to accept arguments in any order, '-c' could be before, or after '-l', or Action.
In theory though your custom list action, could check the Namespace for a 'config' value, and base its action on that value. That would work only if you can count on the user providing '-c' before '-l'; argparse won't enforce that for you.
As a general rule it's best to think of argparse as a tool for finding out what the user wants, with limited assistance in policing that, and an ability to automatically format usage and help. Don't expect it to help with complicated interactions and logical mixes of arguments.

Argparse: Ignore dashes in unknown arguments or collect values (potentially starting with dashes) until the next known command

I have a Python script that will later call multiple Bash scripts with supprocess.run. When calling the Python script, the user should be able to specify lists of arguments (some of which might start with hyphens) for the Bash scripts like
python script.py \
--bash-args1 --param1 val1 --param2 val2 \
--bash-args2 bla --param3 val3 --blu
Argparse should parse this into Namespace(bash_args1=['--param1', 'val1', '--param2', 'val2'], bash_args2=['bla', '--param3', 'val3', '--blu']). Is there a canonical way of achieving this? I cannot use nargs=argparse.REMAINDER or parser.parse_known_args because I need to collect the arguments for more than one Bash script and a simple nargs='+' will fail if the secondary arguments start with dashes.
I guess I would need one of the following:
Either something similar to REMAINDER that causes argparse to collect all strings up to the next known argument
an option that tells argparse to ignore dashes in unknown arguments when using nargs='+'.
For posterity:
There are a few ways of working around this limitation of argparse. I wrapped one up and published it on PyPI. You can use it just like argparse. The only difference is that there is an extra option you can supply as nargs parameter in add_argument. When this option is used, the parser collects all unknown arguments (regardless of whether they start with a hyphen or not) until the next known argument. For more info check out the repo on github.

argparse add subparser that starts with a dash

I'm trying to create a Pacman wrapper in Python. I'm having trouble to parse the arguments in the same way Pacman does. (Described at https://man.archlinux.org/man/pacman.8)
In order to parse the arguments, I need to create a sub parser that starts with a dash. E.g. Pacman allows us to do:
$ pacman --database --asdeps which
Here --asdeps is specific to the --database operation. The following is invalid, if I use the --files operation instead of --database:
$ pacman --files --asdeps
error: invalid option '--asdeps'
Python has the feature request Issue 34046: subparsers -> add_parser doesn't support hyphen char '-' - Python tracker, which was rejected.
Is there some way I can do this with argparse? Or is there some other more flexible argument parsing library, short of parsing the arguments manually?
Update I'm guessing in this case, that parsing arguments manually isn't too bad, as the manual says the operation must be the first argument. This makes sense for Pacman and for argparse. Otherwise argparse can't easily know if a flag is the name of a sub command, or an argument of that sub command.

Using Python to parse complex arguments to shell script

When I'm writing shell scripts, I often find myself spending most of my time (especially when debugging) dealing with argument processing. Many scripts I write or maintain are easily more than 80% input parsing and sanitization. I compare that to my Python scripts, where argparse handles most of the grunt work for me, and lets me easily construct complex option structures and sanitization / string parsing behavior.
I'd love, therefore, to be able to have Python do this heavy lifting, and then get these simplified and sanitized values in my shell script, without needing to worry any further about the arguments the user specified.
To give a specific example, many of the shell scripts where I work have been defined to accept their arguments in a specific order. You can call start_server.sh --server myserver --port 80 but start_server.sh --port 80 --server myserver fails with You must specify a server to start. - it makes the parsing code a lot simpler, but it's hardly intuitive.
So a first pass solution could be something as simple as having Python take in the arguments, sort them (keeping their parameters next to them) and returning the sorted arguments. So the shell script still does some parsing and sanitization, but the user can input much more arbitrary content than the shell script natively accepts, something like:
# script.sh -o -aR --dir /tmp/test --verbose
#!/bin/bash
args=$(order.py "$#")
# args is set to "-a --dir /tmp/test -o -R --verbose"
# simpler processing now that we can guarantee the order of parameters
There's some obvious limitations here, notably that parse.py can't distinguish between a final option with an argument and the start of indexed arguments, but that doesn't seem that terrible.
So here's my question: 1) Is there any existing (Python preferably) utility to enable CLI parsing by something more powerful than bash, which can then be accessed by the rest of my bash script after sanitization, or 2) Has anyone done this before? Are there issues or pitfalls or better solutions I'm not aware of? Care to share your implementation?
One (very half-baked) idea:
#!/bin/bash
# Some sort of simple syntax to describe to Python what arguments to accept
opts='
"a", "append", boolean, help="Append to existing file"
"dir", str, help="Directory to run from"
"o", "overwrite", boolean, help="Overwrite duplicates"
"R", "recurse", boolean, help="Recurse into subdirectories"
"v", "verbose", boolean, help="Print additional information"
'
# Takes in CLI arguments and outputs a sanitized structure (JSON?) or fails
p=$(parse.py "Runs complex_function with nice argument parsing" "$opts" "$#")
if [ $? -ne 0 ]; exit 1; fi # while parse outputs usage to stderr
# Takes the sanitized structure and an argument to get
append=$(arg.py "$p" append)
overwrite=$(arg.py "$p" overwrite)
recurse=$(arg.py "$p" recurse)
verbose=$(arg.py "$p" verbose)
cd $(python arg.py "$p" dir)
complex_function $append $overwrite $recurse $verbose
Two lines of code, along with concise descriptions of the arguments to expect, and we're on to the actual script behavior. Maybe I'm crazy, but that seems way nicer than what I feel like I have to do now.
I've seen Parsing shell script arguments and things like this wiki page on easy CLI argument parsing, but many of these patterns feel clunky and error prone, and I dislike having to re-implement them every time I write a shell script, especially when Python, Java, etc. have such nice argument processing libraries.
You could potentially take advantage of associative arrays in bash to help obtain your goal.
declare -A opts=($(getopts.py $#))
cd ${opts[dir]}
complex_function ${opts[append]} ${opts[overwrite]} ${opts[recurse]} \
${opts[verbose]} ${opts[args]}
To make this work, getopts.py should be a python script that parses and sanitizes your arguments. It should print a string like the following:
[dir]=/tmp
[append]=foo
[overwrite]=bar
[recurse]=baz
[verbose]=fizzbuzz
[args]="a b c d"
You could set aside values for checking that the options were able to be properly parsed and sanitized as well.
Returned from getopts.py:
[__error__]=true
Added to bash script:
if ${opts[__error__]}; then
exit 1
fi
If you would rather work with the exit code from getopts.py, you could play with eval:
getopts=$(getopts.py $#) || exit 1
eval declare -A opts=($getopts)
Alternatively:
getopts=$(getopts.py $#)
if [[ $? -ne 0 ]]; then
exit 1;
fi
eval declare -A opts=($getopts)
Having the very same needs, I ended up writing an optparse-inspired parser for bash (which actually uses python internally); you can find it here:
https://github.com/carlobaldassi/bash_optparse
See the README at the bottom for a quick explanation. You may want to check out a simple example at:
https://github.com/carlobaldassi/bash_optparse/blob/master/doc/example_script_simple
From my experience, it's quite robust (I'm super-paranoid), feature-rich, etc., and I'm using it heavily in my scripts. I hope it may be useful to others. Feedback/contributions welcome.
Edit: I haven't used it (yet), but if I were posting this answer today I would probably recommend https://github.com/docopt/docopts instead of a custom approach like the one described below.
I've put together a short Python script that does most of what I want. I'm not convinced it's production quality yet (notably error handling is lacking), but it's better than nothing. I'd welcome any feedback.
It takes advantage of the set builtin to re-assign the positional arguments, allowing the remainder of the script to still handle them as desired.
bashparse.py
#!/usr/bin/env python
import optparse, sys
from pipes import quote
'''
Uses Python's optparse library to simplify command argument parsing.
Takes in a set of optparse arguments, separated by newlines, followed by command line arguments, as argv[2] and argv[3:]
and outputs a series of bash commands to populate associated variables.
'''
class _ThrowParser(optparse.OptionParser):
def error(self, msg):
"""Overrides optparse's default error handling
and instead raises an exception which will be caught upstream
"""
raise optparse.OptParseError(msg)
def gen_parser(usage, opts_ls):
'''Takes a list of strings which can be used as the parameters to optparse's add_option function.
Returns a parser object able to parse those options
'''
parser = _ThrowParser(usage=usage)
for opts in opts_ls:
if opts:
# yes, I know it's evil, but it's easy
eval('parser.add_option(%s)' % opts)
return parser
def print_bash(opts, args):
'''Takes the result of optparse and outputs commands to update a shell'''
for opt, val in opts.items():
if val:
print('%s=%s' % (opt, quote(val)))
print("set -- %s" % " ".join(quote(a) for a in args))
if __name__ == "__main__":
if len(sys.argv) < 2:
sys.stderr.write("Needs at least a usage string and a set of options to parse")
sys.exit(2)
parser = gen_parser(sys.argv[1], sys.argv[2].split('\n'))
(opts, args) = parser.parse_args(sys.argv[3:])
print_bash(opts.__dict__, args)
Example usage:
#!/bin/bash
usage="[-f FILENAME] [-t|--truncate] [ARGS...]"
opts='
"-f"
"-t", "--truncate",action="store_true"
'
echo "$(./bashparse.py "$usage" "$opts" "$#")"
eval "$(./bashparse.py "$usage" "$opts" "$#")"
echo
echo OUTPUT
echo $f
echo $#
echo $0 $2
Which, if run as: ./run.sh one -f 'a_filename.txt' "two' still two" three outputs the following (notice that the internal positional variables are still correct):
f=a_filename.txt
set -- one 'two'"'"' still two' three
OUTPUT
a_filename.txt
one two' still two three
./run.sh two' still two
Disregarding the debugging output, you're looking at approximately four lines to construct a powerful argument parser. Thoughts?
The original premise of my question assumes that delegating to Python is the right approach to simplify argument parsing. If we drop the language requirement we can actually do a decent job* in Bash, using getopts and a little eval magic:
main() {
local _usage='foo [-a] [-b] [-f val] [-v val] [args ...]'
eval "$(parse_opts 'f:v:ab')"
echo "f=$f v=$v a=$a b=$b -- $#: $*"
}
main "$#"
The implementation of parse_opts is in this gist, but the basic approach is to convert options into local variables which can then be handled like normal. All the standard getopts boilerplate is hidden away, and error handling works as expected.
Because it uses local variables within a function, parse_opts is not just useful for command line arguments, it can be used with any function in your script.
* I say "decent job" because Bash's getopts is a fairly limited parser and only supports single-letter options. Elegant, expressive CLIs are still better implemented in other languages like Python. But for reasonably small functions or scripts this provides a nice middle ground without adding too much complexity or bloat.

How to make a custom command line interface using OptionParser?

I am using the OptionParser from optparse module to parse my command that I get using the raw_input().
I have these questions.
1.) I use OptionParser to parse this input, say for eg. (getting multiple args)
my prompt> -a foo -b bar -c spam eggs
I did this with setting the action='store_true' in add_option() for '-c',now if there is another option with multiple argument say -d x y z then how to know which arguments come from which option? also if one of the arguments has to be parsed again like
my prompt> -a foo -b bar -c spam '-f anotheroption'
2.) if i wanted to do something like this..
my prompt> -a foo -b bar
my prompt> -c spam eggs
my prompt> -d x y z
now each entry must not affect the other options set by the previous command. how to accomplish these?
For part 2: you want a new OptionParser instance for each line you process. And look at the cmd module for writing a command loop like this.
You can also solve #1 using the nargs option attribute as follows:
parser = OptionParser()
parser.add_option("-c", "", nargs=2)
parser.add_option("-d", "", nargs=3)
optparse solves #1 by requiring that an argument always have the same number of parameters (even if that number is 0), variable-parameter arguments are not allowed:
Typically, a given option either takes
an argument or it doesn’t. Lots of
people want an “optional option
arguments” feature, meaning that some
options will take an argument if they
see it, and won’t if they don’t. This
is somewhat controversial, because it
makes parsing ambiguous: if "-a" takes
an optional argument and "-b" is
another option entirely, how do we
interpret "-ab"? Because of this
ambiguity, optparse does not support
this feature.
You would solve #2 by not reusing the previous values to parse_args, so it would create a new values object rather than update.

Categories