Python argparse : how to detect duplicated optional argument? - python

I'm using argparse with optional parameter, but I want to avoid having something like this : script.py -a 1 -b -a 2
Here we have twice the optional parameter 'a', and only the second parameter is returned. I want either to get both values or get an error message.
How should I define the argument ?
[Edit]
This is the code:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-a', dest='alpha', action='store', nargs='?')
parser.add_argument('-b', dest='beta', action='store', nargs='?')
params, undefParams = self.parser.parse_known_args()

append action will collect the values from repeated use in a list
parser.add_argument('-a', '--alpha', action='append')
producing an args namespace like:
namespace(alpha=['1','3'], b='4')
After parsing you can check args.alpha, and accept or complain about the number of values. parser.error('repeated -a') can be used to issue an argparse style error message.
You could implement similar functionality in a custom Action class, but that requires understanding the basic structure and operation of such a class. I can't think anything that can be done in an Action that can't just as well be done in the appended list after.
https://stackoverflow.com/a/23032953/901925 is an answer with a no-repeats custom Action.
Why are you using nargs='?' with flagged arguments like this? Without a const parameter this is nearly useless (see the nargs=? section in the docs).
Another similar SO: Python argparse with nargs behaviour incorrect

Related

Is it possible to validate `argparse` default argument values?

Is it possible to tell argparse to give the same errors on default argument values as it would on user-specified argument values?
For example, the following will not result in any error:
parser = argparse.ArgumentParser()
parser.add_argument('--choice', choices=['a', 'b', 'c'], default='invalid')
args = vars(parser.parse_args()) # args = {'choice': 'invalid'}
whereas omitting the default, and having the user specify --choice=invalid on the command-line will result in an error (as expected).
Reason for asking is that I would like to have the user to be able to specify default command-line options in a JSON file which are then set using ArgumentParser.set_defaults(), but unfortunately the behaviour demonstrated above prevents these user-specified defaults from being validated.
Update: argparse is inconsistent and I now consider the behavior above to be a bug. The following does trigger an error:
parser = argparse.ArgumentParser()
parser.add_argument('--num', type=int, default='foo')
args = parser.parse_args() # triggers exception in case --num is not
# specified on the command-line
I have opened a bug report for this: https://github.com/python/cpython/issues/100949
I took the time to dig into the source code, and what is happening is that a check is only happening for arguments you gave on the command line. The only way to enforce a check, in my opinion, is to subclass ArgumentParser and have it do the check when you add the argument:
class ValidatingArgumentParser(argparse.ArgumentParser):
def add_argument(self, *args, **kwargs):
super().add_argument(*args, **kwargs)
self._check_value(self._actions[-1],kwargs['default'])
No. Explicit arguments need to be validated because they originate from outside the source code. Default values originate in the source code, so it's the job of the programmer, not the argument parser, to ensure they are valid.
(This is the difference between validation and debugging.)
(Using set_defaults on unvalidated user input still falls under the purview of debugging, as it's not the argument parser itself adding the default values, but the programmer.)

Most common way to work with argparse parsed values

In using argparse this is the first time I've come across a 'Namespace' object. What is the most common way to work with these objects? For example, if I have this initialization code:
import argparse
parser = argparse.ArgumentParser(description='Dedupe library.', allow_abbrev=True)
parser.add_argument( '-a', '--all', nargs='+', type=int, help='(Optional) Enter one or more IDs.')
parser.add_argument( '-r', '--reverse', nargs='+', help='(Optional) Enter one or more IDs.')
It seems like the library adds a property on every --long option (if it exists, otherwise the short -s option), so something like the following works:
# test.py
p = parser.parse_args()
print (p.all, p.reverse)
# -------------------------------------
$ python test.py -a 2 3 -r asdf
# [2, 3] ['asdf']
Is this the most common way to work with the argparse output, or how is this usually done?
Every argument performs some kind of action, specified by the action argument to add_argument. The default is a store action.
Each store action saves one (or more) values to an attribute in the resulting namespace. You can specify which attribute with the dest argument to add_argument, but more commonly the name is inferred from the first long option name (or the first short name, if there are no long names).
Note that you can have multiple options that affect the same attribute. A common use is to have multiple store_const actions that save a different hard-coded value to a single attribute.
p.add_argument("--high", action='store_const', dest='level', const='high')
p.add_argument("--med", action='store_const', dest='level', const='medium')
p.add_argument("--low", action='store_const', dest='level', const='low')
You could consider this as providing a series of aliases for an option that takes an explicit argument to specify a level:
p.add_argument("--level", choices=['high', 'medium', low'])
where --high has the same effect as --level high.
>>> p.parse_args(["--level", "high"]).level
'high'
>>> p.parse_args(["--high"]).level
'high'

Parsing argparse input bit by bit

I am using Argparse to parse shell input to my Python function.
The tricky part is that this script first reads in a file that partially determines what kind of arguments are available to Argparse (it's a JSON file containing criteria by which the user can specify what data to output).
But before these arguments are added to my parser, I would like to read in some arguments relating to the file reading itself. (e.g. whether to fix the formatting of the input file). Kinda like this:
test.py (fix_formatting=True, **more arguments added later)
When I try to run args = parser.parse_args() twice, after the initial input and after adding more keys, things fall apart: Argparse quite predictably complains that some of the user input are unrecognized arguments:. I thought I might use subparsers to that end.
So I tried variations of (following the example in the docs as best as I could):
def main():
parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers(help='sub-command help')
settingsparser = subparsers.add_parser('settings') #i want a subparser called 'settings'
settingsparser.add_argument('--fix_formatting', action='store_true') #this subparser shall have a --fix_formatting
Then I try to parse only the "settings" part like so:
settings=parser.parse_args(['settings'])
This seems to work. But then I add my other keys and things break:
keys=['alpha','beta','gamma','delta']
for key in keys:
parser.add_argument("--"+key, type=str, help="X")
args = parser.parse_args()
If I parse any input for any of the arguments from keys, Argparse complains that I make an invalid choice: [...] (choose from 'settings'). Now I don't understand why I have to choose from "settings"; the docs say that the parse
will only contain attributes for the main parser and the subparser that was selected by the command line (and not any other subparsers)
what is my error of understanding here?
and if this is the wrong approach, how would one go about parsing one bit of input before another bit?
Any help is much appreciated!
parse_args calls parse_known_args. This returns the args namesparse along with a list of strings (from sys.argv) that it could not process (extras). parse_args raises this error if this list is not empty.
https://docs.python.org/3/library/argparse.html#partial-parsing
Thus parse_known_args is useful if you want to parse some of the input.
sys.argv remains unchanged. Subsequent calls to a parser (whether it was the original one or not) use that again, unless you pass the extras.
I don't think subparsers help you here. They aren't meant for delayed or two stage parsing. I'd suggest playing with the documentation examples for subparsers first.
To the main parser, the subparsers look like
subparsers = parser.add_argument('cmd', choices=['select',...])
In other words, it adds a positional argument where the choices are the subparser names that you define. That may help you see why it expects you to name select. Positionals are normally required.
(there's an exception to this in recent versions, https://stackoverflow.com/a/22994500/901925)

How ca I get Python ArgParse to stop overwritting positional arguments in child parser

I am attempting to get my script working, but argparse keeps overwriting my positional arguments from the parent parser. How can I get argparse to honor the parent's value for these? It does keep values from optional args.
Here is a very simplified version of what I need. If you run this, you will see that the args are overwritten.
testargs.py
#! /usr/bin/env python3
import argparse
import sys
def main():
preparser = argparse.ArgumentParser(add_help=False)
preparser.add_argument('first',
nargs='?')
preparser.add_argument('outfile',
nargs='?',
type=argparse.FileType('w', encoding='utf-8'),
default=sys.stdout,
help='Output file')
preparser.add_argument(
'--do-something','-d',
action='store_true')
# Parse args with preparser, and find config file
args, remaining_argv = preparser.parse_known_args()
print(args)
parser = argparse.ArgumentParser(
parents=[preparser],
description=__doc__)
parser.add_argument(
'--clear-screen', '-c',
action='store_true')
args = parser.parse_args(args=remaining_argv,namespace=args )
print(args)
if __name__ == '__main__':
main()
And call it with testargs.py something /tmp/test.txt -d -c
You will see it keeps the -d but drops both the positional args and reverts them to defaults.
EDIT: see additional comments in the accepted answer for some caveats.
When you specify parents=[preparser] it means that parser is an extension of preparser, and will parse all arguments relevent to preparser which it is never given.
Lets say the preparser only has one positional argument first and the parser only has one positional argument second, when you make parser a child of preparser it expects both arguments:
import argparse
parser1 = argparse.ArgumentParser(add_help=False)
parser1.add_argument("first")
parser2 = argparse.ArgumentParser(parents=[parser1])
parser2.add_argument("second")
args2 = parser2.parse_args(["arg1","arg2"])
assert args2.first == "arg1" and args2.second == "arg2"
However passing only the remaining arguments that are left over from parser1 would just be ['second'] which is not the correct arguments to parser2:
parser1 = argparse.ArgumentParser(add_help=False)
parser1.add_argument("first")
args1, remaining_args = parser1.parse_known_args(["arg1","arg2"])
parser2 = argparse.ArgumentParser(parents=[parser1])
parser2.add_argument("second")
>>> args1
Namespace(first='arg1')
>>> remaining_args
['arg2']
>>> parser2.parse_args(remaining_args)
usage: test.py [-h] first second
test.py: error: the following arguments are required: second
To only process the arguments that were not handled by the first pass, do not specify it as the parent to the second parser:
parser1 = argparse.ArgumentParser(add_help=False)
parser1.add_argument("first")
args1, remaining_args = parser1.parse_known_args(["arg1","arg2"])
parser2 = argparse.ArgumentParser() #parents=[parser1]) #NO PARENT!
parser2.add_argument("second")
args2 = parser2.parse_args(remaining_args,args1)
assert args2.first == "arg1" and args2.second == "arg2"
The 2 positionals are nargs='?'. A positional like that is always 'seen', since an empty list matches that nargs.
First time through 'text.txt' matches with first and is put in the Namespace. Second time through there isn't any string to match, so the default is used - same as if you had not given that string the first time.
If I change first to have the default nargs, I get
error: the following arguments are required: first
from the 2nd parser. Even though there's a value in the Namespace it still tries to get a value from the argv. (it's like a default, but not quite).
Defaults for positionals with nargs='?' (or *) are tricky. They are optional, but not in quite the same way as optionals. The positional Actions are still called, but with a empty list of values.
I don't think the parents feature does anything for you. preparser already handles that set of arguments; there's no need to handle them again in parser, especially since all the relevant argument strings have been stripped out.
Another option is to leave the parents in, but use the default sys.argv[1:] in the 2nd parser. (but beware of side effects like opening files)
args = parser.parse_args(namespace=args )
A third option is to parse the arguments independently and merge them with a dictionary update.
adict = vars(preparse_args)
adict.update(vars(parser_args))
# taking some care in who overrides who
For more details look in argparse.py file at ArgumentParser._get_values, specifically the not arg_strings cases.
A note about the FileType. That type works nicely for small scripts where you will use the files right away and exit. It isn't so good on large programs where you might want to close the file after use (close stdout???), or use files in a with context.
edit - note on parents
add_argument creates an Action object, and adds it to the parser's list of actions. parse_args basically matches input strings with these actions.
parents just copies those Action objects (by reference) from parent to child. To the child parser it is just as though the actions were created with add_argument directly.
parents is most useful when you are importing a parser and don't have direct access to its definition. If you are defining both parent and child, then parents just saves you some typing/cut-n-paste.
This and other SO questions (mostly triggered the by-reference copy) show that the developers did not intend you to use both the parent and child to do parsing. It can be done, but there are glitches that the they did not consider.
===================
I can imagine defining a custom Action class that would 'behave' in a situation like this. It might, for example, check the namespace for some not default value before adding its own (possibly default) value.
Consider, for example if I changed the action of first to 'append':
preparser.add_argument('first', action='append', nargs='?')
The result is:
1840:~/mypy$ python3 stack37147683.py /tmp/test.txt -d -c
Namespace(do_something=True, first=['/tmp/test.txt'], outfile=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)
Namespace(clear_screen=True, do_something=True, first=['/tmp/test.txt', None], outfile=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)
From the first parser, first=['/tmp/test.txt']; from the second, first=['/tmp/test.txt', None].
Because of the append, the item from the first is preserved, and a new default has been added by the second parser.

argparse key=value parameters

This first link has the same question in the first section, but it is unanswered
(python argparse: parameter=value). And this second question is similar, but I can't seem to get it working for my particular case
( Using argparse to parse arguments of form "arg= val").
So my situation is this -- I am re-writing a Python wrapper which is used by many other scripts (I would prefer not to modify these other scripts). Currently, the Python wrapper is called with command line arguments of the form --key=value for a number of different arguments, but was parsed manually. I would like to parse them with argparse.
N.B. The argument names are unwieldy, so I am renaming using the dest option in add_argument.
parser = argparse.ArgumentParser(description='Wrappin Ronnie Reagan')
parser.add_argument("--veryLongArgName1", nargs=1, dest="arg1", required=True)
parser.add_argument("--veryLongArgName2", nargs=1, dest="arg2")
parser.add_argument("--veryLongArgName3", nargs=1, dest="arg3")
userOpts = vars(parser.parse_args())
Which, while apparently parsing the passed command lines correctly, displays this as the help:
usage: testing_argsparse.py [-h] --veryLongArgName1 ARG1
[--veryLongArgName2 ARG2]
[--veryLongArgName3 ARG3]
testing_argsparse.py: error: argument --veryLongArgName1 is required
But what I want is that all parameters are specified with the --key=value format, not --key value. i.e.
usage: testing_argsparse.py [-h] --veryLongArgName1=ARG1
[--veryLongArgName2=ARG2]
[--veryLongArgName3=ARG3]
testing_argsparse.py: error: argument --veryLongArgName1 is required
testing_argsparse.py --veryLongArgName1=foo
works. argparse module accepts both --veryLongArgName1=foo and --veryLongArgName1 foo formats.
What exact command line arguments are you trying to pass to argparse that's causing it to not work?
A little late but for anyone with a similar request as the OP you could use a custom HelpFormatter.
class ArgFormatter(argparse.HelpFormatter):
def _format_args(self, *args):
result = super(ArgFormatter, self)._format_args(*args)
return result and '%%%' + result
def _format_actions_usage(self, *args):
result = super(ArgFormatter, self)._format_actions_usage(*args)
return result and result.replace(' %%%', '=')
This can then be passed to ArgumentParser to give the wanted behavior.
parser = argparse.ArgumentParser(
description='Wrappin Ronnie Reagan',
formatter_class=ArgFormatter)
This intercepts the args (ARG1, ARG2, ...) and adds a custom prefix which is later replaced (along with the unwanted space) for an = symbol. The and in the return statements makes sure to only modify the result if it's non-empty.

Categories