Argparse positional multiple choices default subset: Invalid Choice - python

I'm using argparse to obtain a subset of elements using choices with nargs='*'. If no element is introduced, I want to return by default a subset of the elements, but I get an Invalid Choice error.
Here is my sample code:
from argparse import ArgumentParser
parser = ArgumentParser()
parser.add_argument(
"choices",
nargs="*",
type=int,
choices=[1,2,3,4,5],
default=[1,3,5]
)
if __name__ == "__main__":
print(parser.parse_args().choices)
The output I get is:
$ ./thing.py 3
[3]
$ ./thing.py 1 2
[1, 2]
$ ./thing.py
usage: thing.py [-h] [{1,2,3,4,5} ...]
thing.py: error: argument choices: invalid choice: [1, 3, 5] (choose from 1, 2, 3, 4, 5)
I can set default to an int, but not a list. Is there any workaround to this?
Edit: This only seems to happen with positional arguments. There aren't any problems if using nargs="+" and required=False in a non-positional argument.

To elaborate on my comments.
Normal processing of defaults, as #Tomerikoo quotes, is to process string defaults through both the type and choices check. The defaults are placed in the args namespace at the start of parsing. At the end, if the defaults have not be overwritten by using inputs, they will conditionally converted. Non-string defaults are left as is, without type or choices checking.
But a positional with nargs='*' is different. It is "always seen", since an empty list (i.e. none) of strings satisfies its nargs. With usual handling this would overwrite the default, resulting in Namespace(choices=[]).
To get around that the _get_values method replaces the [] with the default. This new value is passed through the choices test, but not through type. This results in the behavior that the OP encountered.
The relevant parts of _get_values are quoted below. The first handles nargs='?', the second '*', and the third normal cases. _get_value does the type test. _check_value does the choices test.
def _get_values(self, action, arg_strings):
....
# optional argument produces a default when not present
if not arg_strings and action.nargs == OPTIONAL:
if action.option_strings:
value = action.const
else:
value = action.default
if isinstance(value, str):
value = self._get_value(action, value)
self._check_value(action, value)
# when nargs='*' on a positional, if there were no command-line
# args, use the default if it is anything other than None
elif (not arg_strings and action.nargs == ZERO_OR_MORE and
not action.option_strings):
if action.default is not None:
value = action.default
else:
value = arg_strings
self._check_value(action, value)
# single argument or optional argument produces a single value
elif len(arg_strings) == 1 and action.nargs in [None, OPTIONAL]:
arg_string, = arg_strings
value = self._get_value(action, arg_string)
self._check_value(action, value)
I should add another block, which is used for '*' and non-empty user input:
# all other types of nargs produce a list
else:
value = [self._get_value(action, v) for v in arg_strings]
for v in value:
self._check_value(action, v)
Note that here, individual strings are type and checked, where as in the first case, it's the whole default that is checked.
There have been bug/issues related to this, but no one has some up with a good reason to change it.

Related

Why does argparse not accept "--" as argument?

My script takes -d, --delimiter as argument:
parser.add_argument('-d', '--delimiter')
but when I pass it -- as delimiter, it is empty
script.py --delimiter='--'
I know -- is special in argument/parameter parsing, but I am using it in the form --option='--' and quoted.
Why does it not work?
I am using Python 3.7.3
Here is test code:
#!/bin/python3
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--delimiter')
parser.add_argument('pattern')
args = parser.parse_args()
print(args.delimiter)
When I run it as script --delimiter=-- AAA it prints empty args.delimiter.
This looks like a bug. You should report it.
This code in argparse.py is the start of _get_values, one of the primary helper functions for parsing values:
if action.nargs not in [PARSER, REMAINDER]:
try:
arg_strings.remove('--')
except ValueError:
pass
The code receives the -- argument as the single element of a list ['--']. It tries to remove '--' from the list, because when using -- as an end-of-options marker, the '--' string will end up in arg_strings for one of the _get_values calls. However, when '--' is the actual argument value, the code still removes it anyway, so arg_strings ends up being an empty list instead of a single-element list.
The code then goes through an else-if chain for handling different kinds of argument (branch bodies omitted to save space here):
# optional argument produces a default when not present
if not arg_strings and action.nargs == OPTIONAL:
...
# when nargs='*' on a positional, if there were no command-line
# args, use the default if it is anything other than None
elif (not arg_strings and action.nargs == ZERO_OR_MORE and
not action.option_strings):
...
# single argument or optional argument produces a single value
elif len(arg_strings) == 1 and action.nargs in [None, OPTIONAL]:
...
# REMAINDER arguments convert all values, checking none
elif action.nargs == REMAINDER:
...
# PARSER arguments convert all values, but check only the first
elif action.nargs == PARSER:
...
# SUPPRESS argument does not put anything in the namespace
elif action.nargs == SUPPRESS:
...
# all other types of nargs produce a list
else:
...
This code should go through the 3rd branch,
# single argument or optional argument produces a single value
elif len(arg_strings) == 1 and action.nargs in [None, OPTIONAL]:
but because the argument is missing from arg_strings, len(arg_strings) is 0. It instead hits the final case, which is supposed to handle a completely different kind of argument. That branch ends up returning an empty list instead of the '--' string that should have been returned, which is why args.delimiter ends up being an empty list instead of a '--' string.
This bug manifests with positional arguments too. For example,
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('a')
parser.add_argument('b')
args = parser.parse_args(["--", "--", "--"])
print(args)
prints
Namespace(a='--', b=[])
because when _get_values handles the b argument, it receives ['--'] as arg_strings and removes the '--'. When handling the a argument, it receives ['--', '--'], representing one end-of-options marker and one actual -- argument value, and it successfully removes the end-of-options marker, but when handling b, it removes the actual argument value.
Existing bug report
Patches have been suggested, but it hasn't been applied. Argparse incorrectly handles '--' as argument to option
Some simple examples:
In [1]: import argparse
In [2]: p = argparse.ArgumentParser()
In [3]: a = p.add_argument('--foo')
In [4]: p.parse_args(['--foo=123'])
Out[4]: Namespace(foo='123')
The unexpected case:
In [5]: p.parse_args(['--foo=--'])
Out[5]: Namespace(foo=[])
Fully quote passes through - but I won't get into how you might achieve this via shell call:
In [6]: p.parse_args(['--foo="--"'])
Out[6]: Namespace(foo='"--"')
'--' as separate string:
In [7]: p.parse_args(['--foo','--'])
usage: ipython3 [-h] [--foo FOO]
ipython3: error: argument --foo: expected one argument
...
another example of the double quote:
In [8]: p.parse_args(['--foo','"--"'])
Out[8]: Namespace(foo='"--"')
In _parse_known_args, the input is scanned and classified as "O" or "A". The '--' is handled as
# all args after -- are non-options
if arg_string == '--':
arg_string_pattern_parts.append('-')
for arg_string in arg_strings_iter:
arg_string_pattern_parts.append('A')
I think the '--' are stripped out after that, but I haven't found that part of the code yet. I'm also not finding were the '--foo=...' version is handled.
I vaguely recall some bug/issues over handling of multiple occurances of '--'. With the migration to github, I'm not following argparse developements as much as I used to.
edit
get_values starts with:
def _get_values(self, action, arg_strings):
# for everything but PARSER, REMAINDER args, strip out first '--'
if action.nargs not in [PARSER, REMAINDER]:
try:
arg_strings.remove('--')
except ValueError:
pass
Why that results in a empty list will require more thought and testing.
The '=' is handled in _parse_optional, which is used during the first scan:
# if the option string before the "=" is present, return the action
if '=' in arg_string:
option_string, explicit_arg = arg_string.split('=', 1)
if option_string in self._option_string_actions:
action = self._option_string_actions[option_string]
return action, option_string, explicit_arg
old bug issues
argparse handling multiple "--" in args improperly
argparse: Allow the use of -- to break out of nargs and into subparser
It calls parse_args which calls parse_known_args which calls _parse_known_args.
Then, on line 2078 (or something similar), it does this (inside a while loop going through the string):
start_index = consume_optional(start_index)
which calls the consume_optional (which makes sense, because this is an optional argument it is parsing right now) defined earlier in the method _parse_known_args. When given --delimiter='--', it will make this action_tuples:
# if the action expect exactly one argument, we've
# successfully matched the option; exit the loop
elif arg_count == 1:
stop = start_index + 1
args = [explicit_arg]
action_tuples.append((action, args, option_string))
break
##
## The above code gives you the following:
##
action_tuples=[(_StoreAction(option_strings=['-d', '--delimiter'], dest='delimiter', nargs=None, const=None, default=None, type=None, choices=None, help=None, metavar=None), ['--'], '--delimiter')]
That is then iterated to, and is then fed to take_action on line 2009:
assert action_tuples
for action, args, option_string in action_tuples:
take_action(action, args, option_string)
return stop
The take_action function will then call self._get_values(action, argument_strings) on line 1918, which, as mentioned in the answer by #hpaulj, removes the --. Then, you're left with the empty list.

Using argparse to get a list of unique values from choices

I have a program, for which a argument can take values from a defined list. Every value can only be passed once. At least one value must be chosen. If no values are chosen, the default is that all values were chosen. The following code exhibits the desired behavior.
import argparse
############################# Argument parsing
all_targets = ["bar", "baz", "buzz"]
p = argparse.ArgumentParser()
p.add_argument(
"--targets", default=all_targets, choices=all_targets, nargs="+",
)
args = p.parse_args()
targets = args.targets
if len(targets) > len(set(targets)):
raise ValueError("You may only foo every target once!")
############################## Actual program
for target in targets:
print(f"You just foo'ed the {target}!")
However, now I am dealing with argument validation after parsing, which seems like a anti-pattern, since argparse typically do type validation etc in the parsing step.
Can I get the same behavior as above from the ArgumentParser instead of making post-checks on the arguments?
#chepner provided some alternatives. The one that felt most natural to me was the use of Actions. Incorporating the unicity-check can be done as below:
import argparse
class CheckUniqueStore(argparse.Action):
"""Checks that the list of arguments contains no duplicates, then stores"""
def __call__(self, parser, namespace, values, option_string=None):
if len(values) > len(set(values)):
raise argparse.ArgumentError(
self,
"You cannot specify the same value multiple times. "
+ f"You provided {values}",
)
setattr(namespace, self.dest, values)
############################# Argument parsing
all_targets = ["bar", "baz", "buzz"]
p = argparse.ArgumentParser()
p.add_argument(
"--targets",
default=all_targets,
choices=all_targets,
nargs="+",
action=CheckUniqueStore,
)
args = p.parse_args()
############################## Actual program
for target in args.targets:
print(f"You just foo'ed a {target}!")

Make argparse treat dashes and underscore identically

argparse replaces dashes in optional arguments by underscores to determine their destination:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--use-unicorns', action='store_true')
args = parser.parse_args(['--use-unicorns'])
print(args) # returns: Namespace(use_unicorns=True)
However the user has to remember whether the option is --use-unicorns or --use_unicorns; using the wrong variant raises an error.
This can cause some frustration as the variable args.use_unicorns in the code does not make it clear which variant was defined.
How can I make argparse accept both --use-unicorns and --use_unicorns as valid ways to define this optional argument?
parser.add_argument('--use-unicorns', action='store_true')
args = parser.parse_args(['--use-unicorns'])
print(args) # returns: Namespace(use_unicorns=True)
argparse translates the '-' to '_' because the use of '-' in flags is well established POSIX practice. But args.use-unicones is not acceptable Python. In other words, it does the translation so the dest will be a valid Python variable or attribute name.
Note that argparse does not perform this translation with positionals. In that case the programmer has full control over the dest parameter, and can choose anything that's convenient. Since the argparse only uses getattr and setattr when accessing the Namespace, the constraints on a valid dest are minimal.
There are two users. There's you, the programmer, and there's your end user. What's convenient for you might not be optimal for the other.
You can also specify a dest with defining an optional. And metavar gives you further control over the help display.
It's parser._get_optional_kwargs that performs the '-' replace:
if dest is None:
....
dest = dest.replace('-', '_')
parser.add_argument accepts more than one flag for an argument (link to documentation). One easy way to make the parser accept both variants is to declare the argument as
parser.add_argument('--use-unicorns', '--use_unicorns', action='store_true')
However both options will show up in the help, and it is not very elegant as it forces one to write the variants manually.
An alternative is to subclass argparse.ArgumentParser to make the matching invariant to replacing dashes by underscore. This requires a little bit of fiddling, as both argparse_ActionsContainer._parse_optional and argparse_ActionsContainer._get_option_tuples have to be modified to handle this matching and abbrevations, e.g. --use_unic.
I ended up with the following subclassed method, where the matching to abbrevations is delegated from _parse_optional to _get_option_tuples:
from gettext import gettext as _
import argparse
class ArgumentParser(argparse.ArgumentParser):
def _parse_optional(self, arg_string):
# if it's an empty string, it was meant to be a positional
if not arg_string:
return None
# if it doesn't start with a prefix, it was meant to be positional
if not arg_string[0] in self.prefix_chars:
return None
# if it's just a single character, it was meant to be positional
if len(arg_string) == 1:
return None
option_tuples = self._get_option_tuples(arg_string)
# if multiple actions match, the option string was ambiguous
if len(option_tuples) > 1:
options = ', '.join([option_string
for action, option_string, explicit_arg in option_tuples])
args = {'option': arg_string, 'matches': options}
msg = _('ambiguous option: %(option)s could match %(matches)s')
self.error(msg % args)
# if exactly one action matched, this segmentation is good,
# so return the parsed action
elif len(option_tuples) == 1:
option_tuple, = option_tuples
return option_tuple
# if it was not found as an option, but it looks like a negative
# number, it was meant to be positional
# unless there are negative-number-like options
if self._negative_number_matcher.match(arg_string):
if not self._has_negative_number_optionals:
return None
# if it contains a space, it was meant to be a positional
if ' ' in arg_string:
return None
# it was meant to be an optional but there is no such option
# in this parser (though it might be a valid option in a subparser)
return None, arg_string, None
def _get_option_tuples(self, option_string):
result = []
if '=' in option_string:
option_prefix, explicit_arg = option_string.split('=', 1)
else:
option_prefix = option_string
explicit_arg = None
if option_prefix in self._option_string_actions:
action = self._option_string_actions[option_prefix]
tup = action, option_prefix, explicit_arg
result.append(tup)
else: # imperfect match
chars = self.prefix_chars
if option_string[0] in chars and option_string[1] not in chars:
# short option: if single character, can be concatenated with arguments
short_option_prefix = option_string[:2]
short_explicit_arg = option_string[2:]
if short_option_prefix in self._option_string_actions:
action = self._option_string_actions[short_option_prefix]
tup = action, short_option_prefix, short_explicit_arg
result.append(tup)
underscored = {k.replace('-', '_'): k for k in self._option_string_actions}
option_prefix = option_prefix.replace('-', '_')
if option_prefix in underscored:
action = self._option_string_actions[underscored[option_prefix]]
tup = action, underscored[option_prefix], explicit_arg
result.append(tup)
elif self.allow_abbrev:
for option_string in underscored:
if option_string.startswith(option_prefix):
action = self._option_string_actions[underscored[option_string]]
tup = action, underscored[option_string], explicit_arg
result.append(tup)
# return the collected option tuples
return result
A lot of this code is directly derived from the corresponding methods in argparse (from the CPython implementation here). Using this subclass should make the matching of optional arguments invariant to using dashes - or underscores _.

Is there a way to create argument in python's argparse that returns true in case no values given

Currently --resize flag that I created is boolean, and means that all my objects will be resized:
parser.add_argument("--resize", action="store_true", help="Do dictionary resize")
# ...
# if resize flag is true I'm re-sizing all objects
if args.resize:
for object in my_obects:
object.do_resize()
Is there a way implement argparse argument that if passed as boolean flag (--resize) will return true, but if passed with value (--resize 10), will contain value.
Example:
python ./my_script.py --resize # Will contain True that means, resize all the objects
python ./my_script.py --resize <index> # Will contain index, that means resize only specific object
In order to optionally accept a value, you need to set nargs to '?'. This will make the argument consume one value if it is specified. If the argument is specified but without value, then the argument will be assigned the argument’s const value, so that’s what you need to specify too:
parser = argparse.ArgumentParser()
parser.add_argument('--resize', nargs='?', const=True)
There are now three cases for this argument:
Not specified: The argument will get its default value (None by default):
>>> parser.parse_args(''.split())
Namespace(resize=None)
Specified without a value: The argument will get its const value:
>>> parser.parse_args('--resize'.split())
Namespace(resize=True)
Specified with a value: The argument will get the specified value:
>>> parser.parse_args('--resize 123'.split())
Namespace(resize='123')
Since you are looking for an index, you can also specify type=int so that the argument value will be automatically parsed as an integer. This will not affect the default or const case, so you still get None or True in those cases:
>>> parser = argparse.ArgumentParser()
>>> parser.add_argument('--resize', nargs='?', type=int, const=True)
>>> parser.parse_args('--resize 123'.split())
Namespace(resize=123)
Your usage would then look something like this:
if args.resize is True:
for object in my_objects:
object.do_resize()
elif args.resize:
my_objects[args.resize].do_resize()
You can add default=False, const=True and nargs='?' to the argument definition and remove action. This way if you don't pass --resize it will store False, if you pass --resize with no argument will store True and otherwise the passed argument. Still you will have to refactor the code a bit to know if you have index to delete or delete all objects.
Use nargs to accept different number of command-line arguments
Use default and const to set the default value of resize
see here for details: https://docs.python.org/3/library/argparse.html#nargs
parser.add_argument('-resize', dest='resize', type=int, nargs='?', default=False, const=True)
>>tmp.py -resize 1
args.resize: 1
>>tmp.py -resize
args.resize: True
>>tmp.py
args.resize: False

Parse Args that aren't declared

I'm writing a utility for running bash commands that essentially takes as input a string and a list optional argument and uses the optional arguments to interpolate string.
I'd like it to work like this:
interpolate.py Hello {user_arg} my name is {computer_arg} %% --user_arg=john --computer_arg=hal
The %% is a separator, it separates the string to be interpolated from the arguments. What follows is the arguments used to interpolate the string. In this case the user has chosen user_arg and computer_arg as arguments. The program can't know in advance which argument names the user will choose.
My problem is how to parse the arguments? I can trivially split the input arguments on the separator but I can't figure out how to get optparse to just give the list of optional args as a dictionary, without specifying them in advance. Does anyone know how to do this without writing a lot of regex?
Well, if you use '--' to separate options from arguments instead of %%, optparse/argparse will just give you the arguments as a plain list (treating them as positional arguments instead of switched). After that it's not 'a lot of' regex, it's just a mere split:
for argument in args:
if not argument.startswith("--"):
# decide what to do in this case...
continue
arg_name, arg_value = argument.split("=", 1)
arg_name = arg_name[2:]
# use the argument any way you like
With argparse you could use the parse_known_args method to consume predefined arguments and any additional arguments. For example, using the following script
import sys
import argparse
def main(argv=None):
parser = argparse.ArgumentParser()
parser.add_argument('string', type=str, nargs='*',
help="""String to process. Optionally with interpolation
(explain this here...)""")
args, opt_args = parser.parse_known_args(argv)
print args
print opt_args
return 0
if __name__=='__main__':
sys.exit(main(sys.argv[1:]))
and calling with
python script.py Hello, my name is {name} --name=chris
yields the following output:
Namespace(string=['Hello,' 'my', 'name', 'is', '{name}'])
['--name=chris']
All that is left to do is to loop through the args namespace looking for strings of the form {...} and replacing them with the corresponding element in opt_args, if present. (I'm not sure if argparse can do argument interpolation automatically, the above example is the only immediate solution which comes to mind).
For something like this, you really don't need optparse or argparse - the benefit of such libraries are of little use in this circumstance (things like lone -v type arguments, checking for invalid options, value validation and so on)
def partition_list(lst, sep):
"""Slices a list in two, cutting on index matching "sep"
>>> partition_list(['a', 'b', 'c'], sep='b')
(['a'], ['c'])
"""
if sep in lst:
idx = lst.index(sep)
return (lst[:idx], lst[idx+1:])
else:
return (lst[:], )
def args_to_dict(args):
"""Crudely parses "--blah=123" type arguments into dict like
{'blah': '123'}
"""
ret = {}
for a in args:
key, _, value = a.partition("=")
key = key.replace("--", "", 1)
ret[key] = value
return ret
if __name__ == '__main__':
import sys
# Get stuff before/after the "%%" separator
string, args = partition_list(sys.argv[1:], "%%")
# Join input string
string_joined = " ".join(string)
# Parse --args=stuff
d = args_to_dict(args)
# Do string-interpolation
print string_joined.format(**d)

Categories