Parsing args with different set of default values - python

I have a set of parameters for training and a set of parameters for tuning. They share the same name but different default values. I'd like to use argparse to define which group of default values to use and also parse the values.
I have learned it is possible by using add_subparsers to set subparser for each mode. However, their names are identical which means I'll have to set the same parameters twice (which is very long).
I also tried to include two parsers, the first one parse a few args to determine which group of default values to use, and then use parser.set_defaults(**defaults) to set the default values for the second parser, like this:
train_defaults = dict(
optimizer='AdamW',
lr=1e-3,
strategy='linear',
warmup_steps=5_000,
weight_decay=0.3
)
tune_defaults = dict(
optimizer='SGD',
lr=1e-2,
strategy='cosine',
warmup_steps=500,
weight_decay=0.0
)
selector = argparse.ArgumentParser(description='Mode Selector')
mode = selector.add_mutually_exclusive_group()
mode.add_argument('-t', '--train', action='store_true', help='train model')
mode.add_argument('-u', '--tune', action='store_true', help='tune model')
select, unknown = selector.parse_known_args()
defaults = tune_defaults if select.tune else select.train
parser.set_defaults(**defaults)
args, unknown = parser.parse_known_args()
But two parsers will conflict on some args, for example, -td refers to the --train_data in parser, but it will also be parsed by selector which will raise an Exception:
usage: run.py [-h] [-pt | -pa] [-t] [-u] [-v]
run.py: error: argument -t/--train: ignored explicit argument 'd'
(This is a MWE, the actual args could be vary.

The multiple parsers solution, as you are finding, can be error-prone. I see two alternatives:
Use environment variables
Something like this:
import os
do_tuning = os.getenv("DO_THE_TUNING_MODE", None) is not None
...
defaults = tune_defaults if do_tuning else select.train
parser = argparse.ArgumentParser()
...
parser.set_defaults(**defaults)
args, unknown = parser.parse_known_args()
Use like
DO_THE_TUNING_MODE=1 run.py <options>
or
export DO_THE_TUNING_MODE=1
run.py <options>
(or of course, don't set for training mode)
Pros:
Tuning/selection method is outside the parser so you don't get conflicts
A user can set a "state" in their shell session to tuning or training and not have to continuously set the option when running
Cons:
An environment variable is less straightforward to set than calling a command-line option for one-time use
It is easy to forget what your environment variable is set to
Use subparsers
This is probably the best solution. You indicated that you did not want to do that because you have so many options, but that's what functions are for.
def add_parsing_options(parser):
# All your 40 options go here
parser.add_argument(...)
parser.add_argument(...)
parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers()
tuning_parser = subparser.add_parser("tune")
training_parser = subparser.add_parser("train")
add_parsing_options(tuning_parser)
add_parsing_options(training_parser)
tuning_parser.set_defaults(**tune_defaults)
training_parser.set_defaults(**train_defaults)
args, unknown = parser.parse_known_args()
Call like
run.py train <options>
or
run.py tune <options>
Pros:
It is explicit when using the tool which mode is being used
Cons:
It is an extra parameter to type every time the tool is used

I partially resolved the question by some hard code, i.e.
Since the first parser is only used to set the default parameters of the second parser, there is only a few arguments, in my case, 2.
So what I did is to split the sys.argv to two parts:
import sys
select, unknown = selector.parse_known_args(sys.argv[:3])
args, unknown = parser.parse_known_args(sys.argv[3:])
Pros:
have most if not every pros of other methods
no extra parameter to type every time
Cons:
the value 3 is a hyper parameter

Related

Is it possible to validate `argparse` default argument values?

Is it possible to tell argparse to give the same errors on default argument values as it would on user-specified argument values?
For example, the following will not result in any error:
parser = argparse.ArgumentParser()
parser.add_argument('--choice', choices=['a', 'b', 'c'], default='invalid')
args = vars(parser.parse_args()) # args = {'choice': 'invalid'}
whereas omitting the default, and having the user specify --choice=invalid on the command-line will result in an error (as expected).
Reason for asking is that I would like to have the user to be able to specify default command-line options in a JSON file which are then set using ArgumentParser.set_defaults(), but unfortunately the behaviour demonstrated above prevents these user-specified defaults from being validated.
Update: argparse is inconsistent and I now consider the behavior above to be a bug. The following does trigger an error:
parser = argparse.ArgumentParser()
parser.add_argument('--num', type=int, default='foo')
args = parser.parse_args() # triggers exception in case --num is not
# specified on the command-line
I have opened a bug report for this: https://github.com/python/cpython/issues/100949
I took the time to dig into the source code, and what is happening is that a check is only happening for arguments you gave on the command line. The only way to enforce a check, in my opinion, is to subclass ArgumentParser and have it do the check when you add the argument:
class ValidatingArgumentParser(argparse.ArgumentParser):
def add_argument(self, *args, **kwargs):
super().add_argument(*args, **kwargs)
self._check_value(self._actions[-1],kwargs['default'])
No. Explicit arguments need to be validated because they originate from outside the source code. Default values originate in the source code, so it's the job of the programmer, not the argument parser, to ensure they are valid.
(This is the difference between validation and debugging.)
(Using set_defaults on unvalidated user input still falls under the purview of debugging, as it's not the argument parser itself adding the default values, but the programmer.)

Disable argparse arguments from being overwritten

I have a legacy Python application which uses some options in its CLI, using argparse, like:
parser = argparse.ArgumentParser()
parser.add_argument('-f', default='foo')
Now I need to remove this option, since now its value cannot be overwritten by users but it has to assume its default value (say 'foo' in the example). Is there a way to keep the option but prevent it to show up and be overwritten by users (so that I can keep the rest of the code as it is)?
It's not entirely clear what you can do, not with the parser. Can you edit the setup? Or just modify results of parsing?
If you can edit the setup, you could replace the add_argument line with a
parser.setdefaults(f='foo')
https://docs.python.org/3/library/argparse.html#parser-defaults
The -f won't appear in the usage or help, but it will appear in the args
Or you could leave it in, but suppress the help display
parser.add_argument('-f', default='foo', help=argparse.SUPPRESS)
https://docs.python.org/3/library/argparse.html#help
Setting the value after parsing is also fine.
Yes you can do that. After the parser is parsed (args = parser.parse_args()) it is a NameSpace so you can do this:
parser = argparse.ArgumentParser()
args = parser.parse_args()
args.foo = 'foo value'
print(args)
>>> Namespace(OTHER_OPTIONS, foo='foo value')
I assumed that you wanted to add test to your parser, so your original code will still work, but you do not want it as an option for the user.
I think it doesn't make sense for the argparse module to provide this as a standard option, but there are several easy ways to achieve what you want.
The most obvious way is to just overwrite the value after having called parse_args() (as already mentioned in comments and in another answer):
args.f = 'foo'
However, the user may not be aware that the option is not supported anymore and that the application is now assuming the value "foo". Depending on the use case, it might be better to warn the user about this. The argparse module has several options to do this.
Another possibility is to use an Action class to do a little magic. For example, you could print a warning if the user provided an option that is not supported anymore, or even use the built-in error handling.
import argparse
class FooAction(argparse.Action):
def __call__(self, parser, namespace, values, option_string=None):
if values != 'foo':
print('Warning: option `-f` has been removed, assuming `-f foo` now')
# Or use the built-in error handling like this:
# parser.error('Option "-f" is not supported anymore.')
# You could override the argument value like this:
# setattr(namespace, self.dest, 'foo')
parser = argparse.ArgumentParser()
parser.add_argument('-f', default='foo', action=FooAction)
args = parser.parse_args()
print('option=%s' % args.f)
You could also just limit the choices to only "foo" and let argparse create an error for other values:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-f', default='foo', choices=['foo'])
args = parser.parse_args()
print('option=%s' % args.f)
Calling python test.py -f bar would then result in:
usage: test.py [-h] [-f {foo}]
test.py: error: argument -f: invalid choice: 'bar' (choose from 'foo')

Most common way to work with argparse parsed values

In using argparse this is the first time I've come across a 'Namespace' object. What is the most common way to work with these objects? For example, if I have this initialization code:
import argparse
parser = argparse.ArgumentParser(description='Dedupe library.', allow_abbrev=True)
parser.add_argument( '-a', '--all', nargs='+', type=int, help='(Optional) Enter one or more IDs.')
parser.add_argument( '-r', '--reverse', nargs='+', help='(Optional) Enter one or more IDs.')
It seems like the library adds a property on every --long option (if it exists, otherwise the short -s option), so something like the following works:
# test.py
p = parser.parse_args()
print (p.all, p.reverse)
# -------------------------------------
$ python test.py -a 2 3 -r asdf
# [2, 3] ['asdf']
Is this the most common way to work with the argparse output, or how is this usually done?
Every argument performs some kind of action, specified by the action argument to add_argument. The default is a store action.
Each store action saves one (or more) values to an attribute in the resulting namespace. You can specify which attribute with the dest argument to add_argument, but more commonly the name is inferred from the first long option name (or the first short name, if there are no long names).
Note that you can have multiple options that affect the same attribute. A common use is to have multiple store_const actions that save a different hard-coded value to a single attribute.
p.add_argument("--high", action='store_const', dest='level', const='high')
p.add_argument("--med", action='store_const', dest='level', const='medium')
p.add_argument("--low", action='store_const', dest='level', const='low')
You could consider this as providing a series of aliases for an option that takes an explicit argument to specify a level:
p.add_argument("--level", choices=['high', 'medium', low'])
where --high has the same effect as --level high.
>>> p.parse_args(["--level", "high"]).level
'high'
>>> p.parse_args(["--high"]).level
'high'

How to make conditional arguments using argparse? [duplicate]

I have done as much research as possible but I haven't found the best way to make certain cmdline arguments necessary only under certain conditions, in this case only if other arguments have been given. Here's what I want to do at a very basic level:
p = argparse.ArgumentParser(description='...')
p.add_argument('--argument', required=False)
p.add_argument('-a', required=False) # only required if --argument is given
p.add_argument('-b', required=False) # only required if --argument is given
From what I have seen, other people seem to just add their own check at the end:
if args.argument and (args.a is None or args.b is None):
# raise argparse error here
Is there a way to do this natively within the argparse package?
I've been searching for a simple answer to this kind of question for some time. All you need to do is check if '--argument' is in sys.argv, so basically for your code sample you could just do:
import argparse
import sys
if __name__ == '__main__':
p = argparse.ArgumentParser(description='...')
p.add_argument('--argument', required=False)
p.add_argument('-a', required='--argument' in sys.argv) #only required if --argument is given
p.add_argument('-b', required='--argument' in sys.argv) #only required if --argument is given
args = p.parse_args()
This way required receives either True or False depending on whether the user as used --argument. Already tested it, seems to work and guarantees that -a and -b have an independent behavior between each other.
You can implement a check by providing a custom action for --argument, which will take an additional keyword argument to specify which other action(s) should become required if --argument is used.
import argparse
class CondAction(argparse.Action):
def __init__(self, option_strings, dest, nargs=None, **kwargs):
x = kwargs.pop('to_be_required', [])
super(CondAction, self).__init__(option_strings, dest, **kwargs)
self.make_required = x
def __call__(self, parser, namespace, values, option_string=None):
for x in self.make_required:
x.required = True
try:
return super(CondAction, self).__call__(parser, namespace, values, option_string)
except NotImplementedError:
pass
p = argparse.ArgumentParser()
x = p.add_argument("--a")
p.add_argument("--argument", action=CondAction, to_be_required=[x])
The exact definition of CondAction will depend on what, exactly, --argument should do. But, for example, if --argument is a regular, take-one-argument-and-save-it type of action, then just inheriting from argparse._StoreAction should be sufficient.
In the example parser, we save a reference to the --a option inside the --argument option, and when --argument is seen on the command line, it sets the required flag on --a to True. Once all the options are processed, argparse verifies that any option marked as required has been set.
Your post parsing test is fine, especially if testing for defaults with is None suits your needs.
http://bugs.python.org/issue11588 'Add "necessarily inclusive" groups to argparse' looks into implementing tests like this using the groups mechanism (a generalization of mutuall_exclusive_groups).
I've written a set of UsageGroups that implement tests like xor (mutually exclusive), and, or, and not. I thought those where comprehensive, but I haven't been able to express your case in terms of those operations. (looks like I need nand - not and, see below)
This script uses a custom Test class, that essentially implements your post-parsing test. seen_actions is a list of Actions that the parse has seen.
class Test(argparse.UsageGroup):
def _add_test(self):
self.usage = '(if --argument then -a and -b are required)'
def testfn(parser, seen_actions, *vargs, **kwargs):
"custom error"
actions = self._group_actions
if actions[0] in seen_actions:
if actions[1] not in seen_actions or actions[2] not in seen_actions:
msg = '%s - 2nd and 3rd required with 1st'
self.raise_error(parser, msg)
return True
self.testfn = testfn
self.dest = 'Test'
p = argparse.ArgumentParser(formatter_class=argparse.UsageGroupHelpFormatter)
g1 = p.add_usage_group(kind=Test)
g1.add_argument('--argument')
g1.add_argument('-a')
g1.add_argument('-b')
print(p.parse_args())
Sample output is:
1646:~/mypy/argdev/usage_groups$ python3 issue25626109.py --arg=1 -a1
usage: issue25626109.py [-h] [--argument ARGUMENT] [-a A] [-b B]
(if --argument then -a and -b are required)
issue25626109.py: error: group Test: argument, a, b - 2nd and 3rd required with 1st
usage and error messages still need work. And it doesn't do anything that post-parsing test can't.
Your test raises an error if (argument & (!a or !b)). Conversely, what is allowed is !(argument & (!a or !b)) = !(argument & !(a and b)). By adding a nand test to my UsageGroup classes, I can implement your case as:
p = argparse.ArgumentParser(formatter_class=argparse.UsageGroupHelpFormatter)
g1 = p.add_usage_group(kind='nand', dest='nand1')
arg = g1.add_argument('--arg', metavar='C')
g11 = g1.add_usage_group(kind='nand', dest='nand2')
g11.add_argument('-a')
g11.add_argument('-b')
The usage is (using !() to mark a 'nand' test):
usage: issue25626109.py [-h] !(--arg C & !(-a A & -b B))
I think this is the shortest and clearest way of expressing this problem using general purpose usage groups.
In my tests, inputs that parse successfully are:
''
'-a1'
'-a1 -b2'
'--arg=3 -a1 -b2'
Ones that are supposed to raise errors are:
'--arg=3'
'--arg=3 -a1'
'--arg=3 -b2'
For arguments I've come up with a quick-n-dirty solution like this.
Assumptions: (1) '--help' should display help and not complain about required argument and (2) we're parsing sys.argv
p = argparse.ArgumentParser(...)
p.add_argument('-required', ..., required = '--help' not in sys.argv )
This can easily be modified to match a specific setting.
For required positionals (which will become unrequired if e.g. '--help' is given on the command line) I've come up with the following: [positionals do not allow for a required=... keyword arg!]
p.add_argument('pattern', ..., narg = '+' if '--help' not in sys.argv else '*' )
basically this turns the number of required occurrences of 'pattern' on the command line from one-or-more into zero-or-more in case '--help' is specified.
Here is a simple and clean solution with these advantages:
No ambiguity and loss of functionality caused by oversimplified parsing using the in sys.argv test.
No need to implement a special argparse.Action or argparse.UsageGroup class.
Simple usage even for multiple and complex deciding arguments.
I noticed just one considerable drawback (which some may find desirable): The help text changes according to the state of the deciding arguments.
The idea is to use argparse twice:
Parse the deciding arguments instead of the oversimplified use of the in sys.argv test. For this we use a short parser not showing help and the method .parse_known_args() which ignores unknown arguments.
Parse everything normally while reusing the parser from the first step as a parent and having the results from the first parser available.
import argparse
# First parse the deciding arguments.
deciding_args_parser = argparse.ArgumentParser(add_help=False)
deciding_args_parser.add_argument(
'--argument', required=False, action='store_true')
deciding_args, _ = deciding_args_parser.parse_known_args()
# Create the main parser with the knowledge of the deciding arguments.
parser = argparse.ArgumentParser(
description='...', parents=[deciding_args_parser])
parser.add_argument('-a', required=deciding_args.argument)
parser.add_argument('-b', required=deciding_args.argument)
arguments = parser.parse_args()
print(arguments)
Until http://bugs.python.org/issue11588 is solved, I'd just use nargs:
p = argparse.ArgumentParser(description='...')
p.add_argument('--arguments', required=False, nargs=2, metavar=('A', 'B'))
This way, if anybody supplies --arguments, it will have 2 values.
Maybe its CLI result is less readable, but code is much smaller. You can fix that with good docs/help.
This is really the same as #Mira 's answer but I wanted to show it for the case where when an option is given that an extra arg is required:
For instance, if --option foo is given then some args are also required that are not required if --option bar is given:
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument('--option', required=True,
help='foo and bar need different args')
if 'foo' in sys.argv:
parser.add_argument('--foo_opt1', required=True,
help='--option foo requires "--foo_opt1"')
parser.add_argument('--foo_opt2', required=True,
help='--option foo requires "--foo_opt2"')
...
if 'bar' in sys.argv:
parser.add_argument('--bar_opt', required=True,
help='--option bar requires "--bar_opt"')
...
It's not perfect - for instance proggy --option foo --foo_opt1 bar is ambiguous but for what I needed to do its ok.
Add additional simple "pre"parser to check --argument, but use parse_known_args() .
pre = argparse.ArgumentParser()
pre.add_argument('--argument', required=False, action='store_true', default=False)
args_pre=pre.parse_known_args()
p = argparse.ArgumentParser()
p.add_argument('--argument', required=False)
p.add_argument('-a', required=args_pre.argument)
p.add_argument('-b', required=not args_pre.argument)

Python argparse : how to detect duplicated optional argument?

I'm using argparse with optional parameter, but I want to avoid having something like this : script.py -a 1 -b -a 2
Here we have twice the optional parameter 'a', and only the second parameter is returned. I want either to get both values or get an error message.
How should I define the argument ?
[Edit]
This is the code:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-a', dest='alpha', action='store', nargs='?')
parser.add_argument('-b', dest='beta', action='store', nargs='?')
params, undefParams = self.parser.parse_known_args()
append action will collect the values from repeated use in a list
parser.add_argument('-a', '--alpha', action='append')
producing an args namespace like:
namespace(alpha=['1','3'], b='4')
After parsing you can check args.alpha, and accept or complain about the number of values. parser.error('repeated -a') can be used to issue an argparse style error message.
You could implement similar functionality in a custom Action class, but that requires understanding the basic structure and operation of such a class. I can't think anything that can be done in an Action that can't just as well be done in the appended list after.
https://stackoverflow.com/a/23032953/901925 is an answer with a no-repeats custom Action.
Why are you using nargs='?' with flagged arguments like this? Without a const parameter this is nearly useless (see the nargs=? section in the docs).
Another similar SO: Python argparse with nargs behaviour incorrect

Categories