In using argparse this is the first time I've come across a 'Namespace' object. What is the most common way to work with these objects? For example, if I have this initialization code:
import argparse
parser = argparse.ArgumentParser(description='Dedupe library.', allow_abbrev=True)
parser.add_argument( '-a', '--all', nargs='+', type=int, help='(Optional) Enter one or more IDs.')
parser.add_argument( '-r', '--reverse', nargs='+', help='(Optional) Enter one or more IDs.')
It seems like the library adds a property on every --long option (if it exists, otherwise the short -s option), so something like the following works:
# test.py
p = parser.parse_args()
print (p.all, p.reverse)
# -------------------------------------
$ python test.py -a 2 3 -r asdf
# [2, 3] ['asdf']
Is this the most common way to work with the argparse output, or how is this usually done?
Every argument performs some kind of action, specified by the action argument to add_argument. The default is a store action.
Each store action saves one (or more) values to an attribute in the resulting namespace. You can specify which attribute with the dest argument to add_argument, but more commonly the name is inferred from the first long option name (or the first short name, if there are no long names).
Note that you can have multiple options that affect the same attribute. A common use is to have multiple store_const actions that save a different hard-coded value to a single attribute.
p.add_argument("--high", action='store_const', dest='level', const='high')
p.add_argument("--med", action='store_const', dest='level', const='medium')
p.add_argument("--low", action='store_const', dest='level', const='low')
You could consider this as providing a series of aliases for an option that takes an explicit argument to specify a level:
p.add_argument("--level", choices=['high', 'medium', low'])
where --high has the same effect as --level high.
>>> p.parse_args(["--level", "high"]).level
'high'
>>> p.parse_args(["--high"]).level
'high'
Related
I have a set of parameters for training and a set of parameters for tuning. They share the same name but different default values. I'd like to use argparse to define which group of default values to use and also parse the values.
I have learned it is possible by using add_subparsers to set subparser for each mode. However, their names are identical which means I'll have to set the same parameters twice (which is very long).
I also tried to include two parsers, the first one parse a few args to determine which group of default values to use, and then use parser.set_defaults(**defaults) to set the default values for the second parser, like this:
train_defaults = dict(
optimizer='AdamW',
lr=1e-3,
strategy='linear',
warmup_steps=5_000,
weight_decay=0.3
)
tune_defaults = dict(
optimizer='SGD',
lr=1e-2,
strategy='cosine',
warmup_steps=500,
weight_decay=0.0
)
selector = argparse.ArgumentParser(description='Mode Selector')
mode = selector.add_mutually_exclusive_group()
mode.add_argument('-t', '--train', action='store_true', help='train model')
mode.add_argument('-u', '--tune', action='store_true', help='tune model')
select, unknown = selector.parse_known_args()
defaults = tune_defaults if select.tune else select.train
parser.set_defaults(**defaults)
args, unknown = parser.parse_known_args()
But two parsers will conflict on some args, for example, -td refers to the --train_data in parser, but it will also be parsed by selector which will raise an Exception:
usage: run.py [-h] [-pt | -pa] [-t] [-u] [-v]
run.py: error: argument -t/--train: ignored explicit argument 'd'
(This is a MWE, the actual args could be vary.
The multiple parsers solution, as you are finding, can be error-prone. I see two alternatives:
Use environment variables
Something like this:
import os
do_tuning = os.getenv("DO_THE_TUNING_MODE", None) is not None
...
defaults = tune_defaults if do_tuning else select.train
parser = argparse.ArgumentParser()
...
parser.set_defaults(**defaults)
args, unknown = parser.parse_known_args()
Use like
DO_THE_TUNING_MODE=1 run.py <options>
or
export DO_THE_TUNING_MODE=1
run.py <options>
(or of course, don't set for training mode)
Pros:
Tuning/selection method is outside the parser so you don't get conflicts
A user can set a "state" in their shell session to tuning or training and not have to continuously set the option when running
Cons:
An environment variable is less straightforward to set than calling a command-line option for one-time use
It is easy to forget what your environment variable is set to
Use subparsers
This is probably the best solution. You indicated that you did not want to do that because you have so many options, but that's what functions are for.
def add_parsing_options(parser):
# All your 40 options go here
parser.add_argument(...)
parser.add_argument(...)
parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers()
tuning_parser = subparser.add_parser("tune")
training_parser = subparser.add_parser("train")
add_parsing_options(tuning_parser)
add_parsing_options(training_parser)
tuning_parser.set_defaults(**tune_defaults)
training_parser.set_defaults(**train_defaults)
args, unknown = parser.parse_known_args()
Call like
run.py train <options>
or
run.py tune <options>
Pros:
It is explicit when using the tool which mode is being used
Cons:
It is an extra parameter to type every time the tool is used
I partially resolved the question by some hard code, i.e.
Since the first parser is only used to set the default parameters of the second parser, there is only a few arguments, in my case, 2.
So what I did is to split the sys.argv to two parts:
import sys
select, unknown = selector.parse_known_args(sys.argv[:3])
args, unknown = parser.parse_known_args(sys.argv[3:])
Pros:
have most if not every pros of other methods
no extra parameter to type every time
Cons:
the value 3 is a hyper parameter
I am attempting to get my script working, but argparse keeps overwriting my positional arguments from the parent parser. How can I get argparse to honor the parent's value for these? It does keep values from optional args.
Here is a very simplified version of what I need. If you run this, you will see that the args are overwritten.
testargs.py
#! /usr/bin/env python3
import argparse
import sys
def main():
preparser = argparse.ArgumentParser(add_help=False)
preparser.add_argument('first',
nargs='?')
preparser.add_argument('outfile',
nargs='?',
type=argparse.FileType('w', encoding='utf-8'),
default=sys.stdout,
help='Output file')
preparser.add_argument(
'--do-something','-d',
action='store_true')
# Parse args with preparser, and find config file
args, remaining_argv = preparser.parse_known_args()
print(args)
parser = argparse.ArgumentParser(
parents=[preparser],
description=__doc__)
parser.add_argument(
'--clear-screen', '-c',
action='store_true')
args = parser.parse_args(args=remaining_argv,namespace=args )
print(args)
if __name__ == '__main__':
main()
And call it with testargs.py something /tmp/test.txt -d -c
You will see it keeps the -d but drops both the positional args and reverts them to defaults.
EDIT: see additional comments in the accepted answer for some caveats.
When you specify parents=[preparser] it means that parser is an extension of preparser, and will parse all arguments relevent to preparser which it is never given.
Lets say the preparser only has one positional argument first and the parser only has one positional argument second, when you make parser a child of preparser it expects both arguments:
import argparse
parser1 = argparse.ArgumentParser(add_help=False)
parser1.add_argument("first")
parser2 = argparse.ArgumentParser(parents=[parser1])
parser2.add_argument("second")
args2 = parser2.parse_args(["arg1","arg2"])
assert args2.first == "arg1" and args2.second == "arg2"
However passing only the remaining arguments that are left over from parser1 would just be ['second'] which is not the correct arguments to parser2:
parser1 = argparse.ArgumentParser(add_help=False)
parser1.add_argument("first")
args1, remaining_args = parser1.parse_known_args(["arg1","arg2"])
parser2 = argparse.ArgumentParser(parents=[parser1])
parser2.add_argument("second")
>>> args1
Namespace(first='arg1')
>>> remaining_args
['arg2']
>>> parser2.parse_args(remaining_args)
usage: test.py [-h] first second
test.py: error: the following arguments are required: second
To only process the arguments that were not handled by the first pass, do not specify it as the parent to the second parser:
parser1 = argparse.ArgumentParser(add_help=False)
parser1.add_argument("first")
args1, remaining_args = parser1.parse_known_args(["arg1","arg2"])
parser2 = argparse.ArgumentParser() #parents=[parser1]) #NO PARENT!
parser2.add_argument("second")
args2 = parser2.parse_args(remaining_args,args1)
assert args2.first == "arg1" and args2.second == "arg2"
The 2 positionals are nargs='?'. A positional like that is always 'seen', since an empty list matches that nargs.
First time through 'text.txt' matches with first and is put in the Namespace. Second time through there isn't any string to match, so the default is used - same as if you had not given that string the first time.
If I change first to have the default nargs, I get
error: the following arguments are required: first
from the 2nd parser. Even though there's a value in the Namespace it still tries to get a value from the argv. (it's like a default, but not quite).
Defaults for positionals with nargs='?' (or *) are tricky. They are optional, but not in quite the same way as optionals. The positional Actions are still called, but with a empty list of values.
I don't think the parents feature does anything for you. preparser already handles that set of arguments; there's no need to handle them again in parser, especially since all the relevant argument strings have been stripped out.
Another option is to leave the parents in, but use the default sys.argv[1:] in the 2nd parser. (but beware of side effects like opening files)
args = parser.parse_args(namespace=args )
A third option is to parse the arguments independently and merge them with a dictionary update.
adict = vars(preparse_args)
adict.update(vars(parser_args))
# taking some care in who overrides who
For more details look in argparse.py file at ArgumentParser._get_values, specifically the not arg_strings cases.
A note about the FileType. That type works nicely for small scripts where you will use the files right away and exit. It isn't so good on large programs where you might want to close the file after use (close stdout???), or use files in a with context.
edit - note on parents
add_argument creates an Action object, and adds it to the parser's list of actions. parse_args basically matches input strings with these actions.
parents just copies those Action objects (by reference) from parent to child. To the child parser it is just as though the actions were created with add_argument directly.
parents is most useful when you are importing a parser and don't have direct access to its definition. If you are defining both parent and child, then parents just saves you some typing/cut-n-paste.
This and other SO questions (mostly triggered the by-reference copy) show that the developers did not intend you to use both the parent and child to do parsing. It can be done, but there are glitches that the they did not consider.
===================
I can imagine defining a custom Action class that would 'behave' in a situation like this. It might, for example, check the namespace for some not default value before adding its own (possibly default) value.
Consider, for example if I changed the action of first to 'append':
preparser.add_argument('first', action='append', nargs='?')
The result is:
1840:~/mypy$ python3 stack37147683.py /tmp/test.txt -d -c
Namespace(do_something=True, first=['/tmp/test.txt'], outfile=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)
Namespace(clear_screen=True, do_something=True, first=['/tmp/test.txt', None], outfile=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)
From the first parser, first=['/tmp/test.txt']; from the second, first=['/tmp/test.txt', None].
Because of the append, the item from the first is preserved, and a new default has been added by the second parser.
I'm using argparse with optional parameter, but I want to avoid having something like this : script.py -a 1 -b -a 2
Here we have twice the optional parameter 'a', and only the second parameter is returned. I want either to get both values or get an error message.
How should I define the argument ?
[Edit]
This is the code:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-a', dest='alpha', action='store', nargs='?')
parser.add_argument('-b', dest='beta', action='store', nargs='?')
params, undefParams = self.parser.parse_known_args()
append action will collect the values from repeated use in a list
parser.add_argument('-a', '--alpha', action='append')
producing an args namespace like:
namespace(alpha=['1','3'], b='4')
After parsing you can check args.alpha, and accept or complain about the number of values. parser.error('repeated -a') can be used to issue an argparse style error message.
You could implement similar functionality in a custom Action class, but that requires understanding the basic structure and operation of such a class. I can't think anything that can be done in an Action that can't just as well be done in the appended list after.
https://stackoverflow.com/a/23032953/901925 is an answer with a no-repeats custom Action.
Why are you using nargs='?' with flagged arguments like this? Without a const parameter this is nearly useless (see the nargs=? section in the docs).
Another similar SO: Python argparse with nargs behaviour incorrect
I need to load the argument even if there is a option and if not a option.
#!/usr/bin/python
import optparse
parser = optparse.OptionParser()
parser.add_option('-i', dest='name', help='some')
parser.add_option('-c', dest='name', help='some')
parser.add_option('-p', action='store', help='password')
print parser.parse_args()
[root#server tmp]# ./test -i abc
(<Values at 0x4011368: {'p': None, 'name': 'abc'}>, [])
[root#server tmp]# ./test abc
(<Values at 0x5855368: {'p': None, 'name': None}>, ['abc'])
Now I need have the value "abc" even if I am not using any option. So please let know how can I access that value.
Based solely on your output, you should be able to see that parse_args returns a tuple. The first element of that tuple is an object containing values for defined options and the second element is a list of arguments leftover after parsing options. You can read more about it in the official tutorial.
Having this in mind, you can simply write
options, arguments = parser.parse_args()
and use arguments to do whatever you want with that list.
However, your problem seems to be that when you supply an option, argument is parsed as an option's value. This is caused by your way of defining options because options -i and -c need values.
If you want those options to be boolean, you need to define that manually. Example for one option code would be something like
# This defines an option which set name to True if option is provided, otherwise
# name is set to False
parser.add_option('-i', dest='name', help='some', action="store_true", default=False)
This would also mean that you don't need to provide value for that option, so argument won't be consumed when the parser reads options.
If you want your options to be non-boolean, but don't want to proved values for them, then I'm not sure I get what you're trying to do.
See this maybe :
15.5.2.3. Handling boolean (flag) options
Here is my argparse sample say sample.py
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("-p", nargs="+", help="Stuff")
args = parser.parse_args()
print args
Python - 2.7.3
I expect that the user supplies a list of arguments separated by spaces after the -p option. For example, if you run
$ sample.py -p x y
Namespace(p=['x', 'y'])
But my problem is that when you run
$ sample.py -p x -p y
Namespace(p=['y'])
Which is neither here nor there. I would like one of the following
Throw an exception to the user asking him to not use -p twice instead just supply them as one argument
Just assume it is the same option and produce a list of ['x','y'].
I can see that python 2.7 is doing neither of them which confuses me. Can I get python to do one of the two behaviours documented above?
Note: python 3.8 adds an action="extend" which will create the desired list of ['x','y']
To produce a list of ['x','y'] use action='append'. Actually it gives
Namespace(p=[['x'], ['y']])
For each -p it gives a list ['x'] as dictated by nargs='+', but append means, add that value to what the Namespace already has. The default action just sets the value, e.g. NS['p']=['x']. I'd suggest reviewing the action paragraph in the docs.
optionals allow repeated use by design. It enables actions like append and count. Usually users don't expect to use them repeatedly, or are happy with the last value. positionals (without the -flag) cannot be repeated (except as allowed by nargs).
How to add optional or once arguments? has some suggestions on how to create a 'no repeats' argument. One is to create a custom action class.
I ran into the same issue. I decided to go with the custom action route as suggested by mgilson.
import argparse
class ExtendAction(argparse.Action):
def __call__(self, parser, namespace, values, option_string=None):
if getattr(namespace, self.dest, None) is None:
setattr(namespace, self.dest, [])
getattr(namespace, self.dest).extend(values)
parser = argparse.ArgumentParser()
parser.add_argument("-p", nargs="+", help="Stuff", action=ExtendAction)
args = parser.parse_args()
print args
This results in
$ ./sample.py -p x -p y -p z w
Namespace(p=['x', 'y', 'z', 'w'])
Still, it would have been much neater if there was an action='extend' option in the library by default.