The standard mantra to pass a configuration to hydra is to decorate with #hydra.main then to define your main(cfg), and then in the main block call main with no arguments. But suppose I want to also accept non-hydra command line args, so want to pass both args and the config? Is this viewed as morally impure and discouraged? Here is an example (which does not work) of the sort of thing I mean (it errors out claiming it is missing a cfg argument).
from omegaconf import DictConfig, OmegaConf
import hydra
#hydra.main(version_base=None)
def my_app(foo, cfg: DictConfig) -> None:
print(foo)
print(OmegaConf.to_yaml(cfg))
if __name__ == "__main__":
my_app(5)
This is not supported.
In principle, you have nothing to base different values for your 5 parameter before the config is composed.
If you want to have additional logic before composing the config, look at the Compose API as an alternative to #hydra.main() (or in addition).
Related
For a project involving multiple scripts (data processing, model tuning, model training, testing, etc.) I may keep all my <class 'argparse.ArgumentParser'> objects as the return value of functions for each script task in a module I name cli.py. However, in the calling script itself (__main__), after calling args = parser.parse_args(), if I type args. in VS Code I get no suggested attributes that are specific to that object. How can I get suggested attributes of argparse.NameSpace objects in VS Code?
E.g.,
"""Module for command line interface for different scripts."""
import argparse
def some_task_parser(description: str) -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(description=description)
parser.add_argument('some_arg')
return parser
then
"""Script for executing some task."""
from cli import some_task_parser
if __name__ == '__main__':
parser = some_task_parser('arbitrary task')
args = parser.parse_args()
print(args.) # I WANT SUGGESTED ATTRIBUTES (SUCH AS args.some_arg) TO POP UP!!
I don't know about VSCode, in an ipython session, the tab complete does show the available attributes of Namespace. argparse.Namespace is a relatively simple object class. What you want, I think, are just the attributes of this object.
In [238]: args = argparse.Namespace(test='one', foo='three', bar=3)
In [239]: args
Out[239]: Namespace(bar=3, foo='three', test='one')
In [240]: args.
bar test
foo args.txt
But those attributes will not be available during development. They are put there by the parsing action, at run time. They cannot be reliably deduced from the parser setup. Certainly argparse does not provide such a help.
I often recommend that developers include a
print(args)
during the debugging phase, so they get a clear idea of what the parser does.
Use args.__dict__ to get the names and values. If you need just the argument names, then [key for key in args.__dict__.keys()] will give the list of them.
In pytest, the standard convention for passing cmdline args to tests is to add an option using parser.addoption and define a fixture with the same name using #pytest.fixture().
Example:
# In conftest.py:
import pytest
def pytest_addoption(parser):
parser.addoption("--input1", action="store", default="default input1")
#pytest.fixture
def input1(request):
return request.config.getoption("--input1")
For this reason, our conftest.py has become pretty huge (think AWS CLI). I was wondering if, instead of defining each option explicitly in conftest.py, there is a way to parameterize and pass all command line arguments to the tests? Kind of like *kwargs
As per documents, pytest uses argparse which provides parse_known_args function which outputs both known and unknown options. So technically this should be possible right?
As far as I know, you still have to register each option in pytest_addoption, but you can access all options by using a single fixture:
#pytest.fixture
def options(request):
yield request.config.option
This will provide the options namespace, and you can access each option using the dot notation:
def test_something(options):
print(options.input1)
(provided you have added the option input1 as shown in your question)
The name of the option is either the name you provided using the dest argument in addoption, or derived from option name itself if dest is not given (by stripping the starting hyphens, and converting other characters not allowed in a variable name into underscores).
This behaves very similar to argparse.parse_known_args, as it also returns a namespace with all options. Note that the namespace will additionally contain a number of other options from pytest and pytest plugins.
The below paste contains relevant snippets from three separate Python files. The first is a script called from the command line which instantiates CIPuller given certain arguments. What happens is that the script gets called with something like:
script.py ci (other args to be swallowed by argparse).
The second is part of a subclass called Puller. The third is part of a subclass of Puller called CIPuller.
This works wonderfully, as the correct subclass is called, and any user using the wrong other args gets to see the correct args for their given subclass, plus the generic arguments from the superclass. (Although I was made aware offline that perhaps I should use argparse sub-commands for this.)
I'm stuck trying to write tests for these classes. Currently, I need an ArgumentParser to instantiate the classes, but in testing I'm not instantiating things from the command line, hence my ArgumentParser is useless.
I tried creating an ArgumentParser in the test harness to pass to CIPuller's constructor in the test code, but if I use add_argument there, argparse understandably complains about double (duplicate) arguments when it calls add_argument in the CIPuller constructor.
What would be a suitable design to test these classes with arguments?
#!/usr/bin/env python
from ci_puller import CIPuller
import argparse
import sys
# Using sys.argv[1] for the argument here, as we don't want to pass that onto
# the subclasses, which should receive a vanilla ArgumentParser
puller_type = sys.argv.pop(1)
parser = argparse.ArgumentParser(
description='Throw data into Elasticsearch.'
)
if puller_type == 'ci':
puller = CIPuller(parser, 'single')
else:
raise ValueError("First parameter must be a supported puller. Exiting.")
puller.run()
class Puller(object):
def __init__(self, parser, insert_type):
self.add_arguments(parser)
self.args = parser.parse_args()
self.insert_type = insert_type
def add_arguments(self,parser):
parser.add_argument(
"-d", "--debug",
help="print debug info to stdout",
action="store_true"
)
parser.add_argument(
"--dontsend",
help="don't actually send anything to Elasticsearch",
action="store_true"
)
parser.add_argument(
"--host",
help="override the default host that the data is sent to",
action='store',
default='kibana.munged.tld'
)
class CIPuller(Puller):
def __init__(self, parser, insert_type):
self.add_arguments(parser)
self.index_prefix = "code"
self.doc_type = "cirun"
self.build_url = ""
self.json_url = ""
self.result = []
super(CIPuller, self).__init__(parser, insert_type)
def add_arguments(self, parser):
parser.add_argument(
'--buildnumber',
help='CI build number',
action='store',
required=True
)
parser.add_argument(
'--testtype',
help='Job type per CI e.g. minitest / feature',
choices=['minitest', 'feature'],
required=True
)
parser.add_argument(
'--app',
help='App e.g. sapi / stats',
choices=['sapi', 'stats'],
required=True
)
Unittesting for argparse is tricky. There is a test/test_argparse.py file that is run as part of the overall Python unittest. But it has a complicated custom testing harness to handle most cases.
There are three basic issues, 1) calling parse_args with test values, 2) testing the resulting args, 3) testing for errors.
Testing the resulting args is relatively easy. And the argparse.Namespace class has simple __eq__ method so you can test one namespace against another.
There are two ways of testing inputs. One is to modify the sys.argv. Initially sys.argv has strings meant for the tester.
self.args = parser.parse_args()
tests sys.argv[1:] as a default. So if you change sys.argv you can test custom values.
But you can also give parse_args a custom list. The argparse docs uses this in most of its examples.
self.args = parser.parse_args(argv=myargv)
If myarg is None it uses sys.argv[1:]. Otherwise it uses that custom list.
Testing errors requires either a custom parse.error method (see docs) or wrapping the parse_args in a try/except block that can catch a sys.exit exception.
How do you write tests for the argparse portion of a python module?
python unittest for argparse
Argparse unit tests: Suppress the help message
Unittest with command-line arguments
Using unittest to test argparse - exit errors
I have a main function specified as entry point in my package's setup.py which uses the argparse package in order to pass command line arguments (see discussion here):
# file with main routine specified as entry point in setup.py
import argparse
def main():
parser = argparse.ArgumentParser()
parser.add_argument('a', type=str, help='mandatory argument a')
args = parser.parse_args()
Ideally, I would like to use the same main function in the package's tests as suggested here. In the latter context, I would like to call the main function from within the test class and set (some of) the command line arguments prior to the function call (which otherwise will fail, due to missing arguments).
# file in the tests folder calling the above main function
class TestConsole(TestCase):
def test_basic(self):
set_value_of_a()
main()
Is that possible?
The argparse module actually reads input variables from special variable, which is called ARGV (short from ARGument Vector). This variable is usually accessed by reading sys.argv from sys module.
This variable is a ordinary list, so you can append your command-line parameters to it like this:
import sys
sys.argv.extend(['-a', SOME_VALUE])
main()
However, messing with sys.argv at runtime is not a good way of testing.
A much more cleaner way to replace the sys.argv for some limited scope is using unittest.mock.patch context manager, like this:
with unittest.mock.patch('sys.argv'. ['-a', SOME_VALUE]):
main()
Read more about unittest.mock.patch in documentation
Also, check this SO question:
How do I set sys.argv so I can unit test it?
#William Fernandes: Just for the sake of completeness, I'll post the full solution in the way that was suggested by (checking for an empty dict not kwargs is None):
def main(**kwargs):
a = None
if not kwargs:
parser = argparse.ArgumentParser()
parser.add_argument('a', type=str, help='mandatory argument a')
args = parser.parse_args()
a = args.a
else:
a = kwargs.get('a')
print(a)
From within the test class the main function can then be called with arguments:
# file in the tests folder calling the above main function
class TestConsole(TestCase):
def test_basic(self):
main(a=42)
The call from the command line without kwargs then requires the specification of the command line argument a=....
Add kwargs to main and if they're None, you set them to the parse_args.
I have a system, where you can modify, which modules will be loaded (and run; "module" is not necessarily python module, it can combine several modules). The program can run module A and B. Now, I want to have an option that every module can define (add) its own parameters. Let's say A wants to have -n and B wants to have -s for something. But there is one common parameter -c, which the main system itself needs. What is the best way to achieve this?
So far I have been using single optparse.OptionParser instance and passed it to every module, when they are initialized. Then the module can modify (add a new parameter) if needed.
You should consider moving to a library that supports the concept of sub-parsers, such as argparse (which deprecates optparse anyway), so that each library can create its own parser rules, and the main program can just combine them.
When I had this problem I ended up using a class derived from ArgumentParser that added the ability to register callback functions that would be executed once the arguments were parsed:
import argparse
class ArgumentParser(argparse.ArgumentParser):
def __init__(self, *p, **kw):
super(ArgumentParser, self).__init__(*p, **kw)
self._reactions = []
def add_reaction(self, handler):
self._reactions.append(handler)
def parse_known_args(self, args=None, namespace=None):
(args, argv) = super(ArgumentParser, self).parse_known_args(args, namespace)
for reaction in self._reactions:
reaction(args)
return (args, argv)
This way the parser object still needs to be passed to all the modules to register their command line switches, but the modules can react to the switches "on their own":
def arguments_parsed(args):
if args.datafile:
load_stuff(args.datafile)
def add_arguments(ap):
ap.add_argument('--datafile',
help="Load additional input data")
ap.add_reaction(arguments_parsed)
This uses argparse, but the same could probably be done with optparse.
It is not tested with advanced features like subparsers and probably won't work there, but could easily be extended to do so.