Use config not packaged with installed Hydra app - python

When packaging a Hydra application and including a script as an entry point, one can call the script from the command line anywhere. When called the script will utilize the configuration that was included in the package. Is there a way to modify the config search path such that a different configuration can be passed along to the script?
For example:
app.py
import hydra
#hydra.main(config_path="conf", config_name="config")
def my_func(cfg):
print(cfg)
config.yaml
key: value
another_key: second_value
Creating and installing this package with entry_points={"console_scripts": ["hydra_app_test = hydra_app_test.app:my_func"]}.
$ hydra_app_test
{'key': 'value', 'another_key': 'second_value'}
I would like to be able to define a local config and pass it to hydra_test_app. Is something like this possible?
some/config/path/config.yaml
key: not_value
mon: key
$ hydra_app_test --config-override=some/config/path
{'key': 'not_value', 'mon': 'key'}
I have tried using the --config-dir and --config-path overrides, but without any luck.
$ hydra_app_test --config-dir=some/config/path
{'key': 'value', 'another_key': 'second_value'}
$ hydra_app_test --config-path=some/config/path
Primary config module 'hydra_app_test.some.config.path' not found.
Check that it's correct and contains an __init__.py file
Interestingly enough this pattern works if you do not use the installed app, but run app.py as a script (with the requisite if __name__ == "__main__" logic)
python path/to/app.py --config-path=some/config/path
{'key': 'not_value', 'mon': 'key'}
Perhaps I am just missing something here, but it would seem you should be able to replicate the same behavior for both installed package scripts and python scripts.

Use --config-dir|-cp to add a directory to the config path for a specific run.
You can also use the hydra.searchpath config variable which can be overridden from the command line or from the primary config.
There is a page dedicated to the config search path here.

I have verified that this solutions works. I found this response in this feature request thread on the Hydra Github
Thread:
https://github.com/facebookresearch/hydra/issues/874
Solution:
https://github.com/facebookresearch/hydra/issues/874#issue-678621609
Basically any area of the config that you want to be overridable is set using a default config, this then allows the --config-dir flag to work in conjunction with a specified config file. The only caveat to this answer is that the new configs must be in a directory structure like the default configs.

Related

How to use different .env files with python-decouple

I am working on a django project that I need to run it with Docker. In this project I have multiples .env files: .env.dev, .env.prod, .env.staging. Is there a right way to manage all this file with the package python-decouple? I've search for a workaround to deal with this challenge and do not find any kind of answer, not even on the official documentation.
Can I use something like:
# dont works that way, it's just a dummie example
python manage.py runserver --env-file=.env.prod
or maybe any way to setting or override the file I need to use?
Instead of importing decouple.config and doing the usual config('SOME_ENV_VAR'), create a new decouple.Config object using RepositoryEnv('/path/to/.env.prod').
from decouple import Config, RepositoryEnv
DOTENV_FILE = '/home/user/my-project/.env.prod'
env_config = Config(RepositoryEnv(DOTENV_FILE))
# use the Config().get() method as you normally would since
# decouple.config uses that internally.
# i.e. config('SECRET_KEY') = env_config.get('SECRET_KEY')
SECRET_KEY = env_config.get('SECRET_KEY')

How to call salt-ssh (SSHClient) via Python API

I installed Salt in a Python 3 virtual environment and created a Salt configuration that uses a non-root folder for everything (/home/user/saltenv). When using the salt-ssh command inside the venv, e.g. salt-ssh '*' test.ping, everything works as exptected. (Please note that the config dir is resolved via a Saltfile, so the -c option is omitted, but that should not matter.)
When calling the SSHClient directly via Python however, I get no results. I already figured out that the roster file is not read, obviously resulting in an empty target list. I am stuck somehow and the documentation is not that helpful.
Here is the code:
import salt.config
from salt.client.ssh.client import SSHClient
def main():
c_path = '/home/user/saltenv/etc/salt/master'
master_opts = salt.config.client_config(c_path)
c = SSHClient(c_path=c_path, mopts=master_opts)
res = c.cmd(tgt='*', fun='test.ping')
print(res)
if __name__ == '__main__':
main()
As it seems, the processing of some options differs between the CLI and the Client. salt-ssh does not use the SSHClient. Instead, the class salt.client.ssh.SSH is used directly.
While salt-ssh adds the config_dir from the Saltfile to the opts dictionary to resolve the master config file, the SSHClient reads the config file passed to the constructor directly and config_dir is not added to the options (resulting in the roster file not being found).
My solution is to include config_dir in the master config file as well. The code from the question will then be working unchanged.
Alternative 1: If you only have one Salt configuration, it is also possible to set the environment variable SALT_CONFIG_DIR.
Alternative 2: The mopts argument of SSHClient can be used to pass a custom configuration directory, but it requires more lines of code:
config = '/home/user/saltenv/etc/salt/master'
defaults = dict(salt.config.DEFAULT_MASTER_OPTS)
defaults['config_dir'] = os.path.dirname(config)
master_opts = salt.config.client_config(config, defaults=defaults)
c = SSHClient(mopts=master_opts)

Python unit tests not discovered in VSCode

I've written a python test file called scraping_test.py, with a single test class, using unittest, called TestScrapingUtils
"""Tests for the scraping app"""
import unittest
from bs4 import BeautifulSoup as bs4
from mosque_scraper.management.commands import scraping_utils
from mosque_scraper.selectors import MOSQUE_INFO_ROWS_SELECTOR
class TestScrapingUtils(unittest.TestCase):
"""Test scraping_utils.py """
def setup(self):
"""Setup McSetupface."""
pass
def test_get_keys_from_row(self):
""" Test that we extract the correct keys from the supplied rows."""
test_page_name = "test_page.html"
with open(test_page_name) as test_page_file:
test_mosque = bs4(test_page_file, 'html.parser')
rows = test_mosque.select(MOSQUE_INFO_ROWS_SELECTOR)
field_dict = scraping_utils.get_fields_from_rows(rows)
self.assertDictEqual(field_dict, {})
My settings for unit tests are:
{
"python.unitTest.unittestEnabled": true,
"python.unitTest.unittestArgs": [
"-v",
"-s",
".",
"-p",
"*test.py"
]
}
It looks like it should work, but when I click to run the tests in VSCode it says that no tests were discovered:
No tests discovered, please check the configuration settings for the tests.
How do I make it work?
You have to run it once by using shortcut key shift+ctrl p, and type "Python run all unit tests".
It won't show up in the editor until it was successfully executed at least once or use the discover unit test method.
However one thing catch me many times is that the Python file has to be a valid Python file. The intellisense in VS Code for Python is not complex(compare to Javascript or Typescript), and it won't highlight any syntax error. You can verify that by force it to run all unit test and observe the Python Test Log window.
What caught me is that the __init__.py file must be created in every subdirectory, from the root folder specified with -s option (in the example, the current directory ".") to the subdirectory where the test module is located. Only then was I able to discover tests successfully.
In the question example, both project_dir/ and project_dir/scraping_app/ should contain __init__.py. This is assuming that settings.json is located in project_dir/.vscode and the tests are run from project_dir/ directory.
Edit: Alternatively, use "-s", "./scraping_app/" as the root test directory so you don't have to put __init__.py to project_dir/.
Instead of file name 'scraping_test.py' it shall be 'test_scraping.py'
string shall start from 'test' prefix
I had the same error with a slightly different configuration. (I am posting this here because this is the question that comes up when you search for this error.)
In addition to what was said above, it is also important to not use periods in the test file names (e.g. use module_test.py instead of module.test.py).
You can add the DJANGO_SETTINGS_MODULE variable and django.setup() inside the __init__.py file of tests package.
import os
import django
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'your_app.settings')
django.setup()
In my case, the problem was that my test was importing a module which was reading an environment variable using os.environ['ENV_NAME']. If the variable does not exist, it throws an error. But VS Code does not log anything (or at least I couldn't find it).
So, the reason was that my .env file was NOT in the workspace root. So I had to add "python.envFile": "${workspaceFolder}/path/to/.env" to the settings.json file.
After that, the test was discovered successfully.
Also had this issue.
for me the issue was, make sure there are no errors, and comment out all code in files that rely on pytest, just for the initial load up.
Another issue that causes the unit tests not be discovered is when using a conda environment that contains an explicit dependency on the conda package itself. This happens when the enviroment.yml contains the line:
- conda
Removing this line and creating the environment from scratch makes the unit tests discoverable. I have created a bug report in Github for this: https://github.com/microsoft/vscode-python/issues/19643
(This is my second solution to this issue; I decided to create another answer since this is entirely different from the previous one.)
This is my first time using unittest in vscode. I found that the file names cannot contain spaces and dots. and cannot start with numbers.
for the dots, I guess anything after a dot is considered by the extension by unittest.
for the spaces, I guess they do not use "" to surround the filename.
For me Discovering the unit tests did the trick.
SHIFT+CTRL+P and execute "Python: Discover unit tests"
After running this I get the "Run Test|Debug Test" over each test function.

Get list of used configuration files from Nose

From the code that runs the tests using nose, how do I retrieve a list of config files that have been passed on the command line (without parsing the args myself since nose should expose these values somewhere) as in,
nosetests -c default.ini -c staging.ini
which would then result in,
[default.ini, staging.ini]
I can't seem to find these values on the nose.config object.
Seems like your problem is that you're naming your configuration files differently than what the default nose configuration files should be named.
From nose.config
config_files = [
# Linux users will prefer this
"~/.noserc",
# Windows users will prefer this
"~/nose.cfg"
]
def user_config_files():
"""Return path to any existing user config files
"""
return filter(os.path.exists,
map(os.path.expanduser, config_files))
def all_config_files():
"""Return path to any existing user config files, plus any setup.cfg
in the current working directory.
"""
user = user_config_files()
if os.path.exists('setup.cfg'):
return user + ['setup.cfg']
return user
The short of this is, that nose is looking for default configuration files named ~/.noserc or ~/nose.cfg. If they're not named like this nose will not pick them up and you will have to manually specify the name of the configuration files, like you are doing on the command line
Now say for instance that you have some object config which is an instance of nose.config.Config then the best way to get your config file names would be to say
>>> from nose.config import Config
>>> c = Config()
>>> c.configure(argv=["nosetests", "-c", "foo.txt"])
>>> c.options.files
['foo.txt']

In the Pyramid web framework, how do I source sensitive settings into development.ini / production.ini from an external file?

I'd like to keep development.ini and production.ini under version control, but for security reason would not want the sqlalchemy.url connection string to be stored, as this would contain the username and password used for the database connection.
What's the canonical way, in Pyramid, of sourcing this setting from an additional external file?
Edit
In addition to solution using the environment variable, I came up with this solution after asking around on #pyramid:
def main(global_config, **settings):
""" This function returns a Pyramid WSGI application.
"""
# Read db password from config file outside of version control
secret_cfg = ConfigParser()
secret_cfg.read(settings['secrets'])
dbpass = secret_cfg.get("secrets", "dbpass")
settings['sqlalchemy.url'] = settings['connstr'] % (dbpass,)
I looked into this a lot and played with a lot of different approaches. However, Pyramid is so flexible, and the .ini config parser is so minimal in what it does for you, that there doesn't seem to be a de facto answer.
In my scenario, I tried having a production.example.ini in version control at first that got copied on the production server with the details filled in, but this got hairy, as updates to the example didn't get translated to the copy, and so the copy had to be re-created any time a change was made. Also, I started using Heroku, so files not in version control never made it into the deployment.
Then, there's the encrypted config approach. Which, I don't like the paradigm. Imagine a sysadmin being responsible for maintaining the production environment, but he or she is unable to change the location of a database or environment-specific setting without running it back through version control. It's really nice to have the separation between environment and code as much as possible so those changes can be made on the fly without version control revisions.
My ultimate solution was to have some values that looked like this:
[app:main]
sqlalchemy.url = ${SQLALCHEMY_URL}
Then, on the production server, I would set the environment variable SQLALCHEMY_URL to point to the database. This even allowed me to use the same configuration file for staging and production, which is nice.
In my Pyramid init, I just expanded the environment variable value using os.path.expandvars:
sqlalchemy_url = os.path.expandvars(settings.get('sqlalchemy.url'))
engine = create_engine(sqlalchemy_url)
And, if you want to get fancy with it and automatically replace all the environment variables in your settings dictionary, I made this little helper method for my projects:
def expandvars_dict(settings):
"""Expands all environment variables in a settings dictionary."""
return dict((key, os.path.expandvars(value)) for
key, value in settings.iteritems())
Use it like this in your main app entry point:
settings = expandvars_dict(settings)
The whole point of the separate ini files in Pyramid is that you do not have to version control all of them and that they can contain different settings for different scenarios (development/production/testing). Your production.ini almost always should not be in the same VCS as your source code.
I found this way for loading secrets from a extra configuration and from the env.
from pyramid.config import Configurator
from paste.deploy import appconfig
from os import path
__all__ = [ "main" ]
def _load_secrets(global_config, settings):
""" Helper to load secrets from a secrets config and
from env (in that order).
"""
if "drawstack.secrets" in settings:
secrets_config = appconfig('config:' + settings["drawstack.secrets"],
relative_to=path.dirname(global_config['__file__']))
for k, v in secrets_config.items():
if k == "here" or k == "__file__":
continue
settings[k] = v
if "ENV_DB_URL" in global_config:
settings["sqlalchemy.url"] = global_config["ENV_DB_URL"]
def main(global_config, **settings):
""" This function returns a Pyramid WSGI application.
"""
_load_secrets(global_config, settings)
config = Configurator(settings=settings)
config.include('pyramid_jinja2')
config.include('.models')
config.include('.routes')
config.scan()
return config.make_wsgi_app()
The code above, will load any variables from the value of the config key drawstack.secrets and after that it tries to load DB_URL from the enviornment.
drawstack.secrets can be relative to the original config file OR absolute.

Categories