How to properly manage application configurations

How to properly manage application configurations - python

What is the most universal and best application configurations management method? I want to have these properties in order to have "good configuration management":
A list of all available properties and their default values in one
place.
A list of properties which can be changed by an app user, also in one
place.
When I retrieve a specific property, it's value is returned from the
2nd list (user changeable configs) or if it's not there, from the
first list.
So far, what I did was hard coding the 1st list as an object (more specific as a dict), wrote .conf file used by ConfigParser to make an app user to easily change some of the properties (2nd list), and wrote a public method on the config object to retrieve a property by it's name or if it's not there, raise an exception. In the end, one object was responsible for managing all the stuff (parsing file, raising exception, overriding properties etc.) But I was wondering, if there's a built-in library which does more or less the same thing, or even a better way to manage configuration, which takes into account all the KISS, DRY and other principles (I'm not always successful to do that with this method)?
Thanks in advance.

Create a default settings module which contains your desired default settings. Create a second module intended to be used by the the user with a from default_settings import * statement at the top, and instructing the user to write any replacements into this module instead.
Python is rather expressive, so in most cases, if you can expect the user to understand it on any level, you can use a Python module itself as the configuration file.

Related

How do I navigate the Python documentation to find out the syntax for os.environ.get()?

I saw this line in our code, our python version is 3.7.
example.py
PORT = os.environ.get("PORT", 5000)
I need to add an environment variable, but I need a default, and my background isn't Python. My guess is that 5000 above is a default value if there is no PORT in environment variables, but I want to double check this.
So I googled "os.environ.get python docs." This brought me to the os documentation on the python website, and I had to text search from ther efor environ until I found a paragraph dedicated to os.environ.
A mapping object where keys and values are strings that represent the process environment. For example, environ['HOME'] is the pathname of your home directory (on some platforms), and is equivalent to getenv("HOME") in C.
I had hoped for explicit documentation on environ.get but I figured that environ is itself some sort of Python data structure for which I'd need to look up documentation for on how to use. So I clicked the mapping object link. That brought me to another paragraph:
A container object that supports arbitrary key lookups and implements the methods specified in the Mapping or MutableMapping abstract base classes. Examples include dict, collections.defaultdict, collections.OrderedDict and collections.Counter.
At this point I'm a bit at a loss, because I don't think the docs ever said which kind of mapping object os.environ specifically is. So since it says "implements the methods specificed in the Mapping... abstract base classes" I clicked the abstract base classes link.
On that page I didn't see any reference to get(), so now I was confused because I was expecting a list of methods.
An alternative google search of course demonstrated that the second argument to .get() is the default should no such environment variable exist, but I'm curious what I did wrong there. I tried to do it "right" by looking up the information on my own (rtfm), but I failed. What was the actual proper way to use the Python documentation here?

If the docs mention something returning a mapping type, it means it behaves very similar to the ubiquitous dict type.
https://docs.python.org/3/library/stdtypes.html#dict.get
The os module doc mentions:
This mapping is captured the first time the os module is imported, typically during Python startup as part of processing site.py. Changes to the environment made after this time are not reflected in os.environ, except for changes made by modifying os.environ directly
So reading the os.environ docs it seems that for example writes to os.environ are reflected in the system. Regular dict would not do this of course so that is one important difference between this custom mapping type and a dict.

How to store application settings across modules [duplicate]

This question already has answers here:
How to save application settings in a config file?
(3 answers)
Closed 9 years ago.
I received a project from developer who left our company. Not too complex, but it doesn't look very nice.
So here is the question:
Application has some modules and one is "settings" which stores some app. options (not all possible options, lets say just two: foo and bar).
When application is started it reads options from command line (using argparse):
parser.add_argument('--foo', action='store_true')
parser.add_argument('--bar', action='store_true')
parser.add_argument('--baz', action='store_true')
And then it performs this nasty thing:
for name, val in parser.parse_args(sys.argv[1:])._get_kwargs():
setattr(sys.modules['settings'], name, val)
First: I think this is dirty, non-pythonic hack. And second, it is simply inconvenient to use such code, because when I need to use settings.baz, IDE complaints that it doesn't exist.
The intention of this hack is to make options parsed from command line available in all modules that are used in application further.
I'm thinking about something like singleton pattern, but I only used it once in PHP, and don't know if this correct solution in python. And even if it is, can someone show example?
I'm noob in python and on SO, please be kind to me :)
Thanks.
p.s. I'm sorry for possible mistakes in my English

Modules in Python are singleton objects, and using one to store the settings used by the other modules would be a very Pythonic
The second line of the "nasty thing" is just setting the attributes of a module named settings and so isn't that bad. What's worse is the _get_kwargs() part of the first line which is accessing a private attribute of the argparse.Namespace object returned by parser.parse_args() to get the names and values of the settings parsed from the command-line. A slightly better way to do it might be something like this:
import settings # possibly empty .py file
for name, val in vars(parser.parse_args(sys.argv[1:])).iteritems():
setattr(settings, name, val)
However this won't fix your IDE problems because the IDE doesn't know the name of settings added dynamically. A simple way to fix that would be to define all the possible attributes with some kind of default values in a settings.py module instead of having an empty one.
The first time a module is imported an entry for it is added to the sys.modules dictionary with its name as the key and an instance of types.ModuleType as a value. Subsequent imports will first check to see if an entry for it already exists and will skip reloading the file if it does -- which is why I claim they're basically singleton objects. Modifications made to its attributes will immediately be visible to other modules that have imported it or do so afterwards, so it's generally a good data sharing mechanism within an application.

Look this Config (A hierarchical, easy-to-use, powerful configuration module for Python )
Detailed doc & examples

Is there a django idiom to store app-related variables in the DB?

I'm quite new to django, and moved to it from Drupal.
In Drupal is possible to define module-level variables (read "application" for django) which are stored in the DB and use one of Drupal's "core tables". The idiom would be something like:
variable_set('mymodule_variablename', $value);
variable_get('mymodule_variablename', $default_value);
variable_del('mymodule_variablename');
The idea is that it wouldn't make sense to have each module (app) to instantiate a whole "module table" to just store one value, so the core provides a common one to be shared across modules.
To the best of my newbie understanding of django, django lack such a functionality, but - since it is a common pattern - I thought to turn to SO community to check if there is a typical/standard/idiomatic way that django devs use to solve this problem.
(BTW: the value is not a constant that I could put in a settings file. It's a value that should be refreshed daily, and should be read at each request).

There are apps to achieve this, but I'd like to recommend django-modeldict from disqus, as its brief
ModelDict is a very efficient way to store things like settings in
your database. The entire model is transformed into a dictionary
(lazily) as well as stored in your cache. It's invalidated only when
it needs to be (both in process and based on CACHE_BACKEND).

Data that is not static is stored in a model. If you need to share data or functions between apps I have seen the convention of making a shared app, something like 'common'. This would house shared models, or utility functions.
In the django projects I have seen the data is usually specific. The data you are storing should be in a model that is representative of that data, I would rather have an explicit model/object representing my data then a generic object that houses vastly different data.
If you are only defining 1 or two variables which are changed daily, perhaps just a key/value store like memcached would work for you?

Another +1 for ModelDict. Another potential, similar solution is Django Constance:
https://github.com/jazzband/django-constance
It's meant to store app config parameters in the database and has the advantage that it exposes a nice backend to edit them for administrators (with the right permissions), handles default values and also has caching etc.
EDIT:
In case it's not clear from the documentation (which it isn't), you can set settings the same the 'Pythonic way.' I.e. to set a setting to a value, you do
from constance import config
config.variable_name = value

Dynamic change in Configuration to reflect in user defined Data structures

In my program, I read a "configuration file" and from that I initialize many classes. I need a way so that Dynamic changes in the configuration file, can successfully update all the classes.
What is the best way this can be achived in Python ?
As an example:
The /etc/passwd file consists of
Username:Password:User ID:Group ID:User ID Info:Home directory:Shell
My program Initializes User Defined classes for each user based on the input in /etc/passwd file. If one or more attributes in a user entry changes in the file dynamically, how could this be transparently applied to re-initialze the User Defined Classes ?
PS - The actual program is much complex than the above example. So transparently propagating the configuration chnages to User Defined classes is not possible.

You can watch the changes to the file using e.g. pyinotify (Linux) or watchdog (cross-platform). Once the change is detected, you can update your data structures.
Updating username (or other "derived" information stored elsewhere) is better done by storing (and using) not it but some invariants that never change (e.g. UID), getting the "derived" information on demand with relevant API and using it for display purposes only.

What's a good general way to look SQLAlchemy transactions, complete with authenticated user, etc?

I'm using SQLAlchemy's declarative extension. I'd like all changes to tables logs, including changes in many-to-many relationships (mapping tables). Each table should have a separate "log" table with a similar schema, but additional columns specifying when the change was made, who made the change, etc.
My programming model would be something like this:
row.foo = 1
row.log_version(username, change_description, ...)
Ideally, the system wouldn't allow the transaction to commit without row.log_version being called.
Thoughts?

There are too many questions in one, so they that full answers to all them won't fit StackOverflow answer format. I'll try to describe hints in short, so ask separate question for them if it's not enough.
Assigning user and description to transaction
The most popular way to do so is assigning user (and other info) to some global object (threading.local() in threaded application). This is very bad way, that causes hard to discover bugs.
A better way is assigning user to the session. This is OK when session is created for each web request (in fact, it's the best design for application with authentication anyway), since there is the only user using this session. But passing description this way is not as good.
And my favorite solution is to extent Session.commit() method to accept optional user (and probably other info) parameter and assign it current transaction. This is the most flexible, and it suites well to pass description too. Note that info is bound to single transaction and is passed in obvious way when transaction is closed.
Discovering changes
There is a sqlalchemy.org.attributes.instance_state(obj) contains all information you need. The most useful for you is probably state.committed_state dictionary which contains original state for changed fields (including many-to-many relations!). There is also state.get_history() method (or sqlalchemy.org.attributes.get_history() function) returning a history object with has_changes() method and added and deleted properties for new and old value respectively. In later case use state.manager.keys() (or state.manager.attributes) to get a list of all fields.
Automatically storing changes
SQLAlchemy supports mapper extension that can provide hooks before and after update, insert and delete. You need to provide your own extension with all before hooks (you can't use after since the state of objects is changed on flush). For declarative extension it's easy to write a subclass of DeclarativeMeta that adds a mapper extension for all your models. Note that you have to flush changes twice if you use mapped objects for log, since a unit of work doesn't account objects created in hooks.

We have a pretty comprehensive "versioning" recipe at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/LogVersions . It seems some other users have contributed some variants on it. The mechanics of "add a row when something changes at the ORM level" are all there.
Alternatively you can also intercept at the execution level using ConnectionProxy, search through the SQLA docs for how to use that.
edit: versioning is now an example included with SQLA: http://docs.sqlalchemy.org/en/rel_0_8/orm/examples.html#versioned-objects

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.