is in Python any way how to initialize the variable only once and then just import it into the remaining modules?
I have following python project structure:
api
v1
init.py
v2
init.py
init.py
logging.py
logging.py:
from raven import Client
sentry = None
def init_sentry():
global sentry
sentry = 'some_dsn'
api/init.py
from app import logging
logging.init_sentry()
#run flask server (v1,v2)
api/{v1,v2}/init.py
from logging import sentry
try:
1 / 0
except ZeroDivisionError:
sentry.captureException()
In files api/v1/init.py and api/v2/init.py a get a error NoneType on sentry variable. I know I can call init_sentry in all files when I use it, but I'm looking for a better way.
Thanks
First, I think you misspelled init.py, should it be __init__.py.
It is bad programming style to pass data between modules with variable. You should use a class or a function to handle shared data. In such manner you have an API, and it is clear what the variable could be modified by other modules.
But for your question: I would (really not) create a module data.py with a shared = {} dictionary. From other modules, just by importing I can share the data. By looking if a variable, or just a flag moduleA_initialized, you can decide if you need to initialized the module.
As alternative, you can directly write to globals() dictionary. Note: this is worse programming practice, and you should check carefully the names, so that there is no conflicts to any library you may use. gettext write to it, but is pretty special case.
Here is one way to encapsulate the sentry variable and make sure that it is always calling into something, instead of accessing None:
logging.py:
class Sentry(object):
_dsn = None
#classmethod
def _set_dsn(cls, dsn):
cls._dsn = dsn
#classmethod
def __getattr__(cls, item):
return getattr(cls._dsn)
sentry = Sentry
def init_sentry():
Sentry._set_dsn('some_dsn')
Note:
This answer was also correct about the fact that you likely want __init__.py not init.py.
Related
Problem
I'm seeing examples that use global variables in Blueprints to store objects used by the Blueprint. However, this causes problems in unit-testing in case multiple instances of an application are created.
To illustrate, below is an example where parsed files are cached, based on a setting during blueprint initialization (Note: I know there are many issues with the below code, it's just an example):
from pathlib import Path
import json
from flask import Blueprint, jsonify
from whatevermodule import JSONFileCache
bp = Blueprint('cached_file_loader', __name__, url_prefix='/files')
json_cache = JSONFileCache()
#bp.record
def init(setup_state):
if not setup_state.options['use_caching']:
json_cache.disable()
#bp.route('/<path:path>')
def view_json(path):
localpath = Path('/tmp', path.lstrip('/'))
data = json_cache.get(localpath)
if data is None:
with open(localpath, 'r') as jsonfile:
data = json.load(jsonfile)
json_cache.set(localpath, data)
return jsonify(data)
The problem is, in this example a global variable 'json_cache' is used. This means if I create multiple instances of an app with this blueprint, using different caching settings, these will interfere, causing tests to fail.
What I've tried/considered
Keep using global variables, and just use one application per test file. This actually what I'm using currently. However,
since pytest collects all tests in one process, I have to do find ./tests -iname "test_*.py" -print0 | xargs -0 -n1 pytest
to make sure everything runs .
Storing the objects (in this case the json_cache) on the Blueprint instance. This is however not possible, since the 'bp' variable is also a global variable, that will be reused...
Inside the 'init' function, storing the objects inside setup_state.app.config. This should
work, since this variable is initialized per application. The
docs for configuration handling actually
say:
That way you can create multiple instances of your application with different configurations
attached which makes unit testing a lot easier.
Although option (3) should work, it looks very 'hacky' to me to store functional objects inside a dictionary marked as 'config', where I would only expect configuration variables.
Question
Is there a better way, that looks less 'hacky' that option (3), to store objects used inside blueprints, at the app level during intialization?
I want to define a bunch of config variables that can be imported in all the modules in my project. The values of those variables will be constant during runtime but are not known before runtime; they depend on the input. Usually I'd define a dict in my top module which would be passed to all functions and classes from other modules; however, I was thinking it may be cleaner to simply create a blank config.py module which would be dynamically filled with config variables by the top module:
# top.py
import config
config.x = x
# config.py
x = None
# other.py
import config
print(config.x)
I like this approach because I don't have to save the parameters as attributes of classes in my other modules; which makes sense to me because parameters do not describe classes themselves.
This works but is it considered bad practice?
The question as such may be disputed. But I would generally say yes, it's "bad practice" because scope and impact of change is really getting blurred. Note the use case you're describing really is not about sharing configuration, but about different parts of the program functions, objects, modules exchanging data and as such it's a bit of a variation on (meta)global variable).
Reading common configuration values could be fine, but changing them along the way... you may lose track of what happened where and also in which order as modules get imported / values get modified. For instance assume the config.py and two modules m1.py:
import config
print(config.x)
config.x=1
and m2.py:
import config
print(config.x)
config.x=2
and a main.py that just does:
import m1
import m2
import config
print(config.x)
or:
import m2
import m1
import config
print(config.x)
The state in which you find config in each module and really any other (incl. main.py here) depends on order in which imports have occurred and who assigned what value when. Even for a program entirely under your control, this may get confusing (and source of mistakes) rather quickly.
For runtime data and passing information between objects and modules (and your example is really that and not configuration that is predefined and shared between modules) I would suggest you look into describing the information perhaps in a custom state (config) object and pass it around through appropriate interface. But really just a function / method argument may be all that is needed. The exact form depends on what exactly you're trying to achieve and what your overall design is.
In your example, other.py behaves differently when called or imported before top.py which may still seem obvious and manageable in a minimal example, but really is not a very sound design. Anyone reading the code (incl. future you) should be able to follow its logic and this IMO breaks its flow.
The most trivial (and procedural) example of what for what you've described and now I hopefully have a better grasp of would be other.py recreating your current behavior:
def do_stuff(value):
print(value) # We did something useful here
if __name__ == "__main__":
do_stuff(None) # Could also use config with defaults
And your top.py presumably being the entry point and orchestrating importing and execution doing:
import other
x = get_the_value()
other.do_stuff(x)
You can of course introduce an interface to configure do_stuff perhaps a dict or a custom class even with default implementation in config.py:
class Params:
def __init__(self, x=None):
self.x = x
and your other.py:
def do_stuff(params=config.Params()):
print(params.x) # We did something useful here
And on your top.py you can use:
params = config.Params(get_the_value())
other.do_stuff(params)
But you could also have any use case specific source of value(s):
class TopParams:
def __init__(self, url):
self.x = get_value_from_url(url)
params = TopParams("https://example.com/value-source")
other.do_stuff(params)
x could even be a property which you retrieve every time you access it... or lazily when needed and then cached... Again, it really then is a matter of what you need to do.
"Is it bad practice to modify attributes of one module from another module?"
that it is considered as bad practice - violation of the law of demeter, which means in fact "talk to friends, not to strangers".
Objects should expose behaviour and functions, but should HIDE the data.
DataStructures should EXPOSE data, but should not have any methods (which are exposed). The law of demeter does not apply to such DataStructures. OOP Purists might cover such DataStructures with setters and getters, but it really adds no value in Python.
there is a lot of literature about that like : https://en.wikipedia.org/wiki/Law_of_Demeter
and of course, a must to read: "Clean Code", by Robert C. Martin (Uncle Bob), check it out on Youtube also.
For procedural programming it is perfectly normal to keep data in a DataStructure which does not have any (exposed) methods.
The procedures in the program work with that data. Consider to use the module attrs, see : https://www.attrs.org/en/stable/ for easy creation of such classes.
my prefered method for keeping config is (here without using attrs):
# conf_xy.py
"""
config is code - so why use damned parsers, textfiles, xml, yaml, toml and all that
if You just can use testable code as config that can deliver the correct types, etc.
as well as hinting in Your favorite IDE ?
Here, for demonstration without using attrs package - usually I use attrs (read the docs)
"""
class ConfXY(object):
def __init__(self) -> None:
self.x: int = 1
self.z: float = get_z_from_input()
...
conf_xy=ConfXY()
# other.py
from conf_xy import conf_xy
...
y = conf_xy.x * 2
...
I have a small function as follows:
def write_snapshot_backup_monitoring_values():
try:
snapshot_backup_result = 'my result'
with open(config.MONITOR_SNAPSHOT_BACKUP_FILE, "w") as snapshot_backup_file:
snapshot_backup_file.write(snapshot_backup_result)
except Exception as exception:
LOG.exception(exception)
where config.MONITOR_SNAPSHOT_BACKUP_FILE is declared in a config file with value = /home/result.log
when I try to write a test case using pytest and I call this function as follows:
constants.MONITOR_SNAPSHOT_BACKUP_FILE = "/tmp/result.log"
#pytest.mark.functional_test
def test_write_snapshot_backup_monitoring_values():
utils.write_snapshot_backup_monitoring_values()...
I want to monkey patch the value for config.MONITOR_SNAPSHOT_BACKUP_FILE with constants.MONITOR_SNAPSHOT_BACKUP_FILE which I have declared in the test case file. Basically I want that while runnning the test case it should create /tmp/result.log and not /home/result.log How can I do that? I am new to monkey patching in python.
You don't clear up what config is, so I assume it is another module you have imported. There's no specific technique for monkey-patching, you just assign the value. It's just a name for adding/modifying attributes at runtime.
config.MONITOR_SNAPSHOT_BACKUP_FILE = constants.MONITOR_SNAPSHOT_BACKUP_FILE
However, there's one thing to keep in mind here: Python caches imported modules. If you change this value, it will change for other python modules that have imported config and run in the same runtime. So, be careful that you don't cause any side effects.
Occasionally I want lazy module loading in Python. Usually because I want to keep runtime requirements or start-up times low and splitting the code into sub-modules would be cumbersome. A typical use case and my currently preferred implementation is this:
jinja2 = None
class Handler(...):
...
def render_with_jinja2(self, values, template_name):
global jinja2
if not jinja2:
import jinja2
env = jinja2.Environment(...)
...
I wonder: is there a canonical/better way to implement lazy module loading?
There's no reason for you to keep track of imports manually -- the VM maintains a list of modules that have already been imported, and any subsequent attempts to import that module result in a quick dict lookup in sys.modules and nothing else.
The difference between your code and
def render_with_jinja2(self, values, template_name):
import jinja2
env = jinja2.Environment(...)
is zero -- when we hit that code, if jinja2 hasn't been imported, it is imported then. If it already has been, execution continues on.
class Handler(...):
...
def render_with_jinja2(self, values, template_name):
import jinja2
env = jinja2.Environment(...)
...
There's no need to cache the imported module; Python does that already.
The other answers have covered the actual details but if you are interested in a lazy loading library, check out apipkg which is part of the py package (py.test fame).
Nice pattern from sqlalchemy: dependency injection:
#util.dependencies("sqlalchemy.orm.query")
def merge_result(query, *args):
#...
query.Query(...)
Instead of declaring all "import" statements at the top of the module, it will only import a module when it's actually needed by a function.
This can resolve circular dependency problems.
I want to replace settings.py in my Django system with a module implemented as a directory settings containing __init__.py. This will try to import a module named after the server, thus allowing for per-server settings.
If I don't know the name of a module before I import it then I can't use the import keyword but must instead use the __import__ function. But this does not add the contents of the module to the settings module. I need the equivalent of from MACHINE_NAME import *. Or I need a way to iterate over vars(m) (where m is the loaded module) and add them to the current namespace. But I can't work out how to refer to the current namespace in order to make the assignment. In other words, I can't use setattr(x, ..) or modify x.__dict__, because I don't know what to use for x.
I can't think of much else to try now apart from using exec. This seems a little feeble to me. Am I missing some aspect of Pythonic introspection that would allow me to manipulate the current scope while still in it?
For similar situation where based on lang setting I import different messages in messages.py module it is something like
# set values in current namespace
for name in vars(messages):
v = getattr(messages, name)
globals()[name] = v
Btw why do you want to create a package for settings.py? whatever you want to do can be done in settings.py directly?