Sharing information in Django application - python

I have a very simple model that has data in it that I need to use in various places in my application:
class Setting(models.Model):
name = models.CharField()
value = models.TextField()
I'd like to be able to load this information into a dictionary, then ship that data around my application so I don't have to make duplicate calls to the database. My attempt at doing so was wrapping the logic in a module like so (the print statement is there for debugging):
my_settings.py
from myapp import models
class Settings:
__settings = {}
def __init__(self):
if(not self.__class__.__settings):
print("===== Loading settings from table =====")
qs = models.Setting.objects.all()
for x in qs:
self.__class__.__settings[x.name] = x.value
def get(self, key, default=None):
return self.__class__.__settings.get(key, default)
def getint(self, key, default=0):
return int(self.__class__.__settings.get(key, default))
Using this module would then look like the following:
from my_settings import Settings
# Down in some view somewhere...
settings = Settings()
data = settings.get("some_key")
...
# Now we might be in a helper function somewhere, but still in the
# same view context as above. Note that we should not have made
# a database round trip here; we're using our memory store instead.
settings = Settings()
data = settings.get("another_key")
This seems to work fine, but it has the drawback that the data is loaded once (and only once) at the initial instantiation. If any of the data in the settings database table should change, those changes won't be reflected in the corresponding dictionary held by this class.
Is there a better approach here? I don't mind having a single database query per request, but I also don't want to have to pass the dictionary around from function to function. I was hoping a module-level wrapper would get me the "singleton"-ness that I desire, but it's apparently caching things more aggressively than I thought it would.

I would just not worry about it and once you go into production either do blanket caching using cachalot or write your own rough cache for just this model.

Related

Hook every variable reference and execute code

I have a Python app split across different files. One of them, models.py, contains, among PyQt5 table models, several maps referred from several PyQt5 form files:
# first lines:
agents_id_map = \
{agent.name:agent.id for agent in db.session.query(db.Agent, db.Agent.id)}
# ....
# 2000 thousand lines
I want to keep this kind of maps centralized in a single point. I'm using SQLAlchemy also. Agent class is defined in a db.py file. I use these maps to fulfill the foreign key in another object, say, an invoice, like:
invoice = db.Invoice()
# Here is a reference
invoice.agent_id = models.agents_id_map[agent_combo.currentText()]
····
db.session.add(invoice)
db.session.commit()
The problem is that the model.py module gets cached and several parts of the application access old data, and, if another running instance A of the app creates a new agent, and a running instance B wants to create a new invoice, the B running instance won't see the new Agent created by A unless restarts the app. This also happens if a user in the same running instance creates an agent and then he wants to create an invoice. My solutions are:
Reload the module, to get the whole code executed again, but this could be very expensive.
Isolate the code building those maps in another file, say maps.py, which would be less expensive to reload and change all code that references it through refactoring.
Is there a solution that would allow me to touch only the code building those maps and the rest of the application remains ignorant of the change, and every time the map is referenced from another module or even the same, the code gets executed, effectively re-building maps with fresh data?
Is there a solution that would allow me to touch only the code building those maps and the rest of the application remains ignorant of the change, and every time the map is referenced from another module or even the same, the code gets executed, effectively re-building maps with fresh data?
Certainly: put you maps inside a function, or even better, a class.
If I understand this problem correctly, you have stateful data (maps) which need regenerating under some condition (every time they are accessed? Or just every time the db is updated?). I would do something like this:
class Mappings:
def __init__(self, db):
self._db = db
... # do any initial db stuff you need to here
def id_map(self, thing):
db_thing = getattr(self._db, thing.title)
return {x.name:x.id for x in self._db.session.query(db_thing, db_thing.id)}
def other_property_map(self, prop):
... # etc
mapping = Mapping(db)
mapping.id_map("agent")
This assumes that the mapping example you've given is your major use-case, but this model could easily be adapted for almost any other mapping you might want.
You would write a method of every kind of 'mapping' you need, and it would return the desired dictionary. Note that here I've assumed you handle setting up the db elsewhere and pass a fully initialised db access object to the class, which is probably what you want to do---this class is just about encapsulating mapper state, not re-inventing your orm.
Caching
I have not provided any caching. But if you have complete control over the db, it is easy enough to run a hook before you do any db commits looking to see if you've touched any particular model, and then state that those need rebuilding. Something like this:
class DbAccess(Mappings):
def __init__(self, db, models):
super().init(db)
self._cached_map = {model: {} for model in models}
def db_update(model: str, params: dict):
try:
self._cached_map[model] = {} # wipe cache
except KeyError:
pass
self._db.update_with_model(model, params) # dummy fn
def id_map(self, thing: str):
try:
return self._cached_map[thing]["id"]
except KeyError:
self._cached_map[thing]["id"] = super().id_map(thing)
return self._cached_map[thing]["id"]
I don't really think DbAccess should inherit from Mappings---put it all in one class, or have a DB class and a Mappings mixin and inherit from both. I just didn't want to write everything out again.
I've not written any real db access routines, (hence my dummy fn) as I don't know how you're doing it (but clearly using an ORM). But the basic idea is just to handle the caching yourself, by storing the mapping every time, but deleting all the stored mappings every time you do any commit transactions involving the model in question (thus rebuilding the cache as needed).
Aside
Note that if you really do have 2,000 lines of manually declared mappings of the form thing.name: thing.id you really should generate them at runtime anyhow. Declarative is all very well and good, but writing out 2,000 permutations of the same thing isn't declarative, it's just time-consuming---and doing the job a simple loop putting the data in ram could do for you at startup.

Where in Django can I run startup to load data?

I have a django application that loads a huge array into memory after django is started for further processing.
What I do now is , after loading a specific view, I execute the code to load the array as follows:
try:
load_model = load_model_func()
except:
#Some exception
The code is now very bad. I want to create a class to load this model once after starting django and I want to be able to get this model in all other this model in all other methods of the application
So, is there a good practice for loading data into memory after Django starts?
#Ken4scholar 's answer is a good response in terms of "good practice for loading data AFTER Django starts". It doesn't really address storing/accessing it.
If your data doesn't expire/change unless you reload the application, then using cache as suggested in comments is redundant, and depending on the cache storage you use you might end up still pulling from some db, creating overhead.
You can simply store it in-memory, like you are already doing. Adding an extra class to the mix won't really add anything, a simple global variable could do.
Consider something like this:
# my_app/heavy_model.py
data = None
def load_model():
global data
data = load_your_expensive_model()
def get_data():
if data is None:
raise Exception("Expensive model not loaded")
return data
and in your config (you can lean more about applications and config here):
# my_app/app_config.py
from django.apps import AppConfig
from my_app.heavy_model import load_model
class MyAppConfig(AppConfig):
# ...
def ready(self):
load_model()
then you can simply call my_app.heavy_model.get_data directly in your views.
⚠️ You can also access the global data variable directly, but a wrapper around it may be nicer to have (also in case you want to create more abstractions around it later).
You can run a startup script when apps are initialized. To do this, call the script in the ready method of your Config class in apps.py like this:
class MyAppConfig(AppConfig):
name = 'myapp'
def ready(self):
<import and run script>

Mocking elasticsearch-py calls

I'm writing a CLI to interact with elasticsearch using the elasticsearch-py library. I'm trying to mock elasticsearch-py functions in order to test my functions without calling my real cluster.
I read this question and this one but I still don't understand.
main.py
Escli inherits from cliff's App class
class Escli(App):
_es = elasticsearch5.Elasticsearch()
settings.py
from escli.main import Escli
class Settings:
def get(self, sections):
raise NotImplementedError()
class ClusterSettings(Settings):
def get(self, setting, persistency='transient'):
settings = Escli._es.cluster\
.get_settings(include_defaults=True, flat_settings=True)\
.get(persistency)\
.get(setting)
return settings
settings_test.py
import escli.settings
class TestClusterSettings(TestCase):
def setUp(self):
self.patcher = patch('elasticsearch5.Elasticsearch')
self.MockClass = self.patcher.start()
def test_get(self):
# Note this is an empty dict to show my point
# it will contain childs dict to allow my .get(persistency).get(setting)
self.MockClass.return_value.cluster.get_settings.return_value = {}
cluster_settings = escli.settings.ClusterSettings()
ret = cluster_settings.get('cluster.routing.allocation.node_concurrent_recoveries', persistency='transient')
# ret should contain a subset of my dict defined above
I want to have Escli._es.cluster.get_settings() to return what I want (a dict object) in order to not make the real HTTP call, but it keeps doing it.
What I know:
In order to mock an instance method I have to do something like
MagicMockObject.return_value.InstanceMethodName.return_value = ...
I cannot patch Escli._es.cluster.get_settings because Python tries to import Escli as module, which cannot work. So I'm patching the whole lib.
I desperately tried to put some return_value everywhere but I cannot understand why I can't mock that thing properly.
You should be mocking with respect to where you are testing. Based on the example provided, this means that the Escli class you are using in the settings.py module needs to be mocked with respect to settings.py. So, more practically, your patch call would look like this inside setUp instead:
self.patcher = patch('escli.settings.Escli')
With this, you are now mocking what you want in the right place based on how your tests are running.
Furthermore, to add more robustness to your testing, you might want to consider speccing for the Elasticsearch instance you are creating in order to validate that you are in fact calling valid methods that correlate to Elasticsearch. With that in mind, you can do something like this, instead:
self.patcher = patch('escli.settings.Escli', Mock(Elasticsearch))
To read a bit more about what exactly is meant by spec, check the patch section in the documentation.
As a final note, if you are interested in exploring the great world of pytest, there is a pytest-elasticsearch plugin created to assist with this.

Flask: Calling a class that takes a Resource

I have an endpoint that looks like:
api.add_resource(UserForm,'/app/user/form/<int:form_id>', endpoint='user_form')
My UserForm looks like:
class UserForm(Resource):
def get(self, form_id):
# GET stuff here
return user_form_dictionary
If I had a function called get_user_form(form_id) and I wanted to retrieve the return value from UserForm's get method based on the form_id parameter passed in. Is there a way in Flask that allows for some way to call UserForm's get method within the program?
def get_user_form(form_id):
user_form_dictionary = # some way to call UserForm class
# user_form_dictionary will store return dictionary from
# user_form_dictionary, something like: {'a': 'blah', 'b': 'blah'}
I'm not sure if there is a way to directly access the get method of the UserForm class from within your app, the only thing that springs to mind is to call the url for that resource but I don't recommend doing that.
Are you using the flask-restful extension by some chance? if so the below is based on the suggested intermediate project structure from there site here
In a common module (this contains functions that will be used throughout your application)
common\util.py
def get_user_form(form_id):
# logic to return the form data
Then in your .py that contains the UserForm class, import the util.py file from the common module then do the below
class UserForm(Resource):
def get(self, form_id):
user_form_dictionary = get_user_form(form_id)
# any additional logic. i try and keep it to a minimum as the function called
# would contain it. also this way maintanence is easier
return user_form_dictionary
Then somewhere else in your app after importing the common module you can reuse the same function(s).
def another_function(form_id):
user_form_dictionary = get_user_form(form_id)
# any additional logic.
# same rules as before
return user_form_dictionary
Fetch and display the data using Javascript's Fetch API.

python in Google App Engine - having the class-instead-of-instance problem, but using db.Model so can't create init method

i am writing an app to compare products, using Python and GAE. The products will belong to a set of similar products, and the app calculates the best value in each set.
When i create a new product, it can be added to an existing set or a new set can be created.
When testing the app, the first set gets created just fine. I populate an instance of the set with the name of the product. I use a form on one web page to POST the data into the "suppbook" page. I'm still not clear on how a web page can be a class but that's a different question.
There's more code around all of this but I'm trying to make my question as clear as possible.
class Supp(db.Model):
name = db.StringProperty(multiline=False)
# a bunch of other attributes using Google's DB Model
class SuppSet(db.Model):
name = db.StringProperty(default='')
supp_list = set([])
# a bunch of other attributes using Google's DB Model
# i tried to add this after reading a few questions on SO but GAE doesn't like it
def __init__(self,):
self.name = 'NoName'
self.best_value = 'NoBestValue'
self.supp_list = set([])
Class Suppbook(webapp.RequestHandler):
def post(self):
supp = Supp()
suppSet = SuppSet()
...
supp.name = self.request.get('name')
supp.in_set = self.request.get('newset')
suppSet.name = supp.in_set
suppSet.supp_list.add(supp.name)
self.response.out.write('%s now contains %s<p>' % (suppSet.name,suppSet.supp_list))
This works well the first time around, and if I only use one SuppSet, I can add many supps to it. If I create another SuppSet, though, both suppSets will have the same contents for their supp_list. I have been looking through the questions on here and I think (know) I'm doing something wrong regarding class vs. instance attribute access. I tried to create an __init__ method for SuppSet but GAE complained: AttributeError: 'SuppSet' object has no attribute '_entity'
Also, I am using the GAE datastore to put() and get() the Supps and SuppSets, so I'm not clear why I'm not acting on the unique instances that I should be pulling from the DB.
I am not sure if I am providing enough information but I wanted to get started on this issue. Please let me know if more info is needed to help debug this.
I'm also open to the idea that i'm going about this completely wrong. I'm considering re-writing the whole thing, but I'm so close to being "finished" with basic functionality that I'd like to try to solve this issue.
Thanks
In your init you will need to call the super's init, db.Model has some important stuff to do in its init, you will have to match the signature.
However you likely shouldn't be setting up things like defaults in there. Try and just use the datastore Properties ability to set a default.
You've got some (I assume) typos in your code. Python is sensitive to case and white-space. The attribute names you use also don't match your defs, such as in_set. When possible, post actual working examples demonstrating your problem.
class Supp(db.Model):
name = db.StringProperty(multiline=False)
in_set = db.StringProperty(multiline=False)
# your other stuff ...
class SuppSet(db.Model):
name = db.StringProperty(default='')
supp_list = db.StringListProperty()
# your other stuff ...
# In Python, you need to explicitly call the parent's __init__ with your args.
# Note that this is NOT needed here.
def __init__(self, **kwargs):
db.Model.__init__(self, **kwargs)
class Suppbook(webapp.RequestHandler):
def post(self):
# This will create a NEW Supp and SuppSet every request,
# it won't fetch anything from the datastore.
# These are also NOT needed (included for explanation)
supp = Supp()
suppSet = SuppSet()
# It sounds like you want something like:
product_name = self.request.get('name')
product_set = self.request.get('newset')
# check for missing name / set:
if not product_name or not product_set:
# handle the error
self.error(500)
return
# Build the keys and batch fetch.
supp_key = db.Key.from_path('Supp', product_name)
suppset_key = db.Key.from_path('SuppSet', product_set)
supp, suppset = db.get([supp_key, suppset_key])
if not supp:
supp = Supp(key_name=product_name,
name=product_name)
if not suppset:
suppset = SuppSet(key_name=product_set,
name=product_set)
# Update the entities
supp.in_set = product_set
if product_name not in suppset.supp_list:
suppset.supp_list.append(product_name)
# Batch put...
db.put([supp, suppset])
self.response.out.write('%s now contains %s<p>' % (suppset.name, str(suppset.supp_list)))

Categories