How (and when/where) init Django cache? - python

I'm trying to use the Django LocMemCache for storing a few simple values but I'm not sure how to initialize the cache when Django starts. Django 1.7 come with Applications, allowing to run some code in AppConfig.ready(), that place would be perfect to initialize the cache, but according to Django 1.7 documentation:
" ... Although you can access model classes as described above, avoid
interacting with the database in your ready() implementation."
So, let say I want to store some DB queries from my model:
x = MyModel.objects.count()
y = MyModel.(a really expensive query)
How and when should I init the cache? Is there a recommended "best practice" for doing that?
Currently, I have just added the following cache.py to my application, but I'm not sure if
my code hits the database once (i.e. the first request) and then uses the cached value before (the following requests).
# cache.py
from django.core.cache import caches
from .models import MyModel
class Cache(object):
def __init__(self):
self.__count = MyModel.objects.count()
self.cache = caches['cache-storage']
#property
def total_count(self):
return self.cache.get('total_count', self.__count)
Then I use the cached values in this way:
# view.py
from .cache import Cache
cache = Cache()
...
(some view)
counter = cache.total_count

That tip about using databases in ready() is generally good advice but isn't always going to work out. Some applications need to access the database based on a schedule instead of a view. The only way I've seen to implement that is from within ready().
This does create an issue because all manage.py commands will run the code in ready(). This means that the application would be accessing the database with your runtime code while running migrations or creating super users which isn't ideal.
I avoid this issue by adding a sys.argv check in ready().
import sys
from django.apps import AppConfig
class MyApp(AppConfig):
name = 'MyApp'
def ready(self):
if 'runserver' in sys.argv:
# your code here

Related

Where in Django can I run startup to load data?

I have a django application that loads a huge array into memory after django is started for further processing.
What I do now is , after loading a specific view, I execute the code to load the array as follows:
try:
load_model = load_model_func()
except:
#Some exception
The code is now very bad. I want to create a class to load this model once after starting django and I want to be able to get this model in all other this model in all other methods of the application
So, is there a good practice for loading data into memory after Django starts?
#Ken4scholar 's answer is a good response in terms of "good practice for loading data AFTER Django starts". It doesn't really address storing/accessing it.
If your data doesn't expire/change unless you reload the application, then using cache as suggested in comments is redundant, and depending on the cache storage you use you might end up still pulling from some db, creating overhead.
You can simply store it in-memory, like you are already doing. Adding an extra class to the mix won't really add anything, a simple global variable could do.
Consider something like this:
# my_app/heavy_model.py
data = None
def load_model():
global data
data = load_your_expensive_model()
def get_data():
if data is None:
raise Exception("Expensive model not loaded")
return data
and in your config (you can lean more about applications and config here):
# my_app/app_config.py
from django.apps import AppConfig
from my_app.heavy_model import load_model
class MyAppConfig(AppConfig):
# ...
def ready(self):
load_model()
then you can simply call my_app.heavy_model.get_data directly in your views.
⚠️ You can also access the global data variable directly, but a wrapper around it may be nicer to have (also in case you want to create more abstractions around it later).
You can run a startup script when apps are initialized. To do this, call the script in the ready method of your Config class in apps.py like this:
class MyAppConfig(AppConfig):
name = 'myapp'
def ready(self):
<import and run script>

Django 1.10 seed database without using fixtures

So I've looked at the documentation, as well as this SO question, and the django-seed package, but none of these seem to fit what I'm trying to do.
Basically, I want to programmatically seed my Games model from an external API, but all the information I can find seems to be reliant on generating a fixture first, which seems like an unnecessary step.
For example, in Ruby/Rails you can write directly to seed.rb and seed the database in anyway that's desired.
If a similar functionality available in Django, or do I need to generate the fixture first from the API, and then import it?
You can use a data migration for this. First create an empty migration for your app:
$ python manage.py makemigrations yourappname --empty
In your empty migration, create a function to load your data and add a migrations.RunPython operation. Here's a modified version of the one from the Django documentation on migrations:
from __future__ import unicode_literals
from django.db import migrations
def stream_from_api():
...
def load_data(apps, schema_editor):
# We can't import the Person model directly as it may be a newer
# version than this migration expects. We use the historical version.
Person = apps.get_model('yourappname', 'Person')
for item in stream_from_api():
person = Person(first=item['first'], last=item['last'], age=item['age'])
person.save()
class Migration(migrations.Migration):
dependencies = [('yourappname', '0009_something')]
operations = [migrations.RunPython(load_data)]
If you have a lot of simple data, you might benefit from the bulk-creation methods:
from __future__ import unicode_literals
from django.db import migrations
def stream_from_api():
...
def load_data(apps, schema_editor):
# We can't import the Person model directly as it may be a newer
# version than this migration expects. We use the historical version.
Person = apps.get_model('yourappname', 'Person')
def stream_people():
for item in stream_from_api():
yield Person(first=item['first'], last=item['last'], age=item['age'])
# Adjust (or remove) the batch size depending on your needs.
# You won't be able to use this method if your objects depend on one-another
Person.objects.bulk_create(stream_people(), batch_size=10000)
class Migration(migrations.Migration):
dependencies = [('yourappname', '0009_something')]
operations = [migrations.RunPython(load_data)]
Migrations have the added benefit of being automatically enclosed in a transaction, so you can stop the migration at any time and it won't leave your database in an inconsistent state.
Would it work for you to write some class method on the Games model that creates the data? Presumably this method query the external API, package up the Games() objects as a list called games and then use a Games.objects.bulk_create(games) to insert it into the database.

Cross module variable sharing without duplicating imports

I have a module models.py with some data logic:
db = PostgresqlDatabase(database='database', user='user')
# models logic
and flask app which actually interacts with database:
from models import db, User, ...
But I want to move initializing all setting from one config file in flask app:
So I could separate importing db from other stuff (I need this for access to module variable db in models):
import models
from models import User, ...
app.config.from_object(os.environ['APP_SETTINGS'])
models.db = PostgresqlDatabase(database=app.config['db'],
user=app.config['db'])
and use further db as models.db
But seems it is kinda ugly. Duplicating imports, different usage of module stuff..
Is there any better way how to handle this situation?
I'd recommend 1 level of indirection, so that your code becomes like this:
import const
import runtime
def foo():
runtime.db.execute("SELECT ... LIMIT ?", (..., const.MAX_ROWS))
You get:
clear separation of leaf module const
mocking and/or reloading is possible
uniform and concise access in all user modules
To get rich API on runtime, use the "replace module with object at import" trick ( see __getattr__ on a module )

How to get Registry().settings during Pyramid app startup time?

I am used to develop web applications on Django and gunicorn.
In case of Django, any application modules in a Django application can get deployment settings through django.conf.settings. The "settings.py" is written in Python, so that any arbitrary settings and pre-processing can be defined dynamically.
In case of gunicorn, it has three configuration places in order of precedence, and one settings registry class instance combines those.(But usually these settings are used only for gunicorn not application.)
Command line parameters.
Configuration file. (like Django, written in
Python which can have any arbitrary
settings dynamically.)
Paster application settings.
In case of Pyramid, according to Pyramid documentation, deployment settings may be usually put into pyramid.registry.Registry().settings. But it seems to be accessed only when a pyramid.router.Router() instances exists.
That is pyramid.threadlocal.get_current_registry().settings returns None, during the startup process in an application "main.py".
For example, I usually define some business logic in SQLAlchemy model modules, which requires deployment settings as follows.
myapp/models.py
from sqlalchemy import Table, Column, Types
from sqlalchemy.orm import mapper
from pyramid.threadlocal import get_current_registry
from myapp.db import session, metadata
settings = get_current_registry().settings
mytable = Table('mytable', metadata,
Column('id', Types.INTEGER, primary_key=True,)
(other columns)...
)
class MyModel(object):
query = session.query_property()
external_api_endpoint = settings['external_api_uri']
timezone = settings['timezone']
def get_api_result(self):
(interact with external api ...)
mapper(MyModel, mytable)
But, "settings['external_api_endpoint']" raises a TypeError exception because the "settings" is None.
I thought two solutions.
Define a callable which accepts "config" argument in "models.py" and "main.py" calls it with a
Configurator() instance.
myapp/models.py
from sqlalchemy import Table, Column, Types
from sqlalchemy.orm import mapper
from myapp.db import session, metadata
_g = globals()
def initialize(config):
settings = config.get_settings()
mytable = Table('mytable', metadata,
Column('id', Types.INTEGER, rimary_key = True,)
(other columns ...)
)
class MyModel(object):
query = session.query_property()
external_api_endpoint = settings['external_api_endpoint']
def get_api_result(self):
(interact with external api)...
mapper(MyModel, mytable)
_g['MyModel'] = MyModel
_g['mytable'] = mytable
Or, put an empty module "app/settings.py", and put setting into it later.
myapp/__init__.py
from pyramid.config import Configurator
from .resources import RootResource
def main(global_config, **settings):
config = Configurator(
settings = settings,
root_factory = RootResource,
)
import myapp.settings
myapp.setting.settings = config.get_settings()
(other configurations ...)
return config.make_wsgi_app()
Both and other solutions meet the requirements, but I feel troublesome. What I want is the followings.
development.ini
defines rough settings because development.ini can have only string type constants.
[app:myapp]
use = egg:myapp
env = dev0
api_signature = xxxxxx
myapp/settings.py
defines detail settings based on development.ini, beacause any arbitrary variables(types) can be set.
import datetime, urllib
from pytz import timezone
from pyramid.threadlocal import get_current_registry
pyramid_settings = get_current_registry().settings
if pyramid_settings['env'] == 'production':
api_endpoint_uri = 'http://api.external.com/?{0}'
timezone = timezone('US/Eastern')
elif pyramid_settings['env'] == 'dev0':
api_endpoint_uri = 'http://sandbox0.external.com/?{0}'
timezone = timezone('Australia/Sydney')
elif pyramid_settings['env'] == 'dev1':
api_endpoint_uri = 'http://sandbox1.external.com/?{0}'
timezone = timezone('JP/Tokyo')
api_endpoint_uri = api_endpoint_uri.format(urllib.urlencode({'signature':pyramid_settings['api_signature']}))
Then, other modules can get arbitrary deployment settings through "import myapp.settings".
Or, if Registry().settings is preferable than "settings.py", **settings kwargs and "settings.py" may be combined and registered into Registry().settings during "main.py" startup process.
Anyway, how to get the settings dictionay during startup time ? Or, Pyramid gently forces us to put every code which requires deployment settings in "views" callables which can get settings dictionary anytime through request.registry.settings ?
EDIT
Thanks, Michael and Chris.
I at last understand why Pyramid uses threadlocal variables(registry and request), in particular registry object for more than one Pyramid applications.
In my opinion, however, deployment settings usually affect business logics that may define application-specific somethings. Those logics are usually put in one or more Python modules that may be other than "app/init.py" or "app/views.py" that can easily get access to Config() or Registry(). Those Python modules are normally "global" at Python process level.
That is, even when more than one Pyramid applications coexist, despite their own threadlocal variables, they have to share those "global" Python modules that may contain applicatin-specific somethings at Python process level.
Of cause, every those modules can have "initialize()" callalbe which is called with a Configurator() by the application "main" callable, or passing Registory() or Request() object through so long series of function calls can meet usual requirements. But, I guess Pyramid beginers (like me) or developers who has "large application or so many settings" may feel troublesome, although that is Pyramid design.
So, I think, Registry().settings should have only real "thread-local" variables, and should not have normal business-logic settings. Responsibility for segregation of multiple application-specific module, classes, callables variables etc. should be taken by developer.
As of now, from my viewpoint, I will take Chris's answer. Or in "main" callable, do "execfile('settings.py', settings, settings)" and put it in some "global" space.
Another option, if you enjoy global configuration via Python, create a settings.py file. If it needs values from the ini file, parse the ini file and grab them out (at module scope, so it runs at import time):
from paste.deploy.loadwsgi import appconfig
config = appconfig('config:development.ini', 'myapp', relative_to='.')
if config['env'] == 'production':
api_endpoint_uri = 'http://api.external.com/?{0}'
timezone = timezone('US/Eastern')
# .. and so on ...
'config:development.ini' is the name of the ini file (prefixed with 'config:'). 'myapp' is the section name in the config file representing your app (e.g. [app:myapp]). "relative_to" is the directory name in which the config file can be found.
The pattern that I use is to pass the Configurator to modules that need to be initialized. Pyramid doesn't use any global variables because a design goal is to be able to run multiple instances of Pyramid in the same process. The threadlocals are global, but they are local to the current request, so different Pyramid apps can push to them at the same time from different threads.
With this in mind, if you do want a global settings dictionary you'll have to take care of that yourself. You could even push the registry onto the threadlocal manager yourself by calling config.begin().
I think the major thing to take away here is that you shouldn't be calling get_current_registry() at the module level, because at the time of import you aren't really guaranteed that the threadlocals are initialized, however in your init_model() function if you call get_current_registry(), you'd be fine if you previously called config.begin().
Sorry this is a little convoluted, but it's a common question and the best answer is: pass the configurator to your submodules that need it and allow them to add stuff to the registry/settings objects for use later.
Pyramid uses static configration by PasteDeploy, unlike Django.
Your [EDIT] part is a nice solution, I think Pyramid community should consider such usage.

Call method dynamically if it exists in Django

I'm creating website based on Django (I know it's pure Python, so maybe it could be also answered by people who knows Python well) and I need to call some methods dynamically.
For example I have few applications (modules) in my website with the method "do_search()" in the views.py. Then I have one module called for example "search" and there I want to have an action which will be able to call all the existing "do_search()" in other applications. Of course I don't like to add each application to the import, then call it directly. I need some better way to do it dynamically.
I can read INSTALLED_APPS variable from settings and somehow run through all of the installed apps and look for the specific method? Piece of code will help here a lot :)
Thanks in advance!
Ignas
I'm not sure if I truly understand the question, but please clarify in a comment to my answer if I'm off.
# search.py
searchables = []
def search(search_string):
return [s.do_search(search_string) for s in searchables]
def register_search_engine(searchable):
if hasattr(searchable, 'do_search'):
# you want to see if this is callable also
searchables.append(searchable)
else:
# raise some error perhaps
# views.py
def do_search(search_string):
# search somehow, and return result
# models.py
# you need to ensure this method runs before any attempt at searching can begin
# like in models.py if this app is within installed_apps. the reason being that
# this module may not have been imported before the call to search.
import search
from views import do_search
search.register_search_engine(do_search)
As for where to register your search engine, there is some sort of helpful documentation in the signals docs for django which relates to this.
You can put signal handling and registration code anywhere you like. However, you'll need to make sure that the module it's in gets imported early on so that the signal handling gets registered before any signals need to be sent. This makes your app's models.py a good place to put registration of signal handlers.
So your models.py file should be a good place to register your search engine.
Alternative answer that I just thought of:
In your settings.py, you can have a setting that declares all your search functions. Like so:
# settings.py
SEARCH_ENGINES = ('app1.views.do_search', 'app2.views.do_search')
# search.py
from django.conf import settings
from django.utils import importlib
def search(search_string):
search_results = []
for engine in settings.SEARCH_ENGINES
i = engine.rfind('.')
module, attr = engine[:i], engine[i+1:]
mod = importlib.import_module(module)
do_search = getattr(mod, attr)
search_results.append(do_search(search_string))
return search_results
This works somewhat similar to registering MIDDLEWARE_CLASSES and TEMPLATE_CONTEXT_PROCESSORS. The above is all untested code, but if you look around the django source, you should be able to flesh this out and remove any errors.
If you can import the other applications through
import other_app
then it should be possible to perform
method = getattr(other_app, 'do_' + method_name)
result = method()
However your approach is questionable.

Categories