global counter in Django Application? - python

I was wondering if there is "global counter" in Django application, like the way I store "global counter" in Servlet Context scope in Tomcat.
something like
getServletContext().getAttribute("counter");
counter++;

When you write a django application (or any wsgi application, for that matter), you don't know beforehand if your application will end up running standalone on a single server, or multithreaded, or multiprocessed, or even in multiple separate machines as part of a load balancing strategy.
If you're going to make the constraint "my application only works on single-process servers" then you can use something like this:
from django import settings
settings.counter += 1
However that constraint is often not feasible. So you must use external storage to your counter.
If you want to keep it on memory, maybe a memcached
Maybe you just log the requests to this view. So when you want the counter just count the number of entries in the log.
The log could be file-based, or it could be a table in the database, just define a new model on your models.py.

Related

Does python with wsgi (uwsgi) under nginx have some small default cache?

In my small web-site I feel need to make some data widely available, to avoid exchanging with database for every request made. E.g. this could be the list of current users show in the bottom of every page or the time of last update of ranking.
The stuff works in Python (Flask) running upon nginx + uwsgi (this docker image).
I wonder, do I have some small cache or shared memory for keeping such information "out of the box", or I need to take care of explicitly setting up some dedicated cache? Or perhaps some thing like this is provided by nginx?
alternatively I still can use database for it has its own cache I think, anyway
Sorry if question seems to be naive/silly - for I come from java world (where things a bit different as we serve all requests with one fat instance of java application) - and have some difficulty grasping what powers does wsgi/uwsgi provide. Thanks in advance!
Firstly, nginx has cache:
https://www.nginx.com/blog/nginx-caching-guide/
But for flask cacheing you also have options:
https://pythonhosted.org/Flask-Cache/
http://flask.pocoo.org/docs/1.0/patterns/caching/
Did you have a look at caching section from Flask docs?
It literally says:
Flask itself does not provide caching for you, but Werkzeug, one of the libraries it is based on, has some very basic cache support
You create a cache object once and keep it around, similar to how Flask objects are created. If you are using the development server you can create a SimpleCache object, that one is a simple cache that keeps the item stored in the memory of the Python interpreter:
from werkzeug.contrib.cache import SimpleCache
cache = SimpleCache()
-- UPDATE --
Or you could solve on the frontend side storing data in the web browser local storage.
If there's nothing in the local storage you call the DB, else you use the information from local storage rather than making db call.
Hope it helps.

What is a good way to organize your models, connections if one wants to use SQLAlchemy to connect several databases to various applications?

Background:
This is the situation I am facing and so far my current solution seems rather clunky. I want to improve on it. Right now:
I setup connections to each database in the main function of the Pyramid application:
def main(global_config, **settings):
a_engine = engine_from_config(settings, 'A.')
b_engine = engine_from_config(settings, 'B.')
ASession.configure(bind=a_engine)
BSession.configure(bind=b_engine)
"ASession" and "BSession" are simply globally defined scoped_session in /models/init.py.
ASession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
I define model base class like so. For example:
ABase = declarative_base()
class user(ABase):
id = Column(Integer, primary_key=True)
name = Column(String)
This somehow already doesn't feel very clean. But now that this model is supposed to be accessed from a different application, I also need to define the engine and connection again in that application. This feels extremely redundant.
Problem Abstracted:
Assume that there are 2 different databases:
A and B
Also assume that you want A and B to be accessible from 2 different applications (e.g.: Pyramid application, Bokeh Server App which uses Tornado) using the same model.
In short, how would one best pattern objects/models/classes/functions to produce clean non-redundant code in Python3?
Initial Thought After The Question Was Posted:
Thinking about this a bit more, I think I want each model to be somehow "self-contained". The model should bring with it methods for initiating connections. In other words, the initiation of db connections should be decoupled from the web application itself.
And it should be done in an instance kind of manner. So that multiple applications can use the same models. Each application would have its own session connection to either DB.
How would the community pattern this? Friday afternoons don't lend themselves to find answers to these kinds of questions for myself at least.
I have done this. My recommendation below is how I like doing it, but is not the only way. I would ditch scoped sessions and the transaction manager and make explicit session management objects, with request lifecycle callbacks handling creation, closing, committing, or rolling back your sessions. Basically scoped sessions are a way to simulate a global by getting the same item for that thread of execution. The other way to do this in Pyramid is to attach things to the registry and the request, because you have those everywhere. You attach shared components to the registry (the ZCA) and per-request objects to the request.
When you have multiple sessions, I've found it much easier to reason about them and keep track of them if they are handled by components that wrap up everything for that engine. So for a case like that you describe, I've made two different DB engine components, that are created on start up, attached to the registry, and have a method for getting a fresh session. If you create these components properly, they should be usable in any application, whether it's Pyramid, Tornado, or your test script. You just make sure it has a constructor with some sane way of passing in settings for setting up the engine, whether it's a settings dict or kwargs. I then make my data model(s) live in their own python packages and it's easy to have any app in the family import the model, instantiate the engine components and go to town. Note that if you like using the ZCA registry (and I love it, it's a fantastic DI system), there's nothing preventing you from using it in non-pyramid apps, you just set it up manually in your server start up code.
In Pyramid specifically, I make a custom Request class and use the reify decorator to allow other pyramid code to get the session(s) for that request. The request class has end-of-life-callbacks attached to close out the sessions, and to do rollbacks or commits. There is a bit more boilerplate, but for me it's cleaner in that I can very easily trace where and when in code and time my session management is happening. It's also a good way for testing.
That said, there are lots of smart folks in SQLAlchemy/Pyramid land who swear by scoped sessions and the transaction manager, so there are other valid approaches. Hope that helps.

How to change settings in a middleware?

I'd like to change the environment variable DJANGO_SETTINGS_MODULE (along with a few others) and then have ALL relevant modules like django.conf, django.db etc reloaded to reflect the information from the new settings module. The new settings module will have different database. I will be doing this in a middleware.
I was able to achieve this by reloading a few modules along with django.conf and django.db. All new SQL statements were fired against the new DB.
But this appears to be so hackish.
The main reason for me wanting to do this is to have the same apache child process serve requests for different django applications (different settings and not different apps) without having to recreate a new apache child process which reloads the whole thing.
Is there a clean way of achieving what I want to do?
Thanks,
UPDATE (19-Sept-2014): I have accepted Daniel Roseman's answer as that seems to be the reality in the context of the question asked. The Router approach suggested by him was something that I explored but couldn't use because django's transaction classes don't use the router. The router I presume exists for a different reason. The application code base I'm working on, which is pretty large, has tons of transaction.commit_manually for the default or a specific db alias. I was trying to get this to support multiple client databases without changing the application code.
However, I did manage to solve the main problem which was to support multiple client DBs and other settings. I don't try to change the settings on the fly nor do I use the router. I instead have a single settings.py with all client DB information. I monkey patched the connection handler to return a different database connection for 'default' alias (or other specific alias used by the code) based on certain env variables set in the middleware. So far this has worked fine. I will post an update if I run into any issues or if someone else can point out a potential issue with the approach.
No, there is no way to do this, and that's a very good thing as it is a bad idea. There is no reason to use the same Apache process for different sites: instead you should have different virtual hosts for each of your sites, and let Apache manage them.

Global variables in Python and Apache mod_wsgi

I know there are frameworks that exist, but I am trying to use wsgi directly to improve my own understanding.
I have my wsgi handler, and up at the top I have declared a variable i = 0.
In my application(environ, start_response) function, I declare global i, then I increment i whenever a button is pressed.
It is my understanding that the state of this variable is preserved as long as the server is running, so all users of the web app see the same i.
If I declare i inside the application function, then the value of i resets to 0 any time a request is made.
I'm wondering, how do you preserve i between a single user's requests, but not have it preserved across different users' sessions? So, a single user can make multiple posts, and i will increment, but if another user visits the web app, they start with i=0.
And I know you could store i in a database between posts, but is it possible to just keep i in memory between posts?
Web-apps are generally “shared-nothing”. In the context of WSGI, that means you have no idea how many times your application (and the counter with it) will be instantiated; that choice is up to the WSGI server which acts as your app's container.
If you want some notion of user sessions, you have to implement it explicitly, typically on top of cookies. If you want persistence, you need a component that explicitly supports it; that can be a shared database, or it can piggy-back on your cookie sessions.
Add WSGI session middleware and store the value these.

Django statelessness?

I'm just wondering if Django was designed to be a fully stateless framework?
It seems to encourage statelessness and external storage mechanisms (databases and caches) but I'm wondering if it is possible to store some things in the server's memory while my app is in develpoment and runs via manage.py runserver.
Sure it's possible. But if you are writing a web application you probably won't want to do that because of threading issues.
That depends on what you mean by "store things in the server's memory." It also depends on the type of data. If you can, you're better off storing "global data" in a database or in the file system somewhere. Unless it is needed every request it doesn't really make sense to store it in the Django instance itself. You'll need to implement some form of locking to prevent race conditions, but you'd need to worry about race conditions if you stored everything on the server object anyway.
Of course, if you're talking about user-by-user data, Django does support sessions. Or, and this is another perfectly good option if you're willing to make the user save the data, cookies.
The best way to maintain state in a django app on a per-user basis is request.session (see django sessions) which is a dictionary you can use to remember things about the current user.
For Application-wide state you should use the a persistent datastore (database or key/value store)
example view for sessions:
def my_view(request):
pages_viewed = request.session.get('pages_viewed', 1) + 1
request.session['pages_viewed'] = pages_viewed
...
If you wanted to maintain local variables on a per app-instance basis you can just define module level variables like so
# number of times my_view has been served since by this server
# instance since the last restart
served_since_restart = 0
def my_view(request):
served_since_restart += 1
...
If you wanted to maintain some server state across ALL app servers (like total number of pages viewed EVER) you should probably use a persistent key/value store like redis, memcachedb, or riak. There is a decent comparison of all these options here: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
You can do it with redis (via redis-py) like so (assuming your redis server is at "127.0.0.1" (localhost) and it's port 6379 (the default):
import redis
def my_view(request):
r = redis.Redis(host='127.0.0.1', port="6379")
served = r.get('pages_served_all_time', 0)
served += 1
r.set('pages_served_all_time', served)
...
There is LocMemCache cache backend that stores data in-process. You can use it with sessions (but with great care: this cache is not cross-process so you will have to use single process for deployment because it will not be guaranteed that subsequent requests will be handled by the same process otherwise). Global variables may also work (use threadlocals if they shouldn't be shared for all process threads; the warning about cross-process communication also applies here).
By the way, what's wrong with external storage? External storage provides easy cross-process data sharing and other features (like memory limiting algorithms for cache or persistance with databases).

Categories