Django: Keep a persistent reference to an object? - python

I'm happy to accept that this might not be possible, let alone sensible, but is it possible to keep a persistent reference to an object I have created?
For example, in a few of my views I have code that looks a bit like this (simplified for clarity):
api = Webclient()
api.login(GPLAY_USER,GPLAY_PASS)
url = api.get_stream_urls(track.stream_id)[0]
client = mpd.MPDClient()
client.connect("localhost", 6600)
client.clear()
client.add(url)
client.play()
client.disconnect()
It would be really neat if I could just keep one reference to api and client throughout my project, especially to avoid repeated api logins with gmusicapi. Can I declare them in settings.py? (I'm guessing this is a terrible idea), or by some other means keep a connection to them that's persistent?
Ideally I would then have functions like get_api() which would check the existing object was still ok and return it or create a new one as required.

You can't have anything that's instantiated once per application, because you'll almost certainly have more than one server process, and objects aren't easily shared across processes. However, one per process is definitely possible, and worthwhile. To do that, you only need to instantiate it at module level in the relevant file (eg views.py). That means it will be automatically instantiated when Django first imports that file (in that process), and you can refer to it as a global variable in that file. It will persist as long as the process does, and when as new process is created, a new global var will be instantiated.

You could make them properties of your application object or of some
other application object that is declared at the top level of your
project - before anything else needs it.
If you put them into a class that gets instantiated on the first
import and is then just used on the rest it can be imported by
several modules and accessed.
Either way they would have a life of the length of the execution.

You can't persist the object reference, but you can store something either in memory django cache or in memcached django cache.
Django Cache
https://docs.djangoproject.com/en/dev/topics/cache/
See also
Creating a Persistent Data Object In Django

Related

Is there a way I can persist context locals for sub-threads?

Currently I create a library that records backend calls like ones made to boto3 and requests libraries, and then populates a global "data" object based on some data like the status code of responses, etc.
I originally had the data object as global, but then I realized this was a bad idea because when the application is run in parallel, the data object is simultaneously modified (which would possibly corrupt it), however I want to keep this object separate for each invocation of my application.
So I looked into Flask context locals, similar to how it does for its global "request" object. I manage to implement a way using LocalProxy how they did it, so it works fine now with parallel requests to my application - the issue now though, is that whenever the application spawns a new sub-thread it creates an entirely new context and thus I can't retrieve the data object from its parent thread, e.g. for that request session - basically I need to copy and modify the same data object that is local to the main thread for that particular application request.
To clarify, I was able to do this when I previously had data as a true "global" object - multiple sub-threads could properly modify the same object. However, it did not handle the case for simultaneous requests made to application, as I mentioned; so I manage to fix that, but now the sub-threads are not able to modify the same data object any more *sad face*
I looked at some solutions like below, but this did not help me because the decorator approach only works for "local" functions. Since the functions that I need to decorate are "global" functions like requests.request that threads across various application requests will use, I think I need to use another approach where I can temporarily copy same thread context to use in sub-threads (and my understanding is it should not overwrite or decorate the function, as this is a "global" one that will be use by simultaneous requests to application). Would appreciate any help or possible ideas how I can make this work for my use-case.
Thanks.
Flask throwing 'working outside of request context' when starting sub thread

Updating a Flask extension and maintaining state

I am finding that maintaining changes [of object variables] to a flask extension is very frustrating. Surely there exists multiple use cases where updating an extensions properties are required to endure beyond a single request? Moreover, even if the framework wishes to force statelessness upon the developer, it seems ridiculous to prevent any override mechanism.
To give context, I've adapted the redis flask extension for redis-py-cluster. Upon initialization of the flask app, the extension is initialized. At a later stage if the cluster nodes are updated, re-connection works fine, but does not persist across the api calls. In other words, the original object state at app initialization time is used on each request in stead of the updated state. Yes, the object is the same object (simple ID checking proves this), but the object members' state is always reset.
Which redis cluster nodes are used has nothing to do at all with maintaining statelessness across api calls. It has zero impact on the api calls themselves, hence i ask, why must my object/extension be bound to this almost reincarnation-like behaviour of state?
And more importantly, is there a way to get around this?

Is middleware in Django thread-safe?

Recently I've read this article:
http://blog.roseman.org.uk/2010/02/01/middleware-post-processing-django-gotcha/
I don't understand, why does the solution described there work?
Why does instantiating separate objects make data chunk thread-safe?
I have two guesses:
Django explicitly holds middleware objects in shared memory, and do not do this for other objects, so other objects are thread-safe.
In second example, in article, lifetime of thread-safety-critical data is much less that in first example, so probably, thread-unsafe operations just have no time to occur.
There is also issues with thread-safety in Django templates.
My question is - how to guess when Django thread-safe and where its not? is there any logic in it or conventions? Another question - I know that request object is thread safe - it is clear, that it wouldn't be safe, web-sites built with Django would be not able to operate, but what exactly makes it thread-safe?
The point, as I note in that article, is that the middleware is instantiated once per process. In most methods of deploying Django, a process lasts for multiple requests. Note that you never instantiate the middleware object yourself: Django takes care of that. That's a clue that it's being done outside the request/response cycle.
The extra object I used there is being instantiated within the process_response method. So, as soon as that method returns, the new object goes out of scope and is destroyed, and there are no thread-safety issues.
Generally speaking, the only objects you have to worry about thread-safety on are those you instantiate at module or class level rather than inside a function/method, and those you don't instantiate yourself, like the middleware here. And even there, requests are explicitly an exception: you can count on those being per-request (naturally).

How to handle local long-living objects in WSGI env

INTRO
I've recently switched to Python, after about 10 years of PHP development and habits.
Eg. in Symfony2, every request to server (Apache for instance) has to load eg. container class and instantiate it, to construct the "rest" of the objects.
As far as I understand (I hope) Python's WSGI env, an app is created once, and until that app closes, every request just calls methods/functions.
This means that I can have eg. one instance of some class, that can be accessed every time, request is dispatched, without having to instantiate it in every request. Am I right?
QUESTION
I want to have one instance of class since the call to __init__ is very expensive (in both computing and resources lockup). In PHP instantiating this in every request degrades performance, am I right that with Python's WSGI I can instantiate this once, on app startup, and use through requests? If so, how do I achieve this?
WSGI is merely a standardized interface that makes it possible to build the various components of a web-server architecture so that they can talk to each other.
Pyramid is a framework whose components are glued with each other through WSGI.
Pyramid, like other WSGI frameworks, makes it possible to choose the actual server part of the stack, like gunicorn, Apache, or others. That choice is for you to make, and there lies the ultimate answer to your question.
What you need to know is whether your server is multi-threaded or multi-process. In the latter case, it's not enough to check whether a global variable has been instantiated in order to initialize costly resources, because subsequent requests might end up in separate processes, that don't share state.
If your model is multi-threaded, then you might indeed rely on global state, but be aware of the fact that you are introducing a strong dependency in your code. Maybe a singleton pattern coupled with dependency-injection can help to keep your code cleaner and more open to change.
The best method I found was mentioned (and I missed it earlier) in Pyramid docs:
From Pyramid Docs#Startup
Note that an augmented version of the values passed as **settings to the Configurator constructor will be available in Pyramid view callable code as request.registry.settings. You can create objects you wish to access later from view code, and put them into the dictionary you pass to the configurator as settings. They will then be present in the request.registry.settings dictionary at application runtime.
There are a number of ways to do this in pyramid, depending on what you want to accomplish in the end. It might be useful to look closely at the Pyramid/SQLAlchemy tutorial as an example of how to handle an expensive initialization (database connection and metadata setup) and then pass that into the request-handling engine.
Note that in the referenced link, the important part for your question is the __init__.py file's handling of initialize_sql and the subsequent creation of DBSession.

Managing global objects with side effects when reloading a module in Python

I am looking for a way to correctly manage module level global variables that use some operating system resource (like a file or a thread).
The problem is that when the module is reloaded, my resource must be properly disposed (e.g. the file closed or the thread terminated) before creating the new one.
So I need a better pattern to manage those singleton objects.
I've been reading the docs on module reload and this is quite interesting:
When a module is reloaded, its dictionary (containing the module’s
global variables) is retained. Redefinitions of names will override
the old definitions, so this is generally not a problem. If the new
version of a module does not define a name that was defined by the old
version, the old definition remains. This feature can be used to the
module’s advantage if it maintains a global table or cache of objects
— with a try statement it can test for the table’s presence and skip
its initialization if desired:
try:
cache
except NameError:
cache = {}
So I could just check if the objects already exist, and dispose them before creating the new ones.
You need to monkeypatch or fork django to hook into django dev server reloading feature and do the proper thing to manage file closing etc...
But since you develop a django application, if you mean to use a proper server to serve your app in the future you should consider managing your global variables and think about semaphores and all that jazz.
But before going this route and implement all this difficult code prone to error and hair loss. You should consider other solution like nosql databases (redis, mongodb, neo4j, hadoop...) and background process managers like celery and gearman. If all of this don't feet your use-case(s) and you can't avoid to create and manage files yourself and global variables then consider the client/server pattern where clients are webserver threads unless you want to mess with NFS.

Categories