I'm looking for a way to keep the equivalent of persistent global variables in app engine (python). What I'm doing is creating a global kind that I initialize once (i.e. when I reset all my database objects when I'm testing). I have things in there like global counters, or the next id to assign certain kinds I create.
Is this a decent way to do this sort of thing or is there generally another approach that is used?
The datastore is the only place you can have guaranteed-persistent data that are also modifiable. So you can have a single large object, or several smaller ones (with a name attribute and others), depending on your desired access patterns -- but live in the datastore it must. You can use memcache for faster cache that usually persists across queries, but any memcache entry could go away any time, so you'll always need it to be backed by the datastore (in particular, any change must go to the datastore, not just to memcache).
Related
Old habits being what they are, I would declare global variables, and probably use lists to store records. I appreciate this is not the best way of doing this these days, and that Python actively discourages you from doing this by having to constantly declare 'global' throughout.
So what should I be doing? I'm thinking I should maybe use instances, but I know of no way to create a unique instance name based on an identifier (all the records will have a unique ID) and then find out how many instances I have.
I could use dictionaries maybe?
The most important thing is that the values are accessible anywhere in my code, and that I can list the number of records and easily refer to / change the values.
It's tough to tell, without a bit more information about the problem that you are trying to solve, the scope of your code, and your code's architecture.
Generally speaking:
If you're writing a script that's reasonably small in scope, there really is nothing wrong with declaring variables in the global namespace of the script.
If you're writing a larger system - something that goes beyond one module or file, but is not running in a multi-process/multi-thread environment, then you can create a module (as a separate file) that handles storage of your re-used data. Anytime you need to read or write that data you can import that module. The module could just expose simple variables, or it can wrap the data in classes and expose methods for creation/reading/updating/deletion.
If you are running in a multi-process/multi-thread environment, then any sort of global in-memory variable storage is going to be problematic. You'll need to use an external store of some sort - Redis or the like for temporary storage, or a database of some sort for permanent storage.
I've been racking my brain on this for the last few weeks and I just can't seem to understand it. I'm hoping you folks here can give me some clarity.
A LITTLE BACKGROUND
I've built an API to help serve a large website and like all of us, are trying to keep the API as efficient as possible. Part of this efficiency is to NOT create an object that contains custom business logic over and over again (Example: a service class) as requests are made. To give some personal background I come from the Java world so I'm use to using a IoC or DI to help handle object creation and injection into my classes to ensure classes are NOT created over and over on a per request basis.
WHAT I'VE READ
While looking at many Python IoC and DI posts I've become rather confused on how to best approach creating a given class and not having to worry about the server getting overloaded with too many objects based on the amount of requests it may be handling.
Some people say an IoC or DI really isn't needed. But as I run my Django app I find that unless I construct the object I want globally (top of file) for views.py to use later rather than within each view class or def within views.py I run the change of creating multiple classes of the same type, which from what I understand would cause memory bloat on the server.
So what's the right way to be pythonic to keep objects from being built over and over? Should I invest in using an IoC / DI or not? Can I safely rely on setting up my service.py files to just contain def's instead of classes that contain def's? Is the garbage collector just THAT efficient so I don't even have to worry about it.
I've purposely not placed any code in this post since this seems like a general questions, but I can provide a few code examples if that helps.
Thanks
From a confused engineer that wants to be as pythonic as possible
You come from a background where everything needs to be a class, I've programmed web apps in Java too, and sometimes it's harder to unlearn old things than to learn new things, I understand.
In Python / Django you wouldn't make anything a class unless you need many instances and need to keep state.
For a service that's hardly the case, and sometimes you'll notice in Java-like web apps some services are made singletons, which is just a workaround and a rather big anti-pattern in Python
Pythonic
Python is flexible enough so that a "services class" isn't required, you'd just have a Python module (e.g. services.py) with a number of functions, emphasis on being a function that takes in something, returns something, in a completely stateless fashion.
# services.py
# this is a module, doesn't keep any state within,
# it may read and write to the DB, do some processing etc but doesn't remember things
def get_scores(student_id):
return Score.objects.filter(student=student_id)
# views.py
# receives HTTP requests
def view_scores(request, student_id):
scores = services.get_scores(student_id)
# e.g. use the scores queryset in a template return HTML page
Notice how if you need to swap out the service, you'll just be swapping out a single Python module (just a file really), so Pythonistas hardly bother with explicit interfaces and other abstractions.
Memory
Now per each "django worker process", you'd have that one services module, that is used over and over for all requests that come in, and when the Score queryset is used and no longer pointed at in memory, it'll be cleaned up.
I saw your other post, and well, instantiating a ScoreService object for each request, or keeping an instance of it in the global scope is just unnecessary, the above example does the job with one module in memory, and doesn't need us to be smart about it.
And if you did need to keep state in-between several requests, keeping them in online instances of ScoreService would be a bad idea anyway because now every user might need one instance, that's not viable (too many online objects keeping context). Not to mention that instance is only accessible from the same process unless you have some sharing mechanisms in place.
Keep state in a datastore
In case you want to keep state in-between requests, you'd keep the state in a datastore, and when the request comes in, you hit the services module again to get the context back from the datastore, pick up where you left it and do your business, return your HTTP response, then unused things will get garbage collected.
The emphasis being on keeping things stateless, where any given HTTP request can be processed on any given django process, and all state objects are garbage collected after the response is returned and objects go out of scope.
This may not be the fastest request/response cycle we can pull, but it's scalable as hell
Look at some major web apps written in Django
I suggest you look at some open source Django projects and look at how they're organized, you'll see a lot of the things you're busting your brains with, Djangonauts just don't bother with.
I have a couple of quick questions -
~* when I used to code in Java, we used to reduce the usage of session variables as it used to slow the engine/occupy quite some space. In Python-django when I was trying to access one variable in two functions I have seen that request.session('variable_name') is being used to solve this - is there any other way to achieve what I wanted or request.session is the only way? In case request.session is the only approach then will sessions slow down the engine? (I apologize if its a lame question)
~* I have a list which has values that has to be saved in db table - so the list has to be iterated - model has to be instantiated - and finally it has to be saved. If the list is being iterated(say 100 times) it makes a db call 100 times to avoid that, this is what am doing
with transaction.atomic():
for lcc in list_course_content:
print lcc
c = Course_Content(TITLE=lcc, COURSE_ID_id=crse.id)
c.save()
am I in the right path or is there any other better approach ?
You say that you used to reduce the usage of session variables in Java, but you don't say how you did it. If it worked there, in Python it would also work.
Anyway, to be able to use variable on different requests, you have to store that variable somewhere. Language doesn't matter.In django you can set session backend, which can be based on inmemory storage, files or database, so it's your choice.
Of course you can store variable also without using sessions.
I am writing a reusable django application for returning json result for jquery ui autocomplete.
Currently i am storing the Class/function for getting the result in a dictionary with a unique key for each class/function.
When a request comes then I selects the corresponding class/function from the dict and returns the output.
My query is whether is the best practice to do the above or are there some other tricks to obtains the same result.
Sample GIST : https://gist.github.com/ajumell/5483685
You seem to be talking about a form of memoization.
This is OK, as long as you don't rely on that result being in the dictionary. This is because the memory will be local to each process, and you can't guarantee subsequent requests being handled by the same process. But if you have a fallback where you generate the result, this is a perfectly good optimization.
That's a very general question. It primary depends on the infrastructure of your code. The way your class and models are defined and the dynamics of the application.
Second, is important to have into account the resources of the server where your application is running. How much memory do you have available, and how much disk space so you can take into account what would be better for the application.
Last but not least, it's important to take into account how much operations does it need to put all these resources in memory. Memory is volatile, so if your application restarts you'll have to instantiate all the classes again and maybe this is to much work.
Resuming, as an optimization is very good choice to keep in memory objects that are queried often (that's what cache is all about) but you have to take into account all of the previous stuff.
Storing a series of functions in a dictionary and conditionally selecting one based on the request is a perfectly acceptable way to handle it.
If you would like a more specific answer it would be very helpful to post your actual code. And secondly, this might be better suited to codereview.stackexchange
I'm quite new to django, and moved to it from Drupal.
In Drupal is possible to define module-level variables (read "application" for django) which are stored in the DB and use one of Drupal's "core tables". The idiom would be something like:
variable_set('mymodule_variablename', $value);
variable_get('mymodule_variablename', $default_value);
variable_del('mymodule_variablename');
The idea is that it wouldn't make sense to have each module (app) to instantiate a whole "module table" to just store one value, so the core provides a common one to be shared across modules.
To the best of my newbie understanding of django, django lack such a functionality, but - since it is a common pattern - I thought to turn to SO community to check if there is a typical/standard/idiomatic way that django devs use to solve this problem.
(BTW: the value is not a constant that I could put in a settings file. It's a value that should be refreshed daily, and should be read at each request).
There are apps to achieve this, but I'd like to recommend django-modeldict from disqus, as its brief
ModelDict is a very efficient way to store things like settings in
your database. The entire model is transformed into a dictionary
(lazily) as well as stored in your cache. It's invalidated only when
it needs to be (both in process and based on CACHE_BACKEND).
Data that is not static is stored in a model. If you need to share data or functions between apps I have seen the convention of making a shared app, something like 'common'. This would house shared models, or utility functions.
In the django projects I have seen the data is usually specific. The data you are storing should be in a model that is representative of that data, I would rather have an explicit model/object representing my data then a generic object that houses vastly different data.
If you are only defining 1 or two variables which are changed daily, perhaps just a key/value store like memcached would work for you?
Another +1 for ModelDict. Another potential, similar solution is Django Constance:
https://github.com/jazzband/django-constance
It's meant to store app config parameters in the database and has the advantage that it exposes a nice backend to edit them for administrators (with the right permissions), handles default values and also has caching etc.
EDIT:
In case it's not clear from the documentation (which it isn't), you can set settings the same the 'Pythonic way.' I.e. to set a setting to a value, you do
from constance import config
config.variable_name = value