Suppose I have a QuerySet of all db entries:
all_db_entries = Entry.objects.all()
And then I want to get some specific objects from it by calling get(param=value) (or any other method).
The problem is, that In documentation of QuerySet methods it is said: "These methods do not use a cache. Rather, they query the database each time they’re called.".
But what I want to achieve is to load all data once (like doing Select *), and only after do some searches on them. I don't want to open a connection to the db every time I call get() in order to avoid a heavy load on it.
You can use values to convert your resulting queryset in an ordinary python list, which you can use to do searches etc., e.g.:
list(MyModel.objects.values('pk', 'field'))
values will fetch the queryset once.
Related
Many times, one needs to check if there is at least one element inside a QuerySet. Mostly, I use exists:
if queryset.exists():
...
However, I've seen colleagues using python's any function:
if any(queryset):
...
Is using python's any function unoptimal?
My intuition tells me that this is a similar dilemma to one between using count and len: any will iterate through the QuerySet and, therefore, will need to evaluate it. In a case where we will use the QuerySet items, this doesn't create any slowdowns. However, if we need just to check if any pieces of data that satisfy a query exist, this might load data that we do not need.
Is using python's any function unoptimal?
The most Pythonic way would be:
if queryset:
# …
Indeed, a QuerySet has truthiness True if it contains at least one item, and False otherwise.
In case you later want to enumerate over the queryset (with a for loop for example), it will load the items in the cache if you check its truthiness, so for example:
if queryset:
for item in queryset:
# …
will only make one query to the database: one that will fetch all items when you check the if queryset, and then later you can reuse that cache without making a second query.
In case you do not consume the queryset later in the process, then you can work with a .exists() [Django-doc]: this will not load records in memory, but only make a query to check if at least one such record exists, this is thus less expensive in terms of bandwidth between the application and the database. If you however have to consume the queryset later, using .exists() is not a good idea, since then we make two queries.
Using any(queryset) however is non-sensical: you can check if a queryset contains elements by its truthiness, so using any() will usually only make that check slightly less efficient.
I need to make a function that will be launched in celery and will take records from the model in turn, check something and write data to another model with onetoone relationship. There are a lot of entries and using model_name.objects.all () is not appropriate (it will take a lot of memory and time) how to do it correctly.
You can use an iterator over the queryset https://docs.djangoproject.com/en/dev/ref/models/querysets/#iterator so your records are fetched on by one
model_iterator = your_model.objects.all().iterator()
for record in model_iterator:
do_something(record)
I'm using Django REST framework to serve up JSON content for a website front end. On the back end, I have two Django models, Player and Match, that each reference multiple of the other. A Match contains multiple Players, and a Player contains multiple Matches. This data is originally retrieved from a third-party API.
Matches and Players must be fetched separately from the API, and can only be fetched one at a time. When an object is fetched, its data is converted from the external JSON format into my Django model. At this point, the Match/Player will live forever in Django. The hard part is that I want this external fetching to be seamless. If I query for a player or match and it's in the DB, then just serve what we have there. Otherwise, I want to fetch that object from the external DB.
My question is, does Django provide any convenient way of handling this? Ideally, any query along the lines of Match.objects.get(id=...) will handle this API fallback transparently (I don't mind the fact that this query may take significantly longer in some cases).
If a way is "convenient" depends on your expectations ...
You could create a custom QuerySet where you override the get() method to include your fetch-from-API logic. Then you create a custom manager based on that QuerySet, like the docs show here.
Finally add that custom manager to your model.
See also this question from 2011.
I need to cache a mid-sized queryset (about 500 rows). I had a look on some solutions, django-cache-machine being the most promising.
Since the queryset is pretty much static (it's a table of cities that's been populated in advance and gets updated only by me and anyway, almost never), I just need to serve the same queryset at every request for filtering.
In my search, one detail was really not clear to me: is the cache a sort of singleton object, which is available to every request? By which I mean, if two different users access the same page, and the queryset is evaluated for the first user, does the second one get the cached queryset?
I could not figure out, what problem you are exactly facing. What you are saying is the classical use case for caching. Memcache and redis are two most popular options. You just needs to write some method or function which first tries to load the result from cache, if it not there , the it queries the database. E.g:-
from django.core.cache import cache
def cache_user(userid):
key = "user_{0}".format(userid)
value = cache.get(key)
if value is None:
# fetch value from db
cache.set(value)
return value
Although for simplicity, I have written this as function, ideally this should be a manager method of the concerned model.
So I'm currently working in Python/Django and I have a problem where Django caches querysets "within a session".
If I run python manage.py shell and do so:
>>> from myproject.services.models import *
>>> test = TestModel.objects.filter(pk = 5)
>>> print test[0].name
>>> John
Now, if I then update it directly in SQL to Bob and run it again, it'll still say John. If I however CTRL+D out (exit) and run the same thing, it will have updated and will now print Bob.
My problem is that I'm running a SOAP service in a screen and it'll always return the same result, even if the data gets changed.
I need a way to force the query to actually pull the data from the database again, not just pull the cached data. I could just use raw queries but that doesn't feel like a solution to me, any ideas?
The queryset is not cached 'within a session'.
The Django documentation: Caching and QuerySets mentions:
Each QuerySet contains a cache to minimize database access. Understanding how it works will allow you to write the most efficient code.
In a newly created QuerySet, the cache is empty. The first time a QuerySet is evaluated – and, hence, a database query happens – Django saves the query results in the QuerySet’s cache and returns the results that have been explicitly requested (e.g., the next element, if the QuerySet is being iterated over). Subsequent evaluations of the QuerySet reuse the cached results.
Keep this caching behavior in mind, because it may bite you if you don’t use your QuerySets correctly.
(emphasis mine)
For more information on when querysets are evaluated, refer to this link.
If it is critical for your application that he querysets gets updated, you have to evaluate it each time, be it within a single view function, or with ajax.
It is like running a SQL query again and again. Like old times when no querysets have been available and you kept the data in some structure that you had to refresh.