django substitute dictionary with a faster JSON serializer - python

I'm using Django raw queryset to select data from database.
I will need a translation (by using ugettext) on a field before I return this json serialized data to django rest_framework as an API
However I'm having optimization issue as this I found out it takes quite a while to manually append dictionary to a list especially if I have a lot of database rows.
After some searching i found a library ujson that claims can serialize JSON faster. However I'm struggling to use this as I need this raw query to return translated name of a field (fruits)
Anyone have any idea how to replace this dictionary method with other faster method to serialize JSON data?
all_fruits = []
activate ("en")
raw_query = MyObject.objects.raw(" select id, fruits from my_table ")
for each_name in raw_query:
json_obj = dict( id = each_name.id,
fruits= ugettext(each_name.fruits)
)
all_fruits.append(json_obj)

It's better to avoid raw SQL only when you really have no other Solution, Django QuerySet provide very great and full-featured API for database query, the following solution could fit your needs:
all_fruits = []
activate ("en")
my_object_list = MyObject.objects.all()
for obj in my_object_list.values():
all_fruits.append({"id":obj.id, "fruits" : ugettext(obj.fruits)})

Related

How to query a JSONField comprising a list of values in DJANGO

I am using JSONField to store the configuration parameter(user_types) as a list as follows:
["user_type1", "user_type2", "user_type3"]
How to query to filter elements of type "user_type1"? The following query is not working:
rows=ConfigUserTable.objects.filter(user_types__in=["user_type1"])
Thanks
Use the contains lookup, which is overridden on JSONField. For example, the following may work:
ConfigUserTable.objects.filter(user_types__contains="user_type1")
However, this might depend on how you are storing JSON data in the field. If you are storing the data as a dict, querying on that key will certainly work. I.e. data in this format in the field user_types:
{"types": ["user_type1", "user_type2", "user_type3"]}
could be queried like so:
ConfigUserTable.objects.filter(user_types__types__contains="user_type1")
Reference: https://docs.djangoproject.com/en/dev/topics/db/queries/#std:fieldlookup-jsonfield.contains

django orm group by json key in json field

I'm using json field on my django model:
class JsonTable(models.Model):
data = JSONField()
type = models.IntegerField()
I tried next query, which works for normal sql fields:
JsonTable.objects.filter(type=1).values('type').annotate(Avg('data__superkey'))
But this throws next error:
FieldError: Cannot resolve keyword 'superkey' into field. Join on 'data' not permitted.
Is there way to make group by on json key, using Django ORM or some python lib, without use of raw sql?
Versions: Django 1.9b, PostgreSQL 9.4
UPDATE
Example 2:
JsonTable.objects.filter(type=1).values('data__happykey').annotate(Avg('data_superkey'))
throws same error on happykey
After some researching I found next solution:
from django.db.models import Count
from django.contrib.postgres.fields.jsonb import KeyTextTransform
superkey = KeyTextTransform('superkey', 'data')
table_items = JsonTable.objects.annotate(superkey = superkey).values('superkey').annotate(Count('id')).order_by()
I did not sure about order_by(), but documentation says that is needed.
For another aggregation function type casting needed:
from django.db.models import IntegerField
from django.db.models.functions import Cast
superkey = Cast(KeyTextTransform('superkey', 'data'), IntegerField())
I test with another model, hope that write this code without misprints. PostgreSQL 9.6, Django 2.07
If you are using this package https://github.com/bradjasper/django-jsonfield,
there is nothing in the code for managing such simulated related queries (data__some_json_key)
As Json data is text, you will have to go to raw sql or better : use queryset extra() method, but parsing Json in sql seems to be difficult.

Does django core pagination retrieve all data first?

I am using django 1.5. I need to split pages to data. I read docs here. I am not sure about whether it retrieves all data first or not. Since I have a large table, it should be better to using something like 'limit'. Thanks.
EDIT
I am using queryset in ModelManager.
example:
class KeywordManager(models.Manager):
def currentkeyword(self, kw, bd, ed):
wholeres = super(KeywordManager, self).get_query_set() \
.values("sc", "begindate", "enddate") \
.filter(keyword=kw, begindate__gte=bd, enddate__lte=ed) \
.order_by('enddate')
return wholeres
First, a queryset is a lazy object, and django will retrieve the data as soon you request it, but if you dont, django won't hit the DB. If you use over a queryset any list methods as len(), you will evaluate all the queryset and forcing django to retrieve all the data.
If you pass a queryset to the Paginator, it would not retrieve all the data, because, as docs says, if you pass a queryset, it will use .count() methods avoiding converting the queryset into a list and the use of len() method.
If your data is not coming from the database, then yes - Paginator will have to load all the information first in order to determine how to "split" it.
If you're not and you're simply interacting with the database with Django's auto-generated SQL, then the Paginator performs a query to determine the number of items in the database (i.e. an SQL COUNT()) and uses the value you supplied to determine how many pages to generate. Example: count() returns 43, and you want pages of 10 results - the number of pages generated is equivalent to: 43 % 10 + 1 = 5

Need Optimization & Performance related to Django ORM Query

I am writing an API in django 1.4.5 which return JSON data to third party application.
This is my current method to retrieve data but it takes more time because i also need related data to be available in JSON.
def get_speakers(request)
speakers = Person.objects.filter(profile__person_type__name='Speaker').select_related('series')
for speaker in speakers:
data['first_name'] = speaker.first_name
data['last_name'] = speaker.last_name
data['series_name'] = speaker.series.name
data['series_id'] = speaker.series.id
return JSONResponse(data_dict)
To achieve optimization i tried as following.
def get_speakers(request)
speakers = Person.objects.filter(profile__person_type__name='Speaker').select_related('series')
data_dict = serializers.serialize("python", speakers)
return JSONResponse(data_dict)
But it returns foreign key for related data in JSON which is useless because i can't get its related data.
Also it try for raw sql then data return in tuple but we need in dictionary format.
Need help to achieve this.
Thanks in advance.
If I were you, I would use the following.
def get_speakers(request)
speakers = Person.objects.filter(profile__person_type__name='Speaker').values('first_name', 'last_name', 'series__name', 'series__id')
return JSONResponse(speakers)
In django ORM, while naming fields only one underscore _ is used. Since two underscore has another usage. In your example series is a foreign key, and if you would like to reach its fields while filtering and getting values, you could use like 'series__name' to get name field of related record.
Besides values method there is values_list method. You could use which is best for you.

How to serialize and deserialize Django ORM query (not queryset)?

My use case is that I need to store queries in DB and retrieve them from time to time and evaluate. Thats needed for mailing-app where every user can subscribe to a web-site content selected by individually customized query.
Most basic solution is to store raw SQL and use it with RawQuerySet. But I wonder is there better solutions?
At first glance, it is really dangerous to hand out query building job to others, since they can do anything (even delete all your data in your database or drop entire table etc.)
Even you let them build a specific part of the query, it is still open to Sql Injection. If it is ok for all those dangers, then you may try the following.
This is and old script I used and let users set a specific part of the query. Basics are using string.Template and eval (the evil part)
Define your Model:
class SomeModel(Model):
usr = ForeingKey(User)
ct = ForeignKey(ContentType) # we will choose related DB table with this
extra_params = TextField() # store extra filtering criteria in here
Lets execute all queries belongs to a user. Say we have a User query with extra_params is_staff and 'username__iontains'
usr: somebody
ct: User
extra_params: is_staff=$stff_stat, username__icontains='$uname'
$ defines placeholders in extra_params
from string import Template
for _qry in SomeModel.objects.filter(usr='somebody'): # filter somebody's queries
cts = Template(_qry.extra_params) # take extras with Template
f_cts = cts.substitute(stff_stat=True, uname='Lennon') # sustitute placeholders with real time filtering values
# f_cts is now `is_staff=True, username__icontains='Lennon'`
qry = Template('_qry.ct.model_class().objects.filter($f_cts)') # Now, use Template again to place our extras into a django `filter` query. We also select related model in here with `_qry.ct.model_class()`
exec_qry = qry.substitute(f_cts=f_cts)
# now we have `User.objects.filter(is_staff=True, username__icontains='Lennon')
query = eval(exec_qry) # lets evaluate it!
If you have all relted imports done,then you an use Q or any other query building option in your extra_params. Also You can use other methods to form Create or Update queries.
You can read more about Template form there. But as I said. It is REALLY DANGEROUS to give a such option to other users.
Also you may need to read about Django Content Type
Update: As #GillBates mentioned, you can use a dictonary structure to create the query. In this case, you will not need Template anymore. You can use json for such data transfer (or any other if you wish). Assuming you use json to get the data from an outer source following code is a scratch that uses some variables from the upper code block.
input_data : '{"is_staff"=true, "username__icontains"="Lennon"}'
import json
_data = json.loads(input_data)
result_set = _qry.ct.model_class().objects.filter(**_data)
According to your answer,
User passes some content-specific parameters into a form, then view function, that recieves POST, constructs query
one option is to store parameters (pickle'd or json'ed, or in a model) and reconstruct query with regular django means. This is somewhat more robust solution, since it can handle some datastructure changes.
You could create a new model user_option and store the selections in this table.
From your question, it's hard to determine whether it is a better solution, but it would make your user's choices more explicit in your data structure.

Categories