Update all models at once in Django

Update all models at once in Django - python

I am trying to update position field for all objects in specific order at once in Django (python).
This is how I've done it now, but the problem is that it makes loads of queries.
servers = frontend_models.Server.objects.all().order_by('-vote_count')
i = 1
for server in servers:
server.last_rank = i
server.save()
i += 1
Is there a way to update with
Model.objects.all().order_by('some_field').update(position=some_number_that_changes_for each_object)
Thank you!

You can use the F() expression from django.db.models to do the same:
Model.objects.all().order_by('some_field').update(position=F(some_field)+1)
which will generate a single SQL query to update all of the column values, so it is efficient to the database too.

As far as I know, Django's object-relational mapping system doesn't provide a way to express this update operation. But if you know how to express it in SQL, then you can run it via a custom SQL query:
from django.db import connection
cursor = connection.cursor()
cursor.execute('''UPDATE myapp_server ...''')
Different database engines express this operation in different ways. In MySQL you'd run this query:
SET #rownum=0;
UPDATE myapp_server A,
(SELECT id, #rownum := #rownum + 1 AS rank
FROM myapp_server
ORDER BY vote_count DESCENDING) B
SET A.rank = B.rank
WHERE A.id = B.id
In PostgreSQL I think you'd use
UPDATE myapp_server A,
(SELECT id, rownumber() AS rank
OVER (ORDER BY vote_count DESCENDING)
FROM myapp_server) B
SET A.rank = B.rank
WHERE A.id = B.id
(but that's untested, so beware!).

Related

count subquery in sqlalchemy

I'm having some trouble translating a subquery into sqlalchemy. I have two tables that both have a store_id column that is a foreign key (but it isn't a direct many-to-many relationship) and I need to return the id, store_id and name from table 1 along with the number of records from table 2 that also have the same store_id. I know the SQL that I would use to return those records I'm just now sure how to do it using sqlalchemy.
SELECT
table_1.id
table_1.store_id,
table_1.name,
(
SELECT
count(table_2.id)
FROM
table_2
WHERE
table_1.store_id = table_2.store_id
) AS store_count FROM table_1;

This post actually answered my question. I must have missed it when I was searching initially. My solution below.
Generate sql with subquery as a column in select statement using SQLAlchemy
store_count = session.query(func.count(Table2.id)).filter(Table2.store_id == Table1.store_id)
session.query.add_columns(Table1.id, Table1.name, Table1.store_id, store_count.label("store_count"))

Update Django model based on the row number of rows produced by a subquery on the same model

I have a PostgreSQL UPDATE query which updates a field (global_ranking) of every row in a table, based on the ROW_NUMBER() of each row in that same table sorted by another field (rating). Additionally, the update is partitioned, so that the ranking of each row is relative only to those rows which belong to the same language.
In short, I'm updating the ranking of each player in a game, based on their current rating.
The PostgreSQL query looks like this:
UPDATE stats_userstats
SET
global_ranking = sub.row_number
FROM (
SELECT id, ROW_NUMBER() OVER (
PARTITION BY language
ORDER BY rating DESC
) AS row_number
FROM stats_userstats
) sub
WHERE stats_userstats.id = sub.id;
I'm also using Django, and it'd be fun to learn how to express this query using the Django ORM, if possible.
At first, it seemed like Django had everything necessary to express the query, including the ability to use PostgreSQL's ROW_NUMBER() windowing function, but my best attempt updates all rows ranking with 1:
from django.db.models import F, OuterRef, Subquery
from django.db.models.expressions import Window
from django.db.models.functions import RowNumber
UserStats.objects.update(
global_ranking=Subquery(
UserStats.objects.filter(
id=OuterRef('id')
).annotate(
row_number=Window(
expression=RowNumber(),
partition_by=[F('language')],
order_by=F('rating').desc()
)
).values('row_number')
)
)
I used from django.db import connection; print(connection.queries) to see the query produced by that Django ORM statement, and got this vaguely similar SQL statement:
UPDATE "stats_userstats"
SET "global_ranking" = (
SELECT ROW_NUMBER() OVER (
PARTITION BY U0."language"
ORDER BY U0."rating" DESC
) AS "row_number"
FROM "stats_userstats" U0
WHERE U0."id" = "stats_userstats"."id"
It looks like what I need to do is move the subquery from the SET portion of the query to the FROM, but it's unclear to me how to restructure the Django ORM statement to achieve that.
Any help is greatly appreciated. Thank you!

Subquery filters qs by provided OuterRef. You're always getting 1 as each user is in fact first in any ranking if only them are considered.
A "correct" query would be:
UserStats.objects.alias(
row_number=Window(
expression=RowNumber(),
partition_by=[F('language')],
order_by=F('rating').desc()
)
).update(global_ranking=F('row_number'))
But Django will not allow that:
django.core.exceptions.FieldError: Window expressions are not allowed in this query
Related Django ticket: https://code.djangoproject.com/ticket/25643
I think you might comment there with your use case.

jaydebeapi Getting column alias names

Is there a way to return the aliased column names from a sql query returned from JayDeBeApi?
For example, I have the following query:
sql = """ SELECT visitorid AS id_alias FROM table LIMIT 1 """
I then run the following (connect_to_vdm() establishes a connection to my DB):
curs = connect_to_vdm().cursor()
curs.execute(sql)
vals = curs.fetchall()
I normally retrieve column names like so:
desc = curs.description
column_names = [col[0] for col in desc]
This returns the original column name "visitorid" and not the alias specified in the query "id_alias".
I know I could swap the names for the value in Python, but hoping to be able to have this done within the query since it is already defined in the Select statement. This behaves as expected in a SQL client, but I cannot seem to get the Aliases to return when using python/JayDeBeApi. Is there a way to do this using JayDeBeApi?
EDIT:
I have discovered that structuring my query with a CTE seems to help fix the problem, but still wondering if there is a more straightforward solution out there. Here is how I rewrote the same query:
sql = """ WITH cte (id_alias) AS (SELECT visitorid AS id_alias FROM table LIMIT 1) SELECT id_alias from cte"""

I was able to fix this using a CTE (Common Table Expression)
sql = """ WITH cte (id_alias) AS (SELECT visitorid AS id_alias FROM table LIMIT 1) SELECT id_alias from cte"""

Hat tip to pybokeh on Github, but this worked for me.
According to IBM (here and here), the behavior of JDBC drivers changed at some point. Bizarrely, the column aliases display just fine when using a tool like DBVisualizer, but not by querying through jaydebeapi.
To fix, add the following to the end of your DB URL:
:useJDBC4ColumnNameAndLabelSemantics=false;
Example:
jdbc:db2://[DBSERVER]:[PORT]/[DBNAME]:useJDBC4ColumnNameAndLabelSemantics=false;

sqlalchemy orm join error [duplicate]

Using sqlalchemy I would like to do something like:
q = session.query(a, b.id, func.count(a.id))
q = q.outerjoin(b, b.id == a.b_id)
q = q.group_by(b.id)
However in most of sql implementations it is impossible to select fields that are not in group by clause.
Can I order sqlalchemy to select from table a, but not select any field directly from a? In this case I would be able to just change join order but I've got some complex queries that aren't so easy to modify.

You can set the FROM clause explicitly with select_from:
session.query(b.id, func.count(a.id)).select_from(a).outerjoin(b, ...)...

Django ORM limiting queryset to only return a subset of data

I have the following query in a Django app. The user field is a foreign key. The results may contain 1000 MyModel objects, but only for a handful of users. I'd like to limit it to 5 MyModel objects returned per user in the user__in= portion of the query. I should end up with 5*#users or less MyModel objects.
lfs = MyModel.objects.filter(
user__in=[some,users,here,],
active=True,
follow=True,
)
Either through the ORM or SQL (using Postgres) would be acceptable.
Thanks
EDIT 2
Found a simpler way to get this done, which I've added as an answer below.
EDIT
Some of the links mentioned in the comments had some good information, although none really worked with Postgres or the Django ORM. For anyone else looking for this information in the future my adaptation of the code in those other questions/asnwers is here.
To implement this is postgres 9.1, I had to create a couple functions using pgperl (which also required me to install pgperl)
CREATE OR REPLACE FUNCTION set_int_var(name text, val bigint) RETURNS bigint AS $$
if ($_SHARED{$_[0]} = $_[1]) {
return $_[1];
} else {
return $_[1];
}
$$ LANGUAGE plperl;
CREATE OR REPLACE FUNCTION get_int_var(name text) RETURNS bigint AS $$
return $_SHARED{$_[0]};
$$ LANGUAGE plperl;
And my final query looks something like the following
SELECT x.id, x.ranking, x.active, x.follow, x.user_id
FROM (
SELECT tbl.id, tbl.active, tbl.follow, tbl.user_id,
CASE WHEN get_int_var('user_id') != tbl.user_id
THEN
set_int_var('rownum', 1)
ELSE
set_int_var('rownum', get_int_var('rownum') + 1)
END AS
ranking,
set_int_var('user_id', tbl.user_id)
FROM my_table AS tbl
WHERE tbl.active = TRUE AND tbl.follow=TRUE
ORDER BY tbl.user_id
) AS x
WHERE x.ranking <= 5
ORDER BY x.user_id
LIMIT 50
The only downside to this is that if I try to limit the users that it looks for by using user_id IN (), the whole thing breaks and it just returns every row, rather than just 5 per user.

This is what ended up working, and allowed me to only select a handful of users, or all users (by removing the AND mt.user_id IN () line).
SELECT * FROM mytable
WHERE (id, user_id, follow, active) IN (
SELECT id, likeable, user_id, follow, active FROM mytable mt
WHERE mt.user_id = mytable.user_id
AND mt.user_id IN (1, 2)
ORDER BY user_id LIMIT 5)
ORDER BY likeable

I think this is what you where looking for (i didn't see it in other posts):
https://docs.djangoproject.com/en/dev/topics/db/queries/#limiting-querysets
In other examples, they pass from queryset to list before "slicing". If you make something like this (for example):
lfs = MyModel.objects.filter(
user__in=[some,users,here,],
active=True,
follow=True,
)[:10]
the resulting SQL it's a query with LIMIT 10 in it's clauses.
So, the query you are looking for would be something like this:
mymodel_ids = []
for user in users:
mymodel_5ids_for_user = (MyModel.objects.filter(
user=user,
active=True,
follow=True,
)[:5]).values_list('id', flat=True)
mymodel_ids.extend(mymodel_5ids_for_user)
lfs = MyModel.objects.filter(id__in=mymodel_ids)
having in lfs the objects of MyModel you where looking for (5 entries per user).
I think the number of queries is, at least, one per user and one to retrieve all MyModel objects with that filter.
Be aware of the order you want to filter the objects. If you change the order of "mymodel_5ids_for_user" query, the first 5 elements of the query could change.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Update all models at once in Django - python

You can use the F() expression from django.db.models to do the same: Model.objects.all().order_by('some_field').update(position=F(some_field)+1) which will generate a single SQL query to update all of the column values, so it is efficient to the database too.

Related

count subquery in sqlalchemy

Update Django model based on the row number of rows produced by a subquery on the same model

jaydebeapi Getting column alias names

sqlalchemy orm join error [duplicate]

Django ORM limiting queryset to only return a subset of data

Categories

Resources