How to compare 2 columns of a table through dynamic querying

How to compare 2 columns of a table through dynamic querying - python

Have a requirement in dynamic querying where i would like to compare 2 columns of a table say "column_a" and "column_b" (all columns are strings). The actual columns to compare are decided at run-time.
I'm using kwargs to create a dictionary. But Django assumes that RHS is an absolute value & not a column of the same table. Using F() is an option, but i cant find any documentation of using F() in kwargs.
if i use kwargs = {'predicted_value':'actual_value'}
'actual_value' is used as a literal string instead of column name
How do i use something like:
kwargs = {'predicted_value':F('actual_value')} and pass it as Model.objects.filter(**kwargs)
Alternately, is there a way to use F('column') in LHS ?
e.g. Model.objects.filter(F(column1_name) = F(column2_name))

For a non-dynamic filter field, I would use:
from django.db.models import F
Model.objects.filter(some_col=F(kwargs.get('predicted_value')))
But if you need it all dynamically, you can try with:
kwargs = {'predicted_value':F('actual_value')}
Model.objects.filter(**kwargs)
You can even access related fields:
kwargs = {'fk_field__somefield':F('actual_value')}
Model.objects.filter(**kwargs)

Related

delet duplicates data from a queryset in django (code not working) [duplicate]

suppose we have a model in django defined as follows:
class Literal:
name = models.CharField(...)
...
Name field is not unique, and thus can have duplicate values. I need to accomplish the following task:
Select all rows from the model that have at least one duplicate value of the name field.
I know how to do it using plain SQL (may be not the best solution):
select * from literal where name IN (
select name from literal group by name having count((name)) > 1
);
So, is it possible to select this using django ORM? Or better SQL solution?

Try:
from django.db.models import Count
Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
This is as close as you can get with Django. The problem is that this will return a ValuesQuerySet with only name and count. However, you can then use this to construct a regular QuerySet by feeding it back into another query:
dupes = Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
Literal.objects.filter(name__in=[item['name'] for item in dupes])

This was rejected as an edit. So here it is as a better answer
dups = (
Literal.objects.values('name')
.annotate(count=Count('id'))
.values('name')
.order_by()
.filter(count__gt=1)
)
This will return a ValuesQuerySet with all of the duplicate names. However, you can then use this to construct a regular QuerySet by feeding it back into another query. The django ORM is smart enough to combine these into a single query:
Literal.objects.filter(name__in=dups)
The extra call to .values('name') after the annotate call looks a little strange. Without this, the subquery fails. The extra values tricks the ORM into only selecting the name column for the subquery.

try using aggregation
Literal.objects.values('name').annotate(name_count=Count('name')).exclude(name_count=1)

In case you use PostgreSQL, you can do something like this:
from django.contrib.postgres.aggregates import ArrayAgg
from django.db.models import Func, Value
duplicate_ids = (Literal.objects.values('name')
.annotate(ids=ArrayAgg('id'))
.annotate(c=Func('ids', Value(1), function='array_length'))
.filter(c__gt=1)
.annotate(ids=Func('ids', function='unnest'))
.values_list('ids', flat=True))
It results in this rather simple SQL query:
SELECT unnest(ARRAY_AGG("app_literal"."id")) AS "ids"
FROM "app_literal"
GROUP BY "app_literal"."name"
HAVING array_length(ARRAY_AGG("app_literal"."id"), 1) > 1

Ok, so for some reason none of the above worked for, it always returned <MultilingualQuerySet []>. I use the following, much easier to understand but not so elegant solution:
dupes = []
uniques = []
dupes_query = MyModel.objects.values_list('field', flat=True)
for dupe in set(dupes_query):
if not dupe in uniques:
uniques.append(dupe)
else:
dupes.append(dupe)
print(set(dupes))

If you want to result only names list but not objects, you can use the following query
repeated_names = Literal.objects.values('name').annotate(Count('id')).order_by().filter(id__count__gt=1).values_list('name', flat='true')

How to Django ORM update() a value nested inside a JSONField with a numeric value?

I have a Django JSONField on PostgreSQL which contains a dictionary, and I would like to use the queryset.update() to bulk update one (eventually, several) keys with a numeric (eventually, computed) values. I see there is discussion about adding better support for this and an old extension, but for now it seems I need a DIY approach. Relying on those references, this is what I came up with:
from django.db.models import Func, Value, CharField, FloatField, F, IntegerField
class JSONBSet(Func):
"""
Update the value of a JSONField for a specific key.
"""
function = 'JSONB_SET'
arity = 4
output_field = CharField()
def __init__(self, field, path, value, create: bool = True):
path = Value('{{{0}}}'.format(','.join(path)))
create = Value(create)
super().__init__(field, path, value, create)
This seems to work fine for non-computed "numeric string" values like this:
# This example sets the 'nestedkey' to numeric 199.
queryset.update(inputs=JSONBSet('inputs', ['nestedkey'], Value("199"), False))
and for carefully quoted strings:
# This example sets the 'nestedkey' to 'some string'.
queryset.update(inputs=JSONBSet('inputs', ['nestedkey'], Value('"some string"'), False))
But it does not work for a number:
queryset.update(inputs=JSONBSet('inputs', ['nestedkey'], Value(1), False))
{ProgrammingError}function jsonb_set(jsonb, unknown, integer, boolean) does not exist
LINE 1: UPDATE "paiyroll_payitem" SET "inputs" = JSONB_SET("paiyroll...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
As per the HINT, I tried an explicit cast Value(1, IntegerField()). I'm not sure where I am going wrong.

SQLAlchemy JSON column - how to perform a contains query

I have the following table in mysql(5.7.12):
class Story(db.Model):
sections_ids = Column(JSON, nullable=False, default=[])
sections_ids is basicly a list of integers [1, 2, ...,n].
I need to get all rows where sections_ids contains X.
I tried the following:
stories = session.query(Story).filter(
X in Story.sections_ids
).all()
but it throws:
NotImplementedError: Operator 'contains' is not supported on this expression

Use JSON_CONTAINS(json_doc, val[, path]):
from sqlalchemy import func
# JSON_CONTAINS returns 0 or 1, not found or found. Not sure if MySQL
# likes integer values in WHERE, added == 1 just to be safe
session.query(Story).filter(func.json_contains(Story.section_ids, X) == 1).all()
As you're searching an array at the top level, you do not need to give path. Alternatively beginning from 8.0.17 you can use value MEMBER OF(json_array), but using it in SQLAlchemy is a little less ergonomic in my opinion:
from sqlalchemy import literal
# self_group forces generation of parenthesis that the syntax requires
session.query(Story).filter(literal(X).bool_op('MEMBER OF')(Story.section_ids.self_group())).all()

For whoever get here, but is using PostgreSQL instead:
your fields should be of the type sqlalchemy.dialects.postgresql.JSONB (and not sqlalchemy_utils.JSONType) -
Then you can use the Comparator object that is associated with the field with its contains (and others) operators.
Example:
Query(Mymodel).filter(MyModel.managers.comparator.contains(["user#gmail.com"]))
(note that the contained part must be a JSON fragment, not just a string)

Django Q set too many values to unpack

I am new to django and trying to filter multiple fields that contain text.
columns = ['ticketId', 'checkSum']
q_objects = [Q(fieldname +'__contains', myString) for fieldname in columns]
objects = objects.filter(reduce(operator.or_, q_objects))
I get
Exception Type: ValueError
Exception Value: too many values to unpack on the "filter" last line. Any ideas

Try this:
Q(**{fieldname + '__contains': myString})
This is equivalent to providing a keyword argument, as you normally would when instantiating a Q object. For example:
Q(fieldname__contains=myString, another_fieldname__contains=myOtherstring)
The Q object essentially needs pairs of values to work. Looking at the code it seems you can also use tuples of length two, like this (I haven't tested, though):
Q(("fieldname__contains", myString), ("another_fieldname__contains", myOtherString))

What is the model you are querying? It looks like you left that out.
The last line,
objects = objects.filter(reduce(operator.or_, q_objects))
Should be something like
objects = MyModel.objects.filter(...)

Passing a string as a keyed argument to a function

I am trying to pass an argument to a method with a key to get a Django queryset. The key will be dependent on whatever the user passes through.
Here's an example:
The initial value for filter will be id=1 (a string), I am including a split based on commas in case the user passes in additional filters, such as, title=blahblahblah
filter_split = filters.split(",")
itemFilter = Items.objects # from Django
for f in filter_split:
itemFilter = itemFilter.filter(f)
I have also tried splitting the leftover string as two separate values (key and value) and passing them as such:
itemFilter = itemFilter.filter(key = value)
With no luck.
How can I pass programmatic arguments to a method in Python? Or is there another way to programmatically filter the queryset with Django?

You can pass programmatic arguments with *list and **dict in the argument list.
a = [2,3,4]
function(1, *a) # equal to function(1,2,3,4)
b = {'x':42, 'y':None}
function(1, **b) # equal to function(1, x=42, y=None)
In your case just create a dictionary, assign the key-value pairs from your user input and call itemFilter.filter(**your_dict).

It's not possible to pass a string e.g. "id=1" to filter.
You can create a dictionary dynamically, then pass it to filter using ** unpacking.
key = "id"
value = 1
kwargs = {key: value}
MyModel.objects.filter(**kwargs)
Your other approach to try MyModel.objects.filter(key=value) doesn't work, because it doesn't use the variable key, it tries to filter on the field 'key'

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to compare 2 columns of a table through dynamic querying - python

Related

delet duplicates data from a queryset in django (code not working) [duplicate]

How to Django ORM update() a value nested inside a JSONField with a numeric value?

SQLAlchemy JSON column - how to perform a contains query

Django Q set too many values to unpack

Passing a string as a keyed argument to a function

Categories

Resources