Dynamics OR filtering with SQLAlchemy - python

I have a list of of SQLAlchemy Model attributes. For example:
my_list = ['firstName', 'lastName']
I also have a person SQLAlchemy Object with firstName and lastName attributes.
I want to search query my database for people with a query as follows:
session.filter( Person.lastName.like(query+'%') | Person.firstName.like(query+'%')).all()
The tricky part is that I want to generate the above filter dynamically from the my_list list. For example if an emailAddress is added to the list I want the query to also search via the objects email property.

With SQLAlchemy this is pretty easy if you know about reduce; use getattr to get the dynamically named column from the Person class.
from functools import reduce # python 3
from operator import or_
columns = [ 'firstName', 'lastName' ]
# make a list of `Person.c.like(query+'%')`
likes = [ getattr(Person, c).like(query+'%') for c in column ]
# join them with | using reduce; does the same as likes[0]|likes[1]|....
final_filter = reduce(or_, likes)
session.filter(final_filter).all()
Though or_ also accepts any number of clauses to or together, so you can use argument unpacking too:
final_filter = or_(*likes)

Related

How to query a table that has ENUM column and keep the ENUM type?

I'm using SQLAlchemy ORM.
I have a table in SQL DB with an id column, and a column called b, which is type enum, and can take values ('example_1', 'example_2').
In Python, I have an Enum class like this:
class BTypes(enum.Enum):
EXAMPLE_1 = 'example_1'
EXAMPLE_2 = 'example_2'
For querying the table, I have an ORM like this:
class Example(Base):
__tablename__ = "example"
id = Column(Integer, primary_key=True)
b = Column(Enum(BTypes).values_callable)
When I do session.query(Example).all(), the objects that I get back have str type for the b attribute. In other words:
data = session.query(Example).all()
print(data[0].b)
# Outputs
# example_1
I want that the Example object for the attribute b has an enum type, not str. What is the best way to achieve this?
Base.metadata.create_all(create_engine("sqlite://")) with:
b = Column(Enum(BTypes).values_callable)
gives me:
sqlalchemy.exc.CompileError: (in table 'example', column 'b'): Can't generate DDL for NullType(); did you forget to specify a type on this Column?
About NullType
Since Enum(BTypes).values_callable is None, SQLAlchemy defaults to NullType.
From https://docs.sqlalchemy.org/en/14/core/type_api.html#sqlalchemy.types.NullType:
The NullType can be used within SQL expression invocation without issue, it just has no behavior either at the expression construction level or at the bind-parameter/result processing level.
In other words, when we query, its value is simply assigned as-is from the database.
How to use the Enum.values_callable parameter
From https://docs.sqlalchemy.org/en/14/core/type_basics.html#sqlalchemy.types.Enum:
In order to persist the values and not the names, the Enum.values_callable parameter may be used. The value of this parameter is a user-supplied callable, which is intended to be used with a PEP-435-compliant enumerated class and returns a list of string values to be persisted. For a simple enumeration that uses string values, a callable such as lambda x: [e.value for e in x] is sufficient.
That would be:
b = Column(Enum(BTypes, values_callable=lambda x: [e.value for e in x]))
Modify your code to query table as below to get as Enum:
class Example(Base):
__tablename__ = "example"
id = Column(Integer, primary_key=True)
b = Column(Enum(BTypes))
Note that values_callable typically return list of string values.
Do have a look into the documentation for more information
values_callable –
A callable which will be passed the PEP-435 compliant enumerated type, which should then return a list of string values to be persisted. This allows for alternate usages such as using the string value of an enum to be persisted to the database instead of its name.

delet duplicates data from a queryset in django (code not working) [duplicate]

suppose we have a model in django defined as follows:
class Literal:
name = models.CharField(...)
...
Name field is not unique, and thus can have duplicate values. I need to accomplish the following task:
Select all rows from the model that have at least one duplicate value of the name field.
I know how to do it using plain SQL (may be not the best solution):
select * from literal where name IN (
select name from literal group by name having count((name)) > 1
);
So, is it possible to select this using django ORM? Or better SQL solution?
Try:
from django.db.models import Count
Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
This is as close as you can get with Django. The problem is that this will return a ValuesQuerySet with only name and count. However, you can then use this to construct a regular QuerySet by feeding it back into another query:
dupes = Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
Literal.objects.filter(name__in=[item['name'] for item in dupes])
This was rejected as an edit. So here it is as a better answer
dups = (
Literal.objects.values('name')
.annotate(count=Count('id'))
.values('name')
.order_by()
.filter(count__gt=1)
)
This will return a ValuesQuerySet with all of the duplicate names. However, you can then use this to construct a regular QuerySet by feeding it back into another query. The django ORM is smart enough to combine these into a single query:
Literal.objects.filter(name__in=dups)
The extra call to .values('name') after the annotate call looks a little strange. Without this, the subquery fails. The extra values tricks the ORM into only selecting the name column for the subquery.
try using aggregation
Literal.objects.values('name').annotate(name_count=Count('name')).exclude(name_count=1)
In case you use PostgreSQL, you can do something like this:
from django.contrib.postgres.aggregates import ArrayAgg
from django.db.models import Func, Value
duplicate_ids = (Literal.objects.values('name')
.annotate(ids=ArrayAgg('id'))
.annotate(c=Func('ids', Value(1), function='array_length'))
.filter(c__gt=1)
.annotate(ids=Func('ids', function='unnest'))
.values_list('ids', flat=True))
It results in this rather simple SQL query:
SELECT unnest(ARRAY_AGG("app_literal"."id")) AS "ids"
FROM "app_literal"
GROUP BY "app_literal"."name"
HAVING array_length(ARRAY_AGG("app_literal"."id"), 1) > 1
Ok, so for some reason none of the above worked for, it always returned <MultilingualQuerySet []>. I use the following, much easier to understand but not so elegant solution:
dupes = []
uniques = []
dupes_query = MyModel.objects.values_list('field', flat=True)
for dupe in set(dupes_query):
if not dupe in uniques:
uniques.append(dupe)
else:
dupes.append(dupe)
print(set(dupes))
If you want to result only names list but not objects, you can use the following query
repeated_names = Literal.objects.values('name').annotate(Count('id')).order_by().filter(id__count__gt=1).values_list('name', flat='true')

Django __in but return the first matching element

I have this model
from django.db import models
class TranslatedString(models.Model):
lang = models.CharField()
key = models.CharField()
value = models.CharField()
I have these instances of this model:
a = TranslatedString(lang="en_US", key="my-string", value="hello world")
b = TranslatedString(lang="en_AU", key="my-string", value="g'day world")
c = TranslatedString(lang="ja_JP", key="my-string", value="こんにちは世界")
And I have this list of languages a user wants
preferred_langs = ["en_CA", "en_US", "en_AU", "fr_CA"]
which is ordered by preference. I would like to return the value that matches the first item in that list. Even though both a and b would match a query like
TranslatedString.objects.filter(key="my-string", lang__in=preferred_langs).first()
I want it to be ordered by the list, so that I always get a.
I can make a query for each element in preferred_langs and return as soon as I find a matching value, but is there a better option? I'd like to do it in one query.
You can use a generator expression over preferred_langs to produce a mapping of preferred languages to their respective indices in the list as When objects for a Case object to be annotated as a field so that you can order the filtered result by it:
from django.db.models import Case, Value, When
TranslatedString.objects.filter(key="my-string", lang__in=preferred_langs).annotate(
preference=Case(*(When(lang=lang, then=Value(i)) for i, lang in preferred_langs))
).order_by('preference').first()
If you don't mind retrieving all preferred translations from the database, this may be accomplished tersely by sorting the models in Python:
preferred_langs = ["en_CA", "en_US", "en_AU", "fr_CA"]
strings = list(TranslatedString.objects.filter(key="my-string", lang__in=preferred_langs))
strings.sort(key=lambda s: preferred_langs.index(s))
first_choice = strings[0]
print(first_choice.lang) # outputs "en_US"
This will perform a single (but potentially large) query. However, if the sequence of preferred languages is fairly short, the sorting should occur in negligible time.

Django one to one relation queryset

I have following two models
class A(models.Model):
name = models.CharField()
age = models.SmallIntergerField()
class B(models.Model):
a = models.OneToOneField(A)
salary = model.IntergerField()
No I have got records both of them. I want to query Model A with known id and I want both A and B records.
The SQL query is:
SELECT A.id, A.name, A.age, B.salary
FROM A INNER JOIN B ON A.id = B.a_id
WHERE A.id=1
Please provide me django query (by using orm). I want to achieve this with one queryset.
q = B.objects.filter(id=id).values('salary','a__id','a__name','a__age')
this will return a ValuesQuerySet
values
values(*fields) Returns a ValuesQuerySet — a QuerySet subclass that
returns dictionaries when used as an iterable, rather than
model-instance objects.
Each of those dictionaries represents an object, with the keys
corresponding to the attribute names of model objects.
You can actually print q.query to get the sql query behind the QuerySet, which in this case is exactly as you requested.
Please try this:
result = B.objects.filter(a__id=1).values('a__id', 'a__name', 'a__age', 'salary')
The result is a <class 'django.db.models.query.ValuesQuerySet'>, which is essentially a list of dictionaries with key as the field name and value as the actual value. If you want only the values, do this:
result = B.objects.filter(a__id=1).values_list('a__id', 'a__name', 'a__age', 'salary')
The result is a <class 'django.db.models.query.ValuesListQuerySet'>, and it's essentially a list of tuples.

SQLAlchemy/Elixir - querying to check entity's membership in a many-to-many relationship list

I am trying to construct a sqlalchemy query to get the list of names of all professors who are assistants professors on MIT. Note that there can be multiple assistant professors associated with a certain course.
What I'm trying to do is roughly equivalent to:
uni_mit = University.get_by(name='MIT')
s = select([Professor.name],
and_(Professor.in_(Course.assistants),
Course.university = uni_mit))
session.execute(s)
This won't work, because in_ is only defined for entity's fields, not for the whole entity.. Can't use Professor.id.in_ as Course.assistants is a list of Professors, not a list of their ids. I also tried contains but I didn't work either.
My Elixir model is:
class Course(Entity):
id = Field(Integer, primary_key=True)
assistants = ManyToMany('Professor', inverse='courses_assisted', ondelete='cascade')
university = ManyToOne('University')
..
class Professor(Entity):
id = Field(Integer, primary_key=True)
name = Field(String(50), required=True)
courses_assisted = ManyToMany('Course', inverse='assistants', ondelete='cascade')
..
This would be trivial if I could access the intermediate many-to-many entity (the condition would be and_(interm_table.prof_id = Professor.id, interm_table.course = Course.id), but SQLAlchemy apparently hides this table from me.
I'm using Elixir 0.7 and SQLAlchemy 0.6.
Btw: This question is different from Sqlalchemy+elixir: How query with a ManyToMany relationship? in that I need to check the professors against all courses which satisfy a condition, not a single, static one.
You can find the intermediate table where Elixir has hidden it away, but note that it uses fully qualified column names (such as __package_path_with_underscores__course_id). To avoid this, define your ManyToMany using e.g.
class Course(Entity):
...
assistants = ManyToMany('Professor', inverse='courses_assisted',
local_colname='course_id', remote_colname='prof_id',
ondelete='cascade')
and then you can access the intermediate table using
rel = Course._descriptor.find_relationship('assistants')
assert rel
table = rel.table
and can access the columns using table.c.prof_id, etc.
Update: Of course you can do this at a higher level, but not in a single query, because SQLAlchemy doesn't yet support in_ for relationships. For example, with two queries:
>>> mit_courses = set(Course.query.join(
... University).filter(University.name == 'MIT'))
>>> [p.name for p in Professor.query if set(
... p.courses_assisted).intersection(mit_courses)]
Or, alternatively:
>>> plist = [c.assistants for c in Course.query.join(
... University).filter(University.name == 'MIT')]
>>> [p.name for p in set(itertools.chain(*plist))]
The first step creates a list of lists of assistants. The second step flattens the list of lists and removes duplicates through making a set.

Categories