Python SQL Alchemy how to query by excluding selected columns - python

I basically just need to know how to query by excluding selected columns. Is this possible?
Example: I have table which has id, name, age, address, location, birth, age, sex... etc.
Instead of citing out the columns to retrieve, I'd like to just exclude some columns in the query(exclude age for example).
Sample code:
db.session.query(User.username).filter_by(username = request.form['username'], password = request.form['password']).first()
Last thing I wanna do is to list down all the attributes on the query() method, since this would be pretty long especially when you have lots of attributes, thus I just wanna exclude some columns.

Not sure why you're not just fetching the model. When doing that, you can defer loading of certain columns so that they are only queried on access.
db.session.query(User).options(db.defer('location')).filter_by(...).first()
In this example, accessing User.location the first time on an instance will issue another query to get the data.
See the documentation on column deferral: http://sqlalchemy.readthedocs.org/en/rel_0_9/orm/mapper_config.html?highlight=defer#column-deferral-api
Note that unless you're loading huge amounts of data, you won't see any speedup with this. It might actually make things slower since another query will be issued later. I have queries that load thousands of rows with eager-loaded relationships in less than 200ms, so this might be a case of premature optimization.

We can use the Inspection API to get the model's columns, and then create a list of columns that we want.
exclude = {'age', 'registration_date'}
insp = sa.inspect(User)
include = [c for c in insp.columns if c.name not in exclude]
# Traditional ORM style
with Session() as s:
q = s.query(*include)
for row in q:
print(row.id, row.name)
print()
# 1.4 style
with Session() as s:
q = sa.select(*include)
for row in s.execute(q):
print(row.id, row.name)
print()
inspect returns the mapper for the model class; to work with non-column attributes like relationships use one of the mapper's other attributes, such as all_orm_descriptors.

If you're using an object deserializer like marshmallow, it is easier to omit the required fields during the deserialization.
https://marshmallow.readthedocs.io/en/latest/api_reference.html#marshmallow.EXCLUDE
The fields to be omitted can be formed dynamically and conditionally excluded. Example:
ModelSchema(exclude=(field1, field2,)).jsonify(records)

I am not aware of a method that does that directly, but you can always get the column keys, exclude your columns, then call the resulting list. You don't need to see what is in the list while doing that.
q = db.session.query(blah blah...)
exclude = ['age']
targ_cols = [x for x in q.first().keys() if x not in exclude]
q.with_entities(targ_cols).all()

Related

How to implement a specific SQL statement as a SQLAlchemy ORM query

I was following a tutorial to make my first Flask API (https://medium.com/#dushan14/create-a-web-application-with-python-flask-postgresql-and-deploy-on-heroku-243d548335cc) I did it but now I want to do queries more custom with SQLAlchemy and PostgreSQL. My question is how I could do something like this:
query = text("""SELECT enc.*, persona."Persona_Nombre", persona."Persona_Apellido", metodo."MetEnt_Nombre", metodo_e."MetPag_Descripcion"
FROM "Ventas"."Enc_Ventas" AS enc
INNER JOIN "General"."Persona" AS persona ON enc."PersonaId" = persona."PersonaId"
INNER JOIN "Ventas"."Metodo_Entrega" AS metodo ON enc."MetodoEntregaId" = metodo."MetodoEntregaId"
INNER JOIN "General"."Metodo_Pago" AS metodo_e ON enc."MetodoPagoId" = metodo_e."MetodoPagoId"
INNER JOIN "General"."Estatus" AS estado ON enc. """)
but with SQLAlchemy in order to use the models that I created previously. Thanks in advance for any answer!!
Edit:
The columns that I wish to see at the final result are: enc.*, persona."Persona_Nombre", persona."Persona_Apellido", metodo."MetEnt_Nombre", metodo_e."MetPag_Descripcion"
I really wish I could share more info but sadly I can't at the moment.
Doing this from the ORM layer, you would reference model names (I match the names of your query above, but I'm sure some of the model/table names are off - now adjusted slightly).
Now revised to include the specific columns you only want to see (note that I ignore your SQL aliases, ORM layer handles the actual query construction):
selection = session.query(Enc_Ventas, Persona.Persona_Nombre, Persona.Persona_Apellido, Metodo_Entrega.MetEnt_Nombre, Metodo_Pago.MetPag_Descripcion).\
join(Persona, Enc_Ventas.PersonaId == Persona.PersonaId).
join(Metodo_Entrega, Enc_Ventas.MetodoEntregaId == Metodo_Entrega.MetodoEntregaId).\
join(Metodo_Pago, Enc_Ventas.MetodoPagoId == Metodo_Pago.MetodoPagoId).\
join(Estatus).all()
Referencing the selection collection would be by iteration through the rows of tuples. A more robust and stable solution would be to transform each output row into a dict.
Otherwise, by including whole models, the collection of rows returned can be individually accessed by referencing as dot notation the model names in the query().
If you need further access to the columns in the related tables, use the ORM technique of .options(joinedload(myTable)), which in a single database query will bring in those additional columns, using the relationship name, also as dot notation.
You also need to define sqlalchemy relationships within your models for this to work, as well as defining the underlying SQL foreign keys.
Much more detail and/or a more specific question is needed to help further, imo.

SQLAlchemy - Search entire DB model columns for a pattern

I have a database which I need to return any value that matches the query or looks similar.
Using flask-sqlalchemy I can filter manually, but I'm having trouble getting a list of objects using list comprehensions or any other more pythonic method.
I have already tried to create a dict with all model columns (which I have a list ) and the search value passed to the filter_by query
column_dict_and_value = {column_name:'value' for column_name in columns}
a = model.query.filter_by(**column_dict_and_value).all()
...but doesn't even return an existing object, although
a = model.query.filter_by(column_name='value').all()
actually returns the object.
I tried filter(.like('%')) and it sort of works
a = model.query.filter(model.column_name.like('valu%')).all()
returns me a list of all the objects that matches that pattern but for that column, and I want to iterate over all columns. Basically a full blown search since I'm not sure what the user is going to look for and I want to show the object if the query exist on any column or a list of objects if the query is incomplete, like the * in any search.
I tried to use list comprehensions to iterate through the model attributes but I'm not allowed to do that apparently:
a = [model.query.filter(model.column.like('valu%')) for column in columns]
...this complains that column is not an attribute of the model, which makes sense due to the syntax, but I had to try anyway, innit?
I'm a newbie regarding databases and class objects so please be gentle. I tried to look for similar answers but I can't find the one that suits me. the filter_by(one_column, second_column, etc) doesn't look to me very pythonic and if for any reason I change the model i need to change the query too where as creating a dict comprehension of the model seems to me more foolproof.
SOLUTION
Based on calestini proposed answer I wrote this code that seems to do the trick. It also cleans the list because if all() doesn't have any result (since it looks in every field) it returns None and adds up to the list. I'd rather prefer it clean.
Note: All my fields are text. Please check the third proposed solution from calestini if yours differ. Not tested it though.
columns = [
"foo",
"bar",
"foobar"
]
def list_everything(search):
d = {column: search for column in columns}
raw = [
model.query.filter(getattr(model, col).ilike(f"{val}%")).all()
for col, val in d.items()
]
return [item for item in raw if item]
I'll keep optimizing and update this code if I come with a better solution. Thanks a lot
Probably the reason why you are not getting any result from
a = model.query.filter_by(**column_dict_and_value).all()
is because filter_by is testing for direct equality. In your case you are looking for a pattern, so you will need to loop and use filter as opposed to filter_by.
Assuming all your columns are String or Text types, you can try the following:
a = model.query
for col, val in column_dict_and_value.items():
a = a.filter(getattr(model, col).ilike(f'{val}%')) #ilike for case insensitive
a = a.all()
Or vs And
The issue can also be that in your case you could be testing for intersection but expecting union. In other words, you are returning something only if the pattern matches in all columns. If instead you want a result in case any column matches, then you need to tweak the code a bit:
condition = or_(*[getattr(model, col).ilike(f'{val}%') for col, val in column_dict_and_value.items()])
a = model.query.filter(codition).all()
Now if you actually have other data types among the columns, then you can try to first verify if the column type and then pass the same logic:
for col, val in column_dict_and_value.items():
## check if field python equivalent is string
if isinstance(class_.__table__.c[col].type.python_type, str):
...

Django - calling attribute from queryset with a string

I'm trying to loop over different query sets while not repeating myself too much and have encountered a problem using the queryset class.
This is not necessarily completely a Django-problem.
What I'm trying to do is to use my keylist, which corresponds to a django model's column names, to create a list of the data from those column names, what i want to do is something like this:
if needthisdata==1:
needdata=['column1', 'column2', 'column3']
else:
needdata=['column1', 'column4', 'column7']
entry=djangomodel.get.all().filter(identifier='id')
dictitems=[]
for n in range(0, len(needdata)):
if n==0:
dictitems=[entry.needdata[n]]
else:
dictitems.append(entry.needdata[n])
Which of course doesn't work since the queryset doesn't have a need data attribute, is there some way to call an attribute for a class with a string in this way?
A valid Django statement to obtain a single entry
First of all, there are some semantical problems here:
itentifier should probably be identifier, id, or pk;
you use .all immedately instead of first obtaining a manager (probably .objects); and
you here use a .filter(..) on the queryset to filter on an identifier, but usually this should be a .get(..), since by using a filter, zero, one or more results can be returned in an iterable.
entry = djangomodel.objects.get(id=some_id)
So now we obtain a single entry, but that of course does not resolve
obtaining the columns.
If all elements are real Django columns
In case the columns are real Django fields (so no #propertys, etc.) then we can use values_list, and perform a list(..) constructor on it:
dictitems = list(djangomodel.objects.values_list(*needdata).get(id=some_id))
If case some elements are #propertys
In case not all those fields are real Django fields, then we can use attrgetter instead:
from operator import attrgetter
dictitems = list(attrgetter(*needdata)(djangomodel.objects.get(id=some_id)))

How to filter on calculated column of a query and meanwhile preserve mapped entities

I have a query which selects an entity A and some calculated fields
q = session.query(Recipe,func.avg(Recipe.somefield).join(.....)
I then use what I select in a way which assumes I can subscript result with "Recipe" string:
for entry in q.all():
recipe=entry.Recipe # Access KeyedTuple by Recipe attribute
...
Now I need to wrap my query in an additional select, say to filter by calculated field AVG:
q=q.subquery();
q=session.query(q).filter(q.c.avg_1 > 1)
And now I cannot access entry.Recipe anymore!
Is there a way to make SQLAlchemy adapt a query to an enclosing one, like aliased(adapt_on_names=True) orselect_from_entity()`?
I tried using those but was given an error
As Michael Bayer mentioned in a relevant Google Group thread, such adaptation is already done via Query.from_self() method. My problem was that in this case I didn't know how to refer a column which I want to filter on
This is due to the fact, that it is calculated i.e. there is no table to refer to!
I might resort to using literals(.filter('avg_1>10')), but 'd prefer to stay in the more ORM-style
So, this is what I came up with - an explicit column expression
row_number_column = func.row_number().over(
partition_by=Recipe.id
).label('row_number')
query = query.add_column(
row_number_column
)
query = query.from_self().filter(row_number_column == 1)

Django models - how to filter out duplicate values by PK after the fact?

I build a list of Django model objects by making several queries. Then I want to remove any duplicates, (all of these objects are of the same type with an auto_increment int PK), but I can't use set() because they aren't hashable.
Is there a quick and easy way to do this? I'm considering using a dict instead of a list with the id as the key.
In general it's better to combine all your queries into a single query if possible. Ie.
q = Model.objects.filter(Q(field1=f1)|Q(field2=f2))
instead of
q1 = Models.object.filter(field1=f1)
q2 = Models.object.filter(field2=f2)
If the first query is returning duplicated Models then use distinct()
q = Model.objects.filter(Q(field1=f1)|Q(field2=f2)).distinct()
If your query really is impossible to execute with a single command, then you'll have to resort to using a dict or other technique recommended in the other answers. It might be helpful if you posted the exact query on SO and we could see if it would be possible to combine into a single query. In my experience, most queries can be done with a single queryset.
Is there a quick and easy way to do this? I'm considering using a dict instead of a list with the id as the key.
That's exactly what I would do if you were locked into your current structure of making several queries. Then a simply dictionary.values() will return your list back.
If you have a little more flexibility, why not use Q objects? Instead of actually making the queries, store each query in a Q object and use a bitwise or ("|") to execute a single query. This will achieve your goal and save database hits.
Django Q objects
You can use a set if you add the __hash__ function to your model definition so that it returns the id (assuming this doesn't interfere with other hash behaviour you may have in your app):
class MyModel(models.Model):
def __hash__(self):
return self.pk
If the order doesn't matter, use a dict.
Remove "duplicates" depends on how you define "duplicated".
If you want EVERY column (except the PK) to match, that's a pain in the neck -- it's a lot of comparing.
If, on the other hand, you have some "natural key" column (or short set of columns) than you can easily query and remove these.
master = MyModel.objects.get( id=theMasterKey )
dups = MyModel.objects.filter( fld1=master.fld1, fld2=master.fld2 )
dups.all().delete()
If you can identify some shorter set of key fields for duplicate identification, this works pretty well.
Edit
If the model objects haven't been saved to the database yet, you can make a dictionary on a tuple of these keys.
unique = {}
...
key = (anObject.fld1,anObject.fld2)
if key not in unique:
unique[key]= anObject
I use this one:
dict(zip(map(lambda x: x.pk,items),items)).values()

Categories