How to implement a specific SQL statement as a SQLAlchemy ORM query - python

I was following a tutorial to make my first Flask API (https://medium.com/#dushan14/create-a-web-application-with-python-flask-postgresql-and-deploy-on-heroku-243d548335cc) I did it but now I want to do queries more custom with SQLAlchemy and PostgreSQL. My question is how I could do something like this:
query = text("""SELECT enc.*, persona."Persona_Nombre", persona."Persona_Apellido", metodo."MetEnt_Nombre", metodo_e."MetPag_Descripcion"
FROM "Ventas"."Enc_Ventas" AS enc
INNER JOIN "General"."Persona" AS persona ON enc."PersonaId" = persona."PersonaId"
INNER JOIN "Ventas"."Metodo_Entrega" AS metodo ON enc."MetodoEntregaId" = metodo."MetodoEntregaId"
INNER JOIN "General"."Metodo_Pago" AS metodo_e ON enc."MetodoPagoId" = metodo_e."MetodoPagoId"
INNER JOIN "General"."Estatus" AS estado ON enc. """)
but with SQLAlchemy in order to use the models that I created previously. Thanks in advance for any answer!!
Edit:
The columns that I wish to see at the final result are: enc.*, persona."Persona_Nombre", persona."Persona_Apellido", metodo."MetEnt_Nombre", metodo_e."MetPag_Descripcion"
I really wish I could share more info but sadly I can't at the moment.

Doing this from the ORM layer, you would reference model names (I match the names of your query above, but I'm sure some of the model/table names are off - now adjusted slightly).
Now revised to include the specific columns you only want to see (note that I ignore your SQL aliases, ORM layer handles the actual query construction):
selection = session.query(Enc_Ventas, Persona.Persona_Nombre, Persona.Persona_Apellido, Metodo_Entrega.MetEnt_Nombre, Metodo_Pago.MetPag_Descripcion).\
join(Persona, Enc_Ventas.PersonaId == Persona.PersonaId).
join(Metodo_Entrega, Enc_Ventas.MetodoEntregaId == Metodo_Entrega.MetodoEntregaId).\
join(Metodo_Pago, Enc_Ventas.MetodoPagoId == Metodo_Pago.MetodoPagoId).\
join(Estatus).all()
Referencing the selection collection would be by iteration through the rows of tuples. A more robust and stable solution would be to transform each output row into a dict.
Otherwise, by including whole models, the collection of rows returned can be individually accessed by referencing as dot notation the model names in the query().
If you need further access to the columns in the related tables, use the ORM technique of .options(joinedload(myTable)), which in a single database query will bring in those additional columns, using the relationship name, also as dot notation.
You also need to define sqlalchemy relationships within your models for this to work, as well as defining the underlying SQL foreign keys.
Much more detail and/or a more specific question is needed to help further, imo.

Related

Find parent with certain combination of child rows - SQLite with Python

There are several parts to this question. I am working with sqlite3 in Python 2.7, but I am less concerned with the exact syntax, and more with the methods I need to use. I think the best way to ask this question is to describe my current database design, and what I am trying to accomplish. I am new to databases in general, so I apologize if I don't always use correct nomenclature.
I am modeling refrigeration systems (using Modelica--not really important to know), and I am using the database to manage input data, results data, and models used for that data.
My top parent table is Model, which contains the columns:
id, name, version, date_created
My child table under Model is called Design. It is used to create a unique id for each combination of design input parameters and the model used. the columns it contains are:
id, model_id, date_created
I then have two child tables under Design, one called Input, and the other called Result. We can just look at Input for now, since one example should be enough. The columns for input are:
id, value, design_id, parameter_id, component_id
parameter_id and component_id are foreign keys to their own tables.The Parameter table has the following columns:
id, name, units
Some example rows for Parameter under name are: length, width, speed, temperature, pressure (there are many dozens more). The Component table has the following columns:
id, name
Some example rows for Component under name are: compressor, heat_exchanger, valve.
Ultimately, in my program I want to search the database for a specific design. I want to be able to search a specific design to be able to grab specific results for that design, or to know whether or not a model simulation with that design has already been run previously, to avoid re-running the same data point.
I also want to be able to grab all the parameters for a given design, and insert it into a class I have created in Python, which is then used to provide inputs to my models. In case it helps for solving the problem, the classes I have created are based on the components. So, for example, I have a compressor class, with attributes like compressor.speed, compressor.stroke, compressor.piston_size. Each of these attributes should have their own row in the Parameter table.
So, how would I query this database efficiently to find if there is a design that matches a long list (let's assume 100+) of parameters with specific values? Just as a side note, my friend helped me design this database. He knows databases, but not my application super well. It is possible that I designed it poorly for what I want to accomplish.
Here is a simple picture trying to map a certain combination of parameters with certain values to a design_id, where I have taken out component_id for simplicity:
Picture of simplified tables
Simply join the necessary tables. Your schema properly reflects normalization (separating tables into logical groupings) and can scale for one-to-many relationships. Specifically, to answer your question --So, how would I query this database efficiently to find if there is a design that matches a long list (let's assume 100+) of parameters with specific values?-- consider below approaches:
Inner Join with Where Clause
For handful of parameters, use an inner join with a WHERE...IN() clause. Below returns design fields joined by input and parameters tables, filtered for specific parameter names where you can have Python pass as parameterized values even iteratively in a loop:
SELECT d.id, d.model_id, d.date_created
FROM design d
INNER JOIN input i ON d.id = i.design_id
INNER JOIN parameters p ON p.id = i.parameter_id
WHERE p.name IN ('param1', 'param2', 'param3', 'param4', 'param5', ...)
Inner Join with Temp Table
Should values be over 100+ in a long list, consider a temp table that filters parameters table to specific parameter values:
# CREATE EMPTY TABLE (SAME STRUCTURE AS parameters)
sql = "CREATE TABLE tempparams AS SELECT id, name, units FROM parameters WHERE 0;"
cur.execute(sql)
db.commit()
# ITERATIVELY APPEND TO TEMP
for i in paramslist: # LIST OF 100+ ITEMS
sql = "INSERT INTO tempparams (id, name, units) \
SELECT p.id, p.name, p.units \
FROM parameters p \
WHERE p.name = ?;"
cur.execute(sql, i) # CURSOR OBJECT COMMAND PASSING PARAM
db.commit() # DB OBJECT COMMIT ACTION
Then, join main design and input tables with new temp table holding specific parameters:
SELECT d.id, d.model_id, d.date_created
FROM design d
INNER JOIN input i ON d.id = i.design_id
INNER JOIN tempparams t ON t.id = i.parameter_id
Same process can work with components table as well.
*Moved picture to question section

How to filter on calculated column of a query and meanwhile preserve mapped entities

I have a query which selects an entity A and some calculated fields
q = session.query(Recipe,func.avg(Recipe.somefield).join(.....)
I then use what I select in a way which assumes I can subscript result with "Recipe" string:
for entry in q.all():
recipe=entry.Recipe # Access KeyedTuple by Recipe attribute
...
Now I need to wrap my query in an additional select, say to filter by calculated field AVG:
q=q.subquery();
q=session.query(q).filter(q.c.avg_1 > 1)
And now I cannot access entry.Recipe anymore!
Is there a way to make SQLAlchemy adapt a query to an enclosing one, like aliased(adapt_on_names=True) orselect_from_entity()`?
I tried using those but was given an error
As Michael Bayer mentioned in a relevant Google Group thread, such adaptation is already done via Query.from_self() method. My problem was that in this case I didn't know how to refer a column which I want to filter on
This is due to the fact, that it is calculated i.e. there is no table to refer to!
I might resort to using literals(.filter('avg_1>10')), but 'd prefer to stay in the more ORM-style
So, this is what I came up with - an explicit column expression
row_number_column = func.row_number().over(
partition_by=Recipe.id
).label('row_number')
query = query.add_column(
row_number_column
)
query = query.from_self().filter(row_number_column == 1)

Django model search concatenated string

I am trying to use a Django model to for a record but then return a concatenated field of two different tables joined by a foreign key.
I can do it in SQL like this:
SELECT
location.location_geoname_id as id,
CONCAT_WS(', ', location.location_name, region.region_name, country.country_name) AS 'text'
FROM
geonames_location as location
JOIN
geonames_region as region
ON
location.region_geoname_id = region.region_geoname_id
JOIN
geonames_country as country
ON
region.country_geoname_id = country.country_geoname_id
WHERE
location.location_name like 'location'
ORDER BY
location.location_name, region.region_name, country.country_name
LIMIT 10;
Is there a cleaner way to do this using Django models? Or do I need to just use SQL for this one?
Thank you
Do you really need the SQL to return the concatenated field? Why not query the models in the usual way (with select_related()) and then concatenate in Python? Or if you're worried about querying more columns than you need, use values_list:
locations = Location.objects.values_list(
'location_name', 'region__region_name', 'country__country_name')
location_texts = [','.join(l) for l in locations]
You can also write raw query for this in your code like that and later on you can concatenate.
Example:
org = Organization.objects.raw('SELECT organization_id, name FROM organization where is_active=1 ORDER BY name')
Keep one thing in a raw query you have to always fetch primary key of table, it's mandatory. Here organization_id is a primary key of contact_organization table.
And it's depend on you which one is useful and simple(raw query or model query).

Django: Selecting distinct values on maximum foreign key value

I have the following models which I'm testing with SQLite3 and MySQL:
# (various model fields extraneous to discussion removed...)
class Run(models.Model):
runNumber = models.IntegerField()
class Snapshot(models.Model):
t = models.DateTimeField()
class SnapshotRun(models.Model):
snapshot = models.ForeignKey(Snapshot)
run = models.ForeignKey(Run)
# other fields which make it possible to have multiple distinct Run objects per Snapshot
I want a query which will give me a set of runNumbers & snapshot IDs for which the Snapshot.id is below some specified value. Naively I would expect this to work:
print SnapshotRun.objects.filter(snapshot__id__lte=ss_id)\
.order_by("run__runNumber", "-snapshot__id")\
.distinct("run__runNumber", "snapshot__id")\
.values("run__runNumber", "snapshot__id")
But this blows up with
NotImplementedError: DISTINCT ON fields is not supported by this database backend
for both database backends. Postgres is unfortunately not an option.
Time to fall back to raw SQL?
Update:
Since Django's ORM won't help me out of this one (thanks #jknupp) I did manage to get the following raw SQL to work:
cursor.execute("""
SELECT r.runNumber, ssr1.snapshot_id
FROM livedata_run AS r
JOIN livedata_snapshotrun AS ssr1
ON ssr1.id =
(
SELECT id
FROM livedata_snapshotrun AS ssr2
WHERE ssr2.run_id = r.id
AND ssr2.snapshot_id <= %s
ORDER BY snapshot_id DESC
LIMIT 1
);
""", max_ss_id)
Here livedata is the Django app these tables live in.
The note in the Django documentation is pretty clear:
Note:
Any fields used in an order_by() call are included in the SQL SELECT columns. This can sometimes lead to unexpected results when used in conjunction with distinct(). If order by fields from a related model, those fields will be added to the selected columns and they may make otherwise duplicate rows appear to be distinct. Since the extra columns don’t appear in the returned results (they are only there to support ordering), it sometimes looks like non-distinct results are being returned.
Similarly, if you use a values() query to restrict the columns selected, the columns used in any order_by() (or default model ordering) will still be involved and may affect uniqueness of the results.
The moral here is that if you are using distinct() be careful about ordering by related models. Similarly, when using distinct() and values() together, be careful when ordering by fields not in the values() call.
Also, below that:
This ability to specify field names (with distinct) is only available in PostgreSQL.

How to filter records before executing annotation on those records in django models?

In documentation is example:
Book.objects.annotate(num_authors=Count('authors')).filter(num_authors__gt=1)
How can I filter authors, before executing annotation on authors?
For example I want Count only those authors that have name "John".
I don't believe you can make this selective count with the Django database-abstraction API without including some SQL. You make additions to a QuerySet's SQL using the extra method.
Assuming that the example in the documention is part an app called "inventory" and using syntax that works with postgresql (you didn't specify and it's what I'm more familiar with), the following should do what you're asking for:
Book.objects.extra(
select={"john_count":
"""SELECT COUNT(*) from "inventory_book_authors"
INNER JOIN "inventory_author" ON ("inventory_book_authors"."author_id"="inventory_author"."id")
WHERE "inventory_author"."name"=%s
AND "inventory_book_authors"."book_id"="inventory_book"."id"
"""},
select_params=('John',)
)

Categories