Add table columns to select without adding it to select_from

Add table columns to select without adding it to select_from - python

I have a prepared function in the database, which I want to call using Gino. This function has a return type equal to one of the tables, that is created using declarative. What I try to do is:
select(MyModel).select_from(func.my_function);
The problem is, that SQLAlchemy automatically detects the table in my select and adds it implicitly to select_from. The resulting SQL contains both my function and the table name in the FROM clause and the result is a cartesian of the function result and the whole table (not what I want really).
My question is – can I somehow specify that I want to select all the columns for a model without having the corresponding class in the FROM?

You have to specify the columns (as an array) if you don't want SA to automatically add MyModel to the FROM clause.
You have to either do this:
select([your_model_table.c.column1, your_model_table.c.column2]).select_from(func.my_function);
Your if you want all columns:
select(your_model_table.columns).select_from(func.my_function);

Related

Best way to perform bulk insert SQLAlchemy

I have a tabled called products
which has following columns
id, product_id, data, activity_id
What I am essentially trying to do is copy bulk of existing products and update it's activity_id and create new entry in the products table.
Example:
I already have 70 existing entries in products with activity_id 2
Now I want to create another 70 entries with same data except for updated activity_id
I could have thousands of existing entries that I'd like to make a copy of and update the copied entries activity_id to be a new id.
products = self.session.query(model.Products).filter(filter1, filter2).all()
This returns all the existing products for a filter.
Then I iterate through products, then simply clone existing products and just update activity_id field.
for product in products:
product.activity_id = new_id
self.uow.skus.bulk_save_objects(simulation_skus)
self.uow.flush()
self.uow.commit()
What is the best/ fastest way to do these bulk entries so it kills time, as of now it's OK performance, is there a better solution?

You don't need to load these objects locally, all you really want to do is have the database create these rows.
You essentially want to run a query that creates the rows from the existing rows:
INSERT INTO product (product_id, data, activity_id)
SELECT product_id, data, 2 -- the new activity_id value
FROM product
WHERE activity_id = old_id
The above query would run entirely on the database server; this is far preferable over loading your query into Python objects, then sending all the Python data back to the server to populate INSERT statements for each new row.
Queries like that are something you could do with SQLAlchemy core, the half of the API that deals with generating SQL statements. However, you can use a query built from a declarative ORM model as a starting point. You'd need to
Access the Table instance for the model, as that then lets you create an INSERT statement via the Table.insert() method.
You could also get the same object from models.Product query, more on that later.
Access the statement that would normally fetch the data for your Python instances for your filtered models.Product query; you can do so via the Query.statement property.
Update the statement to replace the included activity_id column with your new value, and remove the primary key (I'm assuming that you have an auto-incrementing primary key column).
Apply that updated statement to the Insert object for the table via Insert.from_select().
Execute the generated INSERT INTO ... FROM ... query.
Step 1 can be achieved by using the SQLAlchemy introspection API; the inspect() function, applied to a model class, gives you a Mapper instance, which in turn has a Mapper.local_table attribute.
Steps 2 and 3 require a little juggling with the Select.with_only_columns() method to produce a new SELECT statement where we swapped out the column. You can't easily remove a column from a select statement but we can, however, use a loop over the existing columns in the query to 'copy' them across to the new SELECT, and at the same time make our replacement.
Step 4 is then straightforward, Insert.from_select() needs to have the columns that are inserted and the SELECT query. We have both as the SELECT object we have gives us its columns too.
Here is the code for generating your INSERT; the **replace keyword arguments are the columns you want to replace when inserting:
from sqlalchemy import inspect, literal
from sqlalchemy.sql import ClauseElement
def insert_from_query(model, query, **replace):
# The SQLAlchemy core definition of the table
table = inspect(model).local_table
# and the underlying core select statement to source new rows from
select = query.statement
# validate asssumptions: make sure the query produces rows from the above table
assert table in select.froms, f"{query!r} must produce rows from {model!r}"
assert all(c.name in select.columns for c in table.columns), f"{query!r} must include all {model!r} columns"
# updated select, replacing the indicated columns
as_clause = lambda v: literal(v) if not isinstance(v, ClauseElement) else v
replacements = {name: as_clause(value).label(name) for name, value in replace.items()}
from_select = select.with_only_columns([
replacements.get(c.name, c)
for c in table.columns
if not c.primary_key
])
return table.insert().from_select(from_select.columns, from_select)
I included a few assertions about the model and query relationship, and the code accepts arbitrary column clauses as replacements, not just literal values. You could use func.max(models.Product.activity_id) + 1 as a replacement value (wrapped as a subselect), for example.
The above function executes steps 1-4, producing the desired INSERT SQL statement when printed (I created a products model and query that I thought might be representative):
>>> print(insert_from_query(models.Product, products, activity_id=2))
INSERT INTO products (product_id, data, activity_id) SELECT products.product_id, products.data, :param_1 AS activity_id
FROM products
WHERE products.activity_id != :activity_id_1
All you have to do is execute it:
insert_stmt = insert_from_query(models.Product, products, activity_id=2)
self.session.execute(insert_stmt)

How to implement a specific SQL statement as a SQLAlchemy ORM query

I was following a tutorial to make my first Flask API (https://medium.com/#dushan14/create-a-web-application-with-python-flask-postgresql-and-deploy-on-heroku-243d548335cc) I did it but now I want to do queries more custom with SQLAlchemy and PostgreSQL. My question is how I could do something like this:
query = text("""SELECT enc.*, persona."Persona_Nombre", persona."Persona_Apellido", metodo."MetEnt_Nombre", metodo_e."MetPag_Descripcion"
FROM "Ventas"."Enc_Ventas" AS enc
INNER JOIN "General"."Persona" AS persona ON enc."PersonaId" = persona."PersonaId"
INNER JOIN "Ventas"."Metodo_Entrega" AS metodo ON enc."MetodoEntregaId" = metodo."MetodoEntregaId"
INNER JOIN "General"."Metodo_Pago" AS metodo_e ON enc."MetodoPagoId" = metodo_e."MetodoPagoId"
INNER JOIN "General"."Estatus" AS estado ON enc. """)
but with SQLAlchemy in order to use the models that I created previously. Thanks in advance for any answer!!
Edit:
The columns that I wish to see at the final result are: enc.*, persona."Persona_Nombre", persona."Persona_Apellido", metodo."MetEnt_Nombre", metodo_e."MetPag_Descripcion"
I really wish I could share more info but sadly I can't at the moment.

Doing this from the ORM layer, you would reference model names (I match the names of your query above, but I'm sure some of the model/table names are off - now adjusted slightly).
Now revised to include the specific columns you only want to see (note that I ignore your SQL aliases, ORM layer handles the actual query construction):
selection = session.query(Enc_Ventas, Persona.Persona_Nombre, Persona.Persona_Apellido, Metodo_Entrega.MetEnt_Nombre, Metodo_Pago.MetPag_Descripcion).\
join(Persona, Enc_Ventas.PersonaId == Persona.PersonaId).
join(Metodo_Entrega, Enc_Ventas.MetodoEntregaId == Metodo_Entrega.MetodoEntregaId).\
join(Metodo_Pago, Enc_Ventas.MetodoPagoId == Metodo_Pago.MetodoPagoId).\
join(Estatus).all()
Referencing the selection collection would be by iteration through the rows of tuples. A more robust and stable solution would be to transform each output row into a dict.
Otherwise, by including whole models, the collection of rows returned can be individually accessed by referencing as dot notation the model names in the query().
If you need further access to the columns in the related tables, use the ORM technique of .options(joinedload(myTable)), which in a single database query will bring in those additional columns, using the relationship name, also as dot notation.
You also need to define sqlalchemy relationships within your models for this to work, as well as defining the underlying SQL foreign keys.
Much more detail and/or a more specific question is needed to help further, imo.

Set column order in alembic when adding a new column

I'm struggling to find a clean way (without raw SQL) to set the column order in alembic. For example, I would like to put a new column called 'name' after the 'id' column, something like this:
from alembic import op
import sqlalchemy as sa
...
op.add_column(
'people',
sa.Column(
'name',
sa.String(),
nullable=False
),
after='id'
)
But of course, alembic does not have the 'after' parameter, therefore this code fails and I have not found an equivalent to this 'after' parameter in the docs. I'm only able to append the column at the end of the table.
Can anybody suggest how to achieve in alembic/sqlalchemy what I want? Is it possible without raw SQL?

alembic version 1.4.0 or higher supports this feature.
use the batch operation to achieve it.
from alembic import op
import sqlalchemy as sa
with op.batch_alter_table("people") as batch_op:
batch_op.add_column(
Column("name", sa.String(50)),
insert_after="id"
)
Reference: https://alembic.sqlalchemy.org/en/latest/ops.html?highlight=insert_after#alembic.operations.BatchOperations.add_column

Slice them into separate nested lists, ending at the column before the inserted column ( so in this case, create a one nested list that ends at the "id" column.
Let's call that variable id_col
Now, you can either use the "extend" built-in or "append" to add the "name" column right after the id column.
You can also add the nested lists to end up with a final single data structure that has what you're looking for.
The emphasis, here, being, that you do not have to be constrained by exclusively using your SQL table format. You can convert it into a a nested list and then once combined using list methods, reconvert them back into your desired SQL table.

Find parent with certain combination of child rows - SQLite with Python

There are several parts to this question. I am working with sqlite3 in Python 2.7, but I am less concerned with the exact syntax, and more with the methods I need to use. I think the best way to ask this question is to describe my current database design, and what I am trying to accomplish. I am new to databases in general, so I apologize if I don't always use correct nomenclature.
I am modeling refrigeration systems (using Modelica--not really important to know), and I am using the database to manage input data, results data, and models used for that data.
My top parent table is Model, which contains the columns:
id, name, version, date_created
My child table under Model is called Design. It is used to create a unique id for each combination of design input parameters and the model used. the columns it contains are:
id, model_id, date_created
I then have two child tables under Design, one called Input, and the other called Result. We can just look at Input for now, since one example should be enough. The columns for input are:
id, value, design_id, parameter_id, component_id
parameter_id and component_id are foreign keys to their own tables.The Parameter table has the following columns:
id, name, units
Some example rows for Parameter under name are: length, width, speed, temperature, pressure (there are many dozens more). The Component table has the following columns:
id, name
Some example rows for Component under name are: compressor, heat_exchanger, valve.
Ultimately, in my program I want to search the database for a specific design. I want to be able to search a specific design to be able to grab specific results for that design, or to know whether or not a model simulation with that design has already been run previously, to avoid re-running the same data point.
I also want to be able to grab all the parameters for a given design, and insert it into a class I have created in Python, which is then used to provide inputs to my models. In case it helps for solving the problem, the classes I have created are based on the components. So, for example, I have a compressor class, with attributes like compressor.speed, compressor.stroke, compressor.piston_size. Each of these attributes should have their own row in the Parameter table.
So, how would I query this database efficiently to find if there is a design that matches a long list (let's assume 100+) of parameters with specific values? Just as a side note, my friend helped me design this database. He knows databases, but not my application super well. It is possible that I designed it poorly for what I want to accomplish.
Here is a simple picture trying to map a certain combination of parameters with certain values to a design_id, where I have taken out component_id for simplicity:
Picture of simplified tables

Simply join the necessary tables. Your schema properly reflects normalization (separating tables into logical groupings) and can scale for one-to-many relationships. Specifically, to answer your question --So, how would I query this database efficiently to find if there is a design that matches a long list (let's assume 100+) of parameters with specific values?-- consider below approaches:
Inner Join with Where Clause
For handful of parameters, use an inner join with a WHERE...IN() clause. Below returns design fields joined by input and parameters tables, filtered for specific parameter names where you can have Python pass as parameterized values even iteratively in a loop:
SELECT d.id, d.model_id, d.date_created
FROM design d
INNER JOIN input i ON d.id = i.design_id
INNER JOIN parameters p ON p.id = i.parameter_id
WHERE p.name IN ('param1', 'param2', 'param3', 'param4', 'param5', ...)
Inner Join with Temp Table
Should values be over 100+ in a long list, consider a temp table that filters parameters table to specific parameter values:
# CREATE EMPTY TABLE (SAME STRUCTURE AS parameters)
sql = "CREATE TABLE tempparams AS SELECT id, name, units FROM parameters WHERE 0;"
cur.execute(sql)
db.commit()
# ITERATIVELY APPEND TO TEMP
for i in paramslist: # LIST OF 100+ ITEMS
sql = "INSERT INTO tempparams (id, name, units) \
SELECT p.id, p.name, p.units \
FROM parameters p \
WHERE p.name = ?;"
cur.execute(sql, i) # CURSOR OBJECT COMMAND PASSING PARAM
db.commit() # DB OBJECT COMMIT ACTION
Then, join main design and input tables with new temp table holding specific parameters:
SELECT d.id, d.model_id, d.date_created
FROM design d
INNER JOIN input i ON d.id = i.design_id
INNER JOIN tempparams t ON t.id = i.parameter_id
Same process can work with components table as well.
*Moved picture to question section

column names and types for insert operation in sqlalchemy

I am building a sqlite browser in Python/sqlalchemy.
Here is my requirement.
I want to do insert operation on the table.
I need to pass a table name to a function. It should return all columns along with the respective types.
Can anyone tell me how to do this in sqlalchemy ?

You can access all columns of a Table like this:
my_table.c
Which returns a type that behaves similar to a dictionary, i.e. it has values method and so on:
columns = [(item.name, item.type) for item in my_table.c.values()]
You can play around with that to see what you can get from that. Using the declarative extension you can access the table through the class' __table__ attribute. Furthermore, you might find the Runtime Inspection API helpful.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.