I connected a postgresql database to Apache Superset and am playing around with their SQL editor. I'm running into a problem where I cannot do a left join between two tables with an associated id.
SELECT id, profile_name FROM "ProductionRun"
LEFT JOIN "StatsAssociation" ON "ProductionRun".id = "StatsAssociation".production_run_id;
Is my above syntax correct? The tables must be referenced with double quotation because they are created case sensitive. This returns only the id and profile_name columns of ProductionRun table without joining with StatsAssociation table.
I created the tables using sqlalchemy and here are the table schema:
ProductionRun
class ProductionRun(Base):
__tablename__ = 'ProductionRun'
id = Column(Integer, primary_key=True, autoincrement=True)
profile_name = Column(String, nullable=False)
StatsAssociation
class StatsAssociation(Base):
__tablename__ = 'StatsAssociation'
production_run_id = Column(Integer, ForeignKey('ProductionRun.id'), primary_key=True)
stats_package_id = Column(Integer, ForeignKey('StatsPackage.id'), unique=True, nullable=False)
stats_package = relationship('StatsPackage', back_populates='stats_association', cascade='all,delete')
production_run = relationship('ProductionRun', back_populates='stats_association')
When I view the tables, they both exist and StatsAssociation has production_run_id column which shares the same ids as ProductionRun.
This was originally posted as a comment.
You're not specifying any column from the "StatsAssociation" table, so it is expected that nothing would show up. To get columns in the output of the SELECT query, you need to list them -- the only exception that I can currently think of being if you use "TableName".* or * in SELECT.
For example, and just to start you off:
SELECT id, profile_name, production_run_id
FROM ...
where ... is the rest of your query.
Related
I have the following tables defined (very simplified version):
class Orders(db.Model):
id = db.Column(db.Integer, primary_key=True)
order_id = db.Column(db.Integer,nullable=False)
date_created = db.Column(db.DateTime, nullable=False)
class ProductOrders(db.Model):
id = db.Column(db.Integer, primary_key=True)
order_id = db.Column(db.Integer, nullable=False)
product_id = db.Column(db.Integer, nullable=False)
base_price = db.Column(db.Float, nullable=False)
I am using BigCommerce API and have multiple order_ids in both tables. The order_id is not unique globally but is unique per store. I am trying to work out how to link the two tables. I do have a Store table (shown below) that holds the store.id for each store, but I just cannot work out how to join the Orders and ProductOrders tables together so I can access both tables data where the store.id is the same. I just want to query, for example, a set of Orders.order_id or Orders.date_created and get ProductOrders.base_price as well.
class Store(db.Model):
id = db.Column(db.Integer, primary_key=True)
Any ideas?
Assuming id in both queries is the store_id and order_id is unique per store, you will have to apply join with AND statement.
For example: (in SQL)
Orders join ProductOrders on Orders.id = ProductOrders.id and Orders.order_id = ProductOrders.order_id
Answer is based on what I have understood from your question, sorry if that's not your required answer.
Edit:
In sqlalchemy it would be something like below:
from sqlalchemy import and_
session.query(Orders, ProductOrders).filter(and_(Orders.id == ProductOrders.id, Orders.order_id == ProductOrders.order_id)).all()
References:
https://www.tutorialspoint.com/sqlalchemy/sqlalchemy_orm_working_with_joins.htm
Using OR in SQLAlchemy
I have used the following documentation as a guide and tried to implement an upset mechanism for my Games table. I want to be able to dynamically update all columns of the selected table at a time (without having to specify each column individually). I have tried different approaches, but none have provided a proper SQL query which can be executed. What did I misunderstand respectively what are the errors in the code?
https://docs.sqlalchemy.org/en/12/dialects/mysql.html?highlight=on_duplicate_key_update#insert-on-duplicate-key-update-upsert
https://github.com/sqlalchemy/sqlalchemy/issues/4483
class Game(CustomBase, Base):
__tablename__ = 'games'
game_id = Column('id', Integer, primary_key=True)
date_time = Column(DateTime, nullable=True)
hall_id = Column(Integer, ForeignKey(SportPlace.id), nullable=False)
team_id_home = Column(Integer, ForeignKey(Team.team_id))
team_id_away = Column(Integer, ForeignKey(Team.team_id))
score_home = Column(Integer, nullable=True)
score_away = Column(Integer, nullable=True)
...
def put_games(games): # games is a/must be a list of type Game
insert_stmt = insert(Game).values(games)
#insert_stmt = insert(Game).values(id=Game.game_id, data=games)
on_upset_stmt = insert_stmt.on_duplicate_key_update(**games)
print(on_upset_stmt)
...
I regularly load original data from an external API (incl. ID) and want to update my database with it, i.e. update the existing entries (with the same ID) with the new data and add missing ones without completely reloading the database.
Updates
The actual code results in
TypeError: on_duplicate_key_update() argument after ** must be a
mapping, not list
With the commented line #insert_statement... instead of first insert_stmt is the error message
sqlalchemy.exc.CompileError: Unconsumed column names: data
I have defined my models as:
class Row(Base):
__tablename__ = "row"
id = Column(Integer, primary_key=True)
key = Column(String(32))
value = Column(String(32))
status = Column(Boolean, default=True)
parent_id = Column(Integer, ForeignKey("table.id"))
class Table(Base):
__tablename__ = "table"
id = Column(Integer, primary_key=True)
name = Column(String(32), nullable=False, unique=True)
rows=relationship("Row", cascade="all, delete-orphan")
to read a table from the db I can simply query Table and it loads all the rows owned by the table. But if I want to filter rows by 'status == True' it does not work. I know this is not a valid query but I want to do something like:
session.query(Table).filter(Table.name == name, Table.row.status == True).one()
As I was not able to make the above query work, I came up with a new solution to query table first without loading any rows, then use the Id to query Rows with filters and then assign the results to the Table object:
table_res = session.query(Table).option(noload('rows')).filter(Table.name == 'test').one()
rows_res = session.query(Row).filter(Row.parent_id == 1, Row.status == True)
table_res.rows = rows_res
But I believe there has to be a better way to do this in one shot. Suggestions?
You could try this SQLAlchemy query:
from sqlalchemy.orm import contains_eager
result = session.query(Table)\
.options(contains_eager(Table.rows))\
.join(Row)\
.filter(Table.name == 'abc', Row.status == True).one()
print(result)
print(result.rows)
Which leads to this SQL:
SELECT "row".id AS row_id,
"row"."key" AS row_key,
"row".value AS row_value,
"row".status AS row_status,
"row".parent_id AS row_parent_id,
"table".id AS table_id,
"table".name AS table_name
FROM "table" JOIN "row" ON "table".id = "row".parent_id
WHERE "table".name = ?
AND "row".status = 1
It does a join but also includes the contains_eager option to do it in one query. Otherwise the rows would be fetched on demand in a second query (you could specify this in the relationship as well, but this is one method of solving it).
I used SQLAlchemy to create a SQLite database which stores bibliographic data of some document, and I want to query the author number of each document.
I know how to do this in raw SQL, but how can I achieve the same result using SQLAlchemy? It is possible without using join?
Here is the classes that I have defined:
class WosDocument(Base):
__tablename__ = 'wos_document'
document_id = Column(Integer, primary_key=True)
unique_id = Column(String, unique=True)
......
authors = relationship('WosAuthor', back_populates='document')
class WosAuthor(Base):
__tablename__ = 'wos_author'
author_id = Column(Integer, primary_key=True, autoincrement=True)
document_unique_id = Column(String, ForeignKey('wos_document.unique_id'))
document = relationship('WosDocument', back_populates='authors')
last_name = Column(String)
first_name = Column(String)
And my goal is to get the same result as this SQL query does:
SELECT a.unique_id, COUNT(*)
FROM wos_document AS a
LEFT JOIN wos_author AS b
ON a.unique_id = b.document_unique_id
GROUP BY a.unique_id
I tried the codes below:
session.query(WosDocument.unique_id, len(WosDocument.authors)).all()
session.query(WosDocument.unique_id, func.count(WosDocument.authors)).all()
The first line raised an error, the second line doesn't give me the desired result, it return only one row and I don't recognize what it is:
[('000275510800023', 40685268)]
Since WosDocument Object has a one-to-many relationship authors, I supposed that I can query the author number of each document without using join explicitly, but I can't find out how to do this with SQLAlchemy.
Can you help me? Thanks!
If you have written the right relation in your model. Then the query would be like:
db.session.query(ParentTable.pk,func.count('*').label("count")).join(Childtable).group_by(ParentTable).all()
The detail of the document of the join() is
https://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.join
If you don't join() explictly you would need to deal with something like parent.relations as a field.
I want to delete some elements in tables that have a polymorphic relationship in sqlalchemy. Here's the model:
class Employee(Base):
__tablename__ = 'employee'
id = Column(Integer, primary_key=True)
name = Column(String(50))
type = Column(String(50))
__mapper_args__ = {
'polymorphic_identity':'employee',
'polymorphic_on':type
}
class Engineer(Employee):
__tablename__ = 'engineer'
id = Column(Integer, ForeignKey('employee.id'), primary_key=True)
engineer_name = Column(String(30))
__mapper_args__ = {
'polymorphic_identity':'engineer',
}
And here's how I delete it:
e = Engineer();
e.name = "John";
e.engineer_name = "Doe";
DBSession.add(e);
q = session.query(Engineer).filter(Employee.name == "John")
q.delete(False)
I get the following error, is that a bug or am I doing it the wrong way ?
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such
column: employee.name [SQL: u'DELETE FROM engineer WHERE employee.name
= ?'] [parameters: ('John',)]
I'm expecting sqlalchemy to delete the entres in the engineer and employee tables.
First you should define the on delete behaviour of this relationship:
id = Column(Integer, ForeignKey('employee.id', ondelete='CASCADE'), primary_key=True)
Then, using the ORM, you can delete all engineers with name "John" through a loop:
eng_list = session.query(Engineer).filter(Employee.name == "John").all()
for eng in eng_list:
session.delete(eng)
session.commit()
This will delete the records from both the Employee and Engineer tables.
update: comment on error message:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such
column: employee.name [SQL: u'DELETE FROM engineer WHERE employee.name
= ?'] [parameters: ('John',)]
Your attempt tries to Delete from Engineer with a join to Employee (to access the field Employee.name). But this join is missing from the query sqlalchemy is emitting to the backend.
I don't think SQLite supports deleting with joins. Perhaps you can try to run session.query(Engineer).filter(Employee.name == "John").delete() against a different backend, and sqlalchemy may be able to emit the proper SQL statement. I haven't tried it though.
update 2: On backends that respect foreign key constraints (and the onupdate constraint has been set to cascade), it should be sufficient to delete the row in the parent row, and the linked rows in the child will automatically be deleted.
I tried this example with both MySQL & Postgresql backends, and the following query deleted the row from both tables (employee & engineer):
session.query(Employee).filter(Employee.name=='John').delete()
For some reason, on Sqlite, this only deletes the record from employee.
Because doing the joined DELETE is not supported directly, I found an easy workaround is to use your normal joined query to select the ids to delete, then pass those ids to a separate DELETE query.
One minor annoyance is that since your returned ids are integers you would likely run into this error like I did if you try to pass those ids (technically an array of tuples) directly to the DELETE query. A simple intermediate conversion to strings fixes that.
So all together:
ids_to_delete = session.query(Engineer.id). \
filter(Employee.name == "John"). \
all()
# Convert the resulting int tuples to simple strings:
id_strings = [str(id_[0]) for id_ in ids_to_delete]
session.query(Engineer). \
filter(Engineer.id.in_(id_strings)). \
delete(synchronize_session=False)