Join 3 tables in postgresql on different objects - python

I have a dataframe with 3 tables, lets call them People, Event, Outcome. The setup for these tables would look like this:
Name has: ID, Name, Age; Outcome has: ID, EventID, EventTime and OutcomeID; Event has: EventID, EventState, EventDate, EventTemp.
I need to run a query that pulls in all the Events that "Sally" competed in and output the EventName, Event Month (extracted from the EventDate), EventTemp, and EventTime. But this issue I'm running into is I need to join Event and Outcome on the EventID and then People and Outcome on the ID.
Here is what I last tried (which isn't working):
SELECT eventname, eventstate, EXTRACT(MONTH FROM eventdate), eventtemp
FROM event E JOIN outcome O ON E.eventid = O.eventid
FROM name N JOIN outcome O ON N.id = O.id
WHERE name = "Sally";
This is not outputting anything because it throws an error. I am new to postgresql. Can someone help?

There can only be one FROM clause, although it can contain multiple JOINs. I'm assuming that the "name" field is inside the "name" table:
SELECT E.eventname, E.eventstate, EXTRACT(MONTH FROM E.eventdate), E.eventtemp
FROM event E JOIN outcome O ON E.eventid = O.eventid
JOIN name N ON N.id = O.id
WHERE N.name = 'Sally';

Related

Filter rows with remaining only the latest record in SQLAlchemy

I have been trying to write the SQLAlchemy code that should function as the following SQL query.
SELECT * FROM events AS ev
INNER JOIN event_types AS et1 on ev.event_type_id = et1.id
INNER JOIN (
SELECT event_type, MAX(created_at) AS LatestCreatedAt
FROM event_types et GROUP BY event_type
) AS et2
ON
et2.event_type = et1.event_type
AND
et2.LatestCreatedAt = et1.created_at
What I'm trying to do is to
Get all columns from the events table
Inner join the event_type table (et1) on the event table
Group by the event_type with only the rows that have the latest record (i.e. Filter out old event types by looking at created_at if duplicated)
Inner join the grouped event_type (et2) on the event_type table (et1)
What I wrote for the SQL Alchemy version of the above is
from sqlalchemy import func
subquery = session.query(EventTypeTable.event_type,
func.max(EventTypeTable.created_at).group_by(EventTypeTable.event_type)).all()
events = (session.query(EventTable)
.join(EventTypeTable)
.join(subquery)
.all())
However, I get the following error.
Neither 'max' object nor 'Comparator' object has an attribute 'group_by'
It seems to complain that I can not use group_by with max function. Is there any other way to get the query results while leaving only the latest record on the created_at column in the event_type table in SQL Alchemy?
Any help or comments are appreciated. Thank you!

How to know which records cause an issue when I run a SQL MERGE statement in Python

I was running below codes in python. It's doing a merge from one table to another table. But sometimes it gave me errors due to duplicates. How do I know which records have been merge and which one has not so that I can trace the records and fix it. Or at least, how to make my code log hinted error message so that I can trace it?
# Exact match client on NAME/DOB (not yet using name_dob_v)
sql = """
merge into nf.es es using (
select id, name_last, name_first, dob
from fd.emp
where name_last is not null and name_first is not null and dob is not null
) es6
on (upper(es.patient_last_name) = upper(es6.name_last) and upper(es.patient_first_name) = upper(es6.name_first)
and es.patient_dob = ems6.dob)
when matched then update set
es.client_id = ems6.id
, es.client_id_comment = '2 exact name/exact dob match'
where
es.client_id is null -- exclude those already matched
and es.patient_last_name is not null and es.patient_first_name is not null and es.patient_dob is not null
and es.is_lock = 'Locked' and es.is_active = 'Yes' and es.patient_last_name NOT IN ('DOE','UNKNOWN','DELETE', 'CANCEL','CANCELLED','CXL','REFUSED')
"""
log.info(sql)
curs.execute(sql)
msg = "nf.es rows updated with es6 client_id due to exact name/dob match: %d" % curs.rowcount
log.info(msg)
emailer.append(msg)
You can't know, merge won't tell you. You have to actually find them and take appropriate action.
Maybe it'll help if you select distinct values:
merge into nf.es es using (
select DISTINCT --> this
id, name_last, name_first, dob
from fd.emp
...
If it still doesn't work, then join table to be merged with the one in using clause on all columns you're doing it already and see which rows are duplicate. Something like this:
SELECT *
FROM (SELECT d.id,
d.name_last,
d.name_first,
d.dob
FROM fd.emp d
JOIN nf.es e
ON UPPER (e.patient_last_name) = UPPER (d.name_last)
AND UPPER (e.patient_first_name) = UPPER (d.name_first)
WHERE d.name_last IS NOT NULL
AND d.name_first IS NOT NULL
AND d.dob IS NOT NULL)
GROUP BY id,
name_last,
name_first,
dob
HAVING COUNT (*) > 2;

How to have the possibility to call name of columns in db.session.query with 2 tables in Flask Python?

I am developing a web application with Flask, Python, SQLAlchemy, and Mysql.
I have 2 tables:
TaskUser:
- id
- id_task (foreign key of id column of table Task)
- message
Task
- id
- id_type_task
I need to extract all the tasksusers (from TaskUser) where the id_task is in a specific list of Task ids.
For example, all the taskusers where id_task is in (1,2,3,4,5)
Once I get the result, I do some stuff and use some conditions.
When I make this request :
all_tasksuser=TaskUser.query.filter(TaskUser.id_task==Task.id) \
.filter(TaskUser.id_task.in_(list_all_tasks),Task.id_type_task).all()
for item in all_tasksuser:
item.message="something"
if item.id_type_task == 2:
#do some stuff
if item.id_task == 7 or item.id_task == 7:
#do some stuff
I get this output error:
if item.id_type_task == 2:
AttributeError: 'TaskUser' object has no attribute 'id_type_task'
It is normal as my SQLAlchemy request is calling only one table. I can't access to columns of table Task.
BUT I CAN call the columns of TaskUser by their names (see item.id_task).
So I change my SQLAlchemy to this:
all_tasksuser=db_mysql.session.query(TaskUser,Task.id,Task.id_type_task).filter(TaskUser.id_task==Task.id) \
.filter(TaskUser.id_task.in_(list_all_tasks),Task.id_type_task).all()
This time I include the table Task in my query BUT I CAN'T call the columns by their names. I should use the [index] of columns.
I get this kind of error message:
AttributeError: 'result' object has no attribute 'message'
The problem is I have many more columns (around 40) on both tables. It is too complicated to handle data with index numbers of columns.
I need to have a list of rows with data from 2 different tables and I need to be able to call the data by column name in a loop.
Do you think it is possible?
The key point leading to the confusion is the fact that when you perform a query for a mapped class like TaskUser, the sqlalchemy will return instances of that class. For example:
q1 = TaskUser.query.filter(...).all() # returns a list of [TaskUser]
q2 = db_mysql.session.query(TaskUser).filter(...).all() # ditto
However, if you specify only specific columns, you will receive just a (special) list of tuples:
q3 = db_mysql.session.query(TaskUser.col1, TaskUser.col2, ...)...
If you switch your mindset to completely use the ORM paradigm, you will work mostly with objects. In your specific example, the workflow could be similar to below, assuming you have relationships defined on your models:
# model
class Task(...):
id = ...
id_type_task = ...
class TaskUser(...):
id = ...
id_task = Column(ForeignKey(Task.id))
message = ...
task = relationship(Task, backref="task_users")
# query
all_tasksuser = TaskUser.query ...
# work
for item in all_tasksuser:
item.message = "something"
if item.task.id_type_task == 2: # <- CHANGED to navigate the relationship ...
#do some stuff
if item.task.id_task == 7 or item.task.id_task == 7: # <- CHANGED
#do some stuff
First error message is the fact that query and filter without join (our any other joins) cannot give you columns from both tables. You need to either put both tables into session query, or join those two tables in order to gather column values from different tables, so this code:
all_tasksuser=TaskUser.query.filter(TaskUser.id_task==Task.id) \
.filter(TaskUser.id_task.in_(list_all_tasks),Task.id_type_task).all()
needs to look more like this:
all_tasksuser=TaskUser.query.join(Task) \
.filter(TaskUser.id_task.in_(list_all_tasks),Task.id_type_task).all()
or like this:
all_tasksuser=session.query(TaskUser, Task).filter(TaskUser.id_task==Task.id) \
.filter(TaskUser.id_task.in_(list_all_tasks),Task.id_type_task).all()
Another thing is that the data will be structured differently so in the first example, you will need two for loops:
for taskuser in all_taskuser:
for task in taskuser.task:
# to reference to id_type_task : task.id_type_task
and in the second example your result is the tuple, so for loop should look like this
for taskuser, task in all_taskuser:
# to reference to id_type_task : task.id_type_task
NOTE: I haven't check all these examples, so there may be errors, but concepts are there. For more info, please refer yourself to this page:
https://www.tutorialspoint.com/sqlalchemy/sqlalchemy_orm_working_with_joins.htm

SQLAlchemy Joining 2 tables using Junction table

I am learning SQL-Python using SQLAlchemy and will appreciate much help on this.
I have 3 tables,
Table 1 (Actors) : nconst (primary key), names
Table 2 (Movies) : tconst (primary key) , titles
Table 3 (Junction table) : nconst (from Actors table) , tconst(from Movies table)
I am trying to obtain 10 rows of actors that acted in particular movies. Hence I am trying to do an inner join of Actors on Junction table (using nconst) and then another inner join onto Movies table.
In SQL, this means
FROM principals INNER JOIN actors
ON principals.nconst=actors.nconst INNER JOIN
movies ON principals.tconst=movies.tconst
In SQLAlchemy, my current code is:
mt = list(session.query(Movies, Principals, Actors).select_from(
join(Movies, Principals, Movies.tconst == Principals.tconst)
.join(Actors, Principals, Actors.nconst == Principals.nconst
).with_entities(
Movies.title, # Select clause
))
Alternatively, I am trying
from sqlalchemy.orm import join
mv = list(session.query(Actors).select_from(
join(Movies, Principals, Actors, Movies.tconst == Principals.tconst,
Actors.nconst == Principals.nconst) # Join clause
).with_entities(
Actors.name, # Select clause
Movies.title,
))
mv
The error I am getting is an Attribute Error, "Actor type object 'Actors' has no attribute '_from_objects'
Appreciate much help on this. Thank you very much.

If a condition is not fulfilled return data for that user, else another query

I have a table with these data:
ID, Name, LastName, Date, Type
I have to query the table for the user with ID 1.
Get the row, if the Type of that user is not 2, then return that user, else return all users that have the same LastName and Date.
What would be the most efficient way to do this ?
What I had done is :
query1 = SELECT * FROM clients where ID = 1
query2 = SELECT * FROM client WHERE LastName = %s AND Date= %s
And I execute the first query
cursor.execute(sql)
rows = cursor.fetchall()
for row in rows:
if(row['Type'] ==2 )
cursor.execute(sql2(row['LastName'], row['Date']))
Save results
else
results = rows?
Is there a more efficient way of doing this using Joins?
Example if I only have a left join, how would I also ask if the type of the user is 2 ?
And if there is multiple rows to be returned, how to assign them to an array of objects in python?
Just do two queries to avoid loops here:
q1 = """
SELECT c.* FROM clients c where c.ID = 1
"""
q2 = """
SELECT b.* FROM clients b
JOIN (SELECT c.* FROM
clients c
c.ID = 1
AND
c.Type = 2) a
ON
a.LastName = b.LastName
AND
a.Date = b.Date
"""
Then you can just execute both queries and you'll have all the desired results you want without the need for loops since your loop will execute n number of queries where n is equal to the number of rows that match as opposed to grabbing it all in one join in one pass. Without more specifics as the desired data structure of final results, as it seems you only care about saving the results, this should give you what you want.

Categories