SQLAlchemy; prevent automatic selection when ordering - python

With SQLAlchemy ORM querying with PostgreSQL(v9.5); how to prevent the automatic selection when sorting by a column; the sorted column should not be selected.
Hopefully the sample code below makes this more clear.
Example code
A table with an integer 'id', an integer 'object_id' and a string 'text':
id | object_id | text
---------------------
1 | 1 | house
2 | 2 | tree
3 | 1 | dog
The following query should return the distinct object_id as its own id with the most recent text:
query = session.query(
MyTable.object_id.label('id'),
MyTable.text
).\
distinct(MyTable.object_id).\
order_by(MyTable.object_id, MyTable.id.desc())
So far so good; but when I compile the query:
print(query.statement.compile(dialect=postgresql.dialect()))
The mytable.id and mytable.object_id are selected as well, so the column id is specified twice:
SELECT DISTINCT ON (mytable.object_id) mytable.object_id AS id,
mytable.text,
mytable.object_id,
mytable.id
FROM mytable
ORDER BY mytable.object_id,
mytable.id DESC

You can try it. It should work:
query = session.query(MyTable.object_id.distinct().label('id'), MyTable.text).order_by(MyTable.object_id, MyTable.id.desc())

Related

How to create a filter in SQLite database across multiple tables?

I am looking for a way to create a number of filters across a few tables in my SQL database. The 2 tables I require the data from are Order and OrderDetails.
The Order table is like this:
------------------------------------
| OrderID | CustomerID | OrderDate |
------------------------------------
The OrderDetails table is like this:
----------------------------------
| OrderID | ProductID | Quantity |
----------------------------------
I want to make it so that it counts the number of instances a particular OrderID pops up in a single day. For example, it will choose an OrderID in Order and then match it to the OrderIDs in OrderDetails, counting the number of times it pops up in OrderDetails.
-----------------------------------------------------------
| OrderID | CustomerID | OrderDate | ProductID | Quantity |
-----------------------------------------------------------
The code I used is below here:
# Execute SQL Query (number of orders made on a particular day entered by a user)
cursor.execute("""
SELECT 'order.*', count('orderdetails.orderid') as 'NumberOfOrders'
from 'order'
left join 'order'
on ('order.orderid' = 'orderdetais.orderid')
group by
'order.orderid'
""")
print(cursor.fetchall())
Also, the current output that I get is this when I should get 3:
[('order.*', 830)]
Your immediate problem is that you are abusing the use of single quotes. If you need to quote an identifiers (table name, column name and the-like), then you should use double quotes in SQLite (this actually is the SQL standard). And an expression such as order.* should not be quoted at all. You are also self-joining the orders table, while you probably want to bring the orderdetails.
You seem to want:
select
o.orderID,
o.customerID,
o.orderDate,
count(*) number_of_orders
from "order" o
left join orderdetails od on od.orderid = o.orderid
group by o.orderID, o.customerID, o.orderDate
order is a language keyword, so I did quote it - that table would be better named orders, to avoid the conflicting name. Other identifiers do not need to be quoted here.
Since all you want from orderdetails is the count, you could also use a subquery instead of aggregation:
select
o.*,
(select count(*) from orderdetails od where od.orderid = o.oderid) number_of_orders
from "order" o

Create new SQLite table combining column from other tables with sqlite3 and python

I am trying to create a new table that combines columns from two different tables.
Let's imagine then that I have a database named db.db that includes two tables named table1 and table2.
table1 looks like this:
id | item | price
-------------
1 | book | 20
2 | copy | 30
3 | pen | 10
and table2 like this (note that has duplicated axis):
id | item | color
-------------
1 | book | blue
2 | copy | red
3 | pen | red
1 | book | blue
2 | copy | red
3 | pen | red
Now I'm trying to create a new table named new_table that combines both columns price and color over the same axis and also without duplicates. My code is the following (it does not obviously work because of my poor SQL skills):
con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.execute("CREATE TABLE new_table (id varchar, item integer, price integer, color integer)")
cur.execute("ATTACH DATABASE 'db.db' AS other;")
cur.execute("INSERT INTO new_table (id, item, price) SELECT * FROM other.table1")
cur.execute("UPDATE new_table SET color = (SELECT color FROM other.table2 WHERE distinct(id))")
con.commit()
I know there are multiple errors in the last line of code but I can't get my head around it. What would be your approach to this problem? Thanks!
Something like
CREATE TABLE new_table(id INTEGER, item TEXT, price INTEGER, color TEXT);
INSERT INTO new_table(id, item, price, color)
SELECT DISTINCT t1.id, t1.item, t1.price, t2.color
FROM table1 AS t1
JOIN table2 AS t2 ON t1.id = t2.id;
Note the fixed column types; yours were all sorts of strange. item and color as integers?
If each id value is unique in the new table (Only one row will ever have an id of 1, only will be 2, and so on), that column should probably be an INTEGER PRIMARY KEY, too.
EDIT: Also, since you're creating this table in an in-memory database from tables from an attached file-based database... maybe you want a temporary table instead? Or a view might be more appropriate? Not sure what your goal is.

Using func rank in SQLAlchemy to rank rows in a table

I have a table defined like so:
Column | Type | Modifiers | Storage | Stats target | Description
-------------+---------+-----------+---------+--------------+-------------
id | uuid | not null | plain | |
user_id | uuid | | plain | |
area_id | integer | | plain | |
vote_amount | integer | | plain | |
I want to be able to generate a rank 'column' when I query this database. This rank column would be ordered by the vote_amount column. I have attempted to create a query to do this, it looks like so:
subq_rank = db.session.query(user_stories).add_columns(db.func.rank.over(partition_by=user_stories.user_id, order_by=user_stories.vote_amount).label('rank')).subquery('slr')
data = db.session.query(user_stories).select_entity_from(subq_rank).filter(user_stories.area_id == id).group_by(-subq_rank.c.rank).limit(50).all()
Hopefully my attempt will give you an idea of what I am trying to achieve.
Thanks.
Well, if you need in each query these columns better I would do it in DB. I would create a view which contains the column rank, and in the query I call this view to show directly the data in code:
CREATE VIEW [ranking_user_stories] AS
SELECT TOP 50 * FROM
(SELECT *, rank() over (partition by user_stories.user_id order by user_stories.vote_amount ASC) AS ranking
FROM user_stories
WHERE user_stories.area_id = id) uS
ORDER BY vote_amount ASC
It's the same logic than your code but in SQL, if your are using MySQL, just change TOP 50 to LIMIT 50 (and put at the end of query). I don't see the sense to put the last group by by ranking, but if you need it:
CREATE VIEW [ranking_user_stories] AS
SELECT TOP 50 MAX(id) AS id, user_id, area_id, MAX(vote_amount) AS vote_amount, ranking FROM
(SELECT *, rank() over (partition by user_stories.user_id order by user_stories.vote_amount ASC) AS ranking
FROM user_stories
WHERE user_stories.area_id = id) uS
ORDER BY MAX(vote_amount) ASC
GROUP BY user_id, area_id, ranking

Compare the results of 2 queries that have "ORDER BY" in PostgreSQL to check for match and mismatch

I am writing a small app that mark students' queries in PostgreSQL against the teacher's queries. For normal query I can easily use EXCEPT and UNION to find the mismatches. But how can I check the ones that need sorting.
If the answer matches all rows but only part of it are in right order. How can find the number of sorted rows and mark the case properly?
My program is written in Python with Psycopg2 library.
You can compare both queries joined by row_number(). Example:
create table example (id int, str text);
insert into example values (1, 'alfa'), (2, 'beta');
with teacher as ( -- teachers query
select * from example order by id
),
student as ( -- students query
select * from example order by id desc
),
teacher_rn as (
select row_number() over () rn, *
from teacher
),
student_rn as (
select row_number() over () rn, *
from student
)
select t.*, s.*
from teacher_rn t
join student_rn s
on t.rn = s.rn
where t <> s;
rn | id | str | rn | id | str
----+----+------+----+----+------
1 | 1 | alfa | 1 | 2 | beta
2 | 2 | beta | 2 | 1 | alfa
(2 rows)

Having trouble with a PostgreSQL query

I've got the following two tables:
User
userid | email | phone
1 | some#email.com | 555-555-5555
2 | some#otheremail.com | 555-444-3333
3 | one#moreemail.com | 333-444-1111
4 | last#one.com | 123-333-2123
UserTag
id | user_id | tag
1 | 1 | tag1
2 | 1 | tag2
3 | 1 | cool_tag
4 | 1 | some_tag
5 | 2 | new_tag
6 | 2 | foo
6 | 4 | tag1
I want to run a query in SQLAlchemy to join those two tables and return all users who do NOT have the tags "tag1" or "tag2". In this case, the query should return users with userid 2, and 3. Any help would be greatly appreciated.
I need the opposite of this query:
users.join(UserTag, User.userid == UserTag.user_id)
.filter(
or_(
UserTag.tag.like('tag1'),
UserTag.tag.like('tag2')
))
)
I have been going at this for hours but always end up with the wrong users or sometimes all of them. An SQL query which achieves this would also be helpful. I'll try to convert that to SQLAlchemy.
Not sure how this would look in SQLAlchemy, but hopefully and explanation of why the query is the way it is will help you get there.
This is an outer join - you want all the records from one table (User) even if there are no records in the other table (UserTag) if we put User first it would be a left join. Beyond that you want all the records that don't have a match in the UserTag for a specific filter.
SELECT user.user_id, email, phone
FROM user LEFT JOIN usertag
ON usertag.user_id = user.user_id
AND usertag.tag IN ('tag1', 'tag2')
WHERE usertag.user_id IS NULL;
SQL will go like this
select u.* from user u join usertag ut on u.id = ut.user_id and ut.tag not in ('tag1', 'tag2');
I have not used SQLAlchemy so you need to convert it to equivalent SQLAlchemy query.
Hope it helps.
Thanks.
Assuming your model defines a relationship as below:
class User(Base):
...
class UserTag(Base):
...
user = relationship("User", backref="tags")
the query follows:
qry = session.query(User).filter(~User.tags.any(UserTag.tag.in_(tags))).order_by(User.id)

Categories