I've got the following two tables:
User
userid | email | phone
1 | some#email.com | 555-555-5555
2 | some#otheremail.com | 555-444-3333
3 | one#moreemail.com | 333-444-1111
4 | last#one.com | 123-333-2123
UserTag
id | user_id | tag
1 | 1 | tag1
2 | 1 | tag2
3 | 1 | cool_tag
4 | 1 | some_tag
5 | 2 | new_tag
6 | 2 | foo
6 | 4 | tag1
I want to run a query in SQLAlchemy to join those two tables and return all users who do NOT have the tags "tag1" or "tag2". In this case, the query should return users with userid 2, and 3. Any help would be greatly appreciated.
I need the opposite of this query:
users.join(UserTag, User.userid == UserTag.user_id)
.filter(
or_(
UserTag.tag.like('tag1'),
UserTag.tag.like('tag2')
))
)
I have been going at this for hours but always end up with the wrong users or sometimes all of them. An SQL query which achieves this would also be helpful. I'll try to convert that to SQLAlchemy.
Not sure how this would look in SQLAlchemy, but hopefully and explanation of why the query is the way it is will help you get there.
This is an outer join - you want all the records from one table (User) even if there are no records in the other table (UserTag) if we put User first it would be a left join. Beyond that you want all the records that don't have a match in the UserTag for a specific filter.
SELECT user.user_id, email, phone
FROM user LEFT JOIN usertag
ON usertag.user_id = user.user_id
AND usertag.tag IN ('tag1', 'tag2')
WHERE usertag.user_id IS NULL;
SQL will go like this
select u.* from user u join usertag ut on u.id = ut.user_id and ut.tag not in ('tag1', 'tag2');
I have not used SQLAlchemy so you need to convert it to equivalent SQLAlchemy query.
Hope it helps.
Thanks.
Assuming your model defines a relationship as below:
class User(Base):
...
class UserTag(Base):
...
user = relationship("User", backref="tags")
the query follows:
qry = session.query(User).filter(~User.tags.any(UserTag.tag.in_(tags))).order_by(User.id)
Related
I am new to Python and am currently trying to create a Web-form to edit customer data. The user selects a customer and gets all DSL-Products linked to the customer. What I am now trying is to get the maximum downstream possible for a customer. So when the customer got DSL1, DSL3 and DSL3 then his MaxDownstream is 550. Sorry for my poor english skills.
Here is the structure of my tables..
Customer_has_product:
Customer_idCustomer | Product_idProduct
----------------------------
1 | 1
1 | 3
1 | 4
2 | 5
3 | 3
Customer:
idCustomer | MaxDownstream
----------------------------
1 |
2 |
3 |
Product:
idProduct | Name | downstream
-------------------------------------------------
1 | DSL1 | 50
2 | DSL2 | 100
3 | DSL3 | 550
4 | DSL4 | 400
5 | DSL5 | 1000
And the code i've got so far:
db_session = Session(db_engine)
customer_object = db_session.query(Customer).filter_by(
idCustomer=productform.Customer.data.idCustomer
).first()
productlist = request.form.getlist("DSLPRODUCTS_PRIVATE")
oldproducts = db_session.query(Customer_has_product.Product_idProduct).filter_by(
Customer_idCustomer=customer_object.idCustomer)
id_list_delete = list(set([r for r, in oldproducts]) - set(productlist))
for delid in id_list_delete:
db_session.query(Customer_has_product).filter_by(Customer_idCustomer=customer_object.idCustomer,
Product_idProduct=delid).delete()
db_session.commit()
for product in productlist:
if db_session.query(Customer_has_product).filter_by(
Customer_idCustomer=customer_object.idCustomer,
Product_idProduct=product
).first() is not None:
continue
else:
product_link_to_add = Customer_has_product(
Customer_idCustomer=productform.Customer.data.idCustomer,
Product_idProduct=product
)
db_session.add(product_link_to_add)
db_session.commit()
What you want to do is JOIN the tables onto each other. All relational database engines support joins, as does SQLAlchemy.
So how do you do that in SQLAlchemy?
You have two options, really. One is to use the Query builder of SQLAlchemy's ORM, the other is using SQLAlchemy Core (upon which the ORM is built) directly. I really prefer the later, because it maps more directly to SELECT statements, but I'm going to show both.
Using SQLAlchemy Core
How to do a join in Core is documented here. First argument is the table to JOIN to, second argument is the JOIN-condition.
from sqlalchemy import select, func
query = select(
[
Customer.idCustomer,
func.max(Product.downstream),
]
).select_from(
Customer.__table__
.join(Customer_has_product.__table__,
Customer_has_product.Customer_idCustomer ==
Customer.idCustomer)
.join(Product.__table__,
Product.idProduct == Customer_has_product.Product_idProduct)
).group_by(
Customer.idCustomer
)
# Now we can execute the built query on the database.
result = db_session.execute(query).fetchall()
print(result) # Should now give you the correct result.
Using SQLAlchemy ORM
To simplify this it's best to declare some [relationships on your models][2].joinis documented [here][2]. First argument tojoin` is the model to join onto and the second argument is the JOIN-condition again.
Without the relationships you'll have to do it like this.
result = (db_session
.query(Customer.idCustomer, func.max(Product.downstream))
.join(Customer_has_product,
Customer_has_product.Customer_idCustomer ==
Customer.idCustomer)
.join(Product,
Product.idProduct == Customer_has_product.Product_idProduct)
.group_by(Customer.idCustomer)
).all()
print(result)
This should be enough to get the idea on how to do this.
I have table that is holding some data about users. There are two fields there like and smile. I need to get data from table, grouped by user_id that will show if user has likes or smiles. Query that I would write in SQL looks like:
select sum(smile) > 0 as has_smile,
sum(like) > 0 as has_like,
user_id
from ratings
group by user_id.
This would provide output like:
| has_smile | has_like | user_id |
+-----------+----------+---------+
| 1 | 0 | 1 |
| 1 | 1 | 2 |
Is there any chance this query can be translated to SQLAlchemy (Flask-SQLAlchemy to be precise)? I know there is db.func.sum but I don't know how to add comparison there, and to have label. What I did for now is:
cls.query.with_entities("user_id").group_by(user_id).\
add_columns(db.func.sum(cls.smile).label("has_smile"),
db.func.sum(cls.like).label("has_like")).all()
but that will return exact number of smiles/likes instead of just 1/0 if there is or there is not smile/like.
Thanks to operator overloading you'd do comparison the way you're used to doing in Python in general:
db.func.sum(cls.smile) > 0
which produces an SQL expression object that you can then give a label to:
(db.func.sum(cls.smile) > 0).label('has_smile')
This question already has answers here:
Generate sql with subquery as a column in select statement using SQLAlchemy
(2 answers)
Closed 5 years ago.
Can the following MySQL query be done with a single SQLAlchemy session.query or do I have to run a second session.query ? If so, how so?
Select *, (select c from table2 where id = table1.id) as d from table1 where foo = x
What you want is SQLAlchemy's subquery object. Essentially, you write a query as normal, but instead of ending the query with .all() or .first() (as you would normally do to return some kind of result directly), you end your query with .subquery() to return a subquery object. The subquery object basically generates the subquery SQL embedded within an alias, but doesn't run it. You can then use it in your primary query, and SQLAlchemy will issue the necessary SQL to perform the query and subquery in a single operation.
Let's say we had the following student_scores table:
+------------+-------+-----+
| name | score | age |
+------------+-------+-----+
| Xu Feng | 95 | 25 |
| John Smith | 88 | 26 |
| Sarah Taft | 89 | 25 |
| Ahmed Zaki | 86 | 26 |
+------------+-------+-----|
(Ignore the horrible database design)
In this example, we want to get a result set containing all the students and their scores, joined to the average score by age. In raw SQL we would do something like this:
SELECT ss.name, ss.age, ss.score, sub.average
FROM student_scores AS "ss"
JOIN ( SELECT age, AVG(score) AS "average"
FROM student_scores
GROUP BY age) AS "sub"
ON ss.age = sub.age
ORDER BY ss.score DESC
The result should be something like this:
+------------+-------+-----+---------+
| name | score | age | average |
+------------+-------+-----+---------+
| Xu Feng | 95 | 25 | 92 |
| John Smith | 88 | 26 | 87 |
| Sarah Taft | 89 | 25 | 92 |
| Ahmed Zaki | 86 | 26 | 87 |
+------------+-------+-----|---------+
In SQLAlchemy, we can first define the subquery on its own:
from sqlalchemy.sql import func
avg_scores = (
session.query(
func.avg(StudentScores.score).label('average'),
StudentScores.age
)
.group_by(StudentScores.age)
.subquery()
)
Now our subquery is defined, but no statements have actually been sent to the database. Nevertheless we can treat our subquery object almost as though it were just another table, and write our main query:
results = (
session.query(StudentScores, avg_scores)
.join(avg_scores, StudentScores.age == avg_scores.c.age)
.order_by('score DESC').all()
)
Only now is any SQL issued to the database, and we get the same results as the raw subquery example.
Having said that, the example you provided is actually pretty trivial and shouldn't require a subquery at all. Depending on how your relationships are defined, SQLAlchemy can eagerly load related objects, so that the object returned by:
results = session.query(Table1).filter(Table1.foo == 'x').all()
will have access to the child (or parent) record(s) from Table2, even though we didn't ask for it here - because the relationship defined directly in the models is handling that for us. Check out "Relationship Loading Techniques" in the SQLAlchemy docs for more information on how this works.
I have a table defined like so:
Column | Type | Modifiers | Storage | Stats target | Description
-------------+---------+-----------+---------+--------------+-------------
id | uuid | not null | plain | |
user_id | uuid | | plain | |
area_id | integer | | plain | |
vote_amount | integer | | plain | |
I want to be able to generate a rank 'column' when I query this database. This rank column would be ordered by the vote_amount column. I have attempted to create a query to do this, it looks like so:
subq_rank = db.session.query(user_stories).add_columns(db.func.rank.over(partition_by=user_stories.user_id, order_by=user_stories.vote_amount).label('rank')).subquery('slr')
data = db.session.query(user_stories).select_entity_from(subq_rank).filter(user_stories.area_id == id).group_by(-subq_rank.c.rank).limit(50).all()
Hopefully my attempt will give you an idea of what I am trying to achieve.
Thanks.
Well, if you need in each query these columns better I would do it in DB. I would create a view which contains the column rank, and in the query I call this view to show directly the data in code:
CREATE VIEW [ranking_user_stories] AS
SELECT TOP 50 * FROM
(SELECT *, rank() over (partition by user_stories.user_id order by user_stories.vote_amount ASC) AS ranking
FROM user_stories
WHERE user_stories.area_id = id) uS
ORDER BY vote_amount ASC
It's the same logic than your code but in SQL, if your are using MySQL, just change TOP 50 to LIMIT 50 (and put at the end of query). I don't see the sense to put the last group by by ranking, but if you need it:
CREATE VIEW [ranking_user_stories] AS
SELECT TOP 50 MAX(id) AS id, user_id, area_id, MAX(vote_amount) AS vote_amount, ranking FROM
(SELECT *, rank() over (partition by user_stories.user_id order by user_stories.vote_amount ASC) AS ranking
FROM user_stories
WHERE user_stories.area_id = id) uS
ORDER BY MAX(vote_amount) ASC
GROUP BY user_id, area_id, ranking
With SQLAlchemy ORM querying with PostgreSQL(v9.5); how to prevent the automatic selection when sorting by a column; the sorted column should not be selected.
Hopefully the sample code below makes this more clear.
Example code
A table with an integer 'id', an integer 'object_id' and a string 'text':
id | object_id | text
---------------------
1 | 1 | house
2 | 2 | tree
3 | 1 | dog
The following query should return the distinct object_id as its own id with the most recent text:
query = session.query(
MyTable.object_id.label('id'),
MyTable.text
).\
distinct(MyTable.object_id).\
order_by(MyTable.object_id, MyTable.id.desc())
So far so good; but when I compile the query:
print(query.statement.compile(dialect=postgresql.dialect()))
The mytable.id and mytable.object_id are selected as well, so the column id is specified twice:
SELECT DISTINCT ON (mytable.object_id) mytable.object_id AS id,
mytable.text,
mytable.object_id,
mytable.id
FROM mytable
ORDER BY mytable.object_id,
mytable.id DESC
You can try it. It should work:
query = session.query(MyTable.object_id.distinct().label('id'), MyTable.text).order_by(MyTable.object_id, MyTable.id.desc())