sqlalchemy: order of query result unexpected

sqlalchemy: order of query result unexpected - python

I'm using SQLAlchemy with MySQL and have a table with two foreign keys:
class CampaignCreativeLink(db.Model):
__tablename__ = 'campaign_creative_links'
campaign_id = db.Column(db.Integer, db.ForeignKey('campaigns.id'),
primary_key=True)
creative_id = db.Column(db.Integer, db.ForeignKey('creatives.id'),
primary_key=True)
Then I use a for loop to insert 3 items into the table like this:
session.add(8, 3)
session.add(8, 2)
session.add(8, 1)
But when I checked the table, the items are ordered reversely
8 1
8 2
8 3
And the query shows the order reversely too. What's the reason for this and how can I keep the order same as when they were added?

A table is a set of rows and are therefore not guaranteed to have any order unless you specify ORDER BY.
In MySQL (InnoDB), the primary key acts as the clustered index. This means that the rows are physically stored in the order specified by the primary key, in this case (campaign_id, created_id), regardless of the order of insertion. This is usually the order the rows are returned in if you don't specify an ORDER BY.
If you need your rows returned in a certain order, specify ORDER BY when you query.

Related

Reuse same query across multiple group-bys?

I have a DB query that matches the desired rows. Let's say (for simplicity):
select * from stats where id in (1, 2);
Now I want to extract several frequency statistics (count of distinct values) for multiple columns, across these matching rows:
-- `stats.status` is one such column
select status, count(*) from stats where id in (1, 2) group by 1 order by 2 desc;
-- `stats.category` is another column
select category, count(*) from stats where id in (1, 2) group by 1 order by 2 desc;
-- etc.
Is there a way to re-use the same underlying query in SqlAlchemy? Raw SQL works too.
Or even better, return all the histograms at once, in a single command?
I'm mostly interested in performance, because I don't want Postgres to run the same row-matching many times, once for each column, over and over. The only change is which column is used for the histogram grouping. Otherwise it's the same set of rows.

I don't want Postgres to run the same row-matching many times
That's one of the motivations behind the GROUPING SETS functionality. Try this model:
SELECT category, status, count(*)
FROM stats where id in (1,2)
GROUP BY grouping sets ((category),(status));

User Abelisto's comment & the other answer both have the correct sql required to generate the histogram for multiple fields in 1 single query.
The only edit I would suggest to their efforts is to add an ORDER BY clause, as it seems from OP's attempts that more frequent labels are desired at the top of the result. You might find that sorting the results in python rather than in the database is simpler. In that case, disregard the complexity brought on the order by clause.
Thus, the modified query would be:
SELECT category, status, count(*)
FROM stats
WHERE id IN (1, 2)
GROUP BY GROUPING SETS (
(category), (status)
)
ORDER BY
GROUPING(category, status), 3 DESC
It is also possible to express the same query using sqlalchemy.
from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Stats(Base):
__tablename__ = 'stats'
id = Column(Integer, primary_key=True)
category = Column(Text)
status = Column(Text)
stmt = select(
[Stats.category, Stats.status, func.count(1)]
).where(
Stats.id.in_([1, 2])
).group_by(
func.grouping_sets(tuple_(Stats.category),
tuple_(Stats.status))
).order_by(
func.grouping(Stats.category, Stats.status),
func.count(1).desc()
)
Investigating the output, we see that it generates the desired query (extra newlines added in output for legibility)
print(stmt.compile(compile_kwargs={'literal_binds': True}))
# outputs:
SELECT stats.category, stats.status, count(1) AS count_1
FROM stats
WHERE stats.id IN (1, 2)
GROUP BY GROUPING SETS((stats.category), (stats.status))
ORDER BY grouping(stats.category, stats.status), count(1) DESC

Replace integer id field with uuid

I need to replace the default integer id in my model with an uuid. The problem is that it's beeing used in another model (foreignkey).
Any idea on how to perform this operation without losing data?
class A(Base):
__tablename__ = 'a'
b_id = Column(
GUID(), ForeignKey('b.id'), nullable=False,
server_default=text("uuid_generate_v4()")
)
class B(Base):
__tablename__ = 'b'
id = Column(
GUID(), primary_key=True,
server_default=text("uuid_generate_v4()")
)
Unfortunately it doesn't work, also I'm afraid I'll break the relation.
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) default for column "id" cannot be cast automatically to type uuid
Alembic migration I've tried looks similar to:
op.execute('ALTER TABLE a ALTER COLUMN b_id SET DATA TYPE UUID USING (uuid_generate_v4())')

Add an id_tmp column to b with autogenerated UUID values, and a b_id_tmp column to a. Update a joining b on the foreign key to fill a.b_id_tmp with the corresponding UUIDs. Then drop a.b_id and b.id, rename the added columns, and reestablish the primary key and foreign key.
CREATE TABLE a(id int PRIMARY KEY, b_id int);
CREATE TABLE b(id int PRIMARY KEY);
ALTER TABLE a ADD CONSTRAINT a_b_id_fkey FOREIGN KEY(b_id) REFERENCES b(id);
INSERT INTO b VALUES (1), (2), (3);
INSERT INTO a VALUES (1, 1), (2, 2), (3, 2);
ALTER TABLE b ADD COLUMN id_tmp UUID NOT NULL DEFAULT uuid_generate_v1mc();
ALTER TABLE a ADD COLUMN b_id_tmp UUID;
UPDATE a SET b_id_tmp = b.id_tmp FROM b WHERE b.id = a.b_id;
ALTER TABLE a DROP COLUMN b_id;
ALTER TABLE a RENAME COLUMN b_id_tmp TO b_id;
ALTER TABLE b DROP COLUMN id;
ALTER TABLE b RENAME COLUMN id_tmp TO id;
ALTER TABLE b ADD PRIMARY KEY (id);
ALTER TABLE a ADD CONSTRAINT b_id_fkey FOREIGN KEY(b_id) REFERENCES b(id);
Just as an aside, it's more efficient to index v1 UUIDs than v4 since they contain some reproducible information, which you'll notice if you generate several in a row. That's a minor savings unless you need the higher randomness for external security reasons.

SQL column with list of foreign keys

I am working with Python, PostgreSQL, SQLAlchemy and alembic.
I have to design a database, but I am kinda stuck because my design needs to have a column which will store a list of IDs which are basically foreign keys. I am not sure how to do that and moreover if I should be doing that.
Example: I have a discount table which basically contains all the available discount codes.I have a column discount_applies to where I want to store a list of all products to which the discount applies (I cannot edit the products table). Basically the column will contain a list of UUIDs of products on which the discount can be applied

class Product(Base):
.....
class Discount(Base):
.....
class ProductDiscount(Base):
__tablename__ = 'discount_applies'
product_id = Column(String(32), nullable=False, ,ForeignKey('product.id'))
discount_id = Column(String(32), nullable=False,ForeignKey('discount.id')) #If discount primary key if Integer then change String to Integer
product = relationship(Product)
discount = relationship(Discount)

Creating partial unique index with sqlalchemy on Postgres

SQLAlchemy supports creating partial indexes in postgresql.
Is it possible to create a partial unique index through SQLAlchemy?
Imagine a table/model as so:
class ScheduledPayment(Base):
invoice_id = Column(Integer)
is_canceled = Column(Boolean, default=False)
I'd like a unique index where there can be only one "active" ScheduledPayment for a given invoice.
I can create this manually in postgres:
CREATE UNIQUE INDEX only_one_active_invoice on scheduled_payment
(invoice_id, is_canceled) where not is_canceled;
I'm wondering how I can add that to my SQLAlchemy model using SQLAlchemy 0.9.

class ScheduledPayment(Base):
id = Column(Integer, primary_key=True)
invoice_id = Column(Integer)
is_canceled = Column(Boolean, default=False)
__table_args__ = (
Index('only_one_active_invoice', invoice_id, is_canceled,
unique=True,
postgresql_where=(~is_canceled)),
)

In case someone stops by looking to set up a partial unique constraint with a column that can optionally be NULL, here's how:
__table_args__ = (
db.Index(
'uk_providers_name_category',
'name', 'category',
unique=True,
postgresql_where=(user_id.is_(None))),
db.Index(
'uk_providers_name_category_user_id',
'name', 'category', 'user_id',
unique=True,
postgresql_where=(user_id.isnot(None))),
)
where user_id is a column that can be NULL and I want a unique constraint enforced across all three columns (name, category, user_id) with NULL just being one of the allowed values for user_id.

To add to the answer by sas, postgresql_where does not seem to be able to accept multiple booleans. So in the situation where you have TWO null-able columns (let's assume an additional 'price' column) it is not possible to have four partial indices for all combinations of NULL/~NULL.
One workaround is to use default values which would never be 'valid' (e.g. -1 for price or '' for a Text column. These would compare correctly, so no more than one row would be allowed to have these default values.
Obviously, you will also need to insert this default value in all existing rows of data (if applicable).

sqlalchemy create a foreign key?

I have a composite PK in table Strings (integer id, varchar(2) lang)
I want to create a FK to ONLY the id half of the PK from other tables. This means I'd have potentially many rows in Strings table (translations) matching the FK. I just need to store the id, and have referential integrity maintained by the DB.
Is this possible? If so, how?

This is from wiki
The columns in the referencing table
must be the primary key or other
candidate key in the referenced table. The values in one row of the referencing columns must occur in a single row in the referenced table.
Let's say you have this:
id | var
1 | 10
1 | 11
2 | 10
The foreign key must reference exactly one row from the referenced table. This is why usually it references the primary key.
In your case you need to make another Table1(id) where you stored the ids and make the column unique/primary key. The id column in your current table is not unique - you can't use it in your situation... so you make a Table1(id - primary key) and make the id in your current table a foreign key to the Table1. Now you can create foreign keys to id in Table1 and the primary key in your current table is ok.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

sqlalchemy: order of query result unexpected - python

Related

Reuse same query across multiple group-bys?

Replace integer id field with uuid

SQL column with list of foreign keys

Creating partial unique index with sqlalchemy on Postgres

sqlalchemy create a foreign key?

Categories

Resources