About unique=True and (unique=True, index=True) in sqlalchemy - python

When I create tables use flask-sqlalchemy like this:
class Te(Model):
__tablename__ = 'tt'
id = Column(db.Integer(), primary_key=True)
t1 = Column(db.String(80), unique=True, )
t3 = Column(db.String(80), unique=True, index=True, )
and In my Sequel Pro , I get the table create info:
CREATE TABLE `tt` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`t1` varchar(80) DEFAULT NULL,
`t3` varchar(80) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `t1` (`t1`),
UNIQUE KEY `ix_tt_t3` (`t3`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
this means t1 is entirely same as t3 in MySQL? So when you define unique=True, it's not require to define index=True?
Thanks.

I think you have a term confusion with the index purpose in sqlalchemy. In sql databases index are used speed up query performance.
According to the sqlalchemy documentation of defining constraints and indexes.
You would notice the use of the index key because the sql code generated is:
UNIQUE KEY `ix_tt_t3` (`t3`)
The way how sqlalchemy nouns the index is idx_%columnlabbel. And that matches with the sql code generated.
So the use or not of index it is only related with performance and the unique key means that the column values cannot be repeated all along of the same column in the 'tt' table.
Hope this helps,

It is not required. A unique constraint is more often than not implemented using a unique index, but you need not care about that detail, if all you want is uniqueness.

Related

How do I get a SQLite query to return the id of a table when I have two tables that have an attribute named id?

I can't seem to find anything on how to access the id attribute from the table I want. I have 4 tables that I have joined. User, workouts, exercises, and sets. They all have primary keys with the attribute name id.
My query:
query = """SELECT users.firstName, workouts.dateandtime, workouts.id, sets.*, exercises.name FROM users
JOIN workouts ON users.id = workouts.userID JOIN sets ON workouts.id = sets.workoutID JOIN exercises ON
sets.exerciseID = exercises.id WHERE users.id = ? ORDER BY sets.id DESC"""
I'm only grabbing the workouts.id and sets.id because user.id is found when the user logs in and exercises.id is cast amongst all users and it's important in this step.
Trying to access the sets.id like this does not work:
posts_unsorted = cur.execute(query, userID).fetchall()
for e in posts_unsorted:
print(e['id']) # Prints workouts.id I'm assuming because it's the first id I grab in the query
print(e['sets.id']) # Error because sets.id does not exist
Is there a way to name the sets.id when making the query so that I can actually use it? Should I be setting up my database differently to gab the sets.id? I don't know what direction I should be going.
This post How do you avoid column name conflicts?. Shows that you can give your tables aliases. This helps make it easier to refer to you tables in queries. It also gives what your query returns direction in what to name everything.
If you have two tables that both have an attribute called id. You will need to give them an alias to be able to access both attributes.
An example:
.schema sets
CREATE TABLE "sets"(
id INTEGER NOT NULL,
interval INTEGER NOT NULL,
workoutID INTEGER NOT NULL,
PRIMARY KEY id,
FORGIEN KEY workoutID REFERENCES workouts(id)
);
.schema workouts
CREATE TABLE "workouts"(
id INTEGER NOT NULL,
date SMALLDATETIME NOT NULL,
PRIMARY KEY id,
FORGIEN KEY workoutID REFERENCES workouts(id)
);
Fill the database:
INSERT INTO workouts (date) VALUES (2022-03-14), (2022-02-13);
INSERT INTO sets (interval, workoutID) VALUES (5, 1), (4, 1), (3, 2), (2, 2);
Both tables have a primary key labeled id. If you must access both ids you will need to add an alias in your query.
database = sqlite3.connect("name.db")
database.row = sqlite3.Row
cur = database.cursor()
query = """SELECT sets.id AS s_id, workouts.date AS w_date, workouts.id AS w_id
FROM sets JOIN workouts ON sets.workoutID=w_id"""
posts = cur.execute(query).fetchall()
This will return to you named tuples making to easy to retrieve the data you want. The data will look like this:
[{'s_id':1, 'w_date':'2022-03-14', 'w_id':1},
{'s_id':2, 'w_date':'2022-03-14', 'w_id':1},
{'s_id':3, 'w_date':'2022-02-13', 'w_id':2},
{'s_id':4, 'w_date':'2022-02-13', 'w_id':2}]
With this set of data you will be able to access everything by name instead of index.

SQLAlchemy: partial unique constraint where a field has a certain value

In my flask project I need a table with a unique constraint on a column, if the values in an other column are identical. So I try to do something like that:
if premiumuser_id = "a value I don't know in advance" then track_id=unique
This is similar to Creating partial unique index with sqlalchemy on Postgres, but I use sqlite (where partial indexes should also be possible: https://docs.sqlalchemy.org/en/13/dialects/sqlite.html?highlight=partial%20indexes#partial-indexes) and the condition is different.
So far my code looks like that:
class Queue(db.Model):
id = db.Column(db.Integer, primary_key=True)
track_id = db.Column(db.Integer)
premiumuser_id = db.Column(
db.Integer, db.ForeignKey("premium_user.id"), nullable=False
)
__table_args__ = db.Index(
"idx_partially_unique_track",
"track_id",
unique=True,
sqlite_where="and here I'm lost",
)
All examples I've found operate with boolean or fixed values. How should the syntax for sqlite_where look like for the condition: premiumuser_id = "a value I don't know in advance"?

Python doesn't select joined Tables from SQlite database while just running a query works fine

I am using Python 3.6.3 and SQLite 3.14.2. I have two tables with one having a foreign key pointin to the other one. When I run the query with join in SQlite browser, it works fine and returns the results I need. But when I try to execute the query in Python, it always return empty list. No matter how simple I make the join, the result is same. Can anyone help me? Thank you in advance.
query = '''SELECT f.ID, f.FoodItemName, f.WaterPerKilo, r.AmountInKilo FROM
FoodItems AS f INNER JOIN RecipeItems AS r on f.ID=r.FoodItemID
WHERE r.RecipeID = {:d}'''.format(db_rec[0])
print(query)
db_fooditems = cur.execute(query).fetchall() #this returns []
The Tables are as follows:
CREATE TABLE "FoodItems" (
`ID` INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
`FoodItemName` TEXT NOT NULL,
`WaterPerKilo` REAL NOT NULL)
CREATE TABLE "RecipeItems" (
`ID` INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
`RecipeID` INTEGER NOT NULL,
`FoodItemID` INTEGER NOT NULL,
`AmountInKilo` REAL NOT NULL)
with some random data.

create table without partitions

I am trying to create a copy of a table through a python script that has all the qualities of the original except for the partitions. I want to do this multiple times in my script (through a for loop) because I want to mysqldump daily files of old data from that table, so I'm trying to use something like:
CREATE TABLE temp_utilization LIKE utilization WITHOUT PARTITIONING;
Here is the original table:
CREATE TABLE `utilization` (
`wrep_time` timestamp NULL DEFAULT NULL,
`end_time` timestamp NULL DEFAULT NULL,
`location` varchar(64) NOT NULL,
`sub_location` varchar(64) NOT NULL,
`model_id` varchar(255) DEFAULT NULL,
`offline` int(11) DEFAULT NULL,
`disabled` int(11) NOT NULL DEFAULT '0',
`total` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`location`,`sub_location`,`wrep_time`),
KEY `key_location` (`location`),
KEY `key_sub_location` (`sub_location`),
KEY `end_time` (`end_time`),
KEY `wrep_time` (`wrep_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (UNIX_TIMESTAMP(wrep_time))
(PARTITION p0 VALUES LESS THAN (1391990400) ENGINE = InnoDB,
PARTITION p1 VALUES LESS THAN (1392076800) ENGINE = InnoDB,
PARTITION p2 VALUES LESS THAN (1392163200) ENGINE = InnoDB,
PARTITION p3 VALUES LESS THAN (1392249600) ENGINE = InnoDB,
PARTITION p492 VALUES LESS THAN (1434499200) ENGINE = InnoDB,
PARTITION p493 VALUES LESS THAN (1434585600) ENGINE = InnoDB,
PARTITION p494 VALUES LESS THAN (1434672000) ENGINE = InnoDB,
PARTITION p495 VALUES LESS THAN (1434758400) ENGINE = InnoDB,
PARTITION p496 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
I would like to create a temp table which contains a create table like this:
CREATE TABLE `temp_utilization` (
`wrep_time` timestamp NULL DEFAULT NULL,
`end_time` timestamp NULL DEFAULT NULL,
`location` varchar(64) NOT NULL,
`sub_location` varchar(64) NOT NULL,
`model_id` varchar(255) DEFAULT NULL,
`offline` int(11) DEFAULT NULL,
`disabled` int(11) NOT NULL DEFAULT '0',
`total` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`location`,`sub_location`,`wrep_time`),
KEY `key_location` (`location`),
KEY `key_sub_location` (`sub_location`),
KEY `end_time` (`end_time`),
KEY `wrep_time` (`wrep_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
mysql> alter table utilization remove partitioning;
Query OK, 0 rows affected (0.40 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show create table utilization\G
*************************** 1. row ***************************
Table: utilization
Create Table: CREATE TABLE `utilization` (
`wrep_time` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`end_time` timestamp NULL DEFAULT NULL,
`location` varchar(64) NOT NULL,
`sub_location` varchar(64) NOT NULL,
`model_id` varchar(255) DEFAULT NULL,
`offline` int(11) DEFAULT NULL,
`disabled` int(11) NOT NULL DEFAULT '0',
`total` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`location`,`sub_location`,`wrep_time`),
KEY `key_location` (`location`),
KEY `key_sub_location` (`sub_location`),
KEY `end_time` (`end_time`),
KEY `wrep_time` (`wrep_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
So, for your example:
CREATE TABLE temp_utilization LIKE utilization;
ALTER TABLE temp_utilization REMOVE PARTITIONING;
Then during your loop you can CREATE TABLE t1 LIKE temp_utilization or however you wish to name the tables
No, it does not appear that you can create a table like another table without partitions, if it is already partitioned, in one command as you suggested above.
The partition is part of the table definition and is stored in the metadata. You can check that by executing show create table yourtablename;
If you just want to create the table over and over again in a loop without the partitions and the data I see three (added one b/c of Cez) options.
have the table definitions hard coded in your script
create the table in the DB without the partitions. So you have one temp table already created and use that as your template to loop through.
run two separate command from your script: A create table like and then an alter table to remove the partitions in a loop.
You can choose which options best suits you for your environment.
You can reference your options when creating a table at dev.mysql.

How do I speed up (or break up) this MySQL query?

I'm building a video recommendation site (think pandora for music videos) in python and MySQL. I have three tables in my db:
video - a table of of the videos. Data doesn't change. Columns are:
CREATE TABLE `video` (
id int(11) NOT NULL AUTO_INCREMENT,
website_id smallint(3) unsigned DEFAULT '0',
rating_global varchar(128) DEFAULT '0',
title varchar(256) DEFAULT NULL,
thumb_url text,
PRIMARY KEY (`id`),
KEY `websites` (`website_id`),
KEY `id` (`id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=49362 DEFAULT CHARSET=utf8
video_tag - a table of the tags (attributes) associated with each video. Doesn't change.
CREATE TABLE `video_tag` (
id int(7) NOT NULL AUTO_INCREMENT,
video_id mediumint(7) unsigned DEFAULT '0',
tag_id mediumint(7) unsigned DEFAULT '0',
PRIMARY KEY (`id`),
KEY `video_id` (`video_id`),
KEY `tag_id` (`tag_id`)
) ENGINE=InnoDB AUTO_INCREMENT=562456 DEFAULT CHARSET=utf8
user_rating - a table of good or bad ratings that the user has given each tag. Data always changing.
CREATE TABLE `user_rating` (
id int(11) NOT NULL AUTO_INCREMENT,
user_id smallint(3) unsigned DEFAULT '0',
tag_id int(5) unsigned DEFAULT '0',
tag_rating float(10,5) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `video` (`tag_id`),
KEY `user_id` (`user_id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=447 DEFAULT CHARSET=utf8
Based on the user's preferences, I want to score each unwatched video, and try and predict what they will like best. This has resulted in the following massive query, which takes about 2 seconds to complete for 50,000 videos:
SELECT video_tag.video_id,
(sum(user_rating.tag_rating) * video.rating_global) as score
FROM video_tag
JOIN user_rating ON user_rating.tag_id = video_tag.tag_id
JOIN video ON video.id = video_tag.video_id
WHERE user_rating.user_id = 1 AND video.website_id = 2
AND rating_global > 0 AND video_id NOT IN (1,2,3) GROUP BY video_id
ORDER BY score DESC LIMIT 20
I desperately need to make this more efficient, so I'm just looking for advice as to what the best direction is. Some ideas I've considered:
a) Rework my db table structure (not sure how)
b) Offload more of the grouping and aggregation into Python (haven't figured out a way to join three tables that is actually faster)
c) Store the non-changing tables in memory to try and speed computation time (earlier tinkering hasn't yielded any gains yet..)
How would you recommend making this more efficient?
Thanks you!!
--
Per request in the comments, EXPLAIN SELECT.. shows:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE user_rating ref video,user_id user_id 3 const 88 Using where; Using temporary; Using filesort
1 SIMPLE video_tag ref video_id,tag_id tag_id 4 db.user_rating.tag_id 92 Using where
1 SIMPLE video eq_ref PRIMARY,websites,id PRIMARY 4 db.video_tag.video_id 1 Using where
Change the field type of the *rating_global* to a numeric type (either float or integer), no need for it to be varchar. Personally I would change all rating fields to integer, I find no need for them to be float.
Drop the KEY on id, PRIMARY KEY is already indexed.
video.id,rating_global,website_id
Watch the integer length for your references (e.g. video_id -> video.id) you may run out of numbers. These sizes should be the same.
I suggest the following 2-step solution to replace your query:
CREATE TEMPORARY TABLE rating_stats ENGINE=MEMORY
SELECT video_id, SUM(tag_rating) AS tag_rating_sum
FROM user_rating ur JOIN video_tag vt ON vt.id = ur.tag_id AND ur.user_id=1
GROUP BY video_id ORDER BY NULL
SELECT v.id, tag_rating_sum*rating_global AS score FROM video v
JOIN rating_stats rs ON rs.video_id = v.id
WHERE v.website_id=2 AND v.rating_global > 0 AND v.id NOT IN (1,2,3)
ORDER BY score DESC LIMIT 20
For the latter query to perform really fast, you could incorporate in your PRIMARY KEY in the video table fields website_id and rating_global (perhaps only website_id is enough though).
You can also use another table with these statistics and precalculate dynamically based on user login/action frequency. I am guessing you can show the cached data instead of showing live results, there shouldn't be much difference.

Categories