I need to return a list from a table, pairing the rows exactly once; So if I were to have :
ID NAME
01 Jones
02 Clay
03 Mark
04 Nancy
05 Larry
I need to pair them like this:
ID1 NAME1 ID2 NAME2
01 Jones 03 Mark
05 Larry 02 Clay
I can't just pair them in a desc order of the id, as the pairing is based on another table, showing how many times each have won, the problem at the moment is that I get pairs everyone pairing with everyone, how can I limit once a user was mentioned once in a pair, as 1 or 2, it can never be paired again?
As an addon to this question, how can I afterwards do something like :
IF exists without a pair [I do this by adding in another table that a person was pairless at one point] already, then redo query until person without a pair wasn't without a pair already. I know I can easily check results like this in code, but I must do it in psql all (I am using python, if that has anything to do with it);
CREATE TABLE tournaments ( id BIGSERIAL PRIMARY KEY, name VARCHAR(250) );
CREATE TABLE participates (id BIGSERIAL PRIMARY KEY, t_id INT NOT NULL, p_id INT NOT NULL, CONSTRAINT fk_tournament FOREIGN KEY (t_id) REFERENCES tournaments (id), CONSTRAINT fk_player FOREIGN KEY (p_id) REFERENCES players (id));
CREATE TABLE players ( id BIGSERIAL PRIMARY KEY, name VARCHAR(250));
CREATE TABLE matches ( id BIGSERIAL PRIMARY KEY, round INT NOT NULL, p_one_id INT NOT NULL, p_two_id INT, CONSTRAINT fk_p_one_id FOREGN KEY (p_one_id) REFERENCES players (id), CONSTRAINT fk_p_two_id FOREIGN KEY (p_two_id) );
CREATE TABLE wins ( id BIGSERIAL PRIMARY KEY, m_id INT NOT NULL, p_id INT NOT NULL, CONSTRAINT fk_m_id FOREIGN KEY (m_id) REFERENCES matches (id), CONSTRAINT fk_p_id FOREIGN KEY (p_id) REFERENCES players(id) );
Regarding the 'checking if one was already paired', I would check if they have wins with m_id = 0;
I do use psql.
Related
I'm trying make database of recipes. In the table "recipes" with col "ingredients", I would like to have a list of ingredient IDs, e.g. [2,5,7]. Can I make something like this or I should be looking for another solution?
import sqlite3
conn = sqlite3.connect('recipes.db')
c = conn.cursor()
c.execute('''CREATE TABLE recipes(ID INT, name TEXT, ingredients INT)''')
c.execute('''CREATE TABLE ingredients(ID INT, nazwa TEXT, kcal REAL)''')
Another idea is to make another table (The list of ingredients) where I will have 15 cols with number of ingredients.
c.execute('''CREATE TABLE The_list_of_ingredients(ID INT, ingredient1 INT, ingredient2 INT, ...)''')
Can I connect every ingredient 1, ingredient 2 ... with their respective ingredients ID?
You're likely looking for a many-to-many relation between recipes and their ingredients.
CREATE TABLE recipes(ID INTEGER PRIMARY KEY, name TEXT);
CREATE TABLE ingredients(ID INTEGER PRIMARY KEY, name TEXT, kcal REAL);
CREATE TABLE recipe_ingredients(
ID INTEGER PRIMARY KEY AUTOINCREMENT,
recipe_id INTEGER,
ingredient_id INTEGER,
quantity REAL,
FOREIGN KEY(recipe_id) REFERENCES recipes(ID),
FOREIGN KEY(ingredient_id) REFERENCES ingredients(ID)
);
This way your data might look something like e.g.
ingredients
id
name
kcal
1
egg
155
2
cream
196
recipes
id
name
1000
omelette
recipe_ingredients
recipe_id
ingredient_id
quantity
1000
1
100
1000
2
50
(assuming kcal is kcal per 100g, and quantity is in grams and a rather creamy omelette)
You can try to store id's as string
json.dumps(list_of_ingredients_ids)
But probably best solution is many to many relation
I am working with Python and SQLite. I am constantly getting this message
"near ")": syntax error".
I tried to add a semi-colon to all the queries but still, I get this error message.
tables.append("""
CREATE TABLE IF NOT EXISTS payment (
p_id integer PRIMARY KEY,
o_id integer NON NULL,
FOREIGN KEY(o_id) REFERENCES orders(o_id),
);"""
)
You have a comma before the final closing ). Simply remove it.
i.e. use :-
tables.append("""
CREATE TABLE IF NOT EXISTS payment (
p_id integer PRIMARY KEY,
o_id integer NON NULL,
FOREIGN KEY(o_id) REFERENCES orders(o_id)
);"""
)
Remove the comma in the end of the FOREIGN KEY(o_id) REFERENCES orders(o_id),
The working code will be:
tables.append("""
CREATE TABLE IF NOT EXISTS payment (
p_id integer PRIMARY KEY,
o_id integer NON NULL,
FOREIGN KEY(o_id) REFERENCES orders(o_id)
);"""
)
Try this:
tables = []
tables.append("""
CREATE TABLE IF NOT EXISTS payment p_id integer PRIMARY KEY,
o_id integer NON NULL FOREIGN KEY(o_id) REFERENCES orders(o_id),
""")
print(tables)
I have three main tables to keep track of products, location and the logistics between them which includes moving products to and from various locations. I have made another table balance to keep a final balance of the quantity of each product in respective locations.
Here are the schemas:
products(prod_id INTEGER PRIMARY KEY AUTOINCREMENT,
prod_name TEXT UNIQUE NOT NULL,
prod_quantity INTEGER NOT NULL,
unallocated_quantity INTEGER)
Initially, when products are added, prod_quantity and unallocated_quantity have the same values. unallocated_quantity is then subtracted from, each time a certain quantity of the respective product is allocated.
location(loc_id INTEGER PRIMARY KEY AUTOINCREMENT,
loc_name TEXT UNIQUE NOT NULL)
logistics(trans_id INTEGER PRIMARY KEY AUTOINCREMENT,
prod_id INTEGER NOT NULL,
from_loc_id INTEGER NULL,
to_loc_id INTEGER NOT NULL,
prod_quantity INTEGER NOT NULL,
trans_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY(prod_id) REFERENCES products(prod_id),
FOREIGN KEY(from_loc_id) REFERENCES location(loc_id),
FOREIGN KEY(to_loc_id) REFERENCES location(loc_id))
balance(prod_id INTEGER NOT NULL,
loc_id INTEGER NOT NULL,
quantity INTEGER NOT NULL,
FOREIGN KEY(prod_id) REFERENCES products(prod_id),
FOREIGN KEY(loc_id) REFERENCES location(loc_id))
At each entry made in logistics, I want a trigger to update the values in balance thereby keeping a summary of all the transactions (moving products between locations)
I thought of a trigger solution which checks if for each insert on the table logistics, there already exists the same prod_id, loc_id entry in the balance table, which if exists will be updated appropriately. However, I don't have the experience in SQLite to implement this idea.
I believe that your TRIGGER would be along the lines of either :-
CREATE TRIGGER IF NOT EXISTS logistics_added AFTER INSERT ON logistics
BEGIN
UPDATE balance SET quantity = ((SELECT quantity FROM balance WHERE prod_id = new.prod_id AND loc_id = new.from_loc_id) - new.prod_quantity) WHERE prod_id = new.prod_id AND loc_id = new.from_loc_id;
UPDATE balance SET quantity = ((SELECT quantity FROM balance WHERE prod_id = new.prod_id AND loc_id = new.to_loc_id) + new.prod_quantity) WHERE prod_id = new.prod_id AND loc_id = new.to_loc_id;
END;
or :-
CREATE TRIGGER IF NOT EXISTS logistics_added AFTER INSERT ON logistics
BEGIN
INSERT OR REPLACE INTO balance VALUES(new.prod_id,new.from_loc_id,(SELECT quantity FROM balance WHERE prod_id = new.prod_id AND loc_id = new.from_loc_id) - new.prod_quantity);
INSERT OR REPLACE INTO balance VALUES(new.prod_id,new.to_loc_id,(SELECT quantity FROM balance WHERE prod_id = new.prod_id AND loc_id = new.to_loc_id) + new.prod_quantity);
END;
Note that the second relies upon adding a UNIQUE constraint to the balance table by using PRIMARY KEY (prod_id,loc_id) or alternately UNIQUE (prod_id,loc_id). The UNIQUE constraint would probably be required/wanted anyway.
The subtle difference is that the second would INSERT a balance row if and appropriate one didn't exist. The latter would do nothing if the appropriate balance row didn't exist.
I am trying to create a copy of a table through a python script that has all the qualities of the original except for the partitions. I want to do this multiple times in my script (through a for loop) because I want to mysqldump daily files of old data from that table, so I'm trying to use something like:
CREATE TABLE temp_utilization LIKE utilization WITHOUT PARTITIONING;
Here is the original table:
CREATE TABLE `utilization` (
`wrep_time` timestamp NULL DEFAULT NULL,
`end_time` timestamp NULL DEFAULT NULL,
`location` varchar(64) NOT NULL,
`sub_location` varchar(64) NOT NULL,
`model_id` varchar(255) DEFAULT NULL,
`offline` int(11) DEFAULT NULL,
`disabled` int(11) NOT NULL DEFAULT '0',
`total` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`location`,`sub_location`,`wrep_time`),
KEY `key_location` (`location`),
KEY `key_sub_location` (`sub_location`),
KEY `end_time` (`end_time`),
KEY `wrep_time` (`wrep_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (UNIX_TIMESTAMP(wrep_time))
(PARTITION p0 VALUES LESS THAN (1391990400) ENGINE = InnoDB,
PARTITION p1 VALUES LESS THAN (1392076800) ENGINE = InnoDB,
PARTITION p2 VALUES LESS THAN (1392163200) ENGINE = InnoDB,
PARTITION p3 VALUES LESS THAN (1392249600) ENGINE = InnoDB,
PARTITION p492 VALUES LESS THAN (1434499200) ENGINE = InnoDB,
PARTITION p493 VALUES LESS THAN (1434585600) ENGINE = InnoDB,
PARTITION p494 VALUES LESS THAN (1434672000) ENGINE = InnoDB,
PARTITION p495 VALUES LESS THAN (1434758400) ENGINE = InnoDB,
PARTITION p496 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
I would like to create a temp table which contains a create table like this:
CREATE TABLE `temp_utilization` (
`wrep_time` timestamp NULL DEFAULT NULL,
`end_time` timestamp NULL DEFAULT NULL,
`location` varchar(64) NOT NULL,
`sub_location` varchar(64) NOT NULL,
`model_id` varchar(255) DEFAULT NULL,
`offline` int(11) DEFAULT NULL,
`disabled` int(11) NOT NULL DEFAULT '0',
`total` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`location`,`sub_location`,`wrep_time`),
KEY `key_location` (`location`),
KEY `key_sub_location` (`sub_location`),
KEY `end_time` (`end_time`),
KEY `wrep_time` (`wrep_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
mysql> alter table utilization remove partitioning;
Query OK, 0 rows affected (0.40 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> show create table utilization\G
*************************** 1. row ***************************
Table: utilization
Create Table: CREATE TABLE `utilization` (
`wrep_time` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`end_time` timestamp NULL DEFAULT NULL,
`location` varchar(64) NOT NULL,
`sub_location` varchar(64) NOT NULL,
`model_id` varchar(255) DEFAULT NULL,
`offline` int(11) DEFAULT NULL,
`disabled` int(11) NOT NULL DEFAULT '0',
`total` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`location`,`sub_location`,`wrep_time`),
KEY `key_location` (`location`),
KEY `key_sub_location` (`sub_location`),
KEY `end_time` (`end_time`),
KEY `wrep_time` (`wrep_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
So, for your example:
CREATE TABLE temp_utilization LIKE utilization;
ALTER TABLE temp_utilization REMOVE PARTITIONING;
Then during your loop you can CREATE TABLE t1 LIKE temp_utilization or however you wish to name the tables
No, it does not appear that you can create a table like another table without partitions, if it is already partitioned, in one command as you suggested above.
The partition is part of the table definition and is stored in the metadata. You can check that by executing show create table yourtablename;
If you just want to create the table over and over again in a loop without the partitions and the data I see three (added one b/c of Cez) options.
have the table definitions hard coded in your script
create the table in the DB without the partitions. So you have one temp table already created and use that as your template to loop through.
run two separate command from your script: A create table like and then an alter table to remove the partitions in a loop.
You can choose which options best suits you for your environment.
You can reference your options when creating a table at dev.mysql.
I'm building a video recommendation site (think pandora for music videos) in python and MySQL. I have three tables in my db:
video - a table of of the videos. Data doesn't change. Columns are:
CREATE TABLE `video` (
id int(11) NOT NULL AUTO_INCREMENT,
website_id smallint(3) unsigned DEFAULT '0',
rating_global varchar(128) DEFAULT '0',
title varchar(256) DEFAULT NULL,
thumb_url text,
PRIMARY KEY (`id`),
KEY `websites` (`website_id`),
KEY `id` (`id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=49362 DEFAULT CHARSET=utf8
video_tag - a table of the tags (attributes) associated with each video. Doesn't change.
CREATE TABLE `video_tag` (
id int(7) NOT NULL AUTO_INCREMENT,
video_id mediumint(7) unsigned DEFAULT '0',
tag_id mediumint(7) unsigned DEFAULT '0',
PRIMARY KEY (`id`),
KEY `video_id` (`video_id`),
KEY `tag_id` (`tag_id`)
) ENGINE=InnoDB AUTO_INCREMENT=562456 DEFAULT CHARSET=utf8
user_rating - a table of good or bad ratings that the user has given each tag. Data always changing.
CREATE TABLE `user_rating` (
id int(11) NOT NULL AUTO_INCREMENT,
user_id smallint(3) unsigned DEFAULT '0',
tag_id int(5) unsigned DEFAULT '0',
tag_rating float(10,5) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `video` (`tag_id`),
KEY `user_id` (`user_id`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=447 DEFAULT CHARSET=utf8
Based on the user's preferences, I want to score each unwatched video, and try and predict what they will like best. This has resulted in the following massive query, which takes about 2 seconds to complete for 50,000 videos:
SELECT video_tag.video_id,
(sum(user_rating.tag_rating) * video.rating_global) as score
FROM video_tag
JOIN user_rating ON user_rating.tag_id = video_tag.tag_id
JOIN video ON video.id = video_tag.video_id
WHERE user_rating.user_id = 1 AND video.website_id = 2
AND rating_global > 0 AND video_id NOT IN (1,2,3) GROUP BY video_id
ORDER BY score DESC LIMIT 20
I desperately need to make this more efficient, so I'm just looking for advice as to what the best direction is. Some ideas I've considered:
a) Rework my db table structure (not sure how)
b) Offload more of the grouping and aggregation into Python (haven't figured out a way to join three tables that is actually faster)
c) Store the non-changing tables in memory to try and speed computation time (earlier tinkering hasn't yielded any gains yet..)
How would you recommend making this more efficient?
Thanks you!!
--
Per request in the comments, EXPLAIN SELECT.. shows:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE user_rating ref video,user_id user_id 3 const 88 Using where; Using temporary; Using filesort
1 SIMPLE video_tag ref video_id,tag_id tag_id 4 db.user_rating.tag_id 92 Using where
1 SIMPLE video eq_ref PRIMARY,websites,id PRIMARY 4 db.video_tag.video_id 1 Using where
Change the field type of the *rating_global* to a numeric type (either float or integer), no need for it to be varchar. Personally I would change all rating fields to integer, I find no need for them to be float.
Drop the KEY on id, PRIMARY KEY is already indexed.
video.id,rating_global,website_id
Watch the integer length for your references (e.g. video_id -> video.id) you may run out of numbers. These sizes should be the same.
I suggest the following 2-step solution to replace your query:
CREATE TEMPORARY TABLE rating_stats ENGINE=MEMORY
SELECT video_id, SUM(tag_rating) AS tag_rating_sum
FROM user_rating ur JOIN video_tag vt ON vt.id = ur.tag_id AND ur.user_id=1
GROUP BY video_id ORDER BY NULL
SELECT v.id, tag_rating_sum*rating_global AS score FROM video v
JOIN rating_stats rs ON rs.video_id = v.id
WHERE v.website_id=2 AND v.rating_global > 0 AND v.id NOT IN (1,2,3)
ORDER BY score DESC LIMIT 20
For the latter query to perform really fast, you could incorporate in your PRIMARY KEY in the video table fields website_id and rating_global (perhaps only website_id is enough though).
You can also use another table with these statistics and precalculate dynamically based on user login/action frequency. I am guessing you can show the cached data instead of showing live results, there shouldn't be much difference.