How do I create an auto updateing sqlite table? - python

I am using a python script to create and maintain an sqlite database for anime shows I watch to help me keep better track of them and the episodes I have and need to get.
I use the script to create a table for each series eg Bleach, Black Lagoon... and in each of these tables the following information is stored:
Series table:
Season_Num # Unique season number
I_Have_Season # yes or no to say I have a directory for that season
Season_Episodes # Amount of episodes according to the TVDB that are in that season
Episodes_I_Have # The numer of episodes I have for that season
the same table is created for every series I have and a row for each season in that series that seems to work fine.
Now what I'm trying to do is create a summary table which takes the information from the tables for each series and creates just 1 table with all the information I need it has the following information:
Summary table:
Series # Unique Series name
Alt_Name # Alternate name (The series name in english)
Special_Eps # The amount of Special episodes (Season 0 in the series table)
Special_Eps_Me # The number of Special Episodes I have
Tot_Ses # The total count of the Seasons (excluding season 0)
Tot_Ses_Me # The total count of Seasons that have yes in I_Have_Season column
Tot_Episodes # Total Episodes excluding season 0 episodes
Tot_Eps_Me # Total Episodes I have excluding season 0 episodes
I think what I want to do can be done using triggers but I am unsure how to implement them so that the summary table will automatically update if for example a new season is added to a series table or the values of a series table are changed.
UPDATE:
Fabian's idea for a view instead of a table after some more thaught and research sounds like it could be what i want but if it's possible i would like to keep each series seperate in its own table for updateing instead of haveing just 1 table with every series and every season mixed in.
UPDATE 2:
I have gone ahead and put in the triggers for INSERT, UPDATE and DELETE i added them in the initial create loop of my script using variable's for the table names and the summary table appears to be updateing fine (after fixing how some of the values in it were calculated). I will test it further and hopefully it will keep working. Now i just need to get my script to add and delete tables for new series and for a series i delete.

This could be achieved using triggers. But this kind of thing is usually better done declaratively, using a view:
For example,
create table series (
series_name,
alt_name,
special_eps,
special_eps_me,
primary key(series_name)
);
create table seasons (
series_name,
season_num,
i_have_season,
episodes,
episodes_i_have,
primary key (series_name,season_num),
foreign key (series_name) references series (series_name),
check (i_have_season in ('F','T'))
);
create view everything_with_counts as
select series_name,
alt_name,
special_eps,
special_eps_me,
(select count(*) from seasons where seasons.series_name = series.series_name) as tot_ses,
(select count(*) from seasons where seasons.series_name = series.series_name and i_have_season = 'T') as tot_ses_me,
(select sum(episodes) from seasons where seasons.series_name = series.series_name) as tot_epsiodes,
(select sum(episodes_i_have) from seasons where seasons.series_name = series.series_name and i_have_season = 'T') as tot_epsiodes_me
from series;
EDIT
Since you want to stick to the trigger design: Assuming you have your series tables like this:
create table series_a (
season_num,
i_have_season,
episodes,
episodes_i_have
);
create table series_b (
season_num,
i_have_season,
episodes,
episodes_i_have
);
and so on, and your summary table like this:
create table summary (
series_name,
alt_name,
special_eps,
special_eps_me,
tot_ses,
tot_ses_me,
tot_episodes,
tot_episodes_me,
primary key(series_name));
You have to create three triggers (insert, update, delete) for each series table, e.g.:
create trigger series_a_ins after insert on series_a
begin
update summary set tot_ses = (select count(*) from series_a ),
tot_ses_me = (select count(*) from series_a where i_have_season = 'T'),
tot_episodes = (select sum(episodes) from series_a ),
tot_episodes_me = (select sum(episodes_i_have) from series_a where i_have_season = 'T')
where series_name = 'a';
end;
/* create trigger series_a_upd after update on series_a ... */
/* create trigger series_a_del after delete on series_a ... */
With this version, you have to add your summary entry manually in the summary table, and the counters get updated automatically afterwards when you modify your series_... tables.
You could use also INSERT OR REPLACE (see documentation) to create summary entries on demand.

Related

How to join these 2 tables by date with ORM

I have two querysets -
A = Bids.objects.filter(*args,**kwargs).annotate(highest_priority=Case(*[
When(data_source=data_source, then Value(i))
for i, data_source in enumerate(data_source_order_list)
],
.order_by(
"date",
"highest_priority"
))
B= A.values("date").annotate(Min("highest_priority)).order_by("date")
First query give me all objects with selected time range with proper data sources and values. Through highest_priority i set which item should be selected. All items have additional data.
Second query gives me grouped by information about items in every date. In second query i do not have important values like price etc. So i assume i have to join these two tables and filter out where a.highest_priority = b.highest priority. Because in this case i will get queryset with objects and only one item per date.
I have tried using distinct - not working with .first()/.last(). Annotates gives me dict by grouped by, and grouping by only date cutting a lot of important data, but i have to group by only date...
Tables looks like that
A
B
How to join them? Because when i join them i could easily filter highest_prio with highest_prio and get my date with only one database shot. I want to use ORM, because i could just distinct and put it on the list and i do not want to hammer base with connecting multiple queries through date.
Look if this sugestion works :
SELECT * , (to_char(a.date, 'YYYYMMDD')::integer)*highest_priority AS prioritycalc;
FROM table A
JOIN table B ON (to_char(a.date, 'YYYYMMDD')::integer)*highest_priority = (to_char(b.date, 'YYYYMMDD')::integer)*highest_priority
ORDER BY prioritycalc DESC;

How do I nest these queries in one Replace Into query?

I have three queries and another table called output_table. This code works but needs to be executed in 1. REPLACE INTO query. I know this involves nested and subqueries, but I have no idea if this is possible since my key is the DISTINCT coins datapoints from target_currency.
How to rewrite 2 and 3 so they execute in query 1? That is, the REPLACE INTO query instead of the individual UPDATE ones:
1. conn3.cursor().execute(
"""REPLACE INTO coin_best_returns(coin) SELECT DISTINCT target_currency FROM output_table"""
)
2. conn3.cursor().execute(
"""UPDATE coin_best_returns SET
highest_price = (SELECT MAX(ask_price_usd) FROM output_table WHERE coin_best_returns.coin = output_table.target_currency),
lowest_price = (SELECT MIN(bid_price_usd) FROM output_table WHERE coin_best_returns.coin = output_table.target_currency)"""
)
3. conn3.cursor().execute(
"""UPDATE coin_best_returns SET
highest_market = (SELECT exchange FROM output_table WHERE coin_best_returns.highest_price = output_table.ask_price_usd),
lowest_market = (SELECT exchange FROM output_table WHERE coin_best_returns.lowest_price = output_table.bid_price_usd)"""
)
You can do it with the help of some window functions, a subquery, and an inner join. The version below is pretty lengthy, but it is less complicated than it may appear. It uses window functions in a subquery to compute the needed per-currency statistics, and factors this out into a common table expression to facilitate joining it to
itself.
Other than the inline comments, the main reason for the complication is original query number 3. Queries (1) and (2) could easily be combined as a single, simple, aggregate query, but the third query is not as easily addressed. To keep the exchange data associated with the corresponding ask and bid prices, this query uses window functions instead of aggregate queries. This also provides a vehicle different from DISTINCT for obtaining one result per currency.
Here's the bare query:
WITH output_stats AS (
-- The ask and bid information for every row of output_table, every row
-- augmented by the needed maximum ask and minimum bid statistics
SELECT
target_currency as tc,
ask_price_usd as ask,
bid_price_usd as bid,
exchange as market,
MAX(ask_price_usd) OVER (PARTITION BY target_currency) as high,
ROW_NUMBER() OVER (
PARTITION_BY target_currency, ask_price_usd ORDER BY exchange DESC)
as ask_rank
MIN(bid_price_usd) OVER (PARTITION BY target_currency) as low,
ROW_NUMBER() OVER (
PARTITION_BY target_currency, bid_price_usd ORDER BY exchange ASC)
as bid_rank
FROM output_table
)
REPLACE INTO coin_best_returns(
-- you must, of course, include all the columns you want to fill in the
-- upsert column list
coin,
highest_price,
lowest_price,
highest_market,
lowest_market)
SELECT
-- ... and select a value for each column
asks.tc,
asks.ask,
bids.bid,
asks.market,
bids.market
FROM output_stats asks
JOIN output_stats bids
ON asks.tc = bids.tc
WHERE
-- These conditions choose exactly one asks row and one bids row
-- for each currency
asks.ask = asks.high
AND asks.ask_rank = 1
AND bids.bid = bids.low
AND bids.bid_rank = 1
Note well that unlike the original query 3, this will consider only exchange values associated with the target currency for setting the highest_market and lowest_market columns in the destination table. I'm supposing that that's what you really want, but if not, then a different strategy will be needed.

Pagination for SQLite

Hey Friends I am working on a application which I similar to service now. I have requests coming from users and have to work on it. I am using python-flask and sqlite for this.
I am new to flask and this is my first project. Please correct me if I am wrong.
result = cur.execute("SELECT * from orders")
orders = result.fetchmany(5)
I am trying to use orders = result.paginate(...)
But it seems there's some problem.
Also, I am not sure of how to display the db data in different pages.
I want first 10 records on 1st page next 10 on 2nd page and so on..
Could you please help me?
I've never used flask but assuming that you can issue a paginate/page throw then a query that introduces a value 0-9 would allow a conditional page throw.
For example, assuming an orders tables that has 3 columns, orderdate, ordertype, orderdesc and that the order required was for the columns according to those columns (see notes) then the following would inroduce a column that is from 0 to 9 and thus allow the check for a pafe throw :-
SELECT *,
(
SELECT count()
FROM ORDERS
WHERE orderdate||ordertype||orderdesc < o.orderdate||o.ordertype||o.orderdesc
ORDER BY orderdate||ordertype||orderdesc
) % 10 AS orderby
FROM ORDERS AS o ORDER BY orderdate||ordertype||orderdesc
Note that the above relies upon sort-orders and the where clause having the same result, a more complex WHERE clause may be needed. The above is intended as an in-principle example.
Example Usage
Consider the following example of the above in use. This generates 100 rows of orders with randomly generated orderdates and ordertypes within specififc ranges and then extracts the data according to the above query. The results of the underyling data and the extracted data are shown in the results section.
/* Create Test Environment */
DROP TABLE IF EXISTS orders;
/* Generate and load some random orders */
CREATE TABLE If NOT EXISTS orders (orderdate TEXT, ordertype TEXT, orderdesc TEXT);
WITH RECURSIVE cte1(od,ot,counter) AS
(
SELECT
datetime('now','+'||(abs(random()) % 10)||' days'),
(abs(random()) % 26),
1
UNION ALL SELECT
datetime('now','+'||(abs(random()) % 10)||' days'),
(abs(random()) % 26),
counter+1
FROM cte1 LIMIT 100
)
INSERT INTO orders SELECT * FROM cte1;
/* Display the resultant data */
SELECT rowid,* FROM orders;
/* Display data with generated page throw indicator */
SELECT rowid,*,
(
SELECT count()
FROM ORDERS
WHERE orderdate||ordertype||orderdesc < o.orderdate||o.ordertype||o.orderdesc
ORDER BY orderdate||ordertype||orderdesc
) % 10 AS orderby
FROM ORDERS AS o ORDER BY orderdate||ordertype||orderdesc;
/* Clean up */
DROP TABLE IF EXISTS orders;
Results (partial)
The core data (not sorted so by rowid (rowid included for comparison purposes)) :-
The extracted data with page-throw indicator (highlighted)
Obviously you would likely not throw a page for the first row.
As concatention of the 3 columns has been used for convenience, the results may be a little confusing as e.g. 2 would appear to be greater than 11 and so on.
the rowid indicates the original position, so demonstrates that the data has been sorted.

INSERT OR REPLACE adding a whole new row SQLite

I'm creating a little stock tracking project using Flask. I have generally used an ORM, but was trying vanilla SQLite on this.
I have 3 tables, users, stocks and a join table of stocks_users that has a column for user_id, symbol, and quantity
db.execute(f"""
INSERT OR REPLACE INTO user_stocks(user_id, stock_id, quantity)
VALUES (
(SELECT 1 FROM users WHERE id = '{user_id}'),
(SELECT 1 FROM stocks WHERE symbol = '{symbol}'),
{shares}
)
""")
I am running this trying to insert a row if it does not exist, but if a row has the matching user_id and symbol (primary key for the users and stocks tables respectively) I would just like to update the quantity.
Any direction on this would be much appreciated.

Selecting a date between two dates while also accounting for separate time field

I have a date and a time field in Postgresql. I am reading it in python and need to sort out things on certain days past certain times.
The steps would basically be like this:
Select * from x where date > monthdayyear
In that subset, select only those that are > time given for that date
AND date2 must be < monthdayyear2 AND time2 must be less than time2 given on that date
I know there are definitely some python ways I could do this, by iterating through results and et cetera. I'm wondering if there is a better way than brute forcing this? I would rather not run multiple queries or have to sort out a lot of extra results in the fetchall() if possible.
If I've understood your design, this is really a schema design issue. Instead of:
CREATE TABLE sometable (
date1 date,
time1 time,
date2 date,
time2 time
);
you generally want:
CREATE TABLE sometable (
timestamp1 timestamp with time zone,
timestamp2 timestamp with time zone
);
if you want the timestamp converted automatically to UTC and back to the client's TimeZone, or timestamp without time zone if you want to store the raw timestamp without timezone conversion.
If an inclusive test is OK, you can write:
SELECT ...
FROM sometable
WHERE '2012-01-01 11:15 +0800' BETWEEN timestamp1 AND timestamp2;
If you cannot amend your schema, your best bet is something like this:
SELECT ...
FROM sometable
WHERE '2012-01-01 11:15 +0800' BETWEEN (date1 + time1) AND (date2 + time2);
This may have some unexpected quirks when it comes to clients in multiple time zones; you may land up needing to look at the AT TIME ZONE operator.
If you need an exclusive test on one side an/or the other, you can't use BETWEEN since it's an a <= x <= b operator. Instead write:
SELECT ...
FROM sometable
WHERE '2012-01-01 11:15 +0800' > (date1 + time1)
AND '2012-01-01 11:15 +0800' < (date2 + time2);
Automating the schema change
Automating a schema change is possible.
You want to query INFORMATION_SCHEMA or pg_catalog.pg_class and pg_catalog.pg_attribute for tables that have pairs of date and time columns, then generate sets of ALTER TABLE commands to unify them.
Determining what a "pair" is is quite application specific; if you've used a consistent naming scheme it should be easy to do with LIKE or ~ operators and/or regexp_matches. You want to produce a set of (tablename, datecolumnname, timecolumnname) tuples.
Once you have that, you can for each (tablename, datecolumnname, timecolumnname) tuple produce the following ALTER TABLE statements, which must be run in a transaction to be safe, and should be tested before use on any data you care about, and where the entries in [brackets] are substitutions:
BEGIN;
ALTER TABLE [tablename] ADD COLUMN [timestampcolumnname] TIMESTAMP WITH TIME ZONE;
--
-- WARNING: This part can lose data; if one of the columns is null and the other one isn't
-- the result is null. You should've had a CHECK constraint preventing that, but probably
-- didn't. You might need to special case that; the `coalesce` and `nullif` functions and
-- the `CASE` clause might be useful if so.
--
UPDATE [tablename] SET [timestampcolumnname] = ([datecolumnname] + [timecolumnname]);
ALTER TABLE [tablename] DROP COLUMN [datecolumnname];
ALTER TABLE [tablename] DROP COLUMN [timecolumnname];
-- Finally, if the originals were NOT NULL:
ALTER TABLE [tablename] ALTER COLUMN [timestampcolumnname] SET NOT NULL;
then check the results and COMMIT if happy. Be aware that an exclusive lock is taken on the table from the first ALTER so nothing else can use the table until you COMMIT or ROLLBACK.
If you're on a vaguely modern PostgreSQL you can generate the SQL with the format function; on older versions you can use string concatenation (||) and the quote_literal function. Example:
Given the sample data:
CREATE TABLE sometable(date1 date not null, time1 time not null, date2 date not null, time2 time not null);
INSERT INTO sometable(date1,time1,date2,time2) VALUES
('2012-01-01','11:15','2012-02-03','04:00');
CREATE TABLE othertable(somedate date, sometime time);
INSERT INTO othertable(somedate, sometime) VALUES
(NULL, NULL),
(NULL, '11:15'),
('2012-03-08',NULL),
('2014-09-18','23:12');
Here's a query that generates the input data set. Note that it relies on the naming convention that matching column pairs always have a common name once any date or time word is removed from the column. You could instead use adjacency by testing for c1.attnum + 1 = c2.attnum.
BEGIN;
WITH
-- Create set of each date/time column along with its table name, oids, and not null flag
cols AS (
select attrelid, relname, attname, typname, atttypid, attnotnull
from pg_attribute
inner join pg_class on pg_attribute.attrelid = pg_class.oid
inner join pg_type on pg_attribute.atttypid = pg_type.oid
where NOT attisdropped AND atttypid IN ('date'::regtype, 'time'::regtype)
),
-- Self join the time and date column set, filtering the left side for only dates and
-- the right side for only times, producing two distinct sets. Then filter for entries
-- where the names are the same after replacing any appearance of the word `date` or
-- `time`.
tableinfo (tablename, datecolumnname, timecolumnname, nonnull, hastimezone) AS (
SELECT
c1.relname, c1.attname, c2.attname,
c1.attnotnull AND c2.attnotnull AS nonnull,
't'::boolean AS withtimezone
FROM cols c1
INNER JOIN cols c2 ON (
c1.atttypid = 'date'::regtype
AND c2.atttypid = 'time'::regtype
AND c1.attrelid = c2.attrelid
-- Match column pairs; I used name matching, you might use adjancency:
AND replace(c1.attname,'date','') = replace(c2.attname,'time','')
)
)
-- Finally, format the results into a series of ALTER TABLE statements.
SELECT format($$
ALTER TABLE %1$I ADD COLUMN %4$I TIMESTAMP %5$s;
UPDATE %1$I SET %4$I = (%2$I + %3$I);
ALTER TABLE %1$I DROP COLUMN %2$I;
ALTER TABLE %1$I DROP COLUMN %3$I;
$$ ||
-- Append a clause to make the column NOT NULL now that it's populated, only
-- if the original date or time were NOT NULL:
CASE
WHEN nonnull
THEN ' ALTER TABLE %1$I ALTER COLUMN %4$I SET NOT NULL;'
ELSE ''
END,
-- Now the format arguments
tablename, -- 1
datecolumnname, -- 2
timecolumnname, -- 3
-- You'd use a better column name generator than this simple example:
datecolumnname||'_'||timecolumnname, -- 4
CASE
WHEN hastimezone THEN 'WITH TIME ZONE'
ELSE 'WITHOUT TIME ZONE'
END -- 5
)
FROM tableinfo;
You can read the results and send them as SQL commands in a second session, or if you want to get fancy you can write a fairly simple PL/PgSQL function that LOOPs over the results and EXECUTEs each one. The query produces output like:
ALTER TABLE sometable ADD COLUMN date1_time1 TIMESTAMP WITH TIME ZONE;
UPDATE sometable SET date1_time1 = (date1 + time1);
ALTER TABLE sometable DROP COLUMN date1;
ALTER TABLE sometable DROP COLUMN time1;
ALTER TABLE sometable ALTER COLUMN date1_time1 SET NOT NULL;
ALTER TABLE sometable ADD COLUMN date2_time2 TIMESTAMP WITH TIME ZONE;
UPDATE sometable SET date2_time2 = (date2 + time2);
ALTER TABLE sometable DROP COLUMN date2;
ALTER TABLE sometable DROP COLUMN time2;
ALTER TABLE sometable ALTER COLUMN date2_time2 SET NOT NULL;
ALTER TABLE othertable ADD COLUMN somedate_sometime TIMESTAMP WITHOUT TIME ZONE;
UPDATE othertable SET somedate_sometime = (somedate + sometime);
ALTER TABLE othertable DROP COLUMN somedate;
ALTER TABLE othertable DROP COLUMN sometime;
I don't know if there's any useful way to work out on a per-column basis whether you want WITH TIME ZONE or WITHOUT TIME ZONE. It's likely you'll land up just doing it hardcoded, in which case you can just remove that column. I put it in there in case there's a good way to figure it out in your application.
If you have cases where the time can be null but the date non-null or vice versa, you will need to wrap the date and time in an expression that decide what result to return when null. The nullif and coalesce functions are useful for this, as is CASE. Remember that adding a null and a non-null value produces a null result so you may not need to do anything special.
If you use schemas you may need to further refine the query to use %I substitution of schema name prefixes to disambiguate. If you don't use schemas (if you don't know what one is, you don't) then this doesn't matter.
Consider adding CHECK constraints enforcing that time1 is less than or equal to time2 where it makes sense in your application once you've done this. Also look at exclusion constraints in the documentation.

Categories