MYSQL Datetime, remove seconds - python

I've been attempting to do this for quite sometime.
I have a program which periodically writes rows to a table.
(table1)
ID Date Text Number
The Date column is in format yyyy-mm-dd hh:mm:ss ("2013-08-03 06:26:27")
The script which reads the data matches it to another set of data with the date in the same format except that the seconds are exactly 0.
"2013-08-03 06:26:00"
I need to change the Date data column in (Table 1) so that the seconds column is exactly zero. Currently it is just random values.
I have changed it on script level so that it writes the data to the MYSQL table so that the seconds is 0. However I have a lot of existing data which I can not loose which does not have the seconds at 0.

This is just a matter of updating the corresponding column.
Depending on ... hum ... your mood (?) you might try:
update tbl set datetime_column = substr(datetime_column, 1, 16);
Or
update tbl set datetime_column = date_format(datetime_column, '%Y-%m-%d %H:%i:00');
Or
update tbl set datetime_column = datetime_column - second(datetime_column);

MySQL? try this:
SELECT DATE_SUB(datetime_column,
INTERVAL EXTRACT(SECOND_MICROSECOND FROM datetime_column)
SECOND_MICROSECOND) as no_second_datetime
FROM table_name;

Related

Python/SQLite: Error inserting datetime.time variable into column of type Time

I am having trouble passing a datetime.time variable into a SQLite database, I have some very basic code here to show what exactly the variable is.
import datetime as dt
time = dt.datetime.now().time()
time = time.strftime('%H:%M')
time = dt.datetime.strptime(time, '%H:%M').time()
print(time)
print(type(time))
time = dt.datetime.now().time() gets the current time in type datetime.time.
Output:
17:34:48.286215
<class 'datetime.time'>
time = time.strftime('%H:%M') is then retrieving just the hour and minute but is of type str
Output:
17:35
<class 'str'>
I then convert it back to a datetime.time with time = dt.datetime.strptime(time, '%H:%M').time() which gives the the output:
17:32:00
<class 'datetime.time'>
The column of type Time accepts the format of HH:SS as shown in the documentation (SQLite3 DateTime Documentation), so I am not sure why I am getting this error:
sqlite3.InterfaceError: Error binding parameter 11 - probably unsupported type.
From this INSERT statement:
cursor.execute("INSERT INTO booked_tickets VALUES (?,?,?,?,?,?,?,?,?,?,?,?)", (booking_ref, ticket_date, film, showing, ticket_type, num_tickets, cus_name, cus_phone, cus_email, ticket_price, booking_date, booking_time, ))
EDIT: As requested, here is a snippet of code to recreate the table with the broken columns:
import datetime as dt
import sqlite3
connection = sqlite3.connect("your_database.db")
cursor = connection.cursor()
# Get the current time
time = dt.datetime.now().time()
# Format the time as a string using the '%H:%M' format
time_str = time.strftime('%H:%M')
# Parse the string back to a time object using the '%H:%M' format
time = dt.datetime.strptime(time_str, '%H:%M').time()
# Create the table
cursor.execute("CREATE TABLE test (example_time Time)")
# Insert the time into the example_time column
cursor.execute("INSERT INTO test VALUES (?)", (time, ))
connection.commit()
connection.close()
There is no Date or Time data type in SQLite.
The documentation from the link that you have in your question clearly states that in SQLite you can store datetime in 3 ways: text in ISO-8601 format, integer unix epochs and float julian days.
If you chose the first way then you should pass strings:
booking_date = dt.datetime.now().date().strftime('%Y-%m-%d')
booking_time = dt.datetime.now().time().strftime('%H:%M:00')
sql = "INSERT INTO booked_tickets VALUES (?,?,?,?,?,?,?,?,?,?)"
cursor.execute(sql, (booking_ref, ticket_date, film, showing, ticket_type, num_tickets, cus_name, cus_phone, cus_email, ticket_price, booking_date, booking_time))
But, you could also let SQLite get the current date and/or time.
Assuming that in the columns booking_date and booking_time you want the current date and time, you can define these columns as:
booking_date TEXT NOT NULL DEFAULT CURRENT_DATE,
booking_time TEXT NOT NULL DEFAULT CURRENT_TIME
and then you don't need to pass anything for them in the INSERT statement:
sql = "INSERT INTO booked_tickets VALUES (?,?,?,?,?,?,?,?,?,?)"
cursor.execute(sql, (booking_ref, ticket_date, film, showing, ticket_type, num_tickets, cus_name, cus_phone, cus_email, ticket_price,))
Checkout the SQLite datatypes documentation
2.2. Date and Time Datatype
SQLite does not have a storage class set aside for storing dates
and/or times. Instead, the built-in Date And Time Functions of SQLite
are capable of storing dates and times as TEXT, REAL, or INTEGER
values:
TEXT as ISO8601 strings ("YYYY-MM-DD HH:MM:SS.SSS").
REAL as Julian day numbers, the number of days since noon in Greenwich on November 24, 4714 B.C. according to the proleptic
Gregorian calendar.
INTEGER as Unix Time, the number of seconds since 1970-01-01 00:00:00 UTC.
Applications can choose to store dates and times in any of these
formats and freely convert between formats using the built-in date and
time functions.
Store the dates as TEXT datatypes.
The documentation you refer to mostly discusses how to format column values that representing dates and times. That is, it discusses what you can do with dates and times that already exist in your database.
It does, however, give just enough information to help you here I think. It says:
Date and time values can be stored as
text in a subset of the ISO-8601 format,
numbers representing the Julian day, or
numbers representing the number of seconds since (or before) 1970-01-01 00:00:00 UTC (the unix timestamp).
So you want to define and supply your dates and times as either full ISO-8601 date strings or as numbers. When defining a table, you indicate which of these formats you wish to use by defining a column type as a STRING, REAL or INTEGER respectively.
Here's some documentation that discusses how to store dates and times in one of these formats: https://www.sqlitetutorial.net/sqlite-date/

Delete datetime from SQL database based on hour

I'm a python dev, I'm handling an SQL database through sqlite3 and I need to perform a certain SQL query to delete data.
I have tables which contain datetime objects as keys.
I want to keep only one row per hour (the last record for that specific time) and delete the rest.
I also need this to only happen on data older than 1 week.
Here's my attempt:
import sqlite3
c= db.cursor()
c.execute('''DELETE FROM TICKER_AAPL WHERE time < 2022-07-11 AND time NOT IN
( SELECT * FROM
(SELECT min(time) FROM TICKER_AAPL GROUP BY hour(time)) AS temp_tab);''')
Here's a screenshot of the table itself:
First change the format of your dates from yyyyMMdd ... to yyyy-MM-dd ..., because this is the only valid text date format for SQLite.
Then use the function strftime() in your query to get the hour of each value in the column time:
DELETE FROM TICKER_AAPL
WHERE time < date(CURRENT_DATE, '-7 day')
AND time NOT IN (SELECT MAX(time) FROM TICKER_AAPL GROUP BY strftime('%Y-%m-%d %H', time));

How to Loop Through Dates in Python to pass into PostgresSQL query

I have 2 date variables which I pass into a SQL query via Python. It looks something like this:
start = '2019-10-01'
finish = '2019-12-22'
code_block = '''select sum(revenue) from table
where date between '{start}' and '{finish}'
'''.format(start = start, finish = finish)
That gets me the data I want for the current quarter, however I want to be able to loop through this same query for the previous 5 quarters. Can someone help me figure out a way so that this runs for the current quarter, then updates both start and finish to previous quarter, runs the query, and then keeps going until 5 quarters ago?
Consider adding a year and quarter grouping in aggregate SQL query and avoid the Python looping. And use a date difference of 15 months (i.e., 5 quarters) even use NOW() for end date. Also, use parameterization (supported in pandas) and not string formatting for dynamic querying.
code_block = '''select concat(date_part('year', date)::text,
'Q', date_part('quarter', date)::text) as yyyyqq,
sum(revenue) as sum_revenue
from table
where date between (%s::date - INTERVAL '15 MONTHS') and NOW()
group by date_part('year', date),
date_part('quarter', date)
'''
df = pd.read_sql(code_block, myconn, params=[start])
If you still need separate quarterly data frames use groupby to build a dictionary of data frames for the 5 quarters.
# DICTIONARY OF QUARTERLY DATA FRAMES
df_dict = {i:g for i,g in df.groupby(['yyyyqq'])}
df_dict['2019Q4'].head()
df_dict['2019Q3'].tail()
df_dict['2019Q2'].describe()
...
Just define a list with start dates and a list with finish dates and loop through them with:
for date_start, date_finish in zip(start_list, finish_list):
start = date_start
finish = date_finish
# here you insert the query
Hope this is what you are looking for =)

How to add date and HHMM time together into month/date/year hh:mm format and index

I'm working with a CSV file with flight records. My overall goal is to make plots of flight delays over a few selected days. I am trying to index these flights by the day and the scheduled departure times. So, I have a flight date in a month/day/year format and a departure time formated in hhmm, is there a way to reformat that departure time column to a hh:mm format in 24:00 time? Then would I simply add the columns together and index by them?
I've tried adding the columns together without reformatting the time and I'm not sure matplotlib recognizes this time format for my plots.
data = pd.read_csv("groundhog_query.csv",parse_dates=[['Flight_Date', 'Scheduled_Dep_Time']])
data.index = data['Flight_Date_Scheduled_Dep_Time']
data
the CSV files looks like this
'''
Year,Flight_Date,Day_Of_Year,Unique_Carrier_ID,Airline_ID,Tail_Number,Flight_Number,Origin_Airport_ID,Origin_Market_ID,Origin_Airport_Code,Origin_State,Destination_Airport_ID,Destination_Market_ID,Destination_Airport_Code,Dest_State,Scheduled_Dep_Time,Actual_Dep_Time,Dep_Delay,Pos_Dep_Delay,Scheduled_Arr_Time,Actual_Arr_Time,Arr_Delay,Pos_Arr_Delay,Combined_Arr_Delay,Can_Status,Can_Reason,Div_Status,Scheduled_Elapsed_Time,Actual_Elapsed_Time,Carrier_Delay,Weather_Delay,Natl_Airspace_System_Delay,Security_Delay,Late_Aircraft_Delay,Div_Airport_Landings,Div_Landing_Status,Div_Elapsed_Time,Div_Arrival_Delay,Div_Airport_1_ID,Div_1_Tail_Num,Div_Airport_2_ID,Div_2_Tail_Num,Div_Airport_3_ID,Div_3_Tail_Num,Div_Airport_4_ID,Div_4_Tail_Num,Div_Airport_5_ID,Div_5_Tail_Num
2011,2011-01-24,24,MQ,20398,N717MQ,4527,11278,30852,DCA,VA,14492,34492,RDU,NC,1630,1622.0,-8.0,0.0,1735,1722.0,-13.0,0.0,-13.0,0,,0,65,60.0,,,,,,0,,,,,,,,,,,,,
2011,2011-01-25,25,MQ,20398,N736MQ,4527,11278,30852,DCA,VA,14492,34492,RDU,NC,1630,1624.0,-6.0,0.0,1735,1724.0,-11.0,0.0,-11.0,0,,0,65,60.0,,,,,,0,,,,,,,,,,,,,
2011,2011-01-26,26,MQ,20398,N737MQ,4527,11278,30852,DCA,VA,14492,34492,RDU,NC,1630,,,,1735,,,,,1,B,0,65,,,,,,,0,,,,,,,,,,,,,
2011,2011-01-27,27,MQ,20398,N721MQ,4527,11278,30852,DCA,VA,14492,34492,RDU,NC,1630,1832.0,122.0,122.0,1735,1936.0,121.0,121.0,121.0,0,,0,65,64.0,121.0,0.0,0.0,0.
'''
my current results are in a month/day/year hhmm format
Use the following steps:
1. Read CSV without parsing dates.
2. Merge 'Flight_Date' and 'Scheduled_Dep_Time' columns. Make sure that 'Scheduled_Dep_Time' is converted to string fist (hence .map(str)) since it is by default parsed as int.
3. Convert string to datetime by using correct format ('%Y-%m-%d %H:%M')
4. Set this newly produced column as index
d = pd.read_csv("groundhog_query.csv")
d['Flight_Date_Scheduled_Dep_Time_string'] = d.Flight_Date.str.cat(' ' + d.Scheduled_Dep_Time.map(str))
d['Flight_Date_Scheduled_Dep_Time'] = pd.to_datetime(d.Flight_Date_Scheduled_Dep_Time_string, format='%Y-%m-%d %H:%M')
d = d.set_index('Flight_Date_Scheduled_Dep_Time')
The reference for % directives is here:
https://docs.python.org/3.7/library/datetime.html#strftime-and-strptime-behavior

Python Count number of records within a given date range

We have a backend table that stores details of transaction including seconds since epoch. I am creating a UI where I collect from-to dates to display counts of transaction occurred in-between the dates.
Assuming that the date range is from 07/01/2012 - 07/30/2012, I am unable to establish a logic that will increment a counter for records that happened within the time period. I should hit the DB only once as hitting for each day will give poor performance.
I am stuck at a logic:
Convert 07/01/2012 & 07/30/2012 to seconds since epoch.
Get the records for start date - end date [as converted to seconds since epoch]
For each record get the month / date
-- now how will we add counters for each date in between 07/01/2012 - 07/30/2012
MySQL has the function FROM_UNIXTIME which will convert your seconds since epoch into datetime and you can then extract the DATE part of it (YYYY-MM-DD format) and group according to it.
SELECT DATE(FROM_UNIXTIME(timestamp_column)), COUNT(*)
FROM table_name
GROUP BY DATE(FROM_UNIXTIME(timestamp_column))
This will return something like
2012-07-01 2
2012-07-03 4
…
(no entries for days without transactions)

Categories