I need to calculate deadline(datetime) after adding N(int) intervals (represented by relativedelta, because it can be months or years and also in seconds or dates). I can do it simply by multiplying interval by N and summing it to start_date(datetime). At the same time, I need to do it in multiple steps, like calculate 5th deadline, 6th..., so I just add interval to start_date for N times.
In some cases, these two methods provide different results.
Assume start_date = datetime(year=2019, month=1, day=2), interval = relativedelta(months=1, days=2), and N = 16.
From one point of view, both of mothods are correct, because interval*16 = relativedelta(years=+1, months=+4, days=+32), start_date+16*interval = 2019-01-01 + 1 year + 4 months + 32 days = 2020/05/1 + 32 days = 2020-06-02(because May has 31 day).
At the sametime, when we add them one by one it results into 2020/05/1 + 1 month + 2 days = 2020/06/02
The problem is related to "month-days overflow", but I can't figure out how to handle it. Always use sum instead of multiplication? But isn't calc-safe (imagine 9999999th deadline with interval=1 day and 1 sec)
Steps to reproduce:
def test_relative_sum_mult_with_date():
start = datetime(year=2019, month=1, day=1)
interval = relativedelta(months=1, days=2)
check_up_to = 100
for i in range(check_up_to):
multiplied = start + i*interval
summed = start
for j in range(i):
summed += interval
print('i=%s, i*interval=%s, diff(multiplied-summed)=%s, multiplied=%s, summed=%s' %
(i, i*interval, multiplied-summed, multiplied, summed))
assert multiplied == summed
Trace:
i*interval=relativedelta(), diff(multiplied-summed)=0:00:00, multiplied=2019-01-01 00:00:00, summed=2019-01-01 00:00:00
i=1, i*interval=relativedelta(months=+1, days=+2), diff(multiplied-summed)=0:00:00, multiplied=2019-02-03 00:00:00, summed=2019-02-03 00:00:00
i=2, i*interval=relativedelta(months=+2, days=+4), diff(multiplied-summed)=0:00:00, multiplied=2019-03-05 00:00:00, summed=2019-03-05 00:00:00
i=3, i*interval=relativedelta(months=+3, days=+6), diff(multiplied-summed)=0:00:00, multiplied=2019-04-07 00:00:00, summed=2019-04-07 00:00:00
i=4, i*interval=relativedelta(months=+4, days=+8), diff(multiplied-summed)=0:00:00, multiplied=2019-05-09 00:00:00, summed=2019-05-09 00:00:00
i=5, i*interval=relativedelta(months=+5, days=+10), diff(multiplied-summed)=0:00:00, multiplied=2019-06-11 00:00:00, summed=2019-06-11 00:00:00
i=6, i*interval=relativedelta(months=+6, days=+12), diff(multiplied-summed)=0:00:00, multiplied=2019-07-13 00:00:00, summed=2019-07-13 00:00:00
i=7, i*interval=relativedelta(months=+7, days=+14), diff(multiplied-summed)=0:00:00, multiplied=2019-08-15 00:00:00, summed=2019-08-15 00:00:00
i=8, i*interval=relativedelta(months=+8, days=+16), diff(multiplied-summed)=0:00:00, multiplied=2019-09-17 00:00:00, summed=2019-09-17 00:00:00
i=9, i*interval=relativedelta(months=+9, days=+18), diff(multiplied-summed)=0:00:00, multiplied=2019-10-19 00:00:00, summed=2019-10-19 00:00:00
i=10, i*interval=relativedelta(months=+10, days=+20), diff(multiplied-summed)=0:00:00, multiplied=2019-11-21 00:00:00, summed=2019-11-21 00:00:00
i=11, i*interval=relativedelta(months=+11, days=+22), diff(multiplied-summed)=0:00:00, multiplied=2019-12-23 00:00:00, summed=2019-12-23 00:00:00
i=12, i*interval=relativedelta(years=+1, days=+24), diff(multiplied-summed)=0:00:00, multiplied=2020-01-25 00:00:00, summed=2020-01-25 00:00:00
i=13, i*interval=relativedelta(years=+1, months=+1, days=+26), diff(multiplied-summed)=0:00:00, multiplied=2020-02-27 00:00:00, summed=2020-02-27 00:00:00
i=14, i*interval=relativedelta(years=+1, months=+2, days=+28), diff(multiplied-summed)=0:00:00, multiplied=2020-03-29 00:00:00, summed=2020-03-29 00:00:00
i=15, i*interval=relativedelta(years=+1, months=+3, days=+30), diff(multiplied-summed)=0:00:00, multiplied=2020-05-01 00:00:00, summed=2020-05-01 00:00:00
i=16, i*interval=relativedelta(years=+1, months=+4, days=+32), diff(multiplied-summed)=-1 day, 0:00:00, multiplied=2020-06-02 00:00:00, summed=2020-06-03 00:00:00
datetime.datetime(2020, 6, 2, 0, 0, 0) != datetime.datetime(2020, 6, 3, 0, 0, 0)
Expected :datetime.datetime(2020, 6, 3, 0, 0, 0)
Actual :datetime.datetime(2020, 6, 2, 0, 0, 0)
Versions:
Python 3.6
python-dateutil==2.8.0
Let me put your example in a more simple manner:
start = datetime(year=2018, month=3, day=29)
interval = relativedelta(months=1, days=2)
d1 = start + interval * 2 # 2018-06-02
d2 = start + interval + interval # 2018-06-03
print(d1, d2)
So I don't even think it's a library bug: just follow the same calculations mentally and see they make some sense.
Related
I have the following dataset:
date event next_event duration_Minutes
2021-09-09 22:30:00 1 2021-09-09 23:00:00 30
2021-09-09 23:00:00 2 2021-09-09 23:10:00 10
2021-09-09 23:10:00 1 2021-09-09 23:50:00 40
2021-09-09 23:50:00 4 2021-09-10 00:50:00 60
2021-09-10 00:50:00 4 2021-09-12 00:50:00 2880
The main problem is that I would like to split the multi-day events into separate events in the following way. I would like to have the event duration from 2021-09-09 23:50:00 until 2021-09-10 00: 00: 00 and then the duration from 2021-09-10 00: 00: 00 to 2021-09-10 00:50:00, and so on. This would be useful because after, I would need to group the events by day and calculate the duration of the each event by day, so I would like to fix these situation in which there is the day change between events.
I would like to obtain something like this:
date event next_event duration_Minutes
2021-09-09 22:30:00 1 2021-09-09 23:00:00 30
2021-09-09 23:00:00 2 2021-09-09 23:10:00 10
2021-09-09 23:10:00 1 2021-09-09 23:50:00 40
2021-09-09 23:50:00 4 2021-09-10 00:00:00 10
2021-09-09 00:00:00 4 2021-09-10 00:50:00 50
2021-09-10 00:50:00 4 2021-09-11 00:00:00 1390
2021-09-11 00:00:00 4 2021-09-12 00:00:00 1440
2021-09-12 00:00:00 4 2021-09-12 00:50:00 50
It should be able to handle situations in which we don't have an event for an entire day or more like in the example.
My current solution for now is:
first_record_hour_ts = df.index.floor('H')[0]
last_record_hour_ts = df.index.floor('H')[-1]
# Create a series from the first to the last date containing Nan
df_to_join = pd.Series(np.nan, index=pd.date_range(first_record_hour_ts, last_record_hour_ts, freq='H'))
df_to_join = pd.DataFrame(df_to_join)
# Concatenate with current status dataframe
df = pd.concat([df, df_to_join[~df_to_join.index.isin(df.index)]]).sort_index()
# Forward fill the nana
df.fillna(method='ffill', inplace=True)
df['next_event'] = df.index.shift(-1)
# Calculate the delta between the 2 status
df['duration'] = df['next_event'] - df.index
# Convert into minutes
df['duration_Minutes'] = df['duration_Minutes'].apply(lambda x: x.total_seconds() // 60)
This doesn't solve exactly the problem, but I think it may solve my goal which being able to group by event and by day at the end.
Ok, the code below looks a bit long -- and there's certainly a better/more efficient/shorter way of doing this. But I think it's pretty reasonably simple to follow along.
split_datetime_span_by_day below takes two dates: start_date and end_date. In your case, it would be date and next_event from your source data.
The function then checks whether that time period (start -> end) spans over midnight. If it doesn't, it returns the start date, the end date, and the time period in seconds. If it does span over midnight, it creates a new segment (start -> midnight), and then calls itself again (i.e. recurses), and the process continues until the time period does not span over midnight.
Just a note: the returned segment list is made up of tuples of (start, end, nmb_seconds). I'm returning the number of seconds, not the number of minutes as in your question, because I didn't know how you wanted to round the seconds (up, down, etc.). That's left as an exercise for the reader :-)
from datetime import datetime, timedelta
def split_datetime_span_by_day(start_date, end_date, split_segments=None):
assert start_date < end_date # sanity check
# when is the next midnight after start_date?
# adapted from https://ispycode.com/Blog/python/2016-07/Get-Midnight-Today
start_next_midnight = datetime.combine(start_date, datetime.min.time()) + timedelta(days=1)
if split_segments is None:
split_segments = []
if end_date < start_next_midnight:
# end date is before next midnight, no split necessary
return split_segments + [(
start_date,
end_date,
(end_date - start_date).total_seconds()
)]
# otherwise, split at next midnight...
split_segments += [(
start_date,
start_next_midnight,
(start_next_midnight - start_date).total_seconds()
)]
if (end_date - start_next_midnight).total_seconds() > 0:
# ...and recurse to get next segment
return split_datetime_span_by_day(
start_date=start_next_midnight,
end_date=end_date,
split_segments=split_segments
)
else:
# case where start_next_midnight == end_date i.e. end_date is midnight
# don't split & create a 0 second segment
return split_segments
# test case:
start_date = datetime.strptime('2021-09-12 00:00:00', '%Y-%m-%d %H:%M:%S')
end_date = datetime.strptime('2021-09-14 01:00:00', '%Y-%m-%d %H:%M:%S')
print(split_datetime_span_by_day(start_date=start_date, end_date=end_date))
# returned values:
# [
# (datetime.datetime(2021, 9, 12, 0, 0), datetime.datetime(2021, 9, 13, 0, 0), 86400.0),
# (datetime.datetime(2021, 9, 13, 0, 0), datetime.datetime(2021, 9, 14, 0, 0), 86400.0),
# (datetime.datetime(2021, 9, 14, 0, 0), datetime.datetime(2021, 9, 14, 1, 0), 3600.0)
# ]
This is my dataframe.
Start_hour End_date
23:58:00 00:26:00
23:56:00 00:01:00
23:18:00 23:36:00
How can I get in a new column the difference (in minutes) between these two columns?
>>> from datetime import datetime
>>>
>>> before = datetime.now()
>>> print('wait for more than 1 minute')
wait for more than 1 minute
>>> after = datetime.now()
>>> td = after - before
>>>
>>> td
datetime.timedelta(seconds=98, microseconds=389121)
>>> td.total_seconds()
98.389121
>>> td.total_seconds() / 60
1.6398186833333335
Then you can round it or use it as-is.
You can do something like this:
import pandas as pd
df = pd.DataFrame({
'Start_hour': ['23:58:00', '23:56:00', '23:18:00'],
'End_date': ['00:26:00', '00:01:00', '23:36:00']}
)
df['Start_hour'] = pd.to_datetime(df['Start_hour'])
df['End_date'] = pd.to_datetime(df['End_date'])
df['diff'] = df.apply(
lambda row: (row['End_date']-row['Start_hour']).seconds / 60,
axis=1
)
print(df)
Start_hour End_date diff
0 2021-03-29 23:58:00 2021-03-29 00:26:00 28.0
1 2021-03-29 23:56:00 2021-03-29 00:01:00 5.0
2 2021-03-29 23:18:00 2021-03-29 23:36:00 18.0
You can also rearrange your dates as string again if you like:
df['Start_hour'] = df['Start_hour'].apply(lambda x: x.strftime('%H:%M:%S'))
df['End_date'] = df['End_date'].apply(lambda x: x.strftime('%H:%M:%S'))
print(df)
Output:
Start_hour End_date diff
0 23:58:00 00:26:00 28.0
1 23:56:00 00:01:00 5.0
2 23:18:00 23:36:00 18.0
Short answer:
df['interval'] = df['End_date'] - df['Start_hour']
df['interval'][df['End_date'] < df['Start_hour']] += timedelta(hours=24)
Why so:
You probably trying to solve the problem that your Start_hout and End_date values sometimes belong to a different days, and that's why you can't just substutute one from the other.
It your time window never exceeds 24 hours interval, you could use some modular arithmetic to deal with 23:59:59 - 00:00:00 border:
if End_date < Start_hour, this always means End_date belongs to a next day
this implies, if End_date - Start_hour < 0 then we should add 24 hours to End_date to find the actual difference
The final formula is:
if rec['Start_hour'] < rec['End_date']:
offset = 0
else:
offset = timedelta(hours=24)
rec['delta'] = offset + rec['End_date'] - rec['Start_hour']
To do the same with pandas.DataFrame we need to change code accordingly. And
that's how we get the snippet from the beginning of the answer.
import pandas as pd
df = pd.DataFrame([
{'Start_hour': datetime(1, 1, 1, 23, 58, 0), 'End_date': datetime(1, 1, 1, 0, 26, 0)},
{'Start_hour': datetime(1, 1, 1, 23, 58, 0), 'End_date': datetime(1, 1, 1, 23, 59, 0)},
])
# ...
df['interval'] = df['End_date'] - df['Start_hour']
df['interval'][df['End_date'] < df['Start_hour']] += timedelta(hours=24)
> df
Start_hour End_date interval
0 0001-01-01 23:58:00 0001-01-01 00:26:00 0 days 00:28:00
1 0001-01-01 23:58:00 0001-01-01 23:59:00 0 days 00:01:00
I have a series of hourly data, and a Python list of dates that I'm interested in examining:
>>> hourly
KWH_DTTM
2015-06-20 15:00:00 2138.4
2015-06-20 16:00:00 4284.0
2015-06-20 17:00:00 4168.8
...
2017-06-21 21:00:00 2743.2
2017-06-21 22:00:00 2757.6
2017-06-21 23:00:00 2635.2
Freq: H, Name: KWH, Length: 17577, dtype: float64
>>> days
[datetime.date(2017, 5, 5), datetime.date(2017, 5, 8), datetime.date(2017, 5, 9), datetime.date(2017, 6, 2)]
I am trying to figure out how to select all entries from hourly that land on a day in days (days is about 50 entries long, and dates can be arbitrary). days is currently a list of Python date objects, but I don't care if they're strings, etc.
If I index hourly with days, I get a series that has been resampled to daily intervals:
>>> hourly[days]
KWH_DTTM
2017-05-05 2628.0
2017-05-08 2628.0
2017-05-09 2548.8
2017-06-02 2512.8
Name: KWH, Length: 30, dtype: float64
If I index with a single day, rendered to a string, I get the desired output for that day:
>>> hourly['2017-5-5']
KWH_DTTM
2017-05-05 00:00:00 2505.6
2017-05-05 01:00:00 2563.2
2017-05-05 02:00:00 2505.6
...
2017-05-05 21:00:00 2268.0
2017-05-05 22:00:00 2232.0
2017-05-05 23:00:00 2088.0
Freq: H, Name: KWH, Length: 24, dtype: float64
Is there a way to do this besides looping over my list of days and concatenating the results?
Consider building a boolean series built from a Series.apply() passing every datetimeindex value and checking if it equals each element of dates via a list comprehension. Then use this boolean series to filter hourly series.
# DATA EXAMPLE
np.random.seed(45)
hourly = pd.Series(index=pd.DatetimeIndex(start='2016-09-05 00:00:00',
periods=17577, freq='H'),
data=np.random.randn(17577),
name='KWH_DTTM')
days = [datetime.date(2017, 5, 5), datetime.date(2017, 5, 8),
datetime.date(2017, 5, 9), datetime.date(2017, 6, 2)]
# BOOLEAN SERIES
bools = pd.Series(hourly.index.values).apply(lambda x: \
max([x.date() == d for d in days]))
bools.index = hourly.index
# FILTER ORIGINAL SERIES
newhourly = hourly[bools]
print(newhourly.head(10))
# 2017-05-05 00:00:00 -0.238799
# 2017-05-05 01:00:00 -0.263365
# 2017-05-05 02:00:00 -0.249632
# 2017-05-05 03:00:00 0.131630
# 2017-05-05 04:00:00 -1.279383
# 2017-05-05 05:00:00 0.411316
# 2017-05-05 06:00:00 -2.059022
# 2017-05-05 07:00:00 -1.008058
# 2017-05-05 08:00:00 -0.365651
# 2017-05-05 09:00:00 1.515522
# Name: KWH_DTTM, dtype: float64
print(newhourly.tail(10))
# 2017-06-02 14:00:00 0.329567
# 2017-06-02 15:00:00 -0.618604
# 2017-06-02 16:00:00 0.848719
# 2017-06-02 17:00:00 -1.152657
# 2017-06-02 18:00:00 0.269618
# 2017-06-02 19:00:00 -1.806861
# 2017-06-02 20:00:00 -0.188643
# 2017-06-02 21:00:00 0.515790
# 2017-06-02 22:00:00 0.384695
# 2017-06-02 23:00:00 1.115494
# Name: KWH_DTTM, dtype: float64
You could convert hourly to a DataFrame, and then use .isin():
df = hourly.reset_index(name='KWH').rename(columns={'index':'hours'})
df = df[df.hours.apply(lambda x: datetime.date(x.year, x.month, x.day)).isin(dates)]
Here's the complete code with random data:
import pandas as pd
import datetime
import random
random_data = [random.randint(1000,2000) for x in range(1,1000)]
hours = [datetime.datetime(random.randint(2014,2016),random.randint(1,12),random.randint(1,28),random.randint(1,23),0) for x in range(1,1000)]
hourly = pd.Series(data=random_data, index=h)
dates = [datetime.date(random.randint(2014,2016),random.randint(1,12),random.randint(1,28)) for x in range(1,10)]
df = hourly.reset_index(name='KWH').rename(columns={'index':'hours'})
df = df[df.hours.apply(lambda x: datetime.date(x.year, x.month, x.day)).isin(dates)]
I have the following model:
Deal(models.Model):
start_date = models.DateTimeField()
end_date = models.DateTimeField()
I want to iterate through a given year
year = '2010'
For each month in year I want to execute a query to see if the month is between start_date and end_date.
How can I iterate through a given year? Use the month to do a query?
SELECT * FROM deals WHERE month BETWEEN start_date AND end_date
The outcome will tell me if I had a deal in January 2010 and/or in February 2010, etc.
How can I iterate through a given year?
You could use python-dateutil's rrule. Install with command pip install python-dateutil.
Example usage:
In [1]: from datetime import datetime
In [2]: from dateutil import rrule
In [3]: list(rrule.rrule(rrule.MONTHLY, dtstart=datetime(2010,01,01,00,01), count=12))
Out[3]:
[datetime.datetime(2010, 1, 1, 0, 1),
datetime.datetime(2010, 2, 1, 0, 1),
datetime.datetime(2010, 3, 1, 0, 1),
datetime.datetime(2010, 4, 1, 0, 1),
datetime.datetime(2010, 5, 1, 0, 1),
datetime.datetime(2010, 6, 1, 0, 1),
datetime.datetime(2010, 7, 1, 0, 1),
datetime.datetime(2010, 8, 1, 0, 1),
datetime.datetime(2010, 9, 1, 0, 1),
datetime.datetime(2010, 10, 1, 0, 1),
datetime.datetime(2010, 11, 1, 0, 1),
datetime.datetime(2010, 12, 1, 0, 1)]
Use the month to do a query?
You could iterate over months like this:
In [1]: from dateutil import rrule
In [2]: from datetime import datetime
In [3]: months = list(rrule.rrule(rrule.MONTHLY, dtstart=datetime(2010,01,01,00,01), count=13))
In [4]: i = 0
In [5]: while i < len(months) - 1:
...: print "start_date", months[i], "end_date", months[i+1]
...: i += 1
...:
start_date 2010-01-01 00:01:00 end_date 2010-02-01 00:01:00
start_date 2010-02-01 00:01:00 end_date 2010-03-01 00:01:00
start_date 2010-03-01 00:01:00 end_date 2010-04-01 00:01:00
start_date 2010-04-01 00:01:00 end_date 2010-05-01 00:01:00
start_date 2010-05-01 00:01:00 end_date 2010-06-01 00:01:00
start_date 2010-06-01 00:01:00 end_date 2010-07-01 00:01:00
start_date 2010-07-01 00:01:00 end_date 2010-08-01 00:01:00
start_date 2010-08-01 00:01:00 end_date 2010-09-01 00:01:00
start_date 2010-09-01 00:01:00 end_date 2010-10-01 00:01:00
start_date 2010-10-01 00:01:00 end_date 2010-11-01 00:01:00
start_date 2010-11-01 00:01:00 end_date 2010-12-01 00:01:00
start_date 2010-12-01 00:01:00 end_date 2011-01-01 00:01:00
Replace the "print" statement with a query. Feel free to adapt it to your needs.
There is probably a better way but that could do the job.
Basically, I want my script to pause between 4 and 5 AM. The only way to do this I've come up with so far is this:
seconds_into_day = time.time() % (60*60*24)
if 60*60*4 < seconds_into_day < 60*60*5:
sleep(time_left_till_5am)
Any "proper" way to do this? Aka some built-in function/lib for calculating time; rather than just using seconds all the time?
You want datetime
The datetime module supplies classes for manipulating dates and times in both simple and complex ways
If you use date.hour from datetime.now() you'll get the current hour:
datetimenow = datetime.now();
if datetimenow.hour in range(4, 5)
sleep(time_left_till_5am)
You can calculate time_left_till_5am by taking 60 - datetimenow.minute multiplying by 60 and adding to 60 - datetimenow.second.
Python has a built-in datetime library: http://docs.python.org/library/datetime.html
This should probably get you what you're after:
import datetime as dt
from time import sleep
now = dt.datetime.now()
if now.hour >= 4 andnow.hour < 5:
sleep((60 - now.minute)*60 + (60 - now.second))
OK, the above works, but here's the purer, less error-prone solution (and what I was originally thinking of but suddenly forgot how to do):
import datetime as dt
from time import sleep
now = dt.datetime.now()
pause = dt.datetime(now.year, now.month, now.day, 4)
start = dt.datetime(now.year, now.month, now.day, 5)
if now >= pause and now < start:
sleep((start - now).seconds)
That's where my original "timedelta" comment came from -- what you get from subtracting two datetime objects is a timedelta object (which in this case we pull the 'seconds' attribute from).
The following code covers the more general case where a script needs to pause during any fixed window of less than 24 hours duration. Example: must sleep between 11:00 PM and 01:00 AM.
import datetime as dt
def sleep_duration(sleep_from, sleep_to, now=None):
# sleep_* are datetime.time objects
# now is a datetime.datetime object
if now is None:
now = dt.datetime.now()
duration = 0
lo = dt.datetime.combine(now, sleep_from)
hi = dt.datetime.combine(now, sleep_to)
if lo <= now < hi:
duration = (hi - now).seconds
elif hi < lo:
if now >= lo:
duration = (hi + dt.timedelta(hours=24) - now).seconds
elif now < hi:
duration = (hi - now).seconds
return duration
tests = [
(4, 5, 3, 30),
(4, 5, 4, 0),
(4, 5, 4, 30),
(4, 5, 5, 0),
(4, 5, 5, 30),
(23, 1, 0, 0),
(23, 1, 0, 30),
(23, 1, 0, 59),
(23, 1, 1, 0),
(23, 1, 1, 30),
(23, 1, 22, 30),
(23, 1, 22, 59),
(23, 1, 23, 0),
(23, 1, 23, 1),
(23, 1, 23, 59),
]
for hfrom, hto, hnow, mnow in tests:
sfrom = dt.time(hfrom)
sto = dt.time(hto)
dnow = dt.datetime(2010, 7, 5, hnow, mnow)
print sfrom, sto, dnow, sleep_duration(sfrom, sto, dnow)
and here's the output:
04:00:00 05:00:00 2010-07-05 03:30:00 0
04:00:00 05:00:00 2010-07-05 04:00:00 3600
04:00:00 05:00:00 2010-07-05 04:30:00 1800
04:00:00 05:00:00 2010-07-05 05:00:00 0
04:00:00 05:00:00 2010-07-05 05:30:00 0
23:00:00 01:00:00 2010-07-05 00:00:00 3600
23:00:00 01:00:00 2010-07-05 00:30:00 1800
23:00:00 01:00:00 2010-07-05 00:59:00 60
23:00:00 01:00:00 2010-07-05 01:00:00 0
23:00:00 01:00:00 2010-07-05 01:30:00 0
23:00:00 01:00:00 2010-07-05 22:30:00 0
23:00:00 01:00:00 2010-07-05 22:59:00 0
23:00:00 01:00:00 2010-07-05 23:00:00 7200
23:00:00 01:00:00 2010-07-05 23:01:00 7140
23:00:00 01:00:00 2010-07-05 23:59:00 3660
When dealing with dates and times in Python I still prefer mxDateTime over Python's datetime module as although the built-in one has improved greatly over the years it's still rather awkward and lacking in comparison. So if interested go here: mxDateTime It's free to download and use. Makes life much easier when dealing with datetime math.
import mx.DateTime as dt
from time import sleep
now = dt.now()
if 4 <= now.hour < 5:
stop = dt.RelativeDateTime(hour=5, minute=0, second=0)
secs_remaining = ((now + stop) - now).seconds
sleep(secs_remaining)