Aware DateTime object not adapting to change in DST - python

I wrote a generator in python that yields a new day of data at each call from a pandas DataFrame. My DataFrame is unix timestamp indexed. My first attempt at a code worked as follows (df is the dataframe, tz is a pytz.timezone (Europe/Amsterdam in my case):
def interval_generator(df, tz):
today = datetime.datetime.fromtimestamp(df.index.min(), tz)
last_day = datetime.datetime.fromtimestamp(df.index.max(), tz)
while today <= last_day:
tomorrow = today + datetime.timedelta(days=1)
yield df.loc[tz.localize(today).timestamp():tz.localize(tomorrow).timestamp() - 1]
today = tomorrow
However when running my code I noticed that the DateTime object has the weird behaviour of really sticking with the timezone it was initially attached to (especially the incremented hour). Example of the (in my eyes) weird behaviour:
import datetime
import pytz
tz = pytz.timezone('Europe/Amsterdam')
# This is when daylight saving times stops in the Netherlands in 2015.
t1 = datetime.datetime(2015, 10, 25, 0, 0)
t2 = t1 + datetime.timedelta(days=1)
t1_localized = tz.localize(t1)
t2_localized = tz.localize(t2)
t2_loc_incremented = t1_localized + datetime.timedelta(days=1)
When printing the output of these final three variables you get:
>>> t1_localized
datetime.datetime(2015, 10, 25, 0, 0, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>)
>>> t2_localized
datetime.datetime(2015, 10, 26, 0, 0, tzinfo=<DstTzInfo 'Europe/Amsterdam' CET+1:00:00 STD>)
>>> t2_loc_incremented
datetime.datetime(2015, 10, 26, 0, 0, tzinfo=<DstTzInfo 'Europe/Amsterdam' CEST+2:00:00 DST>)
More importantly for my code, the timestamp for both versions of t2 is different:
>>> t2_localized.timestamp()
1445814000.0
>>> t2_loc_incremented.timestamp()
1445810400.0
I solved this in my generator function with the following workaround:
def interval_generator(df, tz):
today = datetime.datetime.fromtimestamp(df.index.min(), tz=tz).strftime('%Y-%m-%d')
today = datetime.datetime.strptime(today, '%Y-%m-%d')
last_day = datetime.datetime.fromtimestamp(df.index.max(), tz=tz).strftime('%Y-%m-%d')
last_day = datetime.datetime.strptime(last_day, '%Y-%m-%d')
while today <= last_day:
tomorrow = today + datetime.timedelta(days=1)
yield df.loc[tz.localize(today).timestamp():tz.localize(tomorrow).timestamp() - 1]
today = tomorrow
Which basically gets me the desired functionality, but I can't help but wonder whether there isn't a better way to deal with this issue. Is there any good alternative for my problem? Is this considered a bug of the datetime module? (I am using python 3.4) I tried googling, but could not find anything

Related

Converting and Formatting Time Zones

I am dealing with my worst nightmare - timezones and DST. I already read a lot of posts on stackoverflow but I still cannot figure it out. Here is the problem:
I am making an API request for what needs to be a UTC one day of data but the system I work with needs a request in a US/Pacific time. The documentation says:
Time zone is supported for report range filtering, but all responses are returned in US Pacific time zone please adjust Daylight Savings Time end accordingly.
API calls after DST starts should have -07:00 appended and after DST ends should have -08:00 appended
2018 Daylight Savings Time starts on Sunday March 11th 2018 at 2 AM.
API calls prior to March 11th should have the following syntax: &start=2017-03-10T00:00:00-08:00&end=2017-03-10T23:59:59-08:00
API call for the actual day of Daylight Savings should have the following syntax: &start=2018-03-11T00:00:00-08:00&end=2017-03-11T23:59:59-07:00
Apart from the confusing 2017 and 2018 mixture, there is no actual parameter to specify the time zone you need but you have to adjust the data that is in the following format : 2018-03-11T00:00:00-08:00.
To me it looks like a ISO format but I spent quite some time trying to get yyyy-MM-dd'T'HH:mm:ssXXX and not 'yyyy-MM-dd'T'HH:mm:ss.SSSXXX' and couldn't make this work. So I created the following workaround:
def dst_calc(single_date):
zone = pytz.timezone("US/Pacific")
day = single_date.strftime("%Y-%m-%d")
tdelta_1 = datetime.strptime('2:00:00', '%H:%M:%S') - datetime.strptime('1:00:00', '%H:%M:%S')
tdelta_0 = datetime.strptime('1:00:00', '%H:%M:%S') - datetime.strptime('1:00:00', '%H:%M:%S')
logger.info('check for DST')
if zone.localize(datetime(single_date.year, single_date.month, single_date.day)).dst() == tdelta_1:
logger.info('summertime')
start = single_date.strftime("%Y-%m-%d") + "T00:00:00-07:00"
end = single_date.strftime("%Y-%m-%d") + "T23:59:59-07:00"
elif zone.localize(datetime(single_date.year, single_date.month, single_date.day) + timedelta(days=1)).dst() == tdelta_1:
logger.info('beginning of summertime')
start = single_date.strftime("%Y-%m-%d") + "T00:00:00-08:00"
end = single_date.strftime("%Y-%m-%d") + "T23:59:59-07:00"
elif zone.localize(datetime(single_date.year, single_date.month, single_date.day)).dst() == tdelta_0:
logger.info('wintertime')
start = single_date.strftime("%Y-%m-%d") + "T00:00:00-08:00"
end = single_date.strftime("%Y-%m-%d") + "T23:59:59-08:00"
Obviously this is only in US/Pacific timezone and to get the UTC day I need to subtract 8h difference from the start and 8 timestamp i.e. have T16:00:00-08:00 but I am wondering if there is a better way / package / formatter that can do this is a more logic-proof way.
You can use datetime's astimezone method to determine the correct hours.
import datetime, pytz
now = datetime.datetime.now() # datetime.datetime(2019, 2, 12, 17, 0, 0, 0)
now.astimezone(pytz.utc)
# datetime.datetime(2019, 2, 12, 16, 0, 0, 0, tzinfo=<UTC>)
now.astimezone(pytz.timezone('US/Pacific'))
# datetime.datetime(2019, 2, 12, 8, 0, 0, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>)

Python How do I calibrate to a specific time?

There is a web series starting on 2017-01-11 19:00 Warsaw time. I want to make a list of time zones for major cities to help people figure out when to tune in. How can I tell Python that the date variable is related to the time in Warsaw?
import datetime
from pytz import timezone
from pytz import common_timezones
# warsaw time
s = '2017-01-11 19:00:00.801000'
format = '%Y-%m-%d %H:%M:%S.%f'
date = datetime.datetime.strptime(s, format)
fmt = "%Y-%m-%d %H:%M:%S %Z%z"
warsaw_time = date
print(warsaw_time.strftime(fmt))
for zone in common_timezones:
print( zone + str(warsaw_time.astimezone(timezone(zone))) )
If I understand correctly, you are trying to set the date to Warsaw's local time (CET). Which you can do like this:
>>> warsaw = pytz.timezone("CET")
>>> dt = datetime.datetime(2017, 1, 11, 19, 0, 0, 0, warsaw)
>>> dt
datetime.datetime(2017, 1, 11, 19, 0, tzinfo=<DstTzInfo 'CET' CET+1:00:00 STD>)

Nondeterministic behavior in python's relativedelta

I am trying to get a datetime seven days prior to another date.
So I am doing in the console:
import datetime
from dateutil.relativedelta import relativedelta
dt = datetime.date(2014, 10, 18)
dt_minus_one_week = datetime.date(2014, 10, 18) - relativedelta(days=7)
The result is, as expected, datetime.date(2014, 10, 11). However, I am running a webservice (using eve, but I think that this is unimportant) application for a long time, and then when I invoke the method to get the one week older date, I get datetime.date(2014, 10, 10). The code is exactly the same as above.
If I restart the app, the date is what I expected it to be. Why is this happening? Is relativedelta nondeterministic? Is there any way to "reset" it so I can get the right value again?
From the description of your functions in the comments, you have stepped on a common python "landmine".
def get_d_minus_one_pacific_local_date():
return datetime.datetime.now(
pytz.timezone('US/Pacific')).date() - relativedelta(days=1)
def get_relative_date(init=get_d_minus_one_pacific_local_date(), *args, **kwargs):
return init + datetime.timedelta(*args, **kwargs)
# ...
get_relative_date(days=-7)
When you set the default value of init in get_relative_date definition, it will not be recalculated again. So when the next day comes, it will use the value obtained at the time of function definition.
See: https://stackoverflow.com/a/530768/632706
If you are only dealing with days, I would just use the datetime module.
import datetime
old_date = datetime.date(2014, 10, 18)
new_date = old_date - datetime.timedelta(days=7)
The output would be datetime.date(2014, 10, 11). I have used timedelta a bit and haven't had a problem with inaccurate dates.
Suppose the web server is set up in the US/Hawaii timezone and the current
localtime is 11PM on 2014-10-17. Then
In [57]: datetime.datetime(2014, 10, 17, 23, 0, 0, tzinfo=pytz.timezone('US/Pacific')).date()
Out[57]: datetime.date(2014, 10, 17)
However, the current time in US/Pacific is
In [44]: now = datetime.datetime(2014, 10, 17, 23, 0, 0)
In [45]: hawaii = pytz.timezone('US/Hawaii')
In [46]: pacific = pytz.timezone('US/Pacific')
In [47]: pacific.normalize(hawaii.localize(now).astimezone(pacific)).date()
Out[47]: datetime.date(2014, 10, 18)
This would cause the symptom you are seeing.
In short, you almost never want to build a timezone-aware datetime by directly
supplying it to tzinfo:
datetime.datetime.now(pytz.timezone('US/Pacific')).date()
If you are using pytz, use the pytz timezone's localize method:
tzone.localize(naive_date)
By the way,
datetime.datetime.now(pytz.timezone('US/Pacific')).date()
is always equivalent to
datetime.datetime.now().date()
or
datetime.date.today()
datetime.datetime.now(pytz.timezone('US/Pacific')) is the same as
datetime.datetime.now() with the tzinfo set to pytz.timezone('US/Pacific'), but
if you then call the date method, then the tzinfo does not matter, since all you get back is the year, month and date.

How to find next day's Unix timestamp for same hour, including DST, in Python?

In Python, I can find the Unix time stamp of a local time, knowing the time zone, like this (using pytz):
>>> import datetime as DT
>>> import pytz
>>> mtl = pytz.timezone('America/Montreal')
>>> naive_time3 = DT.datetime.strptime('2013/11/03', '%Y/%m/%d')
>>> naive_time3
datetime.datetime(2013, 11, 3, 0, 0)
>>> localized_time3 = mtl.localize(naive_time3)
>>> localized_time3
datetime.datetime(2013, 11, 3, 0, 0, tzinfo=<DstTzInfo 'America/Montreal' EDT-1 day, 20:00:00 DST>)
>>> localized_time3.timestamp()
1383451200.0
So far, so good. naive_time is not aware of the time zone, whereas localized_time knows its midnight on 2013/11/03 in Montréal, so the (UTC) Unix time stamp is good. This time zone is also my local time zone and this time stamp seems right:
$ date -d #1383451200
Sun Nov 3 00:00:00 EDT 2013
Now, clocks were adjusted one hour backward November 3rd at 2:00 here in Montréal, so we gained an extra hour that day. This means that there were, here, 25 hours between 2013/11/03 and 2013/11/04. This shows it:
>>> naive_time4 = DT.datetime.strptime('2013/11/04', '%Y/%m/%d')
>>> localized_time4 = mtl.localize(naive_time4)
>>> localized_time4
datetime.datetime(2013, 11, 4, 0, 0, tzinfo=<DstTzInfo 'America/Montreal' EST-1 day, 19:00:00 STD>)
>>> (localized_time4.timestamp() - localized_time3.timestamp()) / 3600
25.0
Now, I'm looking for an easy way to get the localized_time4 object from localized_time3, knowing I want to get the next localized day at the same hour (here, midnight). I tried timedelta, but I believe it's not aware of time zones or DST:
>>> localized_time4td = localized_time3 + DT.timedelta(1)
>>> localized_time4td
datetime.datetime(2013, 11, 4, 0, 0, tzinfo=<DstTzInfo 'America/Montreal' EDT-1 day, 20:00:00 DST>)
>>> (localized_time4td.timestamp() - localized_time3.timestamp()) / 3600
24.0
My purpose is to get informations about log entries that are stored with their Unix timestamp for each local day. Of course, if I use localized_time3.timestamp() and add 24 * 3600 here (which will be the same as localized_time4td.timestamp()), I will miss all log entries that happened between localized_time4td.timestamp() and localized_time4td.timestamp() + 3600.
In other words, the function or method I'm looking for should know when to add 25 hours, 24 hours or 23 hours sometimes to a Unix time stamp, depending on when DST shifts happen.
Without using a new package:
def add_day(x):
d = x.date()+DT.timedelta(1)
return mtl.localize(x.replace(year=d.year, month=d.month, day=d.day, tzinfo=None))
Full script:
import datetime as DT
import pytz
import calendar
mtl = pytz.timezone('America/Montreal')
naive_time3 = DT.datetime.strptime('2013/11/03', '%Y/%m/%d')
print repr(naive_time3)
#datetime.datetime(2013, 11, 3, 0, 0)
localized_time3 = mtl.localize(naive_time3)
print repr(localized_time3)
#datetime.datetime(2013, 11, 3, 0, 0, tzinfo=<DstTzInfo 'America/Montreal' EDT-1 day, 20:00:00 DST>)
print calendar.timegm(localized_time3.utctimetuple())
#1383451200.0
def add_day(x):
d = x.date()+DT.timedelta(1)
return mtl.localize(x.replace(year=d.year, month=d.month, day=d.day, tzinfo=None))
print repr(add_day(localized_time3))
#datetime.datetime(2013, 11, 4, 0, 0, tzinfo=<DstTzInfo 'America/Montreal' EST-1 day, 19:00:00 STD>)
(calendar is for Python2.)
I gradually provide several solutions with the most robust solution at the very end of this answer that tries to handle the following issues:
utc offset due to DST
past dates when the local timezone might have had different utc offset due to reason unrelated to DST. dateutil and stdlib solutions fail here on some systems, notably Windows
ambiguous times during DST (don't know whether Arrow provides interface to handle it)
non-existent times during DST (the same)
To find POSIX timestamp for tomorrow's midnight (or other fixed hour) in a given timezone, you could use code from How do I get the UTC time of “midnight” for a given timezone?:
from datetime import datetime, time, timedelta
import pytz
DAY = timedelta(1)
tz = pytz.timezone('America/Montreal')
tomorrow = datetime(2013, 11, 3).date() + DAY
midnight = tz.localize(datetime.combine(tomorrow, time(0, 0)), is_dst=None)
timestamp = (midnight - datetime(1970, 1, 1, tzinfo=pytz.utc)).total_seconds()
dt.date() method returns the same naive date for both naive and timezone-aware dt objects.
The explicit formula for timestamp is used to support Python version before Python 3.3. Otherwise .timestamp() method could be used in Python 3.3+.
To avoid ambiguity in parsing input dates during DST transitions that are unavoidable for .localize() method unless you know is_dst parameter, you could use Unix timestamps stored with the dates:
from datetime import datetime, time, timedelta
import pytz
DAY = timedelta(1)
tz = pytz.timezone('America/Montreal')
local_dt = datetime.fromtimestamp(timestamp_from_the_log, tz)
tomorrow = local_dt.date() + DAY
midnight = tz.localize(datetime.combine(tomorrow, time(0, 0)), is_dst=None)
timestamp = (midnight - datetime(1970, 1, 1, tzinfo=pytz.utc)).total_seconds()
To support other fixed hours (not only midnight):
tomorrow = local_dt.replace(tzinfo=None) + DAY # tomorrow, same time
dt_plus_day = tz.localize(tomorrow, is_dst=None)
timestamp = dt_plus_day.timestamp() # use the explicit formula before Python 3.3
is_dst=None raises an exception if the result date is ambiguous or non-existent. To avoid exception, you could choose the time that is closest to the previous date from yesterday (same DST state i.e., is_dst=local_dt.dst()):
from datetime import datetime, time, timedelta
import pytz
DAY = timedelta(1)
tz = pytz.timezone('America/Montreal')
local_dt = datetime.fromtimestamp(timestamp_from_the_log, tz)
tomorrow = local_dt.replace(tzinfo=None) + DAY
dt_plus_day = tz.localize(tomorrow, is_dst=local_dt.dst())
dt_plus_day = tz.normalize(dt_plus_day) # to detect non-existent times
timestamp = (dt_plus_day - datetime(1970, 1, 1, tzinfo=pytz.utc)).total_seconds()
.localize() respects given time even if it is non-existent, therefore .normalize() is required to fix the time. You could raise an exception here if normalize() method changes its input (non-existent time detected in this case) for consistency with other code examples.
(Thanks to #rdodev for pointing me to Arrow).
Using Arrow, this operation becomes easy:
>>> import arrow
>>> import datetime as DT
>>> lt3 = arrow.get(DT.datetime(2013, 11, 3), 'America/Montreal')
>>> lt3
<Arrow [2013-11-03T00:00:00-04:00]>
>>> lt4 = arrow.get(DT.datetime(2013, 11, 4), 'America/Montreal')
>>> lt4
<Arrow [2013-11-04T00:00:00-05:00]>
>>> lt4.timestamp - (lt3.replace(days=1).timestamp)
0
>>> (lt3.replace(days=1).timestamp - lt3.timestamp) / 3600
25.0
Using Arrow's replace method, singular unit names replace that property while plural adds to it. So lt3.replace(days=1) is November 4th, 2013 while lt3.replace(day=1) is November 1st, 2013.
Here an alternative based on dateutil:
>>> # In Spain we changed DST 10/26/2013
>>> import datetime
>>> import dateutil.tz
>>> # tzlocal gets the timezone of the computer
>>> dt1 = datetime.datetime(2013, 10, 26, 14, 00).replace(tzinfo=dateutil.tz.tzlocal())
>>> print dt1
2013-10-26 14:00:00+02:00
>>> dt2 = dt1 + datetime.timedelta(1)
>>> print dt2
2013-10-27 14:00:00+01:00
# see if we hace 25 hours of difference
>>> import time
>>> (time.mktime(dt2.timetuple()) - time.mktime(dt1.timetuple())) / 3600.0
25.0
>>> (float(dt2.strftime('%s')) - float(dt1.strftime('%s'))) / 3600 # the same
25.0

Cleanest and most Pythonic way to get tomorrow's date?

What is the cleanest and most Pythonic way to get tomorrow's date? There must be a better way than to add one to the day, handle days at the end of the month, etc.
datetime.date.today() + datetime.timedelta(days=1) should do the trick
timedelta can handle adding days, seconds, microseconds, milliseconds, minutes, hours, or weeks.
>>> import datetime
>>> today = datetime.date.today()
>>> today
datetime.date(2009, 10, 1)
>>> today + datetime.timedelta(days=1)
datetime.date(2009, 10, 2)
>>> datetime.date(2009,10,31) + datetime.timedelta(hours=24)
datetime.date(2009, 11, 1)
As asked in a comment, leap days pose no problem:
>>> datetime.date(2004, 2, 28) + datetime.timedelta(days=1)
datetime.date(2004, 2, 29)
>>> datetime.date(2004, 2, 28) + datetime.timedelta(days=2)
datetime.date(2004, 3, 1)
>>> datetime.date(2005, 2, 28) + datetime.timedelta(days=1)
datetime.date(2005, 3, 1)
No handling of leap seconds tho:
>>> from datetime import datetime, timedelta
>>> dt = datetime(2008,12,31,23,59,59)
>>> str(dt)
'2008-12-31 23:59:59'
>>> # leap second was added at the end of 2008,
>>> # adding one second should create a datetime
>>> # of '2008-12-31 23:59:60'
>>> str(dt+timedelta(0,1))
'2009-01-01 00:00:00'
>>> str(dt+timedelta(0,2))
'2009-01-01 00:00:01'
darn.
EDIT - #Mark: The docs say "yes", but the code says "not so much":
>>> time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S")
(2008, 12, 31, 23, 59, 60, 2, 366, -1)
>>> time.mktime(time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S"))
1230789600.0
>>> time.gmtime(time.mktime(time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S")))
(2009, 1, 1, 6, 0, 0, 3, 1, 0)
>>> time.localtime(time.mktime(time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S")))
(2009, 1, 1, 0, 0, 0, 3, 1, 0)
I would think that gmtime or localtime would take the value returned by mktime and given me back the original tuple, with 60 as the number of seconds. And this test shows that these leap seconds can just fade away...
>>> a = time.mktime(time.strptime("2008-12-31 23:59:60","%Y-%m-%d %H:%M:%S"))
>>> b = time.mktime(time.strptime("2009-01-01 00:00:00","%Y-%m-%d %H:%M:%S"))
>>> a,b
(1230789600.0, 1230789600.0)
>>> b-a
0.0
Even the basic time module can handle this:
import time
time.localtime(time.time() + 24*3600)
For people who are dealing with servers Time Stamp
To get yesterday Time Stamp:
yesterdaytimestamp = datetime.datetime.today() + datetime.timedelta(days=-1)
To get Today Time Stamp:
currenttimestamp = datetime.datetime.now().timestamp()
To get Tomorrow Time Stamp:
tomorrowtimestamp = datetime.datetime.today() + datetime.timedelta(days=1)
To print:
print('\n Yesterday TimeStamp is : ', yesterdaytimestamp.timestamp(),
'\n Today TimeStamp is :', currenttimestamp,
'\n Tomorrow TimeStamp is: ', tomorrowtimestamp.timestamp())
The output:
Yesterday TimeStamp is : 1632842904.110993
Today TimeStamp is : 1632929304.111022
Tomorrow TimeStamp is : 1633015704.11103
There's nothing at all wrong with using today() as shown in the selected answer if that is the extent of your needs.
datetime.date.today() + datetime.timedelta(days=1)
Alternatively, if you or someone else working with your code might need more precision in handling tomorrow's date, consider using datetime.now() instead of today(). This will certainly allow for simpler, more readable code:
datetime.datetime.now() + datetime.timedelta(days=1)
This returns something like:
datetime.datetime(2022, 2, 17, 19, 50, 19, 984925)
The advantage is that you can now work with datetime attributes in a concise, human readable way:
class datetime.datetime
A combination of a date and a time. Attributes: year, month, day, hour, minute, second, microsecond, and tzinfo.
Examples
You can easily convert this to a date object withdate():
import datetime
tomorrow = datetime.datetime.now() + datetime.timedelta(days=1)
print(f"Tomorrow's date is {tomorrow.date()}")
tomorrow.date() is easy to use and it is very clear to anyone reading your code that it is returning the date for tomorrow. The output for the above looks like so:
Tomorrow's date is 2022-02-17
If later in your code you only need the date number for the day, you can now use tomorrow.day:
print(f"Tomorrow is the {tomorrow.day}rd")
Which will return something like:
Tomorrow is the 17rd
That's a silly example, but you can see how having access to these attributes can be useful and keep your code readable as well. It can be easily understood that tomorrow.day returns the day number.
Need to work with the exact time tomorrow's date begins? You can now replace the hours, minutes, seconds, and microseconds:
# Replace all attributes except day with 0.
midnight = tomorrow.replace(
hour=0,
minute=0,
second=0,
microsecond=0)
# Print midnight as the beginning of tomorrow's date.
print(f"{midnight}")
Reading the above code, it should be apparent which attributes of tomorrow are being replaced. When midnight is printed, it will output:
2022-02-17 00:00:00
Need to know the time left until tomorrow's date? Now something like that is possible, simple, and readable:
print(f"{midnight - datetime.datetime.now()}")
The output is the time to the microsecond that tomorrow's date begins:
3:14:28.158331
There are many ways people might wish to handle tomorrow's date. By ensuring these attributes are available from the beginning, you can write more readable code and avoid unnecessary work later.
For the case you only want to calculate the timestamp
import time
tomorrow = (int(time.time() / 86400) + 1) * 86400

Categories