How do I combine a timezone aware date and time in Python?

How do I combine a timezone aware date and time in Python? - python

I have a date and a time that I'm attempting to combine in Python. The time is timezone aware.
However, when I try and combine them, I get the wrong time.
import pytz
from datetime import time, date
NYC_TIME = pytz.timezone('America/New_York')
start_date = date(2012, 7, 7)
start_time = time(hour = 0, tzinfo = NYC_TIME)
combined = datetime.combine(start_date, start_time)
print combined
print NYC_TIME.normalize(combined)
This prints 2012-07-07 00:00:00-05:00, which normalizes to 2012-07-07 01:00:00-04:00. Why is this happening? How can I avoid it?

A time without a date attached must assume it's not in the Daylight Saving period. Once you attach a date to it, that assumption can be corrected. The zone offset changes, and the time changes as well to keep it at the same UTC equivalent.

Related

Converting a datetime to local time based on timezone in Python3

I have a question related to dates and time in Python.
Problem:
date = datetime.datetime.strptime(str(row[1]), "%Y-%m-%d %H:%M:%S")
localtime = date.astimezone(pytz.timezone("Europe/Brussels"))
formattedDate = localtime.strftime("%Y-%m%-%d")
In the code above, str(row[1]) gives back a UTC datetime coming from a mysql database: 2022-02-28 23:00:00
I parse this as a datetime and change the timezone to Europe/Brussels.
I then format it back to a string.
Expected result:
I'd like to return the date in local time. Europe/Brussels adds one hour so I would expect that strftime returns 2022-03-01, but it keeps returning 2022-02-28.
Can somebody help?

date is a naïve date, without timezone, because no timezone information was in the string you parsed. Using astimezone on that simply attaches timezone information to it, turning a naïve date into an aware one. It obviously can't convert any times, because it doesn't know what to convert from.
This also already contains the answer: make the date aware that it's in UTC first before trying to convert it to a different timezone:
date = datetime.datetime.strptime(...).astimezone(datetime.timezone.utc)

Ah, I see!
I ended up doing this, basically the same as you mentioned:
from_zone = tz.gettz('UTC')
to_zone = tz.gettz('Europe/Brussels')
utcdate = datetime.datetime.strptime(str(row[1]), "%Y-%m-%d %H:%M:%S")
utcdate = utcdate.replace(tzinfo=from_zone)
localdate = utcdate.astimezone(to_zone)
formattedLocalDate = localdate.strftime("%Y%m%d");
The naïve date gets UTC aware by the utcdate.replace(tzinfo=from_zone).
Thanks for helping!

Convert UTC+0 to other time zone Python

I have a time string obtained from API, it's UTC+0. I would like to change to other time zone.
I have tried below but it doesn't work. Could you please give me some idea ? many thanks.
utc0time='2021-04-17T15:50:14.614646+00:00'
dt = datetime.strptime('utc0time', '%Y-%m-%dT%H:%M:%S%z'). #it results as an error, not match the format
time.mktime(dt.timetuple())
calendar.timegm(dt.timetuple())

You could actually use timedelta in datetime module to +/- number of hours to achieve the time in other timezone you wish.
Here is an example where you can use timedelta:
https://www.geeksforgeeks.org/python-datetime-timedelta-function/

thanks for the comments and it gave me the idea. Because i only need to convert from one time zone to another one, i don't need to convert to multi-timezone. I don't use pytz this time. I used a silly method, changed the str to timestamp first, then used timedelta to adjust the hours. Below is my final code.
utc0time='2021-04-17T15:50:14.614646+00:00'
utc0time = utc0time[:-13]
timestamp = time.mktime(time.strptime(utc0time, '%Y-%m-%dT%H:%M:%S'))
datatimeformat = datetime.fromtimestamp(timestamp)
utc8time = datatimeformat + timedelta(hours = 8)

Test if datetime was converted to UTC correctly

Trying to write a test to see if my datetime conversions are working appropriately and getting some unexpected results.
import pytz
from datetime import datetime
def format_datetime(dt):
if not dt.tzinfo:
raise pytz.UnknownTimeZoneError('timezone not set')
time = dt.astimezone(pytz.utc).strftime('%Y-%m-%dT%H:%M:%S')
millis = dt.microsecond / 1000
string = '{}{}'.format(time, '.%03dZ' % millis)
return string
dt = datetime(2019, 3, 20, 1, 1, 1, 1)
# test 1
utc_dt = dt.replace(tzinfo=pytz.utc)
pdt_dt = dt.replace(tzinfo=pytz.timezone('America/Los_Angeles'))
print(format_datetime(utc_dt)) # 2019-03-20T01:01:01.000Z
print(format_datetime(pdt_dt)) # 2019-03-20T08:54:01.000Z
# test 2
utc_dt2 = dt.replace(tzinfo=pytz.utc)
pdt_dt2 = utc_dt2.astimezone(pytz.timezone('America/Los_Angeles'))
print(format_datetime(utc_dt2)) # 2019-03-20T01:01:01.000Z
print(format_datetime(pdt_dt2)) # 2019-03-20T01:01:01.000Z
I don't understand why, in the first test print(format_datetime(pdt_dt)) changes the minutes value, but in the second test the minutes aren't changed. (I understand why the hours are different between the two examples).

You can't just assign a pytz timezone to a datetime, you must use localize or astimezone:
utc_dt = pytz.utc.localize(dt)
pdt_dt = utc_dt.astimezone(pytz.timezone('America/Los_Angeles'))
This is because timezones are subject to change, and the pytz zone objects contain the entire history and need to be configured for the correct time period. A simple replace doesn't allow for this. Some of those old historic periods will have an odd number of minutes offset.

Python time library: how do I preserve dst with strptime and strftime

I need to store a timestamp in a readable format, and then later on I need to convert it to epoch for comparison purposes.
I tried doing this:
import time
format = '%Y %m %d %H:%M:%S +0000'
timestamp1 = time.strftime(format,time.gmtime()) # '2016 03 25 04:06:22 +0000'
t1 = time.strptime(timestamp1, format) # time.struct_time(..., tm_isdst=-1)
time.sleep(1)
epoch_now = time.mktime(time.gmtime())
epoch_t1 = time.mktime(t1)
print "Delta: %s" % (epoch_now - epoch_t1)
Running this, instead of getting Delta of 1 sec, I get 3601 (1 hr 1 sec), CONSISTENTLY.
Investigating further, it seems that when I just do time.gmtime(), the struct has tm_isdst=0, whereas the converted struct t1 from timestamp1 string has tm_isdst=-1.
How can I ensure the isdst is preserved to 0. I think that's probably the issue here.
Or is there a better way to record time in human readable format (UTC), and yet be able to convert back to epoch properly for time diff calculation?
UPDATES:
After doing more research last night, I switched to using datetime because it preserves more information in the datetime object, and this is confirmed by albertoql answer below.
Here's what I have now:
from datetime import datetime
format = '%Y-%m-%d %H:%M:%S.%f +0000' # +0000 is optional; only for user to see it's UTC
d1 = datetime.utcnow()
timestamp1 = d1.strftime(format)
d1a = datetime.strptime(timestamp1, format)
time.sleep(1)
d2 = datetime.utcnow()
print "Delta: %s" % (d2 - d1a).seconds
I chose not to add tz to keep it simple/shorter; I can still strptime that way.

Below, first an explanation about the problem, then two possible solutions, one using time, another using datetime.
Problem explanation
The problem is on the observation that the OP made in the question: tm_isdst=-1. tm_isdst is a flag that determines whether daylight savings time is in effect or not (see for more details https://docs.python.org/2/library/time.html#time.struct_time).
Specifically, given the format of the string for the time from the OP (that complies with RFC 2822 Internet email standard), [time.strptime]4 does not store the information about the timezone, namely +0000. Thus, when the struct_time is created again according to the information in the string, tm_isdst=-1, namely unknown. The guess on how to fill in that information when making the calculation is based on the local system. For example, as if the system refers to North America, where daylight savings time is in effect, tm_isdst is set.
Solution with time
If you want to use only time package, then, the easiest way to parse directly the information is to specify that the time is in UTC, and thus adding %Z to the format. Note that time does not provide a way to store the information about the timezone in struct_time. As a result, it does not print the actual time zone associated with the time saved in the variable. The time zone is retrieved from the system. Therefore, it is not possible to directly use the same format for time.strftime. The part of the code for writing and reading the string would look like:
format = '%Y %m %d %H:%M:%S UTC'
format2 = '%Y %m %d %H:%M:%S %Z'
timestamp1 = time.strftime(format, time.gmtime())
t1 = time.strptime(timestamp1, format2)
Solution with datetime
Another solution involves the use datetime and dateutil packages, which directly support timezone, and the code could be (assuming that preserving the timezone information is a requirement):
from datetime import datetime
from dateutil import tz, parser
import time
time_format = '%Y %m %d %H:%M:%S %z'
utc_zone = tz.gettz('UTC')
utc_time1 = datetime.utcnow()
utc_time1 = utc_time1.replace(tzinfo=utc_zone)
utc_time1_string = utc_time1.strftime(time_format)
utc_time1 = parser.parse(utc_time1_string)
time.sleep(1)
utc_time2 = datetime.utcnow()
utc_time2 = utc_time2.replace(tzinfo=utc_zone)
print "Delta: %s" % (utc_time2 - utc_time1).total_seconds()
There are some aspects to pay attention to:
After the call of utcnow, the timezone is not set, as it is a naive UTC datetime. If the information about UTC is not needed, it is possible to delete both lines where the timezone is set for the two times, and the result would be the same, as there is no guess about DST.
It is not possible to use datetime.strptime because of %z, which is not correctly parsed. If the string contains the information about the timezone, then parser should be used.
It is possible to directly perform the difference from two instances of datetime and transform the resulting delta into seconds.
If it is necessary to get the time in seconds since the epoch, an explicit computation should be made, as there is no direct function that does that automatically in datetime (at the time of the answer). Below the code, for example for utc_time2:
epoch_time = datetime(1970,1,1)
epoch2 = (utc_time2 - epoch_time).total_seconds()
datetime.resolution, namely the smallest possible difference between two non-equal datetime objects. This results in a difference that is up to the resolution.

get UTC timestamp in python with datetime

Is there a way to get the UTC timestamp by specifying the date? What I would expect:
datetime(2008, 1, 1, 0, 0, 0, 0)
should result in
1199145600
Creating a naive datetime object means that there is no time zone information. If I look at the documentation for datetime.utcfromtimestamp, creating a UTC timestamp means leaving out the time zone information. So I would guess, that creating a naive datetime object (like I did) would result in a UTC timestamp. However:
then = datetime(2008, 1, 1, 0, 0, 0, 0)
datetime.utcfromtimestamp(float(then.strftime('%s')))
results in
2007-12-31 23:00:00
Is there still any hidden time zone information in the datetime object? What am I doing wrong?

Naïve datetime versus aware datetime
Default datetime objects are said to be "naïve": they keep time information without the time zone information. Think about naïve datetime as a relative number (ie: +4) without a clear origin (in fact your origin will be common throughout your system boundary).
In contrast, think about aware datetime as absolute numbers (ie: 8) with a common origin for the whole world.
Without timezone information you cannot convert the "naive" datetime towards any non-naive time representation (where does +4 targets if we don't know from where to start ?). This is why you can't have a datetime.datetime.toutctimestamp() method. (cf: http://bugs.python.org/issue1457227)
To check if your datetime dt is naïve, check dt.tzinfo, if None, then it's naïve:
datetime.now() ## DANGER: returns naïve datetime pointing on local time
datetime(1970, 1, 1) ## returns naïve datetime pointing on user given time
I have naïve datetimes, what can I do ?
You must make an assumption depending on your particular context:
The question you must ask yourself is: was your datetime on UTC ? or was it local time ?
If you were using UTC (you are out of trouble):
import calendar
def dt2ts(dt):
"""Converts a datetime object to UTC timestamp
naive datetime will be considered UTC.
"""
return calendar.timegm(dt.utctimetuple())
If you were NOT using UTC, welcome to hell.
You have to make your datetime non-naïve prior to using the former
function, by giving them back their intended timezone.
You'll need the name of the timezone and the information about
if DST was in effect when producing the target naïve datetime (the
last info about DST is required for cornercases):
import pytz ## pip install pytz
mytz = pytz.timezone('Europe/Amsterdam') ## Set your timezone
dt = mytz.normalize(mytz.localize(dt, is_dst=True)) ## Set is_dst accordingly
Consequences of not providing is_dst:
Not using is_dst will generate incorrect time (and UTC timestamp)
if target datetime was produced while a backward DST was put in place
(for instance changing DST time by removing one hour).
Providing incorrect is_dst will of course generate incorrect
time (and UTC timestamp) only on DST overlap or holes. And, when
providing
also incorrect time, occuring in "holes" (time that never existed due
to forward shifting DST), is_dst will give an interpretation of
how to consider this bogus time, and this is the only case where
.normalize(..) will actually do something here, as it'll then
translate it as an actual valid time (changing the datetime AND the
DST object if required). Note that .normalize() is not required
for having a correct UTC timestamp at the end, but is probably
recommended if you dislike the idea of having bogus times in your
variables, especially if you re-use this variable elsewhere.
and AVOID USING THE FOLLOWING: (cf: Datetime Timezone conversion using pytz)
dt = dt.replace(tzinfo=timezone('Europe/Amsterdam')) ## BAD !!
Why? because .replace() replaces blindly the tzinfo without
taking into account the target time and will choose a bad DST object.
Whereas .localize() uses the target time and your is_dst hint
to select the right DST object.
OLD incorrect answer (thanks #J.F.Sebastien for bringing this up):
Hopefully, it is quite easy to guess the timezone (your local origin) when you create your naive datetime object as it is related to the system configuration that you would hopefully NOT change between the naive datetime object creation and the moment when you want to get the UTC timestamp. This trick can be used to give an imperfect question.
By using time.mktime we can create an utc_mktime:
def utc_mktime(utc_tuple):
"""Returns number of seconds elapsed since epoch
Note that no timezone are taken into consideration.
utc tuple must be: (year, month, day, hour, minute, second)
"""
if len(utc_tuple) == 6:
utc_tuple += (0, 0, 0)
return time.mktime(utc_tuple) - time.mktime((1970, 1, 1, 0, 0, 0, 0, 0, 0))
def datetime_to_timestamp(dt):
"""Converts a datetime object to UTC timestamp"""
return int(utc_mktime(dt.timetuple()))
You must make sure that your datetime object is created on the same timezone than the one that has created your datetime.
This last solution is incorrect because it makes the assumption that the UTC offset from now is the same than the UTC offset from EPOCH. Which is not the case for a lot of timezones (in specific moment of the year for the Daylight Saving Time (DST) offsets).

Another possibility is:
d = datetime.datetime.utcnow()
epoch = datetime.datetime(1970,1,1)
t = (d - epoch).total_seconds()
This works as both "d" and "epoch" are naive datetimes, making the "-" operator valid, and returning an interval. total_seconds() turns the interval into seconds. Note that total_seconds() returns a float, even d.microsecond == 0

Also note the calendar.timegm() function as described by this blog entry:
import calendar
calendar.timegm(utc_timetuple)
The output should agree with the solution of vaab.

A simple solution without using external modules:
from datetime import datetime, timezone
dt = datetime(2008, 1, 1, 0, 0, 0, 0)
int(dt.replace(tzinfo=timezone.utc).timestamp())

If input datetime object is in UTC:
>>> dt = datetime(2008, 1, 1, 0, 0, 0, 0)
>>> timestamp = (dt - datetime(1970, 1, 1)).total_seconds()
1199145600.0
Note: it returns float i.e., microseconds are represented as fractions of a second.
If input date object is in UTC:
>>> from datetime import date
>>> utc_date = date(2008, 1, 1)
>>> timestamp = (utc_date.toordinal() - date(1970, 1, 1).toordinal()) * 24*60*60
1199145600
See more details at Converting datetime.date to UTC timestamp in Python.

I feel like the main answer is still not so clear, and it's worth taking the time to understand time and timezones.
The most important thing to understand when dealing with time is that time is relative!
2017-08-30 13:23:00: (a naive datetime), represents a local time somewhere in the world, but note that 2017-08-30 13:23:00 in London is NOT THE SAME TIME as 2017-08-30 13:23:00 in San Francisco.
Because the same time string can be interpreted as different points-in-time depending on where you are in the world, there is a need for an absolute notion of time.
A UTC timestamp is a number in seconds (or milliseconds) from Epoch (defined as 1 January 1970 00:00:00 at GMT timezone +00:00 offset).
Epoch is anchored on the GMT timezone and therefore is an absolute point in time. A UTC timestamp being an offset from an absolute time therefore defines an absolute point in time.
This makes it possible to order events in time.
Without timezone information, time is relative, and cannot be converted to an absolute notion of time without providing some indication of what timezone the naive datetime should be anchored to.
What are the types of time used in computer system?
naive datetime: usually for display, in local time (i.e. in the browser) where the OS can provide timezone information to the program.
UTC timestamps: A UTC timestamp is an absolute point in time, as mentioned above, but it is anchored in a given timezone, so a UTC timestamp can be converted to a datetime in any timezone, however it does not contain timezone information. What does that mean? That means that 1504119325 corresponds to 2017-08-30T18:55:24Z, or 2017-08-30T17:55:24-0100 or also 2017-08-30T10:55:24-0800. It doesn't tell you where the datetime recorded is from. It's usually used on the server side to record events (logs, etc...) or used to convert a timezone aware datetime to an absolute point in time and compute time differences.
ISO-8601 datetime string: The ISO-8601 is a standardized format to record datetime with timezone. (It's in fact several formats, read on here: https://en.wikipedia.org/wiki/ISO_8601) It is used to communicate timezone aware datetime information in a serializable manner between systems.
When to use which? or rather when do you need to care about timezones?
If you need in any way to care about time-of-day, you need timezone information. A calendar or alarm needs time-of-day to set a meeting at the correct time of the day for any user in the world. If this data is saved on a server, the server needs to know what timezone the datetime corresponds to.
To compute time differences between events coming from different places in the world, UTC timestamp is enough, but you lose the ability to analyze at what time of day events occured (ie. for web analytics, you may want to know when users come to your site in their local time: do you see more users in the morning or the evening? You can't figure that out without time of day information.
Timezone offset in a date string:
Another point that is important, is that timezone offset in a date string is not fixed. That means that because 2017-08-30T10:55:24-0800 says the offset -0800 or 8 hours back, doesn't mean that it will always be!
In the summer it may well be in daylight saving time, and it would be -0700
What that means is that timezone offset (+0100) is not the same as timezone name (Europe/France) or even timezone designation (CET)
America/Los_Angeles timezone is a place in the world, but it turns into PST (Pacific Standard Time) timezone offset notation in the winter, and PDT (Pacific Daylight Time) in the summer.
So, on top of getting the timezone offset from the datestring, you should also get the timezone name to be accurate.
Most packages will be able to convert numeric offsets from daylight saving time to standard time on their own, but that is not necessarily trivial with just offset. For example WAT timezone designation in West Africa, is UTC+0100 just like CET timezone in France, but France observes daylight saving time, while West Africa does not (because they're close to the equator)
So, in short, it's complicated. VERY complicated, and that's why you should not do this yourself, but trust a package that does it for you, and KEEP IT UP TO DATE!

There is indeed a problem with using utcfromtimestamp and specifying time zones. A nice example/explanation is available on the following question:
How to specify time zone (UTC) when converting to Unix time? (Python)

The accepted answer seems not work for me. My solution:
import time
utc_0 = int(time.mktime(datetime(1970, 01, 01).timetuple()))
def datetime2ts(dt):
"""Converts a datetime object to UTC timestamp"""
return int(time.mktime(dt.utctimetuple())) - utc_0

Simplest way:
>>> from datetime import datetime
>>> dt = datetime(2008, 1, 1, 0, 0, 0, 0)
>>> dt.strftime("%s")
'1199163600'
Edit: #Daniel is correct, this would convert it to the machine's timezone. Here is a revised answer:
>>> from datetime import datetime, timezone
>>> epoch = datetime(1970, 1, 1, 0, 0, 0, 0, timezone.utc)
>>> dt = datetime(2008, 1, 1, 0, 0, 0, 0, timezone.utc)
>>> int((dt-epoch).total_seconds())
'1199145600'
In fact, its not even necessary to specify timezone.utc, because the time difference is the same so long as both datetime have the same timezone (or no timezone).
>>> from datetime import datetime
>>> epoch = datetime(1970, 1, 1, 0, 0, 0, 0)
>>> dt = datetime(2008, 1, 1, 0, 0, 0, 0)
>>> int((dt-epoch).total_seconds())
1199145600

I think the correct way to phrase your question is
Is there a way to get the timestamp by specifying the date in UTC?, because timestamp is just a number which is absolute, not relative. The relative (or timezone aware) piece is the date.
I find pandas very convenient for timestamps, so:
import pandas as pd
dt1 = datetime(2008, 1, 1, 0, 0, 0, 0)
ts1 = pd.Timestamp(dt1, tz='utc').timestamp()
# make sure you get back dt1
datetime.utcfromtimestamp(ts1)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.