Convert an RFC 3339 time to a standard Python timestamp - python

Is there an easy way to convert an RFC 3339 time into a regular Python timestamp?
I've got a script which is reading an ATOM feed and I'd like to be able to compare the timestamp of an item in the ATOM feed to the modification time of a file.
I notice from the ATOM spec, that ATOM dates include a time zone offset (Z<a number>) but, in my case, there's nothing after the Z so I guess we can assume GMT.
I suppose I could parse the time with a regex of some sort but I was hoping Python had a built-in way of doing it that I just haven't been able to find.

You don't include an example, but if you don't have a Z-offset or timezone, and assuming you don't want durations but just the basic time, then maybe this will suit you:
import datetime as dt
>>> dt.datetime.strptime('1985-04-12T23:20:50.52', '%Y-%m-%dT%H:%M:%S.%f')
datetime.datetime(1985, 4, 12, 23, 20, 50, 520000)
The strptime() function was added to the datetime module in Python 2.5 so some people don't yet know it's there.
Edit: The time.strptime() function has existed for a while though, and works about the same to give you a struct_time value:
>>> ts = time.strptime('1985-04-12T23:20:50.52', '%Y-%m-%dT%H:%M:%S.%f')
>>> ts
time.struct_time(tm_year=1985, tm_mon=4, tm_mday=12, tm_hour=23, tm_min=20, tm_sec=50, tm_wday=4, tm_yday=102, tm_isdst=-1)
>>> time.mktime(ts)
482210450.0

I struggled with RFC3339 datetime format a lot, but I found a suitable solution to convert date_string <=> datetime_object in both directions.
You need two different external modules, because one of them is is only able to do the conversion in one direction (unfortunately):
first install:
sudo pip install rfc3339
sudo pip install iso8601
then include:
import datetime # for general datetime object handling
import rfc3339 # for date object -> date string
import iso8601 # for date string -> date object
For not needing to remember which module is for which direction, I wrote two simple helper functions:
def get_date_object(date_string):
return iso8601.parse_date(date_string)
def get_date_string(date_object):
return rfc3339.rfc3339(date_object)
which inside your code you can easily use like this:
input_string = '1989-01-01T00:18:07-05:00'
test_date = get_date_object(input_string)
# >>> datetime.datetime(1989, 1, 1, 0, 18, 7, tzinfo=<FixedOffset '-05:00' datetime.timedelta(-1, 68400)>)
test_string = get_date_string(test_date)
# >>> '1989-01-01T00:18:07-05:00'
test_string is input_string # >>> True
Heureka! Now you can easily (haha) use your date strings and date strings in a useable format.

No builtin, afaik.
feed.date.rfc3339
This is a Python library module with functions for converting timestamp strings in RFC 3339 format to Python time float values, and vice versa. RFC 3339 is the timestamp format used by the Atom feed syndication format.
It is BSD-licensed.
http://home.blarg.net/~steveha/pyfeed.html
(Edited so it's clear I didn't write it. :-)

The new datetime.fromisoformat(date_string) method which was added in Python 3.7 will parse most RFC 3339 timestamps, including those with time zone offsets. It's not a full implementation, so be sure to test your use case.
>>> from datetime import datetime
>>> datetime.fromisoformat('2011-11-04')
datetime.datetime(2011, 11, 4, 0, 0)
>>> datetime.fromisoformat('2011-11-04T00:05:23')
datetime.datetime(2011, 11, 4, 0, 5, 23)
>>> datetime.fromisoformat('2011-11-04 00:05:23.283')
datetime.datetime(2011, 11, 4, 0, 5, 23, 283000)
>>> datetime.fromisoformat('2011-11-04 00:05:23.283+00:00')
datetime.datetime(2011, 11, 4, 0, 5, 23, 283000, tzinfo=datetime.timezone.utc)
>>> datetime.fromisoformat('2011-11-04T00:05:23+04:00')
datetime.datetime(2011, 11, 4, 0, 5, 23,
tzinfo=datetime.timezone(datetime.timedelta(seconds=14400)))

If you're using Django, you could use Django's function parse_datetime:
>>> from django.utils.dateparse import parse_datetime
>>> parse_datetime("2016-07-19T07:30:36+05:00")
datetime.datetime(2016, 7, 19, 7, 30, 36, tzinfo=<django.utils.timezone.FixedOffset object at 0x101c0c1d0>)

http://pypi.python.org/pypi/iso8601/ seems to be able to parse iso 8601, which RFC 3339 is a subset of, maybe this could be useful, but again, not built-in.

The simplest solution for me has been dateutil python standart library.
from dateutil.parser import parse
dt = "2020-11-23T11:08:23.022277705Z"
print(parse(dt))
Output:
2020-11-23 11:08:23.022277+00:00
If you don't need the timezone element, just simply set timezone info to None
print(parse(t).replace(tzinfo=None))
The output is a nice and clean datetime object:
2020-11-23 11:08:23.022277

http://bugs.python.org/issue15873 (duplicate of http://bugs.python.org/issue5207 )
Looks like there isn't a built-in as of yet.

feedparser.py provides robust/extensible way to parse various date formats that may be encountered in real-world atom/rss feeds:
>>> from feedparser import _parse_date as parse_date
>>> parse_date('1985-04-12T23:20:50.52Z')
time.struct_time(tm_year=1985, tm_mon=4, tm_mday=12, tm_hour=23, tm_min=20,
tm_sec=50, tm_wday=4, tm_yday=102, tm_isdst=1)

try this, it works fine for me
datetime_obj = datetime.strptime("2014-01-01T00:00:00Z", '%Y-%m-%dT%H:%M:%SZ')
or
datetime_obj = datetime.strptime("Mon, 01 Jun 2015 16:41:40 GMT", '%a, %d %b %Y %H:%M:%S GMT')

Came across the awesome dateutil.parser module in another question, and tried it on my RFC3339 problem, and it appears to handle everything I throw at it with more sanity that any of the other responses in this question.

Using Python 3, you can use RegEx to break the RFC 3339 timestamp into its components.
Then, directly create the datetime object, no additional modules needed:
import re
import datetime
def parse_rfc3339(dt):
broken = re.search(r'([0-9]{4})-([0-9]{2})-([0-9]{2})T([0-9]{2}):([0-9]{2}):([0-9]{2})(\.([0-9]+))?(Z|([+-][0-9]{2}):([0-9]{2}))', dt)
return(datetime.datetime(
year = int(broken.group(1)),
month = int(broken.group(2)),
day = int(broken.group(3)),
hour = int(broken.group(4)),
minute = int(broken.group(5)),
second = int(broken.group(6)),
microsecond = int(broken.group(8) or "0"),
tzinfo = datetime.timezone(datetime.timedelta(
hours = int(broken.group(10) or "0"),
minutes = int(broken.group(11) or "0")))))
This example theads missing timezones or microseconds as "0" but might need additional error checking.
Cheers, Alex

You could use a Google API Core package. They have a really straightforward Datetime to RFC 3339 conversion function. You can find more info in their docs.
Its usage is as simple as:
from google.api_core.datetime_helpers import to_rfc3339
rfc3339_str = to_rfc3339(datetime.now())
They even have a function that works the other way around from_rfc3339 and from_rfc3339_nanos.

rfc3339 library: http://henry.precheur.org/python/rfc3339

I have been doing a deep dive in dateimes and RFC3339 and recently come across the arrow library and have just used and solved my problem:
import arrow
date_string = "2015-11-24 00:00:00+00:00"
my_datetime = arrow.get(date_string).datetime

Related

strptime time data does not match format [duplicate]

I need to parse RFC 3339 strings like "2008-09-03T20:56:35.450686Z" into Python's datetime type.
I have found strptime in the Python standard library, but it is not very convenient.
What is the best way to do this?
isoparse function from python-dateutil
The python-dateutil package has dateutil.parser.isoparse to parse not only RFC 3339 datetime strings like the one in the question, but also other ISO 8601 date and time strings that don't comply with RFC 3339 (such as ones with no UTC offset, or ones that represent only a date).
>>> import dateutil.parser
>>> dateutil.parser.isoparse('2008-09-03T20:56:35.450686Z') # RFC 3339 format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=tzutc())
>>> dateutil.parser.isoparse('2008-09-03T20:56:35.450686') # ISO 8601 extended format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
>>> dateutil.parser.isoparse('20080903T205635.450686') # ISO 8601 basic format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
>>> dateutil.parser.isoparse('20080903') # ISO 8601 basic format, date only
datetime.datetime(2008, 9, 3, 0, 0)
The python-dateutil package also has dateutil.parser.parse. Compared with isoparse, it is presumably less strict, but both of them are quite forgiving and will attempt to interpret the string that you pass in. If you want to eliminate the possibility of any misreads, you need to use something stricter than either of these functions.
Comparison with Python 3.7+’s built-in datetime.datetime.fromisoformat
dateutil.parser.isoparse is a full ISO-8601 format parser, but in Python ≤ 3.10 fromisoformat is deliberately not. In Python 3.11, fromisoformat supports almost all strings in valid ISO 8601. See fromisoformat's docs for this cautionary caveat. (See this answer).
The datetime standard library has, since Python 3.7, a function for inverting datetime.isoformat().
classmethod datetime.fromisoformat(date_string):
Return a datetime corresponding to a date_string in any valid ISO 8601 format, with the following exceptions:
Time zone offsets may have fractional seconds.
The T separator may be replaced by any single unicode character.
Ordinal dates are not currently supported.
Fractional hours and minutes are not supported.
Examples:
>>> from datetime import datetime
>>> datetime.fromisoformat('2011-11-04')
datetime.datetime(2011, 11, 4, 0, 0)
>>> datetime.fromisoformat('20111104')
datetime.datetime(2011, 11, 4, 0, 0)
>>> datetime.fromisoformat('2011-11-04T00:05:23')
datetime.datetime(2011, 11, 4, 0, 5, 23)
>>> datetime.fromisoformat('2011-11-04T00:05:23Z')
datetime.datetime(2011, 11, 4, 0, 5, 23, tzinfo=datetime.timezone.utc)
>>> datetime.fromisoformat('20111104T000523')
datetime.datetime(2011, 11, 4, 0, 5, 23)
>>> datetime.fromisoformat('2011-W01-2T00:05:23.283')
datetime.datetime(2011, 1, 4, 0, 5, 23, 283000)
>>> datetime.fromisoformat('2011-11-04 00:05:23.283')
datetime.datetime(2011, 11, 4, 0, 5, 23, 283000)
>>> datetime.fromisoformat('2011-11-04 00:05:23.283+00:00')
datetime.datetime(2011, 11, 4, 0, 5, 23, 283000, tzinfo=datetime.timezone.utc)
>>> datetime.fromisoformat('2011-11-04T00:05:23+04:00')
datetime.datetime(2011, 11, 4, 0, 5, 23, tzinfo=datetime.timezone(datetime.timedelta(seconds=14400)))
New in version 3.7.
Changed in version 3.11: Previously, this method only supported formats that could be emitted by date.isoformat() or datetime.isoformat().
Be sure to read the caution from the docs if you haven't upgraded to Python 3.11 yet!
Note in Python 2.6+ and Py3K, the %f character catches microseconds.
>>> datetime.datetime.strptime("2008-09-03T20:56:35.450686Z", "%Y-%m-%dT%H:%M:%S.%fZ")
See issue here
As of Python 3.7, you can basically (caveats below) get away with using datetime.datetime.strptime to parse RFC 3339 datetimes, like this:
from datetime import datetime
def parse_rfc3339(datetime_str: str) -> datetime:
try:
return datetime.strptime(datetime_str, "%Y-%m-%dT%H:%M:%S.%f%z")
except ValueError:
# Perhaps the datetime has a whole number of seconds with no decimal
# point. In that case, this will work:
return datetime.strptime(datetime_str, "%Y-%m-%dT%H:%M:%S%z")
It's a little awkward, since we need to try two different format strings in order to support both datetimes with a fractional number of seconds (like 2022-01-01T12:12:12.123Z) and those without (like 2022-01-01T12:12:12Z), both of which are valid under RFC 3339. But as long as we do that single fiddly bit of logic, this works.
Some caveats to note about this approach:
It technically doesn't fully support RFC 3339, since RFC 3339 bizarrely lets you use a space instead of a T to separate the date from the time, even though RFC 3339 purports to be a profile of ISO 8601 and ISO 8601 does not allow this. If you want to support this silly quirk of RFC 3339, you could add datetime_str = datetime_str.replace(' ', 'T') to the start of the function.
My implementation above is slightly more permissive than a strict RFC 3339 parser should be, since it will allow timezone offsets like +0500 without a colon, which RFC 3339 does not support. If you don't merely want to parse known-to-be-RFC-3339 datetimes but also want to rigorously validate that the datetime you're getting is RFC 3339, use another approach or add in your own logic to validate the timezone offset format.
This function definitely doesn't support all of ISO 8601, which includes a much wider array of formats than RFC 3339. (e.g. 2009-W01-1 is a valid ISO 8601 date.)
It does not work in Python 3.6 or earlier, since in those old versions the %z specifier only matches timezones offsets like +0500 or -0430 or +0000, not RFC 3339 timezone offsets like +05:00 or -04:30 or Z.
Try the iso8601 module; it does exactly this.
There are several other options mentioned on the WorkingWithTime page on the python.org wiki.
Starting from Python 3.7, strptime supports colon delimiters in UTC offsets (source). So you can then use:
import datetime
def parse_date_string(date_string: str) -> datetime.datetime
try:
return datetime.datetime.strptime(date_string, '%Y-%m-%dT%H:%M:%S.%f%z')
except ValueError:
return datetime.datetime.strptime(date_string, '%Y-%m-%dT%H:%M:%S%z')
EDIT:
As pointed out by Martijn, if you created the datetime object using isoformat(), you can simply use datetime.fromisoformat().
EDIT 2:
As pointed out by Mark Amery, I added a try..except block to account for missing fractional seconds.
Python >= 3.11
fromisoformat now parses Z directly:
from datetime import datetime
s = "2008-09-03T20:56:35.450686Z"
datetime.fromisoformat(s)
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=datetime.timezone.utc)
Python 3.7 to 3.10
A simple option from one of the comments: replace 'Z' with '+00:00' - and use fromisoformat:
from datetime import datetime
s = "2008-09-03T20:56:35.450686Z"
datetime.fromisoformat(s.replace('Z', '+00:00'))
# datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=datetime.timezone.utc)
Why prefer fromisoformat?
Although strptime's %z can parse the 'Z' character to UTC, fromisoformat is faster by ~ x40 (see also: A faster strptime):
%timeit datetime.fromisoformat(s.replace('Z', '+00:00'))
388 ns ± 48.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit dateutil.parser.isoparse(s)
11 µs ± 1.05 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit datetime.strptime(s, '%Y-%m-%dT%H:%M:%S.%f%z')
15.8 µs ± 1.32 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit dateutil.parser.parse(s)
87.8 µs ± 8.54 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
(Python 3.9.12 x64 on Windows 10)
What is the exact error you get? Is it like the following?
>>> datetime.datetime.strptime("2008-08-12T12:20:30.656234Z", "%Y-%m-%dT%H:%M:%S.Z")
ValueError: time data did not match format: data=2008-08-12T12:20:30.656234Z fmt=%Y-%m-%dT%H:%M:%S.Z
If yes, you can split your input string on ".", and then add the microseconds to the datetime you got.
Try this:
>>> def gt(dt_str):
dt, _, us= dt_str.partition(".")
dt= datetime.datetime.strptime(dt, "%Y-%m-%dT%H:%M:%S")
us= int(us.rstrip("Z"), 10)
return dt + datetime.timedelta(microseconds=us)
>>> gt("2008-08-12T12:20:30.656234Z")
datetime.datetime(2008, 8, 12, 12, 20, 30, 656234)
import re
import datetime
s = "2008-09-03T20:56:35.450686Z"
d = datetime.datetime(*map(int, re.split(r'[^\d]', s)[:-1]))
In these days, Arrow also can be used as a third-party solution:
>>> import arrow
>>> date = arrow.get("2008-09-03T20:56:35.450686Z")
>>> date.datetime
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=tzutc())
Just use the python-dateutil module:
>>> import dateutil.parser as dp
>>> t = '1984-06-02T19:05:00.000Z'
>>> parsed_t = dp.parse(t)
>>> print(parsed_t)
datetime.datetime(1984, 6, 2, 19, 5, tzinfo=tzutc())
Documentation
I have found ciso8601 to be the fastest way to parse ISO 8601 timestamps.
It also has full support for RFC 3339, and a dedicated function for strict parsing RFC 3339 timestamps.
Example usage:
>>> import ciso8601
>>> ciso8601.parse_datetime('2014-01-09T21')
datetime.datetime(2014, 1, 9, 21, 0)
>>> ciso8601.parse_datetime('2014-01-09T21:48:00.921000+05:30')
datetime.datetime(2014, 1, 9, 21, 48, 0, 921000, tzinfo=datetime.timezone(datetime.timedelta(seconds=19800)))
>>> ciso8601.parse_rfc3339('2014-01-09T21:48:00.921000+05:30')
datetime.datetime(2014, 1, 9, 21, 48, 0, 921000, tzinfo=datetime.timezone(datetime.timedelta(seconds=19800)))
The GitHub Repo README shows their speedup versus all of the other libraries listed in the other answers.
My personal project involved a lot of ISO 8601 parsing. It was nice to be able to just switch the call and go faster. :)
Edit: I have since become a maintainer of ciso8601. It's now faster than ever!
If you are working with Django, it provides the dateparse module that accepts a bunch of formats similar to ISO format, including the time zone.
If you are not using Django and you don't want to use one of the other libraries mentioned here, you could probably adapt the Django source code for dateparse to your project.
If you don't want to use dateutil, you can try this function:
def from_utc(utcTime,fmt="%Y-%m-%dT%H:%M:%S.%fZ"):
"""
Convert UTC time string to time.struct_time
"""
# change datetime.datetime to time, return time.struct_time type
return datetime.datetime.strptime(utcTime, fmt)
Test:
from_utc("2007-03-04T21:08:12.123Z")
Result:
datetime.datetime(2007, 3, 4, 21, 8, 12, 123000)
I've coded up a parser for the ISO 8601 standard and put it on GitHub: https://github.com/boxed/iso8601. This implementation supports everything in the specification except for durations, intervals, periodic intervals, and dates outside the supported date range of Python's datetime module.
Tests are included! :P
This works for stdlib on Python 3.2 onwards (assuming all the timestamps are UTC):
from datetime import datetime, timezone, timedelta
datetime.strptime(timestamp, "%Y-%m-%dT%H:%M:%S.%fZ").replace(
tzinfo=timezone(timedelta(0)))
For example,
>>> datetime.utcnow().replace(tzinfo=timezone(timedelta(0)))
... datetime.datetime(2015, 3, 11, 6, 2, 47, 879129, tzinfo=datetime.timezone.utc)
I'm the author of iso8601utils. It can be found on GitHub or on PyPI. Here's how you can parse your example:
>>> from iso8601utils import parsers
>>> parsers.datetime('2008-09-03T20:56:35.450686Z')
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
One straightforward way to convert an ISO 8601-like date string to a UNIX timestamp or datetime.datetime object in all supported Python versions without installing third-party modules is to use the date parser of SQLite.
#!/usr/bin/env python
from __future__ import with_statement, division, print_function
import sqlite3
import datetime
testtimes = [
"2016-08-25T16:01:26.123456Z",
"2016-08-25T16:01:29",
]
db = sqlite3.connect(":memory:")
c = db.cursor()
for timestring in testtimes:
c.execute("SELECT strftime('%s', ?)", (timestring,))
converted = c.fetchone()[0]
print("%s is %s after epoch" % (timestring, converted))
dt = datetime.datetime.fromtimestamp(int(converted))
print("datetime is %s" % dt)
Output:
2016-08-25T16:01:26.123456Z is 1472140886 after epoch
datetime is 2016-08-25 12:01:26
2016-08-25T16:01:29 is 1472140889 after epoch
datetime is 2016-08-25 12:01:29
An another way is to use specialized parser for ISO-8601 is to use isoparse function of dateutil parser:
from dateutil import parser
date = parser.isoparse("2008-09-03T20:56:35.450686+01:00")
print(date)
Output:
2008-09-03 20:56:35.450686+01:00
This function is also mentioned in the documentation for the standard Python function datetime.fromisoformat:
A more full-featured ISO 8601 parser, dateutil.parser.isoparse is
available in the third-party package dateutil.
Django's parse_datetime() function supports dates with UTC offsets:
parse_datetime('2016-08-09T15:12:03.65478Z') =
datetime.datetime(2016, 8, 9, 15, 12, 3, 654780, tzinfo=<UTC>)
So it could be used for parsing ISO 8601 dates in fields within entire project:
from django.utils import formats
from django.forms.fields import DateTimeField
from django.utils.dateparse import parse_datetime
class DateTimeFieldFixed(DateTimeField):
def strptime(self, value, format):
if format == 'iso-8601':
return parse_datetime(value)
return super().strptime(value, format)
DateTimeField.strptime = DateTimeFieldFixed.strptime
formats.ISO_INPUT_FORMATS['DATETIME_INPUT_FORMATS'].insert(0, 'iso-8601')
If pandas is used anyway, I can recommend Timestamp from pandas. There you can
ts_1 = pd.Timestamp('2020-02-18T04:27:58.000Z')
ts_2 = pd.Timestamp('2020-02-18T04:27:58.000')
Rant: It is just unbelievable that we still need to worry about things like date string parsing in 2021.
Because ISO 8601 allows many variations of optional colons and dashes being present, basically CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm]. If you want to use strptime, you need to strip out those variations first.
The goal is to generate a utc datetime object.
If you just want a basic case that work for UTC with the Z suffix like 2016-06-29T19:36:29.3453Z:
datetime.datetime.strptime(timestamp.translate(None, ':-'), "%Y%m%dT%H%M%S.%fZ")
If you want to handle timezone offsets like 2016-06-29T19:36:29.3453-0400 or 2008-09-03T20:56:35.450686+05:00 use the following. These will convert all variations into something without variable delimiters like 20080903T205635.450686+0500 making it more consistent/easier to parse.
import re
# this regex removes all colons and all
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)
datetime.datetime.strptime(conformed_timestamp, "%Y%m%dT%H%M%S.%f%z" )
If your system does not support the %z strptime directive (you see something like ValueError: 'z' is a bad directive in format '%Y%m%dT%H%M%S.%f%z') then you need to manually offset the time from Z (UTC). Note %z may not work on your system in python versions < 3 as it depended on the c library support which varies across system/python build type (i.e. Jython, Cython, etc.).
import re
import datetime
# this regex removes all colons and all
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)
# split on the offset to remove it. use a capture group to keep the delimiter
split_timestamp = re.split(r"[+|-]",conformed_timestamp)
main_timestamp = split_timestamp[0]
if len(split_timestamp) == 3:
sign = split_timestamp[1]
offset = split_timestamp[2]
else:
sign = None
offset = None
# generate the datetime object without the offset at UTC time
output_datetime = datetime.datetime.strptime(main_timestamp +"Z", "%Y%m%dT%H%M%S.%fZ" )
if offset:
# create timedelta based on offset
offset_delta = datetime.timedelta(hours=int(sign+offset[:-2]), minutes=int(sign+offset[-2:]))
# offset datetime with timedelta
output_datetime = output_datetime + offset_delta
Nowadays there's Maya: Datetimes for Humans™, from the author of the popular Requests: HTTP for Humans™ package:
>>> import maya
>>> str = '2008-09-03T20:56:35.450686Z'
>>> maya.MayaDT.from_rfc3339(str).datetime()
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=<UTC>)
The python-dateutil will throw an exception if parsing invalid date strings, so you may want to catch the exception.
from dateutil import parser
ds = '2012-60-31'
try:
dt = parser.parse(ds)
except ValueError, e:
print '"%s" is an invalid date' % ds
For something that works with the 2.X standard library try:
calendar.timegm(time.strptime(date.split(".")[0]+"UTC", "%Y-%m-%dT%H:%M:%S%Z"))
calendar.timegm is the missing gm version of time.mktime.
Thanks to great Mark Amery's answer I devised function to account for all possible ISO formats of datetime:
class FixedOffset(tzinfo):
"""Fixed offset in minutes: `time = utc_time + utc_offset`."""
def __init__(self, offset):
self.__offset = timedelta(minutes=offset)
hours, minutes = divmod(offset, 60)
#NOTE: the last part is to remind about deprecated POSIX GMT+h timezones
# that have the opposite sign in the name;
# the corresponding numeric value is not used e.g., no minutes
self.__name = '<%+03d%02d>%+d' % (hours, minutes, -hours)
def utcoffset(self, dt=None):
return self.__offset
def tzname(self, dt=None):
return self.__name
def dst(self, dt=None):
return timedelta(0)
def __repr__(self):
return 'FixedOffset(%d)' % (self.utcoffset().total_seconds() / 60)
def __getinitargs__(self):
return (self.__offset.total_seconds()/60,)
def parse_isoformat_datetime(isodatetime):
try:
return datetime.strptime(isodatetime, '%Y-%m-%dT%H:%M:%S.%f')
except ValueError:
pass
try:
return datetime.strptime(isodatetime, '%Y-%m-%dT%H:%M:%S')
except ValueError:
pass
pat = r'(.*?[+-]\d{2}):(\d{2})'
temp = re.sub(pat, r'\1\2', isodatetime)
naive_date_str = temp[:-5]
offset_str = temp[-5:]
naive_dt = datetime.strptime(naive_date_str, '%Y-%m-%dT%H:%M:%S.%f')
offset = int(offset_str[-4:-2])*60 + int(offset_str[-2:])
if offset_str[0] == "-":
offset = -offset
return naive_dt.replace(tzinfo=FixedOffset(offset))
datetime.fromisoformat() is improved in Python 3.11 to parse most ISO 8601 formats
datetime.fromisoformat() can now be used to parse most ISO 8601 formats, barring only those that support fractional hours and minutes. Previously, this method only supported formats that could be emitted by datetime.isoformat().
>>> from datetime import datetime
>>> datetime.fromisoformat('2011-11-04T00:05:23Z')
datetime.datetime(2011, 11, 4, 0, 5, 23, tzinfo=datetime.timezone.utc)
>>> datetime.fromisoformat('20111104T000523')
datetime.datetime(2011, 11, 4, 0, 5, 23)
>>> datetime.fromisoformat('2011-W01-2T00:05:23.283')
datetime.datetime(2011, 1, 4, 0, 5, 23, 283000)
Initially I tried with:
from operator import neg, pos
from time import strptime, mktime
from datetime import datetime, tzinfo, timedelta
class MyUTCOffsetTimezone(tzinfo):
#staticmethod
def with_offset(offset_no_signal, signal): # type: (str, str) -> MyUTCOffsetTimezone
return MyUTCOffsetTimezone((pos if signal == '+' else neg)(
(datetime.strptime(offset_no_signal, '%H:%M') - datetime(1900, 1, 1))
.total_seconds()))
def __init__(self, offset, name=None):
self.offset = timedelta(seconds=offset)
self.name = name or self.__class__.__name__
def utcoffset(self, dt):
return self.offset
def tzname(self, dt):
return self.name
def dst(self, dt):
return timedelta(0)
def to_datetime_tz(dt): # type: (str) -> datetime
fmt = '%Y-%m-%dT%H:%M:%S.%f'
if dt[-6] in frozenset(('+', '-')):
dt, sign, offset = strptime(dt[:-6], fmt), dt[-6], dt[-5:]
return datetime.fromtimestamp(mktime(dt),
tz=MyUTCOffsetTimezone.with_offset(offset, sign))
elif dt[-1] == 'Z':
return datetime.strptime(dt, fmt + 'Z')
return datetime.strptime(dt, fmt)
But that didn't work on negative timezones. This however I got working fine, in Python 3.7.3:
from datetime import datetime
def to_datetime_tz(dt): # type: (str) -> datetime
fmt = '%Y-%m-%dT%H:%M:%S.%f'
if dt[-6] in frozenset(('+', '-')):
return datetime.strptime(dt, fmt + '%z')
elif dt[-1] == 'Z':
return datetime.strptime(dt, fmt + 'Z')
return datetime.strptime(dt, fmt)
Some tests, note that the out only differs by precision of microseconds. Got to 6 digits of precision on my machine, but YMMV:
for dt_in, dt_out in (
('2019-03-11T08:00:00.000Z', '2019-03-11T08:00:00'),
('2019-03-11T08:00:00.000+11:00', '2019-03-11T08:00:00+11:00'),
('2019-03-11T08:00:00.000-11:00', '2019-03-11T08:00:00-11:00')
):
isoformat = to_datetime_tz(dt_in).isoformat()
assert isoformat == dt_out, '{} != {}'.format(isoformat, dt_out)
def parseISO8601DateTime(datetimeStr):
import time
from datetime import datetime, timedelta
def log_date_string(when):
gmt = time.gmtime(when)
if time.daylight and gmt[8]:
tz = time.altzone
else:
tz = time.timezone
if tz > 0:
neg = 1
else:
neg = 0
tz = -tz
h, rem = divmod(tz, 3600)
m, rem = divmod(rem, 60)
if neg:
offset = '-%02d%02d' % (h, m)
else:
offset = '+%02d%02d' % (h, m)
return time.strftime('%d/%b/%Y:%H:%M:%S ', gmt) + offset
dt = datetime.strptime(datetimeStr, '%Y-%m-%dT%H:%M:%S.%fZ')
timestamp = dt.timestamp()
return dt + timedelta(hours=dt.hour-time.gmtime(timestamp).tm_hour)
Note that we should look if the string doesn't ends with Z, we could parse using %z.

How to parse datetime that ends with `Z`?

I have the following datetime string s:
2017-10-18T04:46:53.553472514Z
I parese it like that:
t = datetime.strptime(s, '%Y-%m-%dT%H:%M:%SZ')
how to fix ValueError: time data '2017-10-18T04:46:53.553472514Z' does not match format '%Y-%m-%dT%H:%M:%SZ'
In theory,
t = datetime.strptime(s, '%Y-%m-%dT%H:%M:%S.%fZ')
would be the correct format string as you have fractions of second as well. BUT they would then need to be microseconds. Yours are probably nanoseconds, as %f only takes maximum of 6 digits.
So you need to do something like this:
t = datetime.datetime.strptime(s.split(".")[0], '%Y-%m-%dT%H:%M:%S')
t = t + datetime.timedelta(microseconds=int(s.split(".")[1][:-1])/1000)
print (t)
This works but it converts nanoseconds to microseconds. If this is not ok, then you need to do something else.
I think you should use dateutil.parser module
In [23]: s = "2017-10-18T04:46:53.553472514Z"
In [24]: import dateutil.parser as p
In [25]: p.parse(s)
Out[25]: datetime.datetime(2017, 10, 18, 4, 46, 53, 553472, tzinfo=tzutc())
As #FObersteiner pointed out, %z parses Z as well as ±HHMM[SS[.ffffff]] style of timezone encoding.
This information is hidden in the technical detail #6. :-/
Previous incorrect answer:
To answer the original question: there's no way to parse "Z" as the timezone using only the standard library.~
Here's some discussion on that: https://bugs.python.org/issue35829
I like the mx.DateTime module.
import mx.DateTime as dt
dt.DateTimeFrom('2017-10-18T04:46:53.553472514Z')
datetime is another Python built-in module that is completely superseded by a better extra module: arrow.
You can just do:
import arrow
dt = arrow.get('2017-10-18T04:46:53.553472514Z').datetime
Your date string being in standardized ISO format, it will be parsed without even giving a format string. The parsed df will be:
datetime.datetime(2017, 10, 18, 4, 46, 53, 553472, tzinfo=tzutc())
Or keep the Arrow object in case you want to keep the extra precision

Python: Issue converting ISO 8601 to datetime object [duplicate]

I need to parse RFC 3339 strings like "2008-09-03T20:56:35.450686Z" into Python's datetime type.
I have found strptime in the Python standard library, but it is not very convenient.
What is the best way to do this?
isoparse function from python-dateutil
The python-dateutil package has dateutil.parser.isoparse to parse not only RFC 3339 datetime strings like the one in the question, but also other ISO 8601 date and time strings that don't comply with RFC 3339 (such as ones with no UTC offset, or ones that represent only a date).
>>> import dateutil.parser
>>> dateutil.parser.isoparse('2008-09-03T20:56:35.450686Z') # RFC 3339 format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=tzutc())
>>> dateutil.parser.isoparse('2008-09-03T20:56:35.450686') # ISO 8601 extended format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
>>> dateutil.parser.isoparse('20080903T205635.450686') # ISO 8601 basic format
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
>>> dateutil.parser.isoparse('20080903') # ISO 8601 basic format, date only
datetime.datetime(2008, 9, 3, 0, 0)
The python-dateutil package also has dateutil.parser.parse. Compared with isoparse, it is presumably less strict, but both of them are quite forgiving and will attempt to interpret the string that you pass in. If you want to eliminate the possibility of any misreads, you need to use something stricter than either of these functions.
Comparison with Python 3.7+’s built-in datetime.datetime.fromisoformat
dateutil.parser.isoparse is a full ISO-8601 format parser, but in Python ≤ 3.10 fromisoformat is deliberately not. In Python 3.11, fromisoformat supports almost all strings in valid ISO 8601. See fromisoformat's docs for this cautionary caveat. (See this answer).
The datetime standard library has, since Python 3.7, a function for inverting datetime.isoformat().
classmethod datetime.fromisoformat(date_string):
Return a datetime corresponding to a date_string in any valid ISO 8601 format, with the following exceptions:
Time zone offsets may have fractional seconds.
The T separator may be replaced by any single unicode character.
Ordinal dates are not currently supported.
Fractional hours and minutes are not supported.
Examples:
>>> from datetime import datetime
>>> datetime.fromisoformat('2011-11-04')
datetime.datetime(2011, 11, 4, 0, 0)
>>> datetime.fromisoformat('20111104')
datetime.datetime(2011, 11, 4, 0, 0)
>>> datetime.fromisoformat('2011-11-04T00:05:23')
datetime.datetime(2011, 11, 4, 0, 5, 23)
>>> datetime.fromisoformat('2011-11-04T00:05:23Z')
datetime.datetime(2011, 11, 4, 0, 5, 23, tzinfo=datetime.timezone.utc)
>>> datetime.fromisoformat('20111104T000523')
datetime.datetime(2011, 11, 4, 0, 5, 23)
>>> datetime.fromisoformat('2011-W01-2T00:05:23.283')
datetime.datetime(2011, 1, 4, 0, 5, 23, 283000)
>>> datetime.fromisoformat('2011-11-04 00:05:23.283')
datetime.datetime(2011, 11, 4, 0, 5, 23, 283000)
>>> datetime.fromisoformat('2011-11-04 00:05:23.283+00:00')
datetime.datetime(2011, 11, 4, 0, 5, 23, 283000, tzinfo=datetime.timezone.utc)
>>> datetime.fromisoformat('2011-11-04T00:05:23+04:00')
datetime.datetime(2011, 11, 4, 0, 5, 23, tzinfo=datetime.timezone(datetime.timedelta(seconds=14400)))
New in version 3.7.
Changed in version 3.11: Previously, this method only supported formats that could be emitted by date.isoformat() or datetime.isoformat().
Be sure to read the caution from the docs if you haven't upgraded to Python 3.11 yet!
Note in Python 2.6+ and Py3K, the %f character catches microseconds.
>>> datetime.datetime.strptime("2008-09-03T20:56:35.450686Z", "%Y-%m-%dT%H:%M:%S.%fZ")
See issue here
As of Python 3.7, you can basically (caveats below) get away with using datetime.datetime.strptime to parse RFC 3339 datetimes, like this:
from datetime import datetime
def parse_rfc3339(datetime_str: str) -> datetime:
try:
return datetime.strptime(datetime_str, "%Y-%m-%dT%H:%M:%S.%f%z")
except ValueError:
# Perhaps the datetime has a whole number of seconds with no decimal
# point. In that case, this will work:
return datetime.strptime(datetime_str, "%Y-%m-%dT%H:%M:%S%z")
It's a little awkward, since we need to try two different format strings in order to support both datetimes with a fractional number of seconds (like 2022-01-01T12:12:12.123Z) and those without (like 2022-01-01T12:12:12Z), both of which are valid under RFC 3339. But as long as we do that single fiddly bit of logic, this works.
Some caveats to note about this approach:
It technically doesn't fully support RFC 3339, since RFC 3339 bizarrely lets you use a space instead of a T to separate the date from the time, even though RFC 3339 purports to be a profile of ISO 8601 and ISO 8601 does not allow this. If you want to support this silly quirk of RFC 3339, you could add datetime_str = datetime_str.replace(' ', 'T') to the start of the function.
My implementation above is slightly more permissive than a strict RFC 3339 parser should be, since it will allow timezone offsets like +0500 without a colon, which RFC 3339 does not support. If you don't merely want to parse known-to-be-RFC-3339 datetimes but also want to rigorously validate that the datetime you're getting is RFC 3339, use another approach or add in your own logic to validate the timezone offset format.
This function definitely doesn't support all of ISO 8601, which includes a much wider array of formats than RFC 3339. (e.g. 2009-W01-1 is a valid ISO 8601 date.)
It does not work in Python 3.6 or earlier, since in those old versions the %z specifier only matches timezones offsets like +0500 or -0430 or +0000, not RFC 3339 timezone offsets like +05:00 or -04:30 or Z.
Try the iso8601 module; it does exactly this.
There are several other options mentioned on the WorkingWithTime page on the python.org wiki.
Starting from Python 3.7, strptime supports colon delimiters in UTC offsets (source). So you can then use:
import datetime
def parse_date_string(date_string: str) -> datetime.datetime
try:
return datetime.datetime.strptime(date_string, '%Y-%m-%dT%H:%M:%S.%f%z')
except ValueError:
return datetime.datetime.strptime(date_string, '%Y-%m-%dT%H:%M:%S%z')
EDIT:
As pointed out by Martijn, if you created the datetime object using isoformat(), you can simply use datetime.fromisoformat().
EDIT 2:
As pointed out by Mark Amery, I added a try..except block to account for missing fractional seconds.
Python >= 3.11
fromisoformat now parses Z directly:
from datetime import datetime
s = "2008-09-03T20:56:35.450686Z"
datetime.fromisoformat(s)
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=datetime.timezone.utc)
Python 3.7 to 3.10
A simple option from one of the comments: replace 'Z' with '+00:00' - and use fromisoformat:
from datetime import datetime
s = "2008-09-03T20:56:35.450686Z"
datetime.fromisoformat(s.replace('Z', '+00:00'))
# datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=datetime.timezone.utc)
Why prefer fromisoformat?
Although strptime's %z can parse the 'Z' character to UTC, fromisoformat is faster by ~ x40 (see also: A faster strptime):
%timeit datetime.fromisoformat(s.replace('Z', '+00:00'))
388 ns ± 48.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit dateutil.parser.isoparse(s)
11 µs ± 1.05 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit datetime.strptime(s, '%Y-%m-%dT%H:%M:%S.%f%z')
15.8 µs ± 1.32 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit dateutil.parser.parse(s)
87.8 µs ± 8.54 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
(Python 3.9.12 x64 on Windows 10)
What is the exact error you get? Is it like the following?
>>> datetime.datetime.strptime("2008-08-12T12:20:30.656234Z", "%Y-%m-%dT%H:%M:%S.Z")
ValueError: time data did not match format: data=2008-08-12T12:20:30.656234Z fmt=%Y-%m-%dT%H:%M:%S.Z
If yes, you can split your input string on ".", and then add the microseconds to the datetime you got.
Try this:
>>> def gt(dt_str):
dt, _, us= dt_str.partition(".")
dt= datetime.datetime.strptime(dt, "%Y-%m-%dT%H:%M:%S")
us= int(us.rstrip("Z"), 10)
return dt + datetime.timedelta(microseconds=us)
>>> gt("2008-08-12T12:20:30.656234Z")
datetime.datetime(2008, 8, 12, 12, 20, 30, 656234)
import re
import datetime
s = "2008-09-03T20:56:35.450686Z"
d = datetime.datetime(*map(int, re.split(r'[^\d]', s)[:-1]))
In these days, Arrow also can be used as a third-party solution:
>>> import arrow
>>> date = arrow.get("2008-09-03T20:56:35.450686Z")
>>> date.datetime
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=tzutc())
Just use the python-dateutil module:
>>> import dateutil.parser as dp
>>> t = '1984-06-02T19:05:00.000Z'
>>> parsed_t = dp.parse(t)
>>> print(parsed_t)
datetime.datetime(1984, 6, 2, 19, 5, tzinfo=tzutc())
Documentation
I have found ciso8601 to be the fastest way to parse ISO 8601 timestamps.
It also has full support for RFC 3339, and a dedicated function for strict parsing RFC 3339 timestamps.
Example usage:
>>> import ciso8601
>>> ciso8601.parse_datetime('2014-01-09T21')
datetime.datetime(2014, 1, 9, 21, 0)
>>> ciso8601.parse_datetime('2014-01-09T21:48:00.921000+05:30')
datetime.datetime(2014, 1, 9, 21, 48, 0, 921000, tzinfo=datetime.timezone(datetime.timedelta(seconds=19800)))
>>> ciso8601.parse_rfc3339('2014-01-09T21:48:00.921000+05:30')
datetime.datetime(2014, 1, 9, 21, 48, 0, 921000, tzinfo=datetime.timezone(datetime.timedelta(seconds=19800)))
The GitHub Repo README shows their speedup versus all of the other libraries listed in the other answers.
My personal project involved a lot of ISO 8601 parsing. It was nice to be able to just switch the call and go faster. :)
Edit: I have since become a maintainer of ciso8601. It's now faster than ever!
If you are working with Django, it provides the dateparse module that accepts a bunch of formats similar to ISO format, including the time zone.
If you are not using Django and you don't want to use one of the other libraries mentioned here, you could probably adapt the Django source code for dateparse to your project.
If you don't want to use dateutil, you can try this function:
def from_utc(utcTime,fmt="%Y-%m-%dT%H:%M:%S.%fZ"):
"""
Convert UTC time string to time.struct_time
"""
# change datetime.datetime to time, return time.struct_time type
return datetime.datetime.strptime(utcTime, fmt)
Test:
from_utc("2007-03-04T21:08:12.123Z")
Result:
datetime.datetime(2007, 3, 4, 21, 8, 12, 123000)
I've coded up a parser for the ISO 8601 standard and put it on GitHub: https://github.com/boxed/iso8601. This implementation supports everything in the specification except for durations, intervals, periodic intervals, and dates outside the supported date range of Python's datetime module.
Tests are included! :P
This works for stdlib on Python 3.2 onwards (assuming all the timestamps are UTC):
from datetime import datetime, timezone, timedelta
datetime.strptime(timestamp, "%Y-%m-%dT%H:%M:%S.%fZ").replace(
tzinfo=timezone(timedelta(0)))
For example,
>>> datetime.utcnow().replace(tzinfo=timezone(timedelta(0)))
... datetime.datetime(2015, 3, 11, 6, 2, 47, 879129, tzinfo=datetime.timezone.utc)
I'm the author of iso8601utils. It can be found on GitHub or on PyPI. Here's how you can parse your example:
>>> from iso8601utils import parsers
>>> parsers.datetime('2008-09-03T20:56:35.450686Z')
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686)
One straightforward way to convert an ISO 8601-like date string to a UNIX timestamp or datetime.datetime object in all supported Python versions without installing third-party modules is to use the date parser of SQLite.
#!/usr/bin/env python
from __future__ import with_statement, division, print_function
import sqlite3
import datetime
testtimes = [
"2016-08-25T16:01:26.123456Z",
"2016-08-25T16:01:29",
]
db = sqlite3.connect(":memory:")
c = db.cursor()
for timestring in testtimes:
c.execute("SELECT strftime('%s', ?)", (timestring,))
converted = c.fetchone()[0]
print("%s is %s after epoch" % (timestring, converted))
dt = datetime.datetime.fromtimestamp(int(converted))
print("datetime is %s" % dt)
Output:
2016-08-25T16:01:26.123456Z is 1472140886 after epoch
datetime is 2016-08-25 12:01:26
2016-08-25T16:01:29 is 1472140889 after epoch
datetime is 2016-08-25 12:01:29
An another way is to use specialized parser for ISO-8601 is to use isoparse function of dateutil parser:
from dateutil import parser
date = parser.isoparse("2008-09-03T20:56:35.450686+01:00")
print(date)
Output:
2008-09-03 20:56:35.450686+01:00
This function is also mentioned in the documentation for the standard Python function datetime.fromisoformat:
A more full-featured ISO 8601 parser, dateutil.parser.isoparse is
available in the third-party package dateutil.
Django's parse_datetime() function supports dates with UTC offsets:
parse_datetime('2016-08-09T15:12:03.65478Z') =
datetime.datetime(2016, 8, 9, 15, 12, 3, 654780, tzinfo=<UTC>)
So it could be used for parsing ISO 8601 dates in fields within entire project:
from django.utils import formats
from django.forms.fields import DateTimeField
from django.utils.dateparse import parse_datetime
class DateTimeFieldFixed(DateTimeField):
def strptime(self, value, format):
if format == 'iso-8601':
return parse_datetime(value)
return super().strptime(value, format)
DateTimeField.strptime = DateTimeFieldFixed.strptime
formats.ISO_INPUT_FORMATS['DATETIME_INPUT_FORMATS'].insert(0, 'iso-8601')
If pandas is used anyway, I can recommend Timestamp from pandas. There you can
ts_1 = pd.Timestamp('2020-02-18T04:27:58.000Z')
ts_2 = pd.Timestamp('2020-02-18T04:27:58.000')
Rant: It is just unbelievable that we still need to worry about things like date string parsing in 2021.
Because ISO 8601 allows many variations of optional colons and dashes being present, basically CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm]. If you want to use strptime, you need to strip out those variations first.
The goal is to generate a utc datetime object.
If you just want a basic case that work for UTC with the Z suffix like 2016-06-29T19:36:29.3453Z:
datetime.datetime.strptime(timestamp.translate(None, ':-'), "%Y%m%dT%H%M%S.%fZ")
If you want to handle timezone offsets like 2016-06-29T19:36:29.3453-0400 or 2008-09-03T20:56:35.450686+05:00 use the following. These will convert all variations into something without variable delimiters like 20080903T205635.450686+0500 making it more consistent/easier to parse.
import re
# this regex removes all colons and all
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)
datetime.datetime.strptime(conformed_timestamp, "%Y%m%dT%H%M%S.%f%z" )
If your system does not support the %z strptime directive (you see something like ValueError: 'z' is a bad directive in format '%Y%m%dT%H%M%S.%f%z') then you need to manually offset the time from Z (UTC). Note %z may not work on your system in python versions < 3 as it depended on the c library support which varies across system/python build type (i.e. Jython, Cython, etc.).
import re
import datetime
# this regex removes all colons and all
# dashes EXCEPT for the dash indicating + or - utc offset for the timezone
conformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)
# split on the offset to remove it. use a capture group to keep the delimiter
split_timestamp = re.split(r"[+|-]",conformed_timestamp)
main_timestamp = split_timestamp[0]
if len(split_timestamp) == 3:
sign = split_timestamp[1]
offset = split_timestamp[2]
else:
sign = None
offset = None
# generate the datetime object without the offset at UTC time
output_datetime = datetime.datetime.strptime(main_timestamp +"Z", "%Y%m%dT%H%M%S.%fZ" )
if offset:
# create timedelta based on offset
offset_delta = datetime.timedelta(hours=int(sign+offset[:-2]), minutes=int(sign+offset[-2:]))
# offset datetime with timedelta
output_datetime = output_datetime + offset_delta
Nowadays there's Maya: Datetimes for Humans™, from the author of the popular Requests: HTTP for Humans™ package:
>>> import maya
>>> str = '2008-09-03T20:56:35.450686Z'
>>> maya.MayaDT.from_rfc3339(str).datetime()
datetime.datetime(2008, 9, 3, 20, 56, 35, 450686, tzinfo=<UTC>)
The python-dateutil will throw an exception if parsing invalid date strings, so you may want to catch the exception.
from dateutil import parser
ds = '2012-60-31'
try:
dt = parser.parse(ds)
except ValueError, e:
print '"%s" is an invalid date' % ds
For something that works with the 2.X standard library try:
calendar.timegm(time.strptime(date.split(".")[0]+"UTC", "%Y-%m-%dT%H:%M:%S%Z"))
calendar.timegm is the missing gm version of time.mktime.
Thanks to great Mark Amery's answer I devised function to account for all possible ISO formats of datetime:
class FixedOffset(tzinfo):
"""Fixed offset in minutes: `time = utc_time + utc_offset`."""
def __init__(self, offset):
self.__offset = timedelta(minutes=offset)
hours, minutes = divmod(offset, 60)
#NOTE: the last part is to remind about deprecated POSIX GMT+h timezones
# that have the opposite sign in the name;
# the corresponding numeric value is not used e.g., no minutes
self.__name = '<%+03d%02d>%+d' % (hours, minutes, -hours)
def utcoffset(self, dt=None):
return self.__offset
def tzname(self, dt=None):
return self.__name
def dst(self, dt=None):
return timedelta(0)
def __repr__(self):
return 'FixedOffset(%d)' % (self.utcoffset().total_seconds() / 60)
def __getinitargs__(self):
return (self.__offset.total_seconds()/60,)
def parse_isoformat_datetime(isodatetime):
try:
return datetime.strptime(isodatetime, '%Y-%m-%dT%H:%M:%S.%f')
except ValueError:
pass
try:
return datetime.strptime(isodatetime, '%Y-%m-%dT%H:%M:%S')
except ValueError:
pass
pat = r'(.*?[+-]\d{2}):(\d{2})'
temp = re.sub(pat, r'\1\2', isodatetime)
naive_date_str = temp[:-5]
offset_str = temp[-5:]
naive_dt = datetime.strptime(naive_date_str, '%Y-%m-%dT%H:%M:%S.%f')
offset = int(offset_str[-4:-2])*60 + int(offset_str[-2:])
if offset_str[0] == "-":
offset = -offset
return naive_dt.replace(tzinfo=FixedOffset(offset))
datetime.fromisoformat() is improved in Python 3.11 to parse most ISO 8601 formats
datetime.fromisoformat() can now be used to parse most ISO 8601 formats, barring only those that support fractional hours and minutes. Previously, this method only supported formats that could be emitted by datetime.isoformat().
>>> from datetime import datetime
>>> datetime.fromisoformat('2011-11-04T00:05:23Z')
datetime.datetime(2011, 11, 4, 0, 5, 23, tzinfo=datetime.timezone.utc)
>>> datetime.fromisoformat('20111104T000523')
datetime.datetime(2011, 11, 4, 0, 5, 23)
>>> datetime.fromisoformat('2011-W01-2T00:05:23.283')
datetime.datetime(2011, 1, 4, 0, 5, 23, 283000)
Initially I tried with:
from operator import neg, pos
from time import strptime, mktime
from datetime import datetime, tzinfo, timedelta
class MyUTCOffsetTimezone(tzinfo):
#staticmethod
def with_offset(offset_no_signal, signal): # type: (str, str) -> MyUTCOffsetTimezone
return MyUTCOffsetTimezone((pos if signal == '+' else neg)(
(datetime.strptime(offset_no_signal, '%H:%M') - datetime(1900, 1, 1))
.total_seconds()))
def __init__(self, offset, name=None):
self.offset = timedelta(seconds=offset)
self.name = name or self.__class__.__name__
def utcoffset(self, dt):
return self.offset
def tzname(self, dt):
return self.name
def dst(self, dt):
return timedelta(0)
def to_datetime_tz(dt): # type: (str) -> datetime
fmt = '%Y-%m-%dT%H:%M:%S.%f'
if dt[-6] in frozenset(('+', '-')):
dt, sign, offset = strptime(dt[:-6], fmt), dt[-6], dt[-5:]
return datetime.fromtimestamp(mktime(dt),
tz=MyUTCOffsetTimezone.with_offset(offset, sign))
elif dt[-1] == 'Z':
return datetime.strptime(dt, fmt + 'Z')
return datetime.strptime(dt, fmt)
But that didn't work on negative timezones. This however I got working fine, in Python 3.7.3:
from datetime import datetime
def to_datetime_tz(dt): # type: (str) -> datetime
fmt = '%Y-%m-%dT%H:%M:%S.%f'
if dt[-6] in frozenset(('+', '-')):
return datetime.strptime(dt, fmt + '%z')
elif dt[-1] == 'Z':
return datetime.strptime(dt, fmt + 'Z')
return datetime.strptime(dt, fmt)
Some tests, note that the out only differs by precision of microseconds. Got to 6 digits of precision on my machine, but YMMV:
for dt_in, dt_out in (
('2019-03-11T08:00:00.000Z', '2019-03-11T08:00:00'),
('2019-03-11T08:00:00.000+11:00', '2019-03-11T08:00:00+11:00'),
('2019-03-11T08:00:00.000-11:00', '2019-03-11T08:00:00-11:00')
):
isoformat = to_datetime_tz(dt_in).isoformat()
assert isoformat == dt_out, '{} != {}'.format(isoformat, dt_out)
def parseISO8601DateTime(datetimeStr):
import time
from datetime import datetime, timedelta
def log_date_string(when):
gmt = time.gmtime(when)
if time.daylight and gmt[8]:
tz = time.altzone
else:
tz = time.timezone
if tz > 0:
neg = 1
else:
neg = 0
tz = -tz
h, rem = divmod(tz, 3600)
m, rem = divmod(rem, 60)
if neg:
offset = '-%02d%02d' % (h, m)
else:
offset = '+%02d%02d' % (h, m)
return time.strftime('%d/%b/%Y:%H:%M:%S ', gmt) + offset
dt = datetime.strptime(datetimeStr, '%Y-%m-%dT%H:%M:%S.%fZ')
timestamp = dt.timestamp()
return dt + timedelta(hours=dt.hour-time.gmtime(timestamp).tm_hour)
Note that we should look if the string doesn't ends with Z, we could parse using %z.

Is there a wildcard format directive for strptime?

I'm using strptime like this:
import time
time.strptime("+10:00","+%H:%M")
but "+10:00" could also be "-10:00" (timezone offset from UTC) which would break the above command. I could use
time.strptime("+10:00"[1:],"%H:%M")
but ideally I'd find it more readable to use a wildcard in front of the format code.
Does such a wildcard operator exist for Python's strptime / strftime?
There is no wildcard operator. The list of format directives supported by strptime is in the docs.
What you're looking for is the %z format directive, which supports a representation of the timezone of the form +HHMM or -HHMM. While it has been supported by datetime.strftime for some time, it is only supported in strptime starting in Python 3.2.
On Python 2, the best way to handle this is probably to use datetime.datetime.strptime, manually handle the negative offset, and get a datetime.timedelta:
import datetime
tz = "+10:00"
def tz_to_timedelta(tz):
min = datetime.datetime.strptime('', '')
try:
return -(datetime.datetime.strptime(tz,"-%H:%M") - min)
except ValueError:
return datetime.datetime.strptime(tz,"+%H:%M") - min
print tz_to_timedelta(tz)
In Python 3.2, remove the : and use %z:
import time
tz = "+10:00"
tz_toconvert = tz[:3] + tz[4:]
tz_struct_time = time.strptime(tz_toconvert, "%z")
We developed datetime-glob to parse date/times from a list of files generated by a consistent date/time formatting. From the module's documentation:
>>> import datetime_glob
>>> matcher = datetime_glob.Matcher(
pattern='/some/path/*%Y-%m-%dT%H-%M-%SZ.jpg')
>>> matcher.match(path='/some/path/some-text2016-07-03T21-22-23Z.jpg')
datetime_glob.Match(year = 2016, month = 7, day = 3,
hour = 21, minute = 22, second = 23, microsecond = None)
>>> match.as_datetime()
datetime.datetime(2016, 7, 3, 21, 22, 23)

What's the best way to make a time from "Today" or "Yesterday" and a time in Python?

Python has pretty good date parsing but is the only way to recognize a datetime such as "Today 3:20 PM" or "Yesterday 11:06 AM" by creating a new date today and doing subtractions?
A library that I like a lot, and I'm seeing more and more people use, is python-dateutil but unfortunately neither it nor the other traditional big datetime parser, mxDateTime from Egenix can parse the word "tomorrow" in spite of both libraries having very strong "fuzzy" parsers.
The only library I've seen that can do this is magicdate. Examples:
>>> import magicdate
>>> magicdate.magicdate('today')
datetime.date(2009, 2, 15)
>>> magicdate.magicdate('tomorrow')
datetime.date(2009, 2, 16)
>>> magicdate.magicdate('yesterday')
datetime.date(2009, 2, 14)
Unfortunately this only returns datetime.date objects, and so won't include time parts and can't handle your example of "Today 3:20 PM".
So, you need mxDateTime for that. Examples:
>>> import mx.DateTime
>>> mx.DateTime.Parser.DateTimeFromString("Today 3:20 PM")
<mx.DateTime.DateTime object for '2009-02-15 15:20:00.00' at 28faa28>
>>> mx.DateTime.Parser.DateTimeFromString("Tomorrow 5:50 PM")
<mx.DateTime.DateTime object for '2009-02-15 17:50:00.00' at 2a86088>
EDIT: mxDateTime.Parser is only parsing the time in these examples and ignoring the words "today" and "tomorrow". So for this particular case you need to use a combo of magicdate to get the date and mxDateTime to get the time. My recommendation is to just use python-dateutils or mxDateTime and only accept the string formats they can parse.
EDIT 2: As noted in the comments it looks python-dateutil can now handle fuzzy parsing. I've also since discovered the parsedatetime module that was developed for use in Chandler and it works with the queries in this question:
>>> import parsedatetime.parsedatetime as pdt
>>> import parsedatetime.parsedatetime_consts as pdc
>>> c=pdc.Constants()
>>> p=pdt.Calendar(c)
>>> p.parse('Today 3:20 PM')
((2010, 3, 12, 15, 20, 0, 4, 71, -1), 3)
>>> p.parse('Yesterday 11:06 AM')
((2010, 3, 11, 11, 6, 0, 3, 70, -1), 3)
and for reference here is the current time:
>>> import datetime
>>> datetime.datetime.now()
datetime.datetime(2010, 3, 12, 15, 23, 35, 951652)
I am not yet completely up to speed on Python yet, but your question interested me, so I dug around a bit.
Date subtraction using timedelta is by far the most common solution I found.
Since your question asks if that's the only way to do it, I checked out the strftime format codes to see if you could define your own. Unfortunately not. From Python's strftime documentation:
... The full set of format codes supported varies across platforms, because Python calls the platform C library’s strftime() function, and platform variations are common.
The following is a list of all the format codes that the C standard (1989 version) requires ...
Anyhow, this isn't a definitive answer, but maybe It'll save others time barking up the wrong tree.

Categories