python: datetime.strptime gives "does not match format" error [duplicate] - python

I am trying to convert time-stamps of the format "2012-07-24T23:14:29-07:00"
to datetime objects in python using strptime method. The problem is with the time offset at the end(-07:00). Without the offset i can successfully do
time_str = "2012-07-24T23:14:29"
time_obj=datetime.datetime.strptime(time_str,'%Y-%m-%dT%H:%M:%S')
But with the offset i tried
time_str = "2012-07-24T23:14:29-07:00"
time_obj=datetime.datetime.strptime(time_str,'%Y-%m-%dT%H:%M:%S-%z').
But it gives a Value error saying "z" is a bad directive.
Any ideas for a work around?

The Python 2 strptime() function indeed does not support the %z format for timezones (because the underlying time.strptime() function doesn't support it). You have two options:
Ignore the timezone when parsing with strptime:
time_obj = datetime.datetime.strptime(time_str[:19], '%Y-%m-%dT%H:%M:%S')
use the dateutil module, it's parse function does deal with timezones:
from dateutil.parser import parse
time_obj = parse(time_str)
Quick demo on the command prompt:
>>> from dateutil.parser import parse
>>> parse("2012-07-24T23:14:29-07:00")
datetime.datetime(2012, 7, 24, 23, 14, 29, tzinfo=tzoffset(None, -25200))
You could also upgrade to Python 3.2 or newer, where timezone support has been improved to the point that %z would work, provided you remove the last : from the input, and the - from before the %z:
>>> import datetime
>>> time_str = "2012-07-24T23:14:29-07:00"
>>> datetime.datetime.strptime(time_str, '%Y-%m-%dT%H:%M:%S%z')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
tt, fraction = _strptime(data_string, format)
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python3.4/_strptime.py", line 337, in _strptime
(data_string, format))
ValueError: time data '2012-07-24T23:14:29-07:00' does not match format '%Y-%m-%dT%H:%M:%S%z'
>>> ''.join(time_str.rsplit(':', 1))
'2012-07-24T23:14:29-0700'
>>> datetime.datetime.strptime(''.join(time_str.rsplit(':', 1)), '%Y-%m-%dT%H:%M:%S%z')
datetime.datetime(2012, 7, 24, 23, 14, 29, tzinfo=datetime.timezone(datetime.timedelta(-1, 61200)))

In Python 3.7+:
from datetime import datetime
time_str = "2012-07-24T23:14:29-07:00"
dt_aware = datetime.fromisoformat(time_str)
print(dt_aware.isoformat('T'))
# -> 2012-07-24T23:14:29-07:00
In Python 3.2+:
from datetime import datetime
time_str = "2012-07-24T23:14:29-0700"
dt_aware = datetime.strptime(time_str, '%Y-%m-%dT%H:%M:%S%z')
print(dt_aware.isoformat('T'))
# -> 2012-07-24T23:14:29-07:00
Note: Before Python 3.7 this variant didn't support : in the -0700 part (both formats are allowed by rfc 3339). See datetime: add ability to parse RFC 3339 dates and times.
On older Python versions such as Python 2.7, you could parse the utc offset manually:
from datetime import datetime
time_str = "2012-07-24T23:14:29-0700"
# split the utc offset part
naive_time_str, offset_str = time_str[:-5], time_str[-5:]
# parse the naive date/time part
naive_dt = datetime.strptime(naive_time_str, '%Y-%m-%dT%H:%M:%S')
# parse the utc offset
offset = int(offset_str[-4:-2])*60 + int(offset_str[-2:])
if offset_str[0] == "-":
offset = -offset
dt = naive_dt.replace(tzinfo=FixedOffset(offset))
print(dt.isoformat('T'))
where FixedOffset class is defined here.

ValueError: 'z' is a bad directive in format...
(note: I have to stick to python 2.7 in my case)
I have had a similar problem parsing commit dates from the output of git log --date=iso8601 which actually isn't the ISO8601 format (hence the addition of --date=iso8601-strict in a later version).
Since I am using django I can leverage the utilities there.
https://github.com/django/django/blob/master/django/utils/dateparse.py
>>> from django.utils.dateparse import parse_datetime
>>> parse_datetime('2013-07-23T15:10:59.342107+01:00')
datetime.datetime(2013, 7, 23, 15, 10, 59, 342107, tzinfo=+0100)
Instead of strptime you could use your own regular expression.

With python 3.5.2
To convert 26 Sep 2000 05:11:00 -0700
from datetime import datetime
dt_obj = datetime.strptime("26 Sep 2000 05:11:00 -0700", '%d %b %Y %H:%M:%S %z')
To convert 2012-07-24T23:14:29 -0700
dt_obj = datetime.strptime('2012-07-24T23:14:29 -0700', '%Y-%m-%dT%H:%M:%S %z')
Python 3.5.2 doesn't support -07:00 time offset ':' should be removed

Related

Python - aware datetime: from string to datetime [duplicate]

I am trying to convert time-stamps of the format "2012-07-24T23:14:29-07:00"
to datetime objects in python using strptime method. The problem is with the time offset at the end(-07:00). Without the offset i can successfully do
time_str = "2012-07-24T23:14:29"
time_obj=datetime.datetime.strptime(time_str,'%Y-%m-%dT%H:%M:%S')
But with the offset i tried
time_str = "2012-07-24T23:14:29-07:00"
time_obj=datetime.datetime.strptime(time_str,'%Y-%m-%dT%H:%M:%S-%z').
But it gives a Value error saying "z" is a bad directive.
Any ideas for a work around?
The Python 2 strptime() function indeed does not support the %z format for timezones (because the underlying time.strptime() function doesn't support it). You have two options:
Ignore the timezone when parsing with strptime:
time_obj = datetime.datetime.strptime(time_str[:19], '%Y-%m-%dT%H:%M:%S')
use the dateutil module, it's parse function does deal with timezones:
from dateutil.parser import parse
time_obj = parse(time_str)
Quick demo on the command prompt:
>>> from dateutil.parser import parse
>>> parse("2012-07-24T23:14:29-07:00")
datetime.datetime(2012, 7, 24, 23, 14, 29, tzinfo=tzoffset(None, -25200))
You could also upgrade to Python 3.2 or newer, where timezone support has been improved to the point that %z would work, provided you remove the last : from the input, and the - from before the %z:
>>> import datetime
>>> time_str = "2012-07-24T23:14:29-07:00"
>>> datetime.datetime.strptime(time_str, '%Y-%m-%dT%H:%M:%S%z')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python3.4/_strptime.py", line 500, in _strptime_datetime
tt, fraction = _strptime(data_string, format)
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python3.4/_strptime.py", line 337, in _strptime
(data_string, format))
ValueError: time data '2012-07-24T23:14:29-07:00' does not match format '%Y-%m-%dT%H:%M:%S%z'
>>> ''.join(time_str.rsplit(':', 1))
'2012-07-24T23:14:29-0700'
>>> datetime.datetime.strptime(''.join(time_str.rsplit(':', 1)), '%Y-%m-%dT%H:%M:%S%z')
datetime.datetime(2012, 7, 24, 23, 14, 29, tzinfo=datetime.timezone(datetime.timedelta(-1, 61200)))
In Python 3.7+:
from datetime import datetime
time_str = "2012-07-24T23:14:29-07:00"
dt_aware = datetime.fromisoformat(time_str)
print(dt_aware.isoformat('T'))
# -> 2012-07-24T23:14:29-07:00
In Python 3.2+:
from datetime import datetime
time_str = "2012-07-24T23:14:29-0700"
dt_aware = datetime.strptime(time_str, '%Y-%m-%dT%H:%M:%S%z')
print(dt_aware.isoformat('T'))
# -> 2012-07-24T23:14:29-07:00
Note: Before Python 3.7 this variant didn't support : in the -0700 part (both formats are allowed by rfc 3339). See datetime: add ability to parse RFC 3339 dates and times.
On older Python versions such as Python 2.7, you could parse the utc offset manually:
from datetime import datetime
time_str = "2012-07-24T23:14:29-0700"
# split the utc offset part
naive_time_str, offset_str = time_str[:-5], time_str[-5:]
# parse the naive date/time part
naive_dt = datetime.strptime(naive_time_str, '%Y-%m-%dT%H:%M:%S')
# parse the utc offset
offset = int(offset_str[-4:-2])*60 + int(offset_str[-2:])
if offset_str[0] == "-":
offset = -offset
dt = naive_dt.replace(tzinfo=FixedOffset(offset))
print(dt.isoformat('T'))
where FixedOffset class is defined here.
ValueError: 'z' is a bad directive in format...
(note: I have to stick to python 2.7 in my case)
I have had a similar problem parsing commit dates from the output of git log --date=iso8601 which actually isn't the ISO8601 format (hence the addition of --date=iso8601-strict in a later version).
Since I am using django I can leverage the utilities there.
https://github.com/django/django/blob/master/django/utils/dateparse.py
>>> from django.utils.dateparse import parse_datetime
>>> parse_datetime('2013-07-23T15:10:59.342107+01:00')
datetime.datetime(2013, 7, 23, 15, 10, 59, 342107, tzinfo=+0100)
Instead of strptime you could use your own regular expression.
With python 3.5.2
To convert 26 Sep 2000 05:11:00 -0700
from datetime import datetime
dt_obj = datetime.strptime("26 Sep 2000 05:11:00 -0700", '%d %b %Y %H:%M:%S %z')
To convert 2012-07-24T23:14:29 -0700
dt_obj = datetime.strptime('2012-07-24T23:14:29 -0700', '%Y-%m-%dT%H:%M:%S %z')
Python 3.5.2 doesn't support -07:00 time offset ':' should be removed

Date with a time zone specified as a string is parsed as naive

I'm curious why the timezone in this example, GMT, is not parsed as a valid one:
>>> from datetime import datetime
>>> import pytz
>>> b = 'Mon, 3 Oct 2016 21:24:17 GMT'
>>> fmt = '%a, %d %b %Y %H:%M:%S %Z'
>>> datetime.strptime(b, fmt).astimezone(pytz.utc)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: astimezone() cannot be applied to a naive datetime
Doing the same with a -0700 instead of GMT and %z instead of %Z in the format works just fine.
What's the proper way to parse dates ending in string time zones if not this?
Use .replace() method with datetime object to update the time zone info.
>>> datetime.strptime(b, fmt).replace(tzinfo=pytz.utc)
datetime.datetime(2016, 10, 3, 21, 24, 17, tzinfo=<UTC>)
Since you mentioned, .astimezone() is working with %Z instead of %s in the format string. Even though there is z in both the formatting (difference in just case), but they are totally different in terms of what they represent.
As per the strftime's directive document:
%z : UTC offset in the form +HHMM or -HHMM (empty string if the the object is naive).
%Z : Time zone name (empty string if the object is naive).

I want to parse string to timestamp-with-timezone in python

I try to parse string to time-stamp with timezone format.
here is an example
"2016-02-18 16:13:07+09"
i want to know parsing this string format to time-stamp format in python.
how can i do that?
Is the UTC offset format in your string +09 or +0900 ?
If the offset in your string is 0900 you can use the below .If your UTC offset is only +09 as you mentioned in your question , you can pad the string with 00 and get the below code to work .
Code:
import datetime
time="2016-02-18 16:13:07+0900"
new_time=datetime.datetime.strptime(time,"%Y-%m-%d %H:%M:%S%z")
print(new_time)
new_time_python=datetime.datetime.strftime(new_time,"%m-%d-%y")
print(new_time_python)
Output
2016-02-18 16:13:07+09:00
02-18-16
dateutil might be a suitable library for your purposes:
from dateutil.parser import parser
p = parser()
d = p.parse('2016-02-18 16:13:07+09'.decode('utf-8')) # must be unicode string
d
>>> datetime.datetime(2016, 2, 18, 16, 13, 7, tzinfo=tzoffset(None, 32400))
If the UTC offset may be specified both as +HH and +HHMM format then you could use str.ljust() method to normalize the input time string. Then you could use .strptime() to parse it:
#!/usr/bin/env python3
from datetime import datetime
time_string = "2016-02-18 16:13:07+09"
dt = datetime.strptime(time_string.ljust(24, "0"), "%Y-%m-%d %H:%M:%S%z")
# -> datetime.datetime(2016, 2, 18, 16, 13, 7,
# tzinfo=datetime.timezone(datetime.timedelta(0, 32400)))
If your Python version doesn't support %z, see How to parse dates with -0400 timezone string in python?

Have a correct datetime with correct timezone

I am using feedparser in order to get RSS data.
Here is my code :
>>> import datetime
>>> import time
>>> import feedparser
>>> d=feedparser.parse("http://.../rss.xml")
>>> datetimee_rss = d.entries[0].published_parsed
>>> datetimee_rss
time.struct_time(tm_year=2015, tm_mon=5, tm_mday=8, tm_hour=16, tm_min=57, tm_sec=39, tm_wday=4, tm_yday=128, tm_isdst=0)
>>> datetime.datetime.fromtimestamp(time.mktime(datetimee_rss))
datetime.datetime(2015, 5, 8, 17, 57, 39)
In my timezone (FR), the actual date is May, 8th, 2015 18:57.
In the RSS XML, the value is <pubDate>Fri, 08 May 2015 18:57:39 +0200</pubDate>
When I parse it into datetime, I got 2015, 5, 8, 17, 57, 39.
How to have 2015, 5, 8, 18, 57, 39 without dirty hack, but simply by configuring the correct timezone ?
EDIT:
By doing :
>>> from pytz import timezone
>>> datetime.datetime.fromtimestamp(time.mktime(datetimee_rss),tz=timezone('Euro
pe/Paris'))
datetime.datetime(2015, 5, 8, 17, 57, 39, tzinfo=<DstTzInfo 'Europe/Paris' CEST+2:00:00 DST>)
I got something nicer, however, it doesn't seem to work in the rest of the script, I got plenty of TypeError: can't compare offset-naive and offset-aware datetimes error.
feedparser does provide the original datetime string (just remove the _parsed suffix from the attribute name), so if you know the format of the string, you can parse it into a tz-aware datetime object yourself.
For example, with your code, you can get the tz-aware object as such:
datetime.datetime.strptime(d.entries[0].published, '%a, %d %b %Y %H:%M:%S %z')
for more reference on strptime(), see https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior
EDIT: Since Python 2.x doesn't support %z directive, use python-dateutil instead
pip install python-dateutil
then
from dateutil import parser
datetime_rss = parser.parse(d.entries[0].published)
documentation at https://dateutil.readthedocs.org/en/latest/
feedparser returns time in UTC timezone. It is incorrect to apply time.mktime() to it (unless your local timezone is UTC that it isn't). You should use calendar.timegm() instead:
import calendar
from datetime import datetime
utc_tuple = d.entries[0].published_parsed
posix_timestamp = calendar.timegm(utc_tuple)
local_time_as_naive_datetime_object = datetime.frometimestamp(posix_timestamp) # assume non-"right" timezone
RSS feeds may use many different dates formats; I would leave the date parsing to feedparser module.
If you want to get the local time as an aware datetime object:
from tzlocal import get_localzone # $ pip install tzlocal
local_timezone = get_localzone()
local_time = datetime.frometimestamp(posix_timestamp, local_timezone) # assume non-"right" timezone
Try this:
>>> import os
>>> os.environ['TZ'] = 'Europe/Paris'
>>> time.tzset()
>>> time.tzname
('CET', 'CEST')

Converting a String into a datetime object in python

I have a string field like this..
2011-09-04 23:44:30.801000
and now I need to convert it to a datetime object in python so that I can calculate the difference between two datetime objects.
You should use datetime.datetime.strptime(), which converts a string and date format into a datetime.datetime object.
The format fields (e.g., %Y denotes four-digit year) are specified in the Python documentation.
>>> import datetime
>>> s = '2011-09-04 23:44:30.801000'
>>> format = '%Y-%m-%d %H:%M:%S.%f'
>>> date=datetime.datetime.strptime(s, format)
>>> date
datetime.datetime(2011, 9, 4, 23, 44, 30, 801000)
An alternative to datetime.datetime.strptime would be the python-dateutil libray. dateutil will allow you to do the same thing without the explicit formatting step:
>>> from dateutil import parser
>>> date_obj = parser.parse('2011-09-04 23:44:30.801000')
>>> date
datetime.datetime(2011, 9, 4, 23, 44, 30, 801000)
It's not a standard library module, but it is very handy for parsing date and time strings, especially if you don't have control over the format they come in.
One caveat if you install this library: version 1.5 is for Python 2 and version 2.0 is for Python 3. easy_install and pip default to installing the 2.0 version, so you have to explicitly indicate python-dateutil==1.5 if you are using Python 2.
Use datetime.datetime.strptime.
# date string to datetime object
date_str = "2008-11-10 17:53:59"
dt_obj = datetime.strptime(date_str, "%Y-%m-%d %H:%M:%S")
print repr(dt_obj)

Categories