Python: datetime format - python

I have the following datetime string:
Mon Oct 27 23:00:03 +0000 2014
I would like to convert this string to a form where I could compare the datetimes. So, the first thing I tried is converting this to a datetime in Python.
I am having trouble with the correct formatting. I have followed the documentation, but it does not work.
I have tried the following:
str = 'Mon Oct 27 23:00:03 +0000 2014'
datetime.strptime(str, '%a %b %d %X %Z %Y')
How can I get this to work?

If you want to convert it to the datetime object you can use library python-dateutil.
For example:
In [6]: dateutil.parser.parse('Mon Oct 27 23:00:03 +0000 2014')
Out[6]: datetime.datetime(2014, 10, 27, 23, 0, 3, tzinfo=tzutc())

In Python 3.2+:
>>> from datetime import datetime
>>> timestr = 'Mon Oct 27 23:00:03 +0000 2014'
>>> datetime.strptime(timestr, '%a %b %d %X %z %Y')
datetime.datetime(2014, 10, 27, 23, 0, 3, tzinfo=datetime.timezone.utc)
Note the lower case %z.

Here's a stdlib-only version that works on Python 2 and 3:
#!/usr/bin/env python
from datetime import datetime
from email.utils import parsedate_tz, mktime_tz
timestamp = mktime_tz(parsedate_tz('Mon Oct 27 23:00:03 +0000 2014'))
utc_dt = datetime.utcfromtimestamp(timestamp)
# -> datetime.datetime(2014, 10, 27, 23, 0, 3)
where utc_dt is a datetime object that represents time in UTC timezone (regardless of the input timezone).
Note: it doesn't support the time that represents a leap second (though datetime object can't represent it anyway):
>>> datetime.utcfromtimestamp(mktime_tz(parsedate_tz('Tue June 30 23:59:60 +0000 2015')))
datetime.datetime(2015, 7, 1, 0, 0)

Your problem lies with your %z UTC offset value (you should've used a lowercase z). However,
%z is only supported in Python 3.2+
If you are stuck with an older version of Python, you could possibly take out the UTC offset from the string and try converting it after you convert the rest

Related

Python : Converting string to datetime [duplicate]

I was trying to convert a string to a datetime object.
The string I got from a news feed is in the following format:
"Thu, 16 Oct 2014 01:16:17 EDT"
I tried using datetime.strptime() to convert it.
i.e.,
datetime.strptime('Thu, 16 Oct 2014 01:16:17 EDT','%a, %d %b %Y %H:%M:%S %Z')
And got the following error:
Traceback (most recent call last):
File "", line 1, in
datetime.strptime('Thu, 16 Oct 2014 01:16:17 EDT','%a, %d %b %Y %H:%M:%S %Z')
File "C:\Anaconda\lib_strptime.py", line 325, in _strptime
(data_string, format))
ValueError: time data 'Thu, 16 Oct 2014 01:16:17 EDT' does not match
format '%a, %d %b %Y %H:%M:%S %Z'
However, if I tried the string without "EDT", it worked.
i.e.,
datetime.strptime('Thu, 16 Oct 2014 01:16:17','%a, %d %b %Y %H:%M:%S')
Does anyone know how to parse that "EDT" part?
To parse the date in RFC 2822 format, you could use email package:
from datetime import datetime, timedelta
from email.utils import parsedate_tz, mktime_tz
timestamp = mktime_tz(parsedate_tz("Thu, 16 Oct 2014 01:16:17 EDT"))
# -> 1413436577
utc_dt = datetime(1970, 1, 1) + timedelta(seconds=timestamp)
# -> datetime.datetime(2014, 10, 16, 5, 16, 17)
Note: parsedate_tz() assumes that EDT corresponds to -0400 UTC offset but it might be incorrect in Australia where EDT is +1100 (AEDT is used by pytz in this case) i.e., a timezone abbreviation may be ambiguous. See Parsing date/time string with timezone abbreviated name in Python?
Related Python bug: %Z in strptime doesn't match EST and others.
If your computer uses POSIX timestamps (likely), and you are sure the input date is within an acceptable range for your system (not too far into the future/past), and you don't need to preserve the microsecond precision then you could use datetime.utcfromtimestamp:
from datetime import datetime
from email.utils import parsedate_tz, mktime_tz
timestamp = mktime_tz(parsedate_tz("Thu, 16 Oct 2014 01:16:17 EDT"))
# -> 1413436577
utc_dt = datetime.utcfromtimestamp(timestamp)
# -> datetime.datetime(2014, 10, 16, 5, 16, 17)
The email.utils.parsedate_tz() solution is good for 3-letter timezones but it does not work for 4 letters such as AEDT or CEST. If you need a mix, the answer under Parsing date/time string with timezone abbreviated name in Python? works for both with the most commonly used time zones.

Calculating days using string dates in Python

I have dates in the current string format: 'Tue Feb 19 00:09:28 +1100 2013'
I'm trying to figure out how many days have passed between the date in the string and the present date.
I've been able to convert the string into a date.
import time
day = time.strptime('Tue Feb 19 00:09:28 +1100 2013', '%a %b %d %H:%M:%S +1100 %Y')
Use the datetime module instead:
import datetime
day = datetime.datetime.strptime('Tue Feb 19 00:09:28 +1100 2013', '%a %b %d %H:%M:%S +1100 %Y')
delta = day - datetime.datetime.now()
print delta.days
Subtracting two datetime.datetime values returns a datetime.timedelta object, which has a days attribute.
Your strings do contain a timezone offset, and you hardcoded it to match; if the value varies you'll have to use a parser that can handle the offset. The python-dateutil package includes both an excellent parser and the timezone support to handle this:
>>> from dateutil import parser
>>> parser.parse('Tue Feb 19 00:09:28 +1100 2013')
datetime.datetime(2013, 2, 19, 0, 9, 28, tzinfo=tzoffset(None, 39600))
Note that because this result includes the timezone, you now need to use timezone-aware datetime objects when using date arithmetic:
>>> from dateutil import tz
>>> import datetime
>>> utcnow = datetime.datetime.now(tz.tzutc())
>>> then = parser.parse('Tue Feb 19 00:09:28 +1100 2013')
>>> utcnow - then
datetime.timedelta(31, 12087, 617740)
>>> (utcnow - then).days
31
I created a utcnow variable in the above example based of the UTC timezone before calculating how long ago the parsed date was.

Comparing dates in Python - how to handle time zone modifiers

I am doing Python date comparizon:
Assume I have a date like this: 'Fri Aug 17 12:34:00 2012 +0000'
I am parsing it in the following manner:
dt=datetime.strptime('Fri Aug 17 12:34:00 2012 +0000', '%a %b %d %H:%M:%S %Y +0000')
I could not find on the documentation page how to handle the remaining +0000?
I want to have a more generic solution then hardcoded value.
Perhaps this is quite easy, any hint?
The default datetime module does not handle timezones very well; beyond your current machine timezone and UTC, they are basically not supported.
You'll have to use an external library for that or handle the timezone offset manually.
External library options:
Use dateutil.parser can handle just about any date and or time format you care to throw at it:
from dateutil import parser
dt = parser.parse(s)
The iso8601 library handles only ISO 8601 formats, which include timezone offsets of the same form:
import iso8601
datetimetext, tz = s.rsplit(None, 1) # only grab the timezone portion.
timezone = iso8601.iso8601.parse_timezone('{}:{}'.format(tz[:3], tz[3:]))
dt = datetime.strptime(datetimetext, '%a %b %d %H:%M:%S %Y').replace(tzinfo=timezone)
Demonstration of each approach:
>>> import datetime
>>> s = 'Fri Aug 17 12:34:00 2012 +0000'
>>> import iso8601
>>> timezone = iso8601.iso8601.parse_timezone('{}:{}'.format(tz[:3], tz[3:]))
>>> datetime.datetime.strptime(datetimetext, '%a %b %d %H:%M:%S %Y').replace(tzinfo=timezone)
datetime.datetime(2012, 8, 17, 12, 34, tzinfo=<FixedOffset '+00:00'>)
>>> from dateutil import parser
>>> parser.parse(s)
datetime.datetime(2012, 8, 17, 12, 34, tzinfo=tzutc())
You might want also give Delorean a look. Which is a wrapper around both pytz and dateutil it provides timezone manipulation as well as easy datetime time zone shift.
Here is how I would solve your question with Delorean.
>>> from delorean import parse
>>> parse("2011/01/01 00:00:00 -0700")
Delorean(datetime=2011-01-01 07:00:00+00:00, timezone=UTC)
From there you can return the datetime by simply return the .datetime attribute. If you wan to do some time shifts simply use the Delorean object and do .shift("UTC") etc.
Use the dateutil.parser:
>>> import dateutil.parser
>>> dateutil.parser.parse('Fri Aug 17 12:34:00 2012 +0000')
>>> datetime.datetime(2012, 8, 17, 12, 34, tzinfo=tzutc())

Parsing the string to Dates in python [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Parse date and format it using python?
I'm very new to Python. I have the following two strings :
Mon, 29 Oct 2012 13:07:07 GMT
2012-10-29 12:57:08
I wish there could be any date parsing lib available in python which does parse the above two strings to something like this:
2012-10-29 12:57:08
So that I can compare them. Note that the comparison should be able to produce the result like integer comparison. Like 1 is less than 2, so the same way 2012-10-29 12:57:08 is less than 2012-10-29 13:57:08
Are there any easy to do so in python?
Thanks in advance.
Use the dateutil module for general date parsing:
>>> from dateutil.parser import parse
>>> parse('Mon, 29 Oct 2012 13:07:07 GMT')
datetime.datetime(2012, 10, 29, 13, 7, 7, tzinfo=tzutc())
>>> parse('2012-10-29 12:57:08')
datetime.datetime(2012, 10, 29, 12, 57, 8)
datetime.datetime objects can be compared in various ways.
If you know the exact format of each date string to be parsed, you can also do the parsing more explicitly using the datetime.datetime.strptime() method:
>>> import datetime
>>> datetime.datetime.strptime('Mon, 29 Oct 2012 13:07:07 GMT', '%a, %d %b %Y %H:%M:%S %Z')
datetime.datetime(2012, 10, 29, 13, 7, 7)
Note however that that method ignores timezones!
Yes, time.strptime can convert the text to date representations. From there you can use strftime to print it how you like.
>>> a = '2012-10-29 12:57:08'
>>> time.strptime(a, '%Y-%m-%d %H:%M:%S')
time.struct_time(tm_year=2012, tm_mon=10, tm_mday=29, tm_hour=12, tm_min=57, tm_sec=8, tm_wday=0, tm_yday=303, tm_isdst=-1)
>>> b = 'Mon, 29 Oct 2012 13:07:07 GMT'
>>> time.strptime(b, '%a, %d %b %Y %H:%M:%S %Z')
time.struct_time(tm_year=2012, tm_mon=10, tm_mday=29, tm_hour=13, tm_min=7, tm_sec=7, tm_wday=0, tm_yday=303, tm_isdst=0)
The datetime module in python, with its strptime function (string parse), can do that.
For example, you can use the function like this:
somestring = '2004-03-13T03:00:00Z'
result = datetime.datetime.strptime(somestring, '%Y-%m-%dT%H:%M:%SZ')
Docs here.

Parsing time string in Python

I have a date time string that I don't know how to parse it in Python.
The string is like this:
Tue May 08 15:14:45 +0800 2012
I tried
datetime.strptime("Tue May 08 15:14:45 +0800 2012","%a %b %d %H:%M:%S %z %Y")
but Python raises
'z' is a bad directive in format '%a %b %d %H:%M:%S %z %Y'
According to Python doc:
%z UTC offset in the form +HHMM or -HHMM (empty string if the the object is naive).
What is the right format to parse this time string?
datetime.datetime.strptime has problems with timezone parsing. Have a look at the dateutil package:
>>> from dateutil import parser
>>> parser.parse("Tue May 08 15:14:45 +0800 2012")
datetime.datetime(2012, 5, 8, 15, 14, 45, tzinfo=tzoffset(None, 28800))
Your best bet is to have a look at strptime()
Something along the lines of
>>> from datetime import datetime
>>> date_str = 'Tue May 08 15:14:45 +0800 2012'
>>> date = datetime.strptime(date_str, '%a %B %d %H:%M:%S +0800 %Y')
>>> date
datetime.datetime(2012, 5, 8, 15, 14, 45)
Im not sure how to do the +0800 timezone unfortunately, maybe someone else can help out with that.
The formatting strings can be found at http://docs.python.org/library/time.html#time.strftime and are the same for formatting the string for printing.
Hope that helps
Mark
PS, Your best bet for timezones in installing pytz from pypi. ( http://pytz.sourceforge.net/ )
in fact I think pytz has a great datetime parsing method if i remember correctly. The standard lib is a little thin on the ground with timezone functionality.
Here's a stdlib solution that supports a variable utc offset in the input time string:
>>> from email.utils import parsedate_tz, mktime_tz
>>> from datetime import datetime, timedelta
>>> timestamp = mktime_tz(parsedate_tz('Tue May 08 15:14:45 +0800 2012'))
>>> utc_time = datetime(1970, 1, 1) + timedelta(seconds=timestamp)
>>> utc_time
datetime.datetime(2012, 5, 8, 7, 14, 45)
It has discussed many times in SO. In short, "%z" is not supported because platform not support it.
My solution is a new one, just skip the time zone.:
datetime.datetime.strptime(re.sub(r"[+-]([0-9])+", "", "Tue May 08 15:14:45 +0800 2012"),"%a %b %d %H:%M:%S %Y")
In [117]: datetime.datetime.strptime?
Type: builtin_function_or_method
Base Class: <type 'builtin_function_or_method'>
String Form: <built-in method strptime of type object at 0x9a2520>
Namespace: Interactive
Docstring:
string, format -> new datetime parsed from a string (like time.strptime()).

Categories