Python convert string to datetime for comparison to datetime object - python

I have a string lfile with a datetime in it (type(lfile) gives <type 'str'>) and a Python datetime object wfile. Here is the code:
import os, datetime
lfile = '2005-08-22_11:05:45.000000000'
time_w = os.path.getmtime('{}\\{}.py' .format('C:\Temp_Readouts\RtFyar','TempReads.csv'))
wfile = datetime.datetime.fromtimestamp(time_w)
wfile contains this 2006-11-30 19:08:06.531328 and repr(wfile) gives:
datetime.datetime(2006, 11, 30, 19, 8, 6, 531328)
Problem:
I need to:
convert lfile into a Python datetime object
compare lfile to wfile and determine which datetime is more recent
For 1.:
I am only able to get a partial solution using strptime as per here. Here is what I tried:
lfile = datetime.datetime.strptime(linx_file_dtime, '%Y-%m-%d_%H:%M:%S')
The output is:
`ValueError: unconverted data remains: .000`
Question 1
It seems that strptime() cannot handle the nano seconds. How do I tell strptime() to ignore the last 3 zeros?
For 2.:
When I use type(wfile) I get <type 'datetime.datetime'>. If both wfile and lfile are Python datetime objects (i.e. if step 1. is successful), then would this work?:
if wtime < ltime:
print 'Linux file created after Windows file'
else:
print 'Windows file created after Linux file'
Question 2
Or is there some other way in which Python can compare datetime objects to determine which of the two occurred after the other?

Question 1
Python handles microseconds, not nano seconds. You can strip the last three characters of the time to convert it to microseconds and then add .%f to the end:
lfile = datetime.datetime.strptime(linx_file_dtime[:-3], '%Y-%m-%d_%H:%M:%S.%f')
Question 2
Yes, comparison works:
if wtime < ltime:
...

That's right, strptime() does not handle nanoseconds. The accepted answer in the question that you linked to offers an option: strip off the last 3 digits and then parse with .%f appended to the format string.
Another option is to use dateutil.parser.parse():
>>> from dateutil.parser import parse
>>> parse('2005-08-22_11:05:45.123456789', fuzzy=True)
datetime.datetime(2005, 8, 22, 11, 5, 45, 123456)
fuzzy=True is required to overlook the unsupported underscore between date and time components. Because datetime objects do not support nanoseconds, the last 3 digits vanish, leaving microsecond accuracy.

Related

How to convert time string with very precise time measurements to date objects using strptime()? [duplicate]

I am able to parse strings containing date/time with time.strptime
>>> import time
>>> time.strptime('30/03/09 16:31:32', '%d/%m/%y %H:%M:%S')
(2009, 3, 30, 16, 31, 32, 0, 89, -1)
How can I parse a time string that contains milliseconds?
>>> time.strptime('30/03/09 16:31:32.123', '%d/%m/%y %H:%M:%S')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/_strptime.py", line 333, in strptime
data_string[found.end():])
ValueError: unconverted data remains: .123
Python 2.6 added a new strftime/strptime macro %f. The docs are a bit misleading as they only mention microseconds, but %f actually parses any decimal fraction of seconds with up to 6 digits, meaning it also works for milliseconds or even centiseconds or deciseconds.
time.strptime('30/03/09 16:31:32.123', '%d/%m/%y %H:%M:%S.%f')
However, time.struct_time doesn't actually store milliseconds/microseconds. You're better off using datetime, like this:
>>> from datetime import datetime
>>> a = datetime.strptime('30/03/09 16:31:32.123', '%d/%m/%y %H:%M:%S.%f')
>>> a.microsecond
123000
As you can see, .123 is correctly interpreted as 123 000 microseconds.
I know this is an older question but I'm still using Python 2.4.3 and I needed to find a better way of converting the string of data to a datetime.
The solution if datetime doesn't support %f and without needing a try/except is:
(dt, mSecs) = row[5].strip().split(".")
dt = datetime.datetime(*time.strptime(dt, "%Y-%m-%d %H:%M:%S")[0:6])
mSeconds = datetime.timedelta(microseconds = int(mSecs))
fullDateTime = dt + mSeconds
This works for the input string "2010-10-06 09:42:52.266000"
To give the code that nstehr's answer refers to (from its source):
def timeparse(t, format):
"""Parse a time string that might contain fractions of a second.
Fractional seconds are supported using a fragile, miserable hack.
Given a time string like '02:03:04.234234' and a format string of
'%H:%M:%S', time.strptime() will raise a ValueError with this
message: 'unconverted data remains: .234234'. If %S is in the
format string and the ValueError matches as above, a datetime
object will be created from the part that matches and the
microseconds in the time string.
"""
try:
return datetime.datetime(*time.strptime(t, format)[0:6]).time()
except ValueError, msg:
if "%S" in format:
msg = str(msg)
mat = re.match(r"unconverted data remains:"
" \.([0-9]{1,6})$", msg)
if mat is not None:
# fractional seconds are present - this is the style
# used by datetime's isoformat() method
frac = "." + mat.group(1)
t = t[:-len(frac)]
t = datetime.datetime(*time.strptime(t, format)[0:6])
microsecond = int(float(frac)*1e6)
return t.replace(microsecond=microsecond)
else:
mat = re.match(r"unconverted data remains:"
" \,([0-9]{3,3})$", msg)
if mat is not None:
# fractional seconds are present - this is the style
# used by the logging module
frac = "." + mat.group(1)
t = t[:-len(frac)]
t = datetime.datetime(*time.strptime(t, format)[0:6])
microsecond = int(float(frac)*1e6)
return t.replace(microsecond=microsecond)
raise
DNS answer above is actually incorrect. The SO is asking about milliseconds but the answer is for microseconds. Unfortunately, Python`s doesn't have a directive for milliseconds, just microseconds (see doc), but you can workaround it by appending three zeros at the end of the string and parsing the string as microseconds, something like:
datetime.strptime(time_str + '000', '%d/%m/%y %H:%M:%S.%f')
where time_str is formatted like 30/03/09 16:31:32.123.
Hope this helps.
My first thought was to try passing it '30/03/09 16:31:32.123' (with a period instead of a colon between the seconds and the milliseconds.) But that didn't work. A quick glance at the docs indicates that fractional seconds are ignored in any case...
Ah, version differences. This was reported as a bug and now in 2.6+ you can use "%S.%f" to parse it.
from python mailing lists: parsing millisecond thread. There is a function posted there that seems to get the job done, although as mentioned in the author's comments it is kind of a hack. It uses regular expressions to handle the exception that gets raised, and then does some calculations.
You could also try do the regular expressions and calculations up front, before passing it to strptime.
For python 2 i did this
print ( time.strftime("%H:%M:%S", time.localtime(time.time())) + "." + str(time.time()).split(".",1)[1])
it prints time "%H:%M:%S" , splits the time.time() to two substrings (before and after the .) xxxxxxx.xx and since .xx are my milliseconds i add the second substring to my "%H:%M:%S"
hope that makes sense :)
Example output:
13:31:21.72
Blink 01
13:31:21.81
END OF BLINK 01
13:31:26.3
Blink 01
13:31:26.39
END OF BLINK 01
13:31:34.65
Starting Lane 01

How to parse datetime that ends with `Z`?

I have the following datetime string s:
2017-10-18T04:46:53.553472514Z
I parese it like that:
t = datetime.strptime(s, '%Y-%m-%dT%H:%M:%SZ')
how to fix ValueError: time data '2017-10-18T04:46:53.553472514Z' does not match format '%Y-%m-%dT%H:%M:%SZ'
In theory,
t = datetime.strptime(s, '%Y-%m-%dT%H:%M:%S.%fZ')
would be the correct format string as you have fractions of second as well. BUT they would then need to be microseconds. Yours are probably nanoseconds, as %f only takes maximum of 6 digits.
So you need to do something like this:
t = datetime.datetime.strptime(s.split(".")[0], '%Y-%m-%dT%H:%M:%S')
t = t + datetime.timedelta(microseconds=int(s.split(".")[1][:-1])/1000)
print (t)
This works but it converts nanoseconds to microseconds. If this is not ok, then you need to do something else.
I think you should use dateutil.parser module
In [23]: s = "2017-10-18T04:46:53.553472514Z"
In [24]: import dateutil.parser as p
In [25]: p.parse(s)
Out[25]: datetime.datetime(2017, 10, 18, 4, 46, 53, 553472, tzinfo=tzutc())
As #FObersteiner pointed out, %z parses Z as well as ±HHMM[SS[.ffffff]] style of timezone encoding.
This information is hidden in the technical detail #6. :-/
Previous incorrect answer:
To answer the original question: there's no way to parse "Z" as the timezone using only the standard library.~
Here's some discussion on that: https://bugs.python.org/issue35829
I like the mx.DateTime module.
import mx.DateTime as dt
dt.DateTimeFrom('2017-10-18T04:46:53.553472514Z')
datetime is another Python built-in module that is completely superseded by a better extra module: arrow.
You can just do:
import arrow
dt = arrow.get('2017-10-18T04:46:53.553472514Z').datetime
Your date string being in standardized ISO format, it will be parsed without even giving a format string. The parsed df will be:
datetime.datetime(2017, 10, 18, 4, 46, 53, 553472, tzinfo=tzutc())
Or keep the Arrow object in case you want to keep the extra precision

Is there a wildcard format directive for strptime?

I'm using strptime like this:
import time
time.strptime("+10:00","+%H:%M")
but "+10:00" could also be "-10:00" (timezone offset from UTC) which would break the above command. I could use
time.strptime("+10:00"[1:],"%H:%M")
but ideally I'd find it more readable to use a wildcard in front of the format code.
Does such a wildcard operator exist for Python's strptime / strftime?
There is no wildcard operator. The list of format directives supported by strptime is in the docs.
What you're looking for is the %z format directive, which supports a representation of the timezone of the form +HHMM or -HHMM. While it has been supported by datetime.strftime for some time, it is only supported in strptime starting in Python 3.2.
On Python 2, the best way to handle this is probably to use datetime.datetime.strptime, manually handle the negative offset, and get a datetime.timedelta:
import datetime
tz = "+10:00"
def tz_to_timedelta(tz):
min = datetime.datetime.strptime('', '')
try:
return -(datetime.datetime.strptime(tz,"-%H:%M") - min)
except ValueError:
return datetime.datetime.strptime(tz,"+%H:%M") - min
print tz_to_timedelta(tz)
In Python 3.2, remove the : and use %z:
import time
tz = "+10:00"
tz_toconvert = tz[:3] + tz[4:]
tz_struct_time = time.strptime(tz_toconvert, "%z")
We developed datetime-glob to parse date/times from a list of files generated by a consistent date/time formatting. From the module's documentation:
>>> import datetime_glob
>>> matcher = datetime_glob.Matcher(
pattern='/some/path/*%Y-%m-%dT%H-%M-%SZ.jpg')
>>> matcher.match(path='/some/path/some-text2016-07-03T21-22-23Z.jpg')
datetime_glob.Match(year = 2016, month = 7, day = 3,
hour = 21, minute = 22, second = 23, microsecond = None)
>>> match.as_datetime()
datetime.datetime(2016, 7, 3, 21, 22, 23)

string to datetime with fractional seconds, on Google App Engine

I need to convert a string to a datetime object, along with the fractional seconds. I'm running into various problems.
Normally, i would do:
>>> datetime.datetime.strptime(val, "%Y-%m-%dT%H:%M:%S.%f")
But errors and old docs showed me that python2.5's strptime does not have %f...
Investigating further, it seems that the App Engine's data store does not like fractional seconds. Upon editing a datastore entity, trying to add .5 to the datetime field gave me the following error:
ValueError: unconverted data remains: .5
I doubt that fractional seconds are not supported... so this is just on the datastore viewer, right?
Has anyone circumvented this issue? I want to use the native datetime objects... I rather not store UNIX timestamps...
Thanks!
EDIT: Thanks to Jacob Oscarson for the .replace(...) tip!
One thing to keep in mind is to check the length of nofrag before feeding it in. Different sources use different precision for seconds.
Here's a quick function for those looking for something similar:
def strptime(val):
if '.' not in val:
return datetime.datetime.strptime(val, "%Y-%m-%dT%H:%M:%S")
nofrag, frag = val.split(".")
date = datetime.datetime.strptime(nofrag, "%Y-%m-%dT%H:%M:%S")
frag = frag[:6] # truncate to microseconds
frag += (6 - len(frag)) * '0' # add 0s
return date.replace(microsecond=int(frag))
Parsing
Without the %f format support for datetime.datetime.strptime() you can still sufficiently easy enter it into a datetime.datetime object (randomly picking a value for your val here) using datetime.datetime.replace()), tested on 2.5.5:
>>> val = '2010-08-06T10:00:14.143896'
>>> nofrag, frag = val.split('.')
>>> nofrag_dt = datetime.datetime.strptime(nofrag, "%Y-%m-%dT%H:%M:%S")
>>> dt = nofrag_dt.replace(microsecond=int(frag))
>>> dt
datetime.datetime(2010, 8, 6, 10, 0, 14, 143896)
Now you have your datetime.datetime object.
Storing
Reading further into http://code.google.com/appengine/docs/python/datastore/typesandpropertyclasses.html#datetime
I can see no mentioning that fractions isn't supported, so yes, it's probably only the datastore viewer. The docs points directly to Python 2.5.2's module docs for datetime, and it does support fractions, just not the %f parsing directive for strptime. Querying for fractions might be trickier, though..
All ancient history by now, but in these modern times you can also conveniently use dateutil
from dateutil import parser as DUp
funky_time_str = "1/1/2011 12:51:00.0123 AM"
foo = DUp.parse(funky_time_str)
print foo.timetuple()
# time.struct_time(tm_year=2011, tm_mon=1, tm_mday=1, tm_hour=0, tm_min=51, tm_sec=0, tm_wday=5, tm_yday=1, tm_isdst=-1)
print foo.microsecond
# 12300
print foo
# 2011-01-01 00:51:00.012300
dateutil supports a surprising variety of possible input formats, which it parses without pattern strings.

Convert an RFC 3339 time to a standard Python timestamp

Is there an easy way to convert an RFC 3339 time into a regular Python timestamp?
I've got a script which is reading an ATOM feed and I'd like to be able to compare the timestamp of an item in the ATOM feed to the modification time of a file.
I notice from the ATOM spec, that ATOM dates include a time zone offset (Z<a number>) but, in my case, there's nothing after the Z so I guess we can assume GMT.
I suppose I could parse the time with a regex of some sort but I was hoping Python had a built-in way of doing it that I just haven't been able to find.
You don't include an example, but if you don't have a Z-offset or timezone, and assuming you don't want durations but just the basic time, then maybe this will suit you:
import datetime as dt
>>> dt.datetime.strptime('1985-04-12T23:20:50.52', '%Y-%m-%dT%H:%M:%S.%f')
datetime.datetime(1985, 4, 12, 23, 20, 50, 520000)
The strptime() function was added to the datetime module in Python 2.5 so some people don't yet know it's there.
Edit: The time.strptime() function has existed for a while though, and works about the same to give you a struct_time value:
>>> ts = time.strptime('1985-04-12T23:20:50.52', '%Y-%m-%dT%H:%M:%S.%f')
>>> ts
time.struct_time(tm_year=1985, tm_mon=4, tm_mday=12, tm_hour=23, tm_min=20, tm_sec=50, tm_wday=4, tm_yday=102, tm_isdst=-1)
>>> time.mktime(ts)
482210450.0
I struggled with RFC3339 datetime format a lot, but I found a suitable solution to convert date_string <=> datetime_object in both directions.
You need two different external modules, because one of them is is only able to do the conversion in one direction (unfortunately):
first install:
sudo pip install rfc3339
sudo pip install iso8601
then include:
import datetime # for general datetime object handling
import rfc3339 # for date object -> date string
import iso8601 # for date string -> date object
For not needing to remember which module is for which direction, I wrote two simple helper functions:
def get_date_object(date_string):
return iso8601.parse_date(date_string)
def get_date_string(date_object):
return rfc3339.rfc3339(date_object)
which inside your code you can easily use like this:
input_string = '1989-01-01T00:18:07-05:00'
test_date = get_date_object(input_string)
# >>> datetime.datetime(1989, 1, 1, 0, 18, 7, tzinfo=<FixedOffset '-05:00' datetime.timedelta(-1, 68400)>)
test_string = get_date_string(test_date)
# >>> '1989-01-01T00:18:07-05:00'
test_string is input_string # >>> True
Heureka! Now you can easily (haha) use your date strings and date strings in a useable format.
No builtin, afaik.
feed.date.rfc3339
This is a Python library module with functions for converting timestamp strings in RFC 3339 format to Python time float values, and vice versa. RFC 3339 is the timestamp format used by the Atom feed syndication format.
It is BSD-licensed.
http://home.blarg.net/~steveha/pyfeed.html
(Edited so it's clear I didn't write it. :-)
The new datetime.fromisoformat(date_string) method which was added in Python 3.7 will parse most RFC 3339 timestamps, including those with time zone offsets. It's not a full implementation, so be sure to test your use case.
>>> from datetime import datetime
>>> datetime.fromisoformat('2011-11-04')
datetime.datetime(2011, 11, 4, 0, 0)
>>> datetime.fromisoformat('2011-11-04T00:05:23')
datetime.datetime(2011, 11, 4, 0, 5, 23)
>>> datetime.fromisoformat('2011-11-04 00:05:23.283')
datetime.datetime(2011, 11, 4, 0, 5, 23, 283000)
>>> datetime.fromisoformat('2011-11-04 00:05:23.283+00:00')
datetime.datetime(2011, 11, 4, 0, 5, 23, 283000, tzinfo=datetime.timezone.utc)
>>> datetime.fromisoformat('2011-11-04T00:05:23+04:00')
datetime.datetime(2011, 11, 4, 0, 5, 23,
tzinfo=datetime.timezone(datetime.timedelta(seconds=14400)))
If you're using Django, you could use Django's function parse_datetime:
>>> from django.utils.dateparse import parse_datetime
>>> parse_datetime("2016-07-19T07:30:36+05:00")
datetime.datetime(2016, 7, 19, 7, 30, 36, tzinfo=<django.utils.timezone.FixedOffset object at 0x101c0c1d0>)
http://pypi.python.org/pypi/iso8601/ seems to be able to parse iso 8601, which RFC 3339 is a subset of, maybe this could be useful, but again, not built-in.
The simplest solution for me has been dateutil python standart library.
from dateutil.parser import parse
dt = "2020-11-23T11:08:23.022277705Z"
print(parse(dt))
Output:
2020-11-23 11:08:23.022277+00:00
If you don't need the timezone element, just simply set timezone info to None
print(parse(t).replace(tzinfo=None))
The output is a nice and clean datetime object:
2020-11-23 11:08:23.022277
http://bugs.python.org/issue15873 (duplicate of http://bugs.python.org/issue5207 )
Looks like there isn't a built-in as of yet.
feedparser.py provides robust/extensible way to parse various date formats that may be encountered in real-world atom/rss feeds:
>>> from feedparser import _parse_date as parse_date
>>> parse_date('1985-04-12T23:20:50.52Z')
time.struct_time(tm_year=1985, tm_mon=4, tm_mday=12, tm_hour=23, tm_min=20,
tm_sec=50, tm_wday=4, tm_yday=102, tm_isdst=1)
try this, it works fine for me
datetime_obj = datetime.strptime("2014-01-01T00:00:00Z", '%Y-%m-%dT%H:%M:%SZ')
or
datetime_obj = datetime.strptime("Mon, 01 Jun 2015 16:41:40 GMT", '%a, %d %b %Y %H:%M:%S GMT')
Came across the awesome dateutil.parser module in another question, and tried it on my RFC3339 problem, and it appears to handle everything I throw at it with more sanity that any of the other responses in this question.
Using Python 3, you can use RegEx to break the RFC 3339 timestamp into its components.
Then, directly create the datetime object, no additional modules needed:
import re
import datetime
def parse_rfc3339(dt):
broken = re.search(r'([0-9]{4})-([0-9]{2})-([0-9]{2})T([0-9]{2}):([0-9]{2}):([0-9]{2})(\.([0-9]+))?(Z|([+-][0-9]{2}):([0-9]{2}))', dt)
return(datetime.datetime(
year = int(broken.group(1)),
month = int(broken.group(2)),
day = int(broken.group(3)),
hour = int(broken.group(4)),
minute = int(broken.group(5)),
second = int(broken.group(6)),
microsecond = int(broken.group(8) or "0"),
tzinfo = datetime.timezone(datetime.timedelta(
hours = int(broken.group(10) or "0"),
minutes = int(broken.group(11) or "0")))))
This example theads missing timezones or microseconds as "0" but might need additional error checking.
Cheers, Alex
You could use a Google API Core package. They have a really straightforward Datetime to RFC 3339 conversion function. You can find more info in their docs.
Its usage is as simple as:
from google.api_core.datetime_helpers import to_rfc3339
rfc3339_str = to_rfc3339(datetime.now())
They even have a function that works the other way around from_rfc3339 and from_rfc3339_nanos.
rfc3339 library: http://henry.precheur.org/python/rfc3339
I have been doing a deep dive in dateimes and RFC3339 and recently come across the arrow library and have just used and solved my problem:
import arrow
date_string = "2015-11-24 00:00:00+00:00"
my_datetime = arrow.get(date_string).datetime

Categories