I use the dateutil.parser.parse function to recognize a date entered by a user. Normally hours and minutes are separated by a double point but sometimes a user enters something like 6.30pm which is parsed to 18:00. So the minutes are just dropped.
>>> dateutil.parser.parse ('6.30pm')
datetime.datetime(2019, 5, 14, 18, 0)
Is there a way to specify the dot as a legal separator or throw a ValueError if the user uses the wrong separator? I want to show at least an error message to the user and not just process the wrong recognized date.
What about a little substitution previous the parsing operation, something like:
import dateutil.parser
import re
def parse(timestr):
timestr = re.sub(r"(\d{1,2})\.(\d{2})(\D*)$", r"\1:\2\3", timestr)
return dateutil.parser.parse(timestr)
print(parse('6.30pm')) # >> 2019-05-14 18:30:00
print(parse('12:06.30')) # >> 2019-05-14 12:06:30
print(parse('2018-01-01 12:06:05.123')) # >> 2018-01-01 12:06:05.123000
Related
I'm trying to find dates in a string. This is what I'm doing.
def _is_date(string, fuzzy=False):
try:
return parse(string, fuzzy=fuzzy)
except ValueError:
return False
It works on some:-
>>> _is_date('delivered 22-jun-2022', fuzzy=True)
2022-06-22 00:00:00
>>> _is_date('04 sep, lets meet', fuzzy=True)
2022-09-04 00:00:00
however, it returns incorrect results for others.
>>> _is_date('Ive 4 kids', fuzzy=True)
2022-09-04 00:00:00
>> _is_date('samsung galaxy m32 (black,', fuzzy=True)
2022-09-23 00:00:32
>> _is_date('4gb ram..', fuzzy=True)
2022-09-04 00:00:00
How can I fix this? or is there any other way that can help me out with this problem statement.
The fuzzy flag isn't meant to be used the way you're using it. It is meant for processing strings along the lines of "Today is 9/23/22"; for this example, parse will ignore the "Today is " and parse the date/time portion.
Via experimentation, I found that when called with fuzzy=True, parse will try to interpret any character that is a digit as part of a date. Looking at the examples you expected to yield False:
'Ive4 kids' returns a date/time of 2022-09-04 00 - the 4 was taken to be the 4th of the current month
'samsung galaxy m32 (black,' gives 2022-09-23 00:00:32 - the 32 became the number of seconds after midnight today
4gb ram..' - again, the 4 was taken to be the 4th day of the current month
It seems you won't be able to use fuzzy the way you're hoping; somehow you'll have to clean up the strings before you pass them to parse, probably rejecting those that don't have legitimate dates before calling parse.
You might find it instructive to experiment with fuzzy_with_tokens=True instead of fuzzy. With fuzzy_with_tokens set to True, you will receive a two item tuple with a datetime object holding the resulting date in the first item and the ignored text in the second. Also, this might be a useful resource for you: https://dateutil.readthedocs.io/en/stable/parser.html
I am able to parse strings containing date/time with time.strptime
>>> import time
>>> time.strptime('30/03/09 16:31:32', '%d/%m/%y %H:%M:%S')
(2009, 3, 30, 16, 31, 32, 0, 89, -1)
How can I parse a time string that contains milliseconds?
>>> time.strptime('30/03/09 16:31:32.123', '%d/%m/%y %H:%M:%S')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/_strptime.py", line 333, in strptime
data_string[found.end():])
ValueError: unconverted data remains: .123
Python 2.6 added a new strftime/strptime macro %f. The docs are a bit misleading as they only mention microseconds, but %f actually parses any decimal fraction of seconds with up to 6 digits, meaning it also works for milliseconds or even centiseconds or deciseconds.
time.strptime('30/03/09 16:31:32.123', '%d/%m/%y %H:%M:%S.%f')
However, time.struct_time doesn't actually store milliseconds/microseconds. You're better off using datetime, like this:
>>> from datetime import datetime
>>> a = datetime.strptime('30/03/09 16:31:32.123', '%d/%m/%y %H:%M:%S.%f')
>>> a.microsecond
123000
As you can see, .123 is correctly interpreted as 123 000 microseconds.
I know this is an older question but I'm still using Python 2.4.3 and I needed to find a better way of converting the string of data to a datetime.
The solution if datetime doesn't support %f and without needing a try/except is:
(dt, mSecs) = row[5].strip().split(".")
dt = datetime.datetime(*time.strptime(dt, "%Y-%m-%d %H:%M:%S")[0:6])
mSeconds = datetime.timedelta(microseconds = int(mSecs))
fullDateTime = dt + mSeconds
This works for the input string "2010-10-06 09:42:52.266000"
To give the code that nstehr's answer refers to (from its source):
def timeparse(t, format):
"""Parse a time string that might contain fractions of a second.
Fractional seconds are supported using a fragile, miserable hack.
Given a time string like '02:03:04.234234' and a format string of
'%H:%M:%S', time.strptime() will raise a ValueError with this
message: 'unconverted data remains: .234234'. If %S is in the
format string and the ValueError matches as above, a datetime
object will be created from the part that matches and the
microseconds in the time string.
"""
try:
return datetime.datetime(*time.strptime(t, format)[0:6]).time()
except ValueError, msg:
if "%S" in format:
msg = str(msg)
mat = re.match(r"unconverted data remains:"
" \.([0-9]{1,6})$", msg)
if mat is not None:
# fractional seconds are present - this is the style
# used by datetime's isoformat() method
frac = "." + mat.group(1)
t = t[:-len(frac)]
t = datetime.datetime(*time.strptime(t, format)[0:6])
microsecond = int(float(frac)*1e6)
return t.replace(microsecond=microsecond)
else:
mat = re.match(r"unconverted data remains:"
" \,([0-9]{3,3})$", msg)
if mat is not None:
# fractional seconds are present - this is the style
# used by the logging module
frac = "." + mat.group(1)
t = t[:-len(frac)]
t = datetime.datetime(*time.strptime(t, format)[0:6])
microsecond = int(float(frac)*1e6)
return t.replace(microsecond=microsecond)
raise
DNS answer above is actually incorrect. The SO is asking about milliseconds but the answer is for microseconds. Unfortunately, Python`s doesn't have a directive for milliseconds, just microseconds (see doc), but you can workaround it by appending three zeros at the end of the string and parsing the string as microseconds, something like:
datetime.strptime(time_str + '000', '%d/%m/%y %H:%M:%S.%f')
where time_str is formatted like 30/03/09 16:31:32.123.
Hope this helps.
My first thought was to try passing it '30/03/09 16:31:32.123' (with a period instead of a colon between the seconds and the milliseconds.) But that didn't work. A quick glance at the docs indicates that fractional seconds are ignored in any case...
Ah, version differences. This was reported as a bug and now in 2.6+ you can use "%S.%f" to parse it.
from python mailing lists: parsing millisecond thread. There is a function posted there that seems to get the job done, although as mentioned in the author's comments it is kind of a hack. It uses regular expressions to handle the exception that gets raised, and then does some calculations.
You could also try do the regular expressions and calculations up front, before passing it to strptime.
For python 2 i did this
print ( time.strftime("%H:%M:%S", time.localtime(time.time())) + "." + str(time.time()).split(".",1)[1])
it prints time "%H:%M:%S" , splits the time.time() to two substrings (before and after the .) xxxxxxx.xx and since .xx are my milliseconds i add the second substring to my "%H:%M:%S"
hope that makes sense :)
Example output:
13:31:21.72
Blink 01
13:31:21.81
END OF BLINK 01
13:31:26.3
Blink 01
13:31:26.39
END OF BLINK 01
13:31:34.65
Starting Lane 01
I have a string lfile with a datetime in it (type(lfile) gives <type 'str'>) and a Python datetime object wfile. Here is the code:
import os, datetime
lfile = '2005-08-22_11:05:45.000000000'
time_w = os.path.getmtime('{}\\{}.py' .format('C:\Temp_Readouts\RtFyar','TempReads.csv'))
wfile = datetime.datetime.fromtimestamp(time_w)
wfile contains this 2006-11-30 19:08:06.531328 and repr(wfile) gives:
datetime.datetime(2006, 11, 30, 19, 8, 6, 531328)
Problem:
I need to:
convert lfile into a Python datetime object
compare lfile to wfile and determine which datetime is more recent
For 1.:
I am only able to get a partial solution using strptime as per here. Here is what I tried:
lfile = datetime.datetime.strptime(linx_file_dtime, '%Y-%m-%d_%H:%M:%S')
The output is:
`ValueError: unconverted data remains: .000`
Question 1
It seems that strptime() cannot handle the nano seconds. How do I tell strptime() to ignore the last 3 zeros?
For 2.:
When I use type(wfile) I get <type 'datetime.datetime'>. If both wfile and lfile are Python datetime objects (i.e. if step 1. is successful), then would this work?:
if wtime < ltime:
print 'Linux file created after Windows file'
else:
print 'Windows file created after Linux file'
Question 2
Or is there some other way in which Python can compare datetime objects to determine which of the two occurred after the other?
Question 1
Python handles microseconds, not nano seconds. You can strip the last three characters of the time to convert it to microseconds and then add .%f to the end:
lfile = datetime.datetime.strptime(linx_file_dtime[:-3], '%Y-%m-%d_%H:%M:%S.%f')
Question 2
Yes, comparison works:
if wtime < ltime:
...
That's right, strptime() does not handle nanoseconds. The accepted answer in the question that you linked to offers an option: strip off the last 3 digits and then parse with .%f appended to the format string.
Another option is to use dateutil.parser.parse():
>>> from dateutil.parser import parse
>>> parse('2005-08-22_11:05:45.123456789', fuzzy=True)
datetime.datetime(2005, 8, 22, 11, 5, 45, 123456)
fuzzy=True is required to overlook the unsupported underscore between date and time components. Because datetime objects do not support nanoseconds, the last 3 digits vanish, leaving microsecond accuracy.
Using the pysnmp framework i get some values doing a snmp walk. Unfortunately for the oid
1.3.6.1.21.69.1.5.8.1.2 (DOCS-CABLE-DEVICE-MIB)
i get a weird result which i cant correctly print here since it contains ascii chars like BEL ACK
When doing a repr i get:
OctetString('\x07\xd8\t\x17\x03\x184\x00')
But the output should look like:
2008-9-23,3:24:52.0
the format is called "DateAndTime". How can i translate the OctetString output to a "human readable" date/time ?
You can find the format specification here.
A date-time specification.
field octets contents range
----- ------ -------- -----
1 1-2 year* 0..65536
2 3 month 1..12
3 4 day 1..31
4 5 hour 0..23
5 6 minutes 0..59
6 7 seconds 0..60
(use 60 for leap-second)
7 8 deci-seconds 0..9
8 9 direction from UTC '+' / '-'
9 10 hours from UTC* 0..13
10 11 minutes from UTC 0..59
* Notes:
- the value of year is in network-byte order
- daylight saving time in New Zealand is +13 For example,
Tuesday May 26, 1992 at 1:30:15 PM EDT would be displayed as:
1992-5-26,13:30:15.0,-4:0
Note that if only local time is known, then timezone
information (fields 8-10) is not present.
In order to decode your sample data you can use this quick-and-dirty one-liner:
>>> import struct, datetime
>>> s = '\x07\xd8\t\x17\x03\x184\x00'
>>> datetime.datetime(*struct.unpack('>HBBBBBB', s))
datetime.datetime(2008, 9, 23, 3, 24, 52)
The example above is far from perfect, it does not account for size (this object has variable size) and is missing timezone information. Also note that the field 7 is deci-seconds (0..9) while timetuple[6] is microseconds (0 <= x < 1000000); the correct implementations is left as an exercise for the reader.
[update]
8 years later, lets try to fix this answer (am I lazy or what?):
import struct, pytz, datetime
def decode_snmp_date(octetstr: bytes) -> datetime.datetime:
size = len(octetstr)
if size == 8:
(year, month, day, hour, minutes,
seconds, deci_seconds,
) = struct.unpack('>HBBBBBB', octetstr)
return datetime.datetime(
year, month, day, hour, minutes, seconds,
deci_seconds * 100_000, tzinfo=pytz.utc)
elif size == 11:
(year, month, day, hour, minutes,
seconds, deci_seconds, direction,
hours_from_utc, minutes_from_utc,
) = struct.unpack('>HBBBBBBcBB', octetstr)
offset = datetime.timedelta(
hours=hours_from_utc, minutes=minutes_from_utc)
if direction == b'-':
offset = -offset
return datetime.datetime(
year, month, day, hour, minutes, seconds,
deci_seconds * 100_000, tzinfo=pytz.utc) + offset
raise ValueError("The provided OCTETSTR is not a valid SNMP date")
I'm not sure I got the timezone offset right but I don't have sample data to test, feel free to amend the answer or ping me in the comments.
#Paulo Scardine: This was the best answer I found online when working to resolve a very similar problem. It still took me a little while to resolve my issue even with this answer, so I wanted to post a follow up answer that may add more clarity. (specifically the issue with the date having different length options).
The following piece of code connects to a server and grabs the system time and then outputs it as a string to illustrate the method.
import netsnmp
import struct
oid = netsnmp.Varbind('hrSystemDate.0')
resp = netsnmp.snmpget(oid, Version=1, DestHost='<ip>', Community='public')
oct = str(resp[0])
# hrSystemDate can be either 8 or 11 units in length.
oct_len = len(oct)
fmt_mapping = dict({8:'>HBBBBBB', 11:'>HBBBBBBcBB'})
if oct_len == 8 or oct_len == 11:
t = struct.unpack(fmt_mapping[oct_len], oct)
print 'date tuple: %s' % (repr(t))
else:
print 'invalid date format'
I hope this helps other people who are having similar issues trying to work with this type of data.
Shameless plug here: The Pycopia SNMP and SMI modules correctly handle this object, and others as well.
Pycopia is installed from source, and dont forget the mibs file if you try it.
I need to convert a string to a datetime object, along with the fractional seconds. I'm running into various problems.
Normally, i would do:
>>> datetime.datetime.strptime(val, "%Y-%m-%dT%H:%M:%S.%f")
But errors and old docs showed me that python2.5's strptime does not have %f...
Investigating further, it seems that the App Engine's data store does not like fractional seconds. Upon editing a datastore entity, trying to add .5 to the datetime field gave me the following error:
ValueError: unconverted data remains: .5
I doubt that fractional seconds are not supported... so this is just on the datastore viewer, right?
Has anyone circumvented this issue? I want to use the native datetime objects... I rather not store UNIX timestamps...
Thanks!
EDIT: Thanks to Jacob Oscarson for the .replace(...) tip!
One thing to keep in mind is to check the length of nofrag before feeding it in. Different sources use different precision for seconds.
Here's a quick function for those looking for something similar:
def strptime(val):
if '.' not in val:
return datetime.datetime.strptime(val, "%Y-%m-%dT%H:%M:%S")
nofrag, frag = val.split(".")
date = datetime.datetime.strptime(nofrag, "%Y-%m-%dT%H:%M:%S")
frag = frag[:6] # truncate to microseconds
frag += (6 - len(frag)) * '0' # add 0s
return date.replace(microsecond=int(frag))
Parsing
Without the %f format support for datetime.datetime.strptime() you can still sufficiently easy enter it into a datetime.datetime object (randomly picking a value for your val here) using datetime.datetime.replace()), tested on 2.5.5:
>>> val = '2010-08-06T10:00:14.143896'
>>> nofrag, frag = val.split('.')
>>> nofrag_dt = datetime.datetime.strptime(nofrag, "%Y-%m-%dT%H:%M:%S")
>>> dt = nofrag_dt.replace(microsecond=int(frag))
>>> dt
datetime.datetime(2010, 8, 6, 10, 0, 14, 143896)
Now you have your datetime.datetime object.
Storing
Reading further into http://code.google.com/appengine/docs/python/datastore/typesandpropertyclasses.html#datetime
I can see no mentioning that fractions isn't supported, so yes, it's probably only the datastore viewer. The docs points directly to Python 2.5.2's module docs for datetime, and it does support fractions, just not the %f parsing directive for strptime. Querying for fractions might be trickier, though..
All ancient history by now, but in these modern times you can also conveniently use dateutil
from dateutil import parser as DUp
funky_time_str = "1/1/2011 12:51:00.0123 AM"
foo = DUp.parse(funky_time_str)
print foo.timetuple()
# time.struct_time(tm_year=2011, tm_mon=1, tm_mday=1, tm_hour=0, tm_min=51, tm_sec=0, tm_wday=5, tm_yday=1, tm_isdst=-1)
print foo.microsecond
# 12300
print foo
# 2011-01-01 00:51:00.012300
dateutil supports a surprising variety of possible input formats, which it parses without pattern strings.