Parsing hh:mm in Python

Parsing hh:mm in Python - python

Sometimes I get a string like "02:40" indicating 2 hours and 40 minutes. I'd like to parse that string into the number of minutes (160 in this case) using Python.
Sure, I can parse the string and multiply the hours by 60, but is there something in the standard lib that does this?

Personally, I think simply parsing the string is far easier to read:
>>> s = '02:40'
>>> int(s[:-3]) * 60 + int(s[-2:])
160
Note that using negative indexing means it will handle strings without the leading zero on the hour:
>>> s = '2:40'
>>> int(s[:-3]) * 60 + int(s[-2:])
160
You could also use the split() function:
>>> hours, minutes = s.split(':')
>>> int(hours) * 60 + int(minutes)
160
Or use the map() function to convert the pieces to integers:
>>> hours, minutes = map(int, s.split(':'))
>>> hours * 60 + minutes
160
Speed
Using the timeit module indicates it is also faster than other methods proposed here:
>>> import timeit
>>> parsetime = timeit.timeit("mins = int(s[:-3]) * 60 + int(s[-2:])", "s='02:40'", number=100000) / 100000
>>> parsetime
9.018449783325196e-06
The split() method is a bit slower:
>>> splittime = timeit.timeit("hours,minutes = s.split(':'); mins=int(hours)*60 + int(minutes)", "s='02:40'", number=100000)/100000
>>> splittime
1.1217889785766602e-05
>>> splittime/parsetime
1.2438822697120402
And using map() a bit slower again:
>>> splitmaptime = timeit.timeit("hours,minutes = map(int, s.split(':')); mins=hours*60 + minutes", "s='02:40'", number=100000)/100000
>>> splitmaptime
1.3971350193023682e-05
>>> splitmaptime/parsetime
1.5491964282881776
John Machin's map and sum is about 2.4 times slower:
>>> summaptime = timeit.timeit('mins=sum(map(lambda x, y: x * y, map(int, "2:40".split(":")), [60, 1]))', "s='02:40'", number=100000) / 100000
>>> summaptime
2.1276121139526366e-05
>>> summaptime/parsetime
2.43
Chrono Kitsune's strptime()-based answer is ten times slower:
>>> strp = timeit.timeit("t=time.strptime(s, '%H:%M');mins=t.tm_hour * 60 + t.tm_min", "import time; s='02:40'", number=100000)/100000
>>> strp
9.0362770557403569e-05
>>> strp/parsetime
10.019767557444432

Other than the following, string parsing (or if you want to be even slower for something so simple, use the re module) is the only way I can think of if you rely on the standard library. TimeDelta doesn't seem to suit the task.
>>> import time
>>> x = "02:40"
>>> t = time.strptime(x, "%H:%M")
>>> minutes = t.tm_hour * 60 + t.tm_min
>>> minutes
160

See http://webcache.googleusercontent.com/search?q=cache:EAuL4vECPBEJ:docs.python.org/library/datetime.html+python+datetime&hl=en&client=firefox-a&gl=us&strip=1 since the main Python site is having problems.
The function you want is datetime.strptime or time.strptime, which create either a datetime or time object from a string with a time and another string describing the format.
If you want to not have to describe the format, use dateutil, http://labix.org/python-dateutil.
from dateutil.parser import parse
>>> d = parse('2009/05/13 19:19:30 -0400')
>>> d
datetime.datetime(2009, 5, 13, 19, 19, 30, tzinfo=tzoffset(None, -14400))
See How to parse dates with -0400 timezone string in python?

>>> sum(map(lambda x, y: x * y, map(int, "2:40".split(":")), [60, 1]))
160

I'm sure you can represent the given time as a TimeDelta object. From there I am sure there is an easy way to represent the TimeDelta in minutes.

There is:
from time import strptime
from calendar import timegm
T = '02:40'
t = timegm(strptime('19700101'+T,'%Y%m%d%H:%M'))
print t
But is this really better than brute calculus ?
.
An exotic solution, that doesn't need importing functions :
T = '02:40'
exec('x = %s' % T.replace(':','*60+'))
print x
edit: corrected second solution to obtain minutes, not seconds
.
Simplest solution
T = '02:40'
print int(T[0:2])*60 + int(T[3:])

Related

How to extract time and store in an Array/Structure

I want to measure the time difference between two time readings in a Python script.
time_stamp_1 = datetime.datetime.now()
# some task
time_stamp_2 = datetime.datetime.now()
time_stamp = time_stamp_2 - time_stamp_1
time_stamp_sec = ??
I got the result as 0:00:04.052000.
What exactly is this time format I am getting?
How to extract the time in seconds from time_stamp?
How to store time_stamp in an array like variable (as I want to use a loop and store the time_stamp for each loop)
How to store time_stamp_sec in an array like variable

1) It's a datetime.timedelta object which is being printed nicely:
>>> import datetime
>>> datetime.timedelta(days=3)
datetime.timedelta(3)
>>> print(datetime.timedelta(days=3))
3 days, 0:00:00
>>> print(datetime.timedelta(hours=3))
3:00:00
2) You can use total_seconds():
>>> t = datetime.timedelta(hours=3)
>>> t.total_seconds()
10800.0
This gives you a float as the datetime.timedelta may represent an amount of time which isn't an exact number of seconds.
3) You can append it to a list at each step in the loop:
>>> d = []
>>> d.append(t)
>>> d
[datetime.timedelta(0, 10800)]
4) You can do the same with the result of total_seconds():
>>> e = []
>>> e.append(t.total_seconds())
>>> e
[10800.0]

Most efficient way to convert string-time to time

I have to run about a million operations to do:
"Runtime": "01:12:00" --> datetime.time(1,12)
What would be the most performant way to do this? Right now I'm just doing a split on the semicolons, and the doing a datetime.time(...) --
s = '01:12:00'
h,m,s = [int(i) for i in s.split(':')
st = datetime.time(hour=h, minute=m, second=s)

Using the timeit module you could test different implementations yourself:
import datetime
import re
PAT = re.compile('(\d{2}):(\d{2}):(\d{2})')
TSTR = "01:12:00"
def fun1():
dt = datetime.datetime.strptime(TSTR, "%H:%M:%S")
return dt
def fun2():
h,m,s = [int(i) for i in TSTR.split(':')]
dt = datetime.time(hour=h, minute=m, second=s)
return dt
def fun3():
mat = PAT.match(TSTR)
dt = datetime.time(hour=int(mat.group(1)), minute=int(mat.group(2)), second=int(mat.group(3)))
return dt
def fun4():
h,m,s = int(TSTR[0:2]), int(TSTR[3:5]), int(TSTR[6:8])
dt = datetime.time(hour=h, minute=m, second=s)
return dt
if __name__ == "__main__":
import timeit
# Use the default repeat arguments: repeat=3, number=1000000
print(min(timeit.repeat("fun1()", setup="from __main__ import fun1"))) # 15.5739
print(min(timeit.repeat("fun2()", setup="from __main__ import fun2"))) # 3.4544
print(min(timeit.repeat("fun3()", setup="from __main__ import fun3"))) # 4.1829
print(min(timeit.repeat("fun4()", setup="from __main__ import fun4"))) # 2.8675
The fastest approach is in fun4. Your split method is next, followed closely (surprisingly, imo) by the regex approach, and straggling far behind is the strptime method.

In [48]: s = '"Runtime": "01:12:00"'
In [49]: dt.strptime(s, '"Runtime": "%H:%M:%S"')
Out[49]: datetime.datetime(1900, 1, 1, 1, 12)

>>> import time
>>> a='01:12:00'
>>> b=time.strptime(a,'%H:%M:%S') # use %I instead of %H if you use 12-hour clock
>>> b
time.struct_time(tm_year=1900, tm_mon=1, tm_mday=1, tm_hour=1, tm_min=12, tm_sec=0, tm_wday=0, tm_yday=1, tm_isdst=-1)
Then use b.tm_hour, b.tm_min and b.tm_sec to get hours, minutes and seconds.

I analyzed the performance of the regex method, the string.split to array method, and OP's method
It appears that split to array is faster than regex by about 38% and faster than OP's method by about 15%.
import time
import re
import datetime
timestring = "01:12:00"
# STRING.split method, stored temporarily in array
beforeMillis = int(round(time.time() * 1000))
for i in range(10000):
result = re.search(r"(\d{2}):(\d{2}):(\d{2})", timestring).groups()
theTime = datetime.time(int(result[0]), int(result[1]), int(result[2]))
afterMillis = int(round(time.time() * 1000))
print "Using Regex: " + str(afterMillis - beforeMillis) + "ms"
# regex method
beforeMillis = int(round(time.time() * 1000))
for i in range(10000):
result = timestring.split(":")
theTime = datetime.time(int(result[0]), int(result[1]), int(result[2]))
afterMillis = int(round(time.time() * 1000))
print "Using Split: " + str(afterMillis - beforeMillis) + "ms"
# STRING.split method, stored temporarily in three variables
beforeMillis = int(round(time.time() * 1000))
for i in range(10000):
h,m,s = [int(i) for i in timestring.split(':')]
theTime = datetime.time(hour=h, minute=m, second=s)
afterMillis = int(round(time.time() * 1000))
print "Using Split with 3 Variables: " + str(afterMillis - beforeMillis) + "ms"
Output:
$ python test.py
Using Regex: 52ms
Using Split: 34ms
Using Split with 3 Variables: 44ms
I don't think you'll find a much faster method than storing the split string in an array.
Storing the array temporarily is (a bit) faster than in three variables for a good reason: No further memory has to be used, and the compiler can probably optimize this easier.
All other answers (except for the one recommending regex) also fail to use datetime.time.
I recommend that you don't use the built-in time object for this purpose as it represents a unix time (seconds since Jan 1 1970), not a time of day.

Print Difference Between Time in Ms

I am reading a log file in my python script, and I have got a list of tuples of startTimes and endTimes -
('[19:49:40:680]', '[19:49:49:128]')
('[11:29:10:837]', '[11:29:15:698]')
('[11:30:18:291]', '[11:30:21:025]')
('[11:37:44:293]', '[11:38:02:008]')
('[11:39:14:897]', '[11:39:21:572]')
('[11:42:19:968]', '[11:42:22:036]')
('[11:43:18:887]', '[11:43:19:633]')
('[11:44:26:533]', '[11:49:29:274]')
('[11:55:03:974]', '[11:55:06:372]')
('[11:56:14:096]', '[11:56:14:493]')
('[11:57:08:372]', '[11:57:08:767]')
('[11:59:26:201]', '[11:59:27:438]')
How can I take a difference of the times in milliseconds?

>>> import datetime
>>> a = ('[19:49:40:680]', '[19:49:49:128]')
>>> start = datetime.datetime.strptime(a[0][:-1]+"000", "[%H:%M:%S:%f")
>>> end = datetime.datetime.strptime(a[1][:-1]+"000", "[%H:%M:%S:%f")
>>> delta = end-start
>>> ms = delta.seconds*1000 + delta.microseconds/1000
>>> ms
8448.0
This even works if the clock wraps around at midnight:
>>> a = ('[23:59:59:000]','[00:00:01:000]')
>>> # <snip> see above
>>> ms = delta.seconds*1000 + delta.microseconds/1000
>>> ms
2000.0

You can try the datetime package. (http://docs.python.org/library/datetime.html)
First read the time per strftime. (http://docs.python.org/library/datetime.html#strftime-strptime-behavior)
Then substract them, which should give you a timedeltaobject (http://docs.python.org/library/datetime.html#datetime.timedelta) in which you will find your millisecounds.

I thought it would be fun to see if this could be done in a oneliner. And yes, it can (split out for a faint attempt at readability):
interval = ('[19:49:40:680]', '[19:49:49:128]')
import datetime
(lambda td:
(td.microseconds + (td.seconds + td.days * 24 * 3600) * 10**6) / 10**3)\
(reduce(
lambda a, b: b - a,
[datetime.datetime.strptime(t[1:-1] + '000', '%H:%M:%S:%f')
for t in interval]))
This is Python 2.6. In 2.7 it can be shortened using timedelta.total_seconds(). In Python 3, the reduce() function must be imported from somewhere.

Adding up time durations in Python

I would like to add up a series of splits in Python. The times begin as strings like "00:08:30.291". I can't seem to find the right way to use the Python objects or API to make this convenient/elegant. It seems that the time object doesn't use microseconds, so I'm using datetime's strptime to parse the strings, successfully. But then datetimes don't seem to add, and I really prefer not to overflow into days (i.e. 23 + 2 hours = 25 hours). I can use datetime.time but they don't add either. Timedeltas would seem appropriate but seem a little awkward to convert from/to other things. Perhaps I am missing something obvious here. I would like to be able to:
for timestring in times:
t = datetime.strptime("%H:%M:%S.%f", timestring).time
total_duration = total_duration + t
print total_duration.strftime("%H:%M:%S.%f")

What you're working with is time differences, that's why using datetime.timedelta is only appropriate here:
>>> import datetime
>>> d1 = datetime.datetime.strptime("00:08:30.291", "%H:%M:%S.%f")
>>> d1
datetime.datetime(1900, 1, 1, 0, 8, 30, 291000)
>>> d2
datetime.datetime(1900, 1, 1, 0, 2, 30, 291000)
>>> dt1 = datetime.timedelta(minutes=d1.minute, seconds=d1.second, microseconds=d1.microsecond)
>>> dt2 = datetime.timedelta(minutes=d2.minute, seconds=d2.second, microseconds=d2.microsecond)
>>> fin = dt1 + dt2
>>> fin
datetime.timedelta(0, 660, 582000)
>>> str(fin)
'0:11:00.582000'
Also, please don't use such names as sum for your variables, you're shadowing built-in.

import numpy as np
# read file with one duration per line
with open('clean_times.txt', 'r') as f:
x = f.read()
# Convert string to list of '00:02:12.31'
# I had to drop last item (empty string)
tmp = x.split('\n')[:-1]
# get list of ['00', 02, '12.31']
tmp = [i.split(':') for i in tmp.copy()]
# create numpy array with floats
np_tmp = np.array(tmp, dtype=np.float)
# sum via columns and divide
# hours/24 minutes/60 milliseconds/1000
# X will be a float array [days, hours, seconds]
# Something like `array([ 0. , 15.68333333, 7.4189 ])`
X = np_tmp.sum(axis=0) / np.array([24, 60, 1000])
I was hapy here, but if you need fancy string like '15:41:07.518'
as output, continue reading
# X will be a float array [hours, hours, seconds]
X = np_tmp.sum(axis=0) / np.array([1, 60, 1000])
# ugly part
# Hours are integer parts
H = int(X[0]) + int(X[1])
# Minutes are hour fractional part and integer minutes part
tmp_M = (X[0] % 1 + X[1] % 1) * 60
M = int(tmp_M)
# Seconds are minutes fractional part and integer seconds part
tmp_S = tmp_M % 1 * 60 + X[2]
S = int(tmp_S)
# Milliseconds are seconds fractional part
MS = int(tmp_S % 1 * 1000)
# merge string for output
# Something like '15:41:07.518'
result = f'{H:02}:{M:02}:{S:02}.{MS:03}'

Convert seconds to hh:mm:ss in Python [duplicate]

This question already has answers here:
How do I convert seconds to hours, minutes and seconds?
(18 answers)
Closed 9 years ago.
How do I convert an int (number of seconds) to the formats mm:ss or hh:mm:ss?
I need to do this with Python code (and if possible in a Django template).

I can't believe any of the many answers gives what I'd consider the "one obvious way to do it" (and I'm not even Dutch...!-) -- up to just below 24 hours' worth of seconds (86399 seconds, specifically):
>>> import time
>>> time.strftime('%H:%M:%S', time.gmtime(12345))
'03:25:45'
Doing it in a Django template's more finicky, since the time filter supports a funky time-formatting syntax (inspired, I believe, from PHP), and also needs the datetime module, and a timezone implementation such as pytz, to prep the data. For example:
>>> from django import template as tt
>>> import pytz
>>> import datetime
>>> tt.Template('{{ x|time:"H:i:s" }}').render(
... tt.Context({'x': datetime.datetime.fromtimestamp(12345, pytz.utc)}))
u'03:25:45'
Depending on your exact needs, it might be more convenient to define a custom filter for this formatting task in your app.

>>> a = datetime.timedelta(seconds=65)
datetime.timedelta(0, 65)
>>> str(a)
'0:01:05'

Read up on the datetime module.
SilentGhost's answer has the details my answer leaves out and is reposted here:
>>> a = datetime.timedelta(seconds=65)
datetime.timedelta(0, 65)
>>> str(a)
'0:01:05'

Code that does what was requested, with examples, and showing how cases he didn't specify are handled:
def format_seconds_to_hhmmss(seconds):
hours = seconds // (60*60)
seconds %= (60*60)
minutes = seconds // 60
seconds %= 60
return "%02i:%02i:%02i" % (hours, minutes, seconds)
def format_seconds_to_mmss(seconds):
minutes = seconds // 60
seconds %= 60
return "%02i:%02i" % (minutes, seconds)
minutes = 60
hours = 60*60
assert format_seconds_to_mmss(7*minutes + 30) == "07:30"
assert format_seconds_to_mmss(15*minutes + 30) == "15:30"
assert format_seconds_to_mmss(1000*minutes + 30) == "1000:30"
assert format_seconds_to_hhmmss(2*hours + 15*minutes + 30) == "02:15:30"
assert format_seconds_to_hhmmss(11*hours + 15*minutes + 30) == "11:15:30"
assert format_seconds_to_hhmmss(99*hours + 15*minutes + 30) == "99:15:30"
assert format_seconds_to_hhmmss(500*hours + 15*minutes + 30) == "500:15:30"
You can--and probably should--store this as a timedelta rather than an int, but that's a separate issue and timedelta doesn't actually make this particular task any easier.

You can calculate the number of minutes and hours from the number of seconds by simple division:
seconds = 12345
minutes = seconds // 60
hours = minutes // 60
print "%02d:%02d:%02d" % (hours, minutes % 60, seconds % 60)
print "%02d:%02d" % (minutes, seconds % 60)
Here // is Python's integer division.

If you use divmod, you are immune to different flavors of integer division:
# show time strings for 3800 seconds
# easy way to get mm:ss
print "%02d:%02d" % divmod(3800, 60)
# easy way to get hh:mm:ss
from functools import reduce
print "%02d:%02d:%02d" % \
reduce(lambda ll,b : divmod(ll[0],b) + ll[1:],
[(3800,),60,60])
# function to convert floating point number of seconds to
# hh:mm:ss.sss
def secondsToStr(t):
return "%02d:%02d:%02d.%03d" % \
reduce(lambda ll,b : divmod(ll[0],b) + ll[1:],
[(round(t*1000),),1000,60,60])
print secondsToStr(3800.123)
Prints:
63:20
01:03:20
01:03:20.123

Just be careful when dividing by 60: division between integers returns an integer ->
12/60 = 0 unless you import division from future.
The following is copy and pasted from Python 2.6.2:
IDLE 2.6.2
>>> 12/60
0
>>> from __future__ import division
>>> 12/60
0.20000000000000001

Not being a Python person, but the easiest without any libraries is just:
total = 3800
seconds = total % 60
total = total - seconds
hours = total / 3600
total = total - (hours * 3600)
mins = total / 60

If you need to do this a lot, you can precalculate all possible strings for number of seconds in a day:
try:
from itertools import product
except ImportError:
def product(*seqs):
if len(seqs) == 1:
for p in seqs[0]:
yield p,
else:
for s in seqs[0]:
for p in product(*seqs[1:]):
yield (s,) + p
hhmmss = []
for (h, m, s) in product(range(24), range(60), range(60)):
hhmmss.append("%02d:%02d:%02d" % (h, m, s))
Now conversion of seconds to format string is a fast indexed lookup:
print hhmmss[12345]
prints
'03:25:45'
EDIT:
Updated to 2020, removing Py2 compatibility ugliness, and f-strings!
import sys
from itertools import product
hhmmss = [f"{h:02d}:{m:02d}:{s:02d}"
for h, m, s in product(range(24), range(60), range(60))]
# we can still just index into the list, but define as a function
# for common API with code below
seconds_to_str = hhmmss.__getitem__
print(seconds_to_str(12345))
How much memory does this take? sys.getsizeof of a list won't do, since it will just give us the size of the list and its str refs, but not include the memory of the strs themselves:
# how big is a list of 24*60*60 8-character strs?
list_size = sys.getsizeof(hhmmss) + sum(sys.getsizeof(s) for s in hhmmss)
print("{:,}".format(list_size))
prints:
5,657,616
What if we just had one big str? Every value is exactly 8 characters long, so we can slice into this str and get the correct str for second X of the day:
hhmmss_str = ''.join([f"{h:02d}:{m:02d}:{s:02d}"
for h, m, s in product(range(24),
range(60),
range(60))])
def seconds_to_str(n):
loc = n * 8
return hhmmss_str[loc: loc+8]
print(seconds_to_str(12345))
Did that save any space?
# how big is a str of 24*60*60*8 characters?
str_size = sys.getsizeof(hhmmss_str)
print("{:,}".format(str_size))
prints:
691,249
Reduced to about this much:
print(str_size / list_size)
prints:
0.12218026108523448
On the performance side, this looks like a classic memory vs. CPU tradeoff:
import timeit
print("\nindex into pre-calculated list")
print(timeit.timeit("hhmmss[6]", '''from itertools import product; hhmmss = [f"{h:02d}:{m:02d}:{s:02d}"
for h, m, s in product(range(24),
range(60),
range(60))]'''))
print("\nget slice from pre-calculated str")
print(timeit.timeit("hhmmss_str[6*8:7*8]", '''from itertools import product; hhmmss_str=''.join([f"{h:02d}:{m:02d}:{s:02d}"
for h, m, s in product(range(24),
range(60),
range(60))])'''))
print("\nuse datetime.timedelta from stdlib")
print(timeit.timeit("timedelta(seconds=6)", "from datetime import timedelta"))
print("\ninline compute of h, m, s using divmod")
print(timeit.timeit("n=6;m,s=divmod(n,60);h,m=divmod(m,60);f'{h:02d}:{m:02d}:{s:02d}'"))
On my machine I get:
index into pre-calculated list
0.0434853
get slice from pre-calculated str
0.1085147
use datetime.timedelta from stdlib
0.7625738
inline compute of h, m, s using divmod
2.0477764

Besides the fact that Python has built in support for dates and times (see bigmattyh's response), finding minutes or hours from seconds is easy:
minutes = seconds / 60
hours = minutes / 60
Now, when you want to display minutes or seconds, MOD them by 60 so that they will not be larger than 59

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parsing hh:mm in Python - python

Sometimes I get a string like "02:40" indicating 2 hours and 40 minutes. I'd like to parse that string into the number of minutes (160 in this case) using Python. Sure, I can parse the string and multiply the hours by 60, but is there something in the standard lib that does this?

>>> sum(map(lambda x, y: x * y, map(int, "2:40".split(":")), [60, 1])) 160

I'm sure you can represent the given time as a TimeDelta object. From there I am sure there is an easy way to represent the TimeDelta in minutes.

Related

How to extract time and store in an Array/Structure

Most efficient way to convert string-time to time

Print Difference Between Time in Ms

Adding up time durations in Python

Convert seconds to hh:mm:ss in Python [duplicate]

Categories

Resources