Fast convert datetime string to seconds (Python3) - python

Trying to convert a huge amount of records (time series) to int like this:
seconds_time = int(time.mktime(time.strptime(parts[0], '%Y%m%d %H%M%S')))
Unfortunately, this is the code's bottle neck (time-consuming increases by factor of about 20). Any suggestions to improve it?
Thanks in advance

Actually there's a way to drastically reduce parsing time.
import time
start = time.time()
nb_loops = 1000000
time_string = "20170101 201456"
for i in range(nb_loops):
seconds_time = int(time.mktime(time.strptime(time_string, '%Y%m%d %H%M%S')))
print(time.time()-start)
that first loop runs in 12 seconds. Not very good I admit.
But, since your format is simple, why not use integer conversion with slicing in a list comprehension (and add 0 for the missing fields like milliseconds, ...) and pass the result to mktime.
start = time.time()
for i in range(nb_loops):
seconds_time = time.mktime(tuple([int(time_string[s:e]) for s,e in ((0,4),(4,6),(6,8),(9,11),(11,13),(13,15))]+[0,0,0]))
print(time.time()-start)
that runs in 3 seconds (saves the parsing of the '%Y%m%d %H%M%S' format string, which seems to take a while).
Using compiled regular expressions is slightly faster:
import re
r = re.compile("(....)(..)(..) (..)(..)(..)")
start = time.time()
for i in range(nb_loops):
seconds_time = time.mktime(tuple(map(int,r.match(time_string).groups()))+(0,0,0))
print(time.time()-start)
results:
basic 14.41410493850708
string slicing 3.1356000900268555
regex 2.8703999519348145

Related

Python convert string to datetime but formatting is not very predictable

I'm extract the execution time of a Linux process using Subprocess and ps. I'd like to put it in a datetime object, to perform datetime arithmetic. However, I'm a little concerned about the output ps returns for the execution time:
1-01:12:23 // 1 day, 1 hour, 12 minutes, 23 seconds
05:39:03 // 5 hours, 39 minutes, 3 seconds
15:06 // 15 minutes, 6 seconds
Notice there is no zero padding before the day. And it doesn't include months/years, whereas technically something could run for that long.
Consequently i'm unsure what format string to convert it to a timedelta because I don't want it to break if a process has ran for months, or another has only ran for hours.
UPDATE
Mozway has given a very smart answer. However, I'm taking a step back and wondering if I can get the execution time another way. I'm currently using ps to get the time, but it means I also have the pid. Is there something else I can do with the pid, to get the execution time in a simpler format?
(Can only use official Python libraries)
UPDATE2
It's actually colons between the hours, mins and seconds.
You should use a timedelta
Here is a suggestion on how to convert from your string:
import datetime
s = '1-01-12-23'
out = datetime.timedelta(**dict(zip(['days', 'hours', 'minutes', 'seconds'],
map(int, s.split('-')))))
Output:
datetime.timedelta(days=1, seconds=4343)
If you can have more or less units, and assuming the smallest units are present you take advantage of the fact that zip stops with the smallest iterable, just reverse the inputs:
s = '12-23'
units = ['days', 'hours', 'minutes', 'seconds']
out = datetime.timedelta(**dict(zip(reversed(units),
map(int, reversed(s.split('-'))))))
Output:
datetime.timedelta(seconds=743)
As a function
Using re.split to handle the 1-01:23:45 format
import re
def to_timedelta(s):
units = ['days', 'hours', 'minutes', 'seconds']
return datetime.timedelta(**dict(zip(reversed(units),
map(int, reversed(re.split('[-:]', s))))))
to_timedelta('1-01:12:23')
# datetime.timedelta(days=1, seconds=4343)
to_timedelta('05:39:03')
# datetime.timedelta(seconds=20343)
to_timedelta('15:06')
# datetime.timedelta(seconds=906)

How could I implement “HH:MM:SS” format

I am asked to do this:
Write a program that adds one second to a clock time, given its hours, minutes and seconds.
Input consists of three natural numbers h, m and s that represent a clock time, that is, such that h<24, m<60 and s<60.
This is the code I came up with:
from easyinput import read
h = read(int)
m = read(int)
s = read(int)
seconds = (s+1)%60
minutes = (m + (s+1)//60)%60
hours = h + (m + (s+1)//60))//60
print(hours, minutes, seconds)
It does its function well, if I have
13 59 59
it returns
14 0 0
I am sure it could be bettered, but that's not the problem right now.
The problem is that I need the format to be like this:
11:33:16
It should be “HH:MM:SS”, and I don't know how to do it.
Anyone could help me?? Thanksss :)))
Use an f-string with format modifiers. 02d says "an int with field width 2 padded with 0."
print(f"{hours:02d}:{minutes:02d}:{seconds:02d}")
>>> hours = 13
>>> minutes = 3
>>> seconds = 5
>>> print(f"{hours:02d}:{minutes:02d}:{seconds:02d}")
13:03:05
>>>
Note that the d in the format specifiers is unnecessary. You could write:
print(f"{hours:02}:{minutes:02}:{seconds:02}")
Documentation on f-strings.
Usually, you don't want to deal with calculating date and time yourself, so a better approach is to use the native library that works with date and time out of the box:
from datetime import datetime, timedelta
from easyinput import read
h, m, s = read(int), read(int), read(int)
time = datetime.now().replace(hour=h, minute=m, second=s)
time += timedelta(seconds=1)
print(time.strftime("%H:%M:%S"))
print(f'{hours:>02}:{minutes:>02}:{seconds:>02}')

Python Datetime: Converting seconds to hours/minutes/seconds - but converting the day option into hours also?

Pretty basic question. I'm using a function that parses seconds from JSON and uses datetime in Python to output to hours/minutes/seconds, like so:
str(datetime.timedelta(seconds=seconds here))
This outputs something like so:
Timestamp: 23:54:02.513000
Timestamp: 1 day, 0:01:07.827000
It works perfectly, but I don't want datetime to print "1 day", I want hours only. So for example the second above should be something like 24:01:07.827000
I tried using my own custom function to convert the seconds, but I feel there must be an easier way.
According to Python Docs, https://docs.python.org/3/library/datetime.html#timedelta-objects
Only days, seconds and microseconds are stored internally.
So you have to compute the hours and minutes yourself using days and seconds. Below code uses f-strings.
import datetime
t = datetime.timedelta(seconds=60*60*24 + 11569) # A random number for testing
print(t) # 1 day, 3:12:49
print(f'{t.days * 24 + t.seconds // 3600:02}:{(t.seconds % 3600) // 60:02}:{t.seconds % 60:02}') # 27:12:49

How can I format timedelta microseconds to 2 decimal digits?

Running this
import time
import datetime
timenow = time.time()
timedifference = time.time() - timenow
timedifference = datetime.timedelta( seconds=timedifference )
print( "%s" % timedifference )
I got this:
0:00:00.000004
How can I format trimming the microseconds to 2 decimal digits using the deltatime object?
0:00:00.00
Related questions:
Timedelta in hours,minutes,seconds,microseconds format
Formatting microseconds to two decimal places (in fact converting microseconds into tens of microseconds)
Convert the timedifference to a string with str(), then separate on either side of the decimal place with .split('.'). Then keep the first portion before the decimal place with [0]:
Your example with the only difference on the last line:
import time
import datetime
timenow = time.time()
timedifference = time.time() - timenow
timedifference = datetime.timedelta( seconds=timedifference )
print( "%s" % str(timedifference).split('.')[0] )
generates:
0:00:00
Another solution is to split the fractional part numerically and format it separately:
>>> seconds = 123.995
>>> isec, fsec = divmod(round(seconds*100), 100)
>>> "{}.{:02.0f}".format(timedelta(seconds=isec), fsec)
'0:02:04.00'
As you can see, this takes care of the rounding. It is also easy to adjust the output precision by changing 100 above to another power of 10 (and adjusting the format string):
def format_td(seconds, digits=2):
isec, fsec = divmod(round(seconds*10**digits), 10**digits)
return ("{}.{:0%d.0f}" % digits).format(timedelta(seconds=isec), fsec)
You'll have to format it yourself. A timedelta object contains days, seconds and microseconds so you'll have to do the math to convert to days/hours/min/sec/microsec and then format using python string.format. For your microsec, you'll want ((microsec+5000)/10000) to get the top two digits (the +5000 is for rounding).
A bit late, but here's a 2021 answer with f-strings (modified from #Seb's original answer):
def format_td(seconds, digits=3):
isec, fsec = divmod(round(seconds*10**digits), 10**digits)
return f'{timedelta(seconds=isec)}.{fsec:0{digits}.0f}'

datetime: Round/trim number of digits in microseconds

Currently I am logging stuff and I am using my own formatter with a custom formatTime():
def formatTime(self, _record, _datefmt):
t = datetime.datetime.now()
return t.strftime('%Y-%m-%d %H:%M:%S.%f')
My issue is that the microseconds, %f, are six digits. Is there anyway to spit out less digits, like the first three digits of the microseconds?
The simplest way would be to use slicing to just chop off the last three digits of the microseconds:
def format_time():
t = datetime.datetime.now()
s = t.strftime('%Y-%m-%d %H:%M:%S.%f')
return s[:-3]
I strongly recommend just chopping. I once wrote some logging code that rounded the timestamps rather than chopping, and I found it actually kind of confusing when the rounding changed the last digit. There was timed code that stopped running at a certain timestamp yet there were log events with that timestamp due to the rounding. Simpler and more predictable to just chop.
If you want to actually round the number rather than just chopping, it's a little more work but not horrible:
def format_time():
t = datetime.datetime.now()
s = t.strftime('%Y-%m-%d %H:%M:%S.%f')
head = s[:-7] # everything up to the '.'
tail = s[-7:] # the '.' and the 6 digits after it
f = float(tail)
temp = "{:.03f}".format(f) # for Python 2.x: temp = "%.3f" % f
new_tail = temp[1:] # temp[0] is always '0'; get rid of it
return head + new_tail
Obviously you can simplify the above with fewer variables; I just wanted it to be very easy to follow.
As of Python 3.6 the language has this feature built in:
def format_time():
t = datetime.datetime.now()
s = t.isoformat(timespec='milliseconds')
return s
This method should always return a timestamp that looks exactly like this (with or without the timezone depending on whether the input dt object contains one):
2016-08-05T18:18:54.776+0000
It takes a datetime object as input (which you can produce with datetime.datetime.now()). To get the time zone like in my example output you'll need to import pytz and pass datetime.datetime.now(pytz.utc).
import pytz, datetime
time_format(datetime.datetime.now(pytz.utc))
def time_format(dt):
return "%s:%.3f%s" % (
dt.strftime('%Y-%m-%dT%H:%M'),
float("%.3f" % (dt.second + dt.microsecond / 1e6)),
dt.strftime('%z')
)
I noticed that some of the other methods above would omit the trailing zero if there was one (e.g. 0.870 became 0.87) and this was causing problems for the parser I was feeding these timestamps into. This method does not have that problem.
An easy solution that should work in all cases:
def format_time():
t = datetime.datetime.now()
if t.microsecond % 1000 >= 500: # check if there will be rounding up
t = t + datetime.timedelta(milliseconds=1) # manually round up
return t.strftime('%Y-%m-%d %H:%M:%S.%f')[:-3]
Basically you do manual rounding on the date object itself first, then you can safely trim the microseconds.
Edit: As some pointed out in the comments below, the rounding of this solution (and the one above) introduces problems when the microsecond value reaches 999500, as 999.5 is rounded to 1000 (overflow).
Short of reimplementing strftime to support the format we want (the potential overflow caused by the rounding would need to be propagated up to seconds, then minutes, etc.), it is much simpler to just truncate to the first 3 digits as outlined in the accepted answer, or using something like:
'{:03}'.format(int(999999/1000))
-- Original answer preserved below --
In my case, I was trying to format a datestamp with milliseconds formatted as 'ddd'. The solution I ended up using to get milliseconds was to use the microsecond attribute of the datetime object, divide it by 1000.0, pad it with zeros if necessary, and round it with format. It looks like this:
'{:03.0f}'.format(datetime.now().microsecond / 1000.0)
# Produces: '033', '499', etc.
You can subtract the current datetime from the microseconds.
d = datetime.datetime.now()
current_time = d - datetime.timedelta(microseconds=d.microsecond)
This will turn 2021-05-14 16:11:21.916229 into 2021-05-14 16:11:21
This method allows flexible precision and will consume the entire microsecond value if you specify too great a precision.
def formatTime(self, _record, _datefmt, precision=3):
dt = datetime.datetime.now()
us = str(dt.microsecond)
f = us[:precision] if len(us) > precision else us
return "%d-%d-%d %d:%d:%d.%d" % (dt.year, dt.month, dt.day, dt.hour, dt.minute, dt.second, int(f))
This method implements rounding to 3 decimal places:
import datetime
from decimal import *
def formatTime(self, _record, _datefmt, precision='0.001'):
dt = datetime.datetime.now()
seconds = float("%d.%d" % (dt.second, dt.microsecond))
return "%d-%d-%d %d:%d:%s" % (dt.year, dt.month, dt.day, dt.hour, dt.minute,
float(Decimal(seconds).quantize(Decimal(precision), rounding=ROUND_HALF_UP)))
I avoided using the strftime method purposely because I would prefer not to modify a fully serialized datetime object without revalidating it. This way also shows the date internals in case you want to modify it further.
In the rounding example, note that the precision is string-based for the Decimal module.
Here is my solution using regexp:
import re
# Capture 6 digits after dot in a group.
regexp = re.compile(r'\.(\d{6})')
def to_splunk_iso(dt):
"""Converts the datetime object to Splunk isoformat string."""
# 6-digits string.
microseconds = regexp.search(dt.isoformat()).group(1)
return regexp.sub('.%d' % round(float(microseconds) / 1000), dt.isoformat())
Fixing the proposed solution based on Pablojim Comments:
from datetime import datetime
dt = datetime.now()
dt_round_microsec = round(dt.microsecond/1000) #number of zeroes to round
dt = dt.replace(microsecond=dt_round_microsec)
If once want to get the day of the week (i.e, 'Sunday)' along with the result, then by slicing '[:-3]' will not work. At that time you may go with,
dt = datetime.datetime.now()
print("{}.{:03d} {}".format(dt.strftime('%Y-%m-%d %I:%M:%S'), dt.microsecond//1000, dt.strftime("%A")))
#Output: '2019-05-05 03:11:22.211 Sunday'
%H - for 24 Hour format
%I - for 12 Hour format
Thanks,
Adding my two cents here as this method will allow you to write your microsecond format as you would a float in c-style. It takes advantage that they both use %f.
import datetime
import re
def format_datetime(date, format):
"""Format a ``datetime`` object with microsecond precision.
Pass your microsecond as you would format a c-string float.
e.g "%.3f"
Args:
date (datetime.datetime): You input ``datetime`` obj.
format (str): Your strftime format string.
Returns:
str: Your formatted datetime string.
"""
# We need to check if formatted_str contains "%.xf" (x = a number)
float_format = r"(%\.\d+f)"
has_float_format = re.search(float_format, format)
if has_float_format:
# make microseconds be decimal place. Might be a better way to do this
microseconds = date.microsecond
while int(microseconds): # quit once it's 0
microseconds /= 10
ms_str = has_float_format.group(1) % microseconds
format = re.sub(float_format, ms_str[2:], format)
return date.strftime(format)
print(datetime.datetime.now(), "%H:%M:%S.%.3f")
# '17:58:54.424'

Categories