Parsing human readable relative times - python

I would like to parse human terms like 3 days ago in python 2.7 to get a timedelta equivalent.
For example:
>>> relativetimeparer.parser('3 days ago')
datetime.timedelta(3)
I have tried the dateparser module.
>>> import dateparser
>>> dateparser.parse('3 days ago')
datetime.datetime(2016, 8, 20, 2, 57, 23, 372538)
>>> datetime.now() - dateparser.parse('3 days ago')
datetime.timedelta(3, 35999, 999232)
It parses relative time directly to datetime without the option of returning a timedelta. It also seems to think that 3 days ago is actually 3 days and 10 hours ago. So it seems to be invoking my timezone offset from Greenwich too (+10 hours).
Is there a better module for parsing human readable relative times?

You could specify the RELATIVE_BASE setting:
>>> now = datetime.datetime.now()
>>> res = dateparser.parse('3 days ago', settings={'RELATIVE_BASE': now})
>>> now - res
datetime.timedelta(3)

I would just like to tip about the interesting Arrow library that has a builtin dehumanize() method. The dehumanize() method takes a human readable string and use it to shift a moment into a past time or into the future. The above could then be accomplished as simply as:
>>> import arrow
>>> now = arrow.now()
<Arrow [2021-10-15T14:43:20.506200+02:00]>
>>> now.dehumanize('3 days ago')
<Arrow [2021-10-12T14:43:20.506200+02:00]>
>>> now.dehumanize('3 days ago') - now
datetime.timedelta(days=-3)

Related

Python difference in datetime now against datetime now

I have got into an issue or might quite possibly feature in turn! Not sure, wondering!! In python's datetime library, to get difference in time, as in below snippet.
>>> import datetime
>>> datetime.datetime.now() - datetime.datetime.now()
datetime.timedelta(-1, 86399, 999958)
>>> tnow = datetime.datetime.now()
>>> datetime.datetime.now() - tnow
datetime.timedelta(0, 4, 327859)
I would like to understand why datetime.datetime.now() - datetime.datetime.now() is producing output as -1 days, 86399 seconds whereas assigning current time to some variable and computing difference gives desired output 0 days, 4 seconds.
The results seems to be bit confusing, it would be helpful if someone could decode whats going behind
Note: I'm using Python 2.7
As per the documentation of timedelta object
If the normalized value of days lies outside the indicated range,
OverflowError is raised.
Note that normalization of negative values may be surprising at first.
For example:
>>> from datetime import timedelta
>>> d = timedelta(microseconds=-1)
>>> (d.days, d.seconds, d.microseconds)
(-1, 86399, 999999)
This is valid for python 2.7 and 3 both.
Why this is happening is simple:
a , b = datetime.datetime.now(), datetime.datetime.now()
# here datetime.now() in a will be <= b.
# That is because they will be executed separately at different CPU clock cycle.
a - b
# datetime.timedelta(-1, 86399, 999973)
b - a
# datetime.timedelta(0, 0, 27)
To get the proper time difference:
(tnow - datetime.datetime.now()).total_seconds()
# output: -1.751166
This Answer gives more information on how to use time delta safely (handle negative values) Link
You are encountering a "corner case" situation.
Every datetime.datetime.now() produces a datetime.datetime object ([Python]: https://docs.python.org/3/library/datetime.html#datetime-objects), which is the current date & time at the moment the call was made
You have 2 such calls (even if they are on the same line). Since the CPU speeds are very high nowadays, every such call takes a very small amount of time (much less than microseconds, I presume)
But, when the 1st call is at the very end of a (microsecond?) period, and the 2nd one is at the beginning of the next one, you'd get this behavior:
>>> import datetime
>>> now0 = datetime.datetime.now()
>>> now0
datetime.datetime(2018, 2, 20, 12, 23, 23, 1000)
>>> delta = datetime.timedelta(microseconds=1)
>>> now1 = now0 + delta
>>> now0 - now1
datetime.timedelta(-1, 86399, 999999)
Explanation:
Let now0 to be the result of the 1st call made to datetime.datetime.now()
Let's say that the 2nd datetime.datetime.now() call happens one microsecond later (I am reproducing the behavior using the delta object, as the times involved here are waaay too small for me to be able to to run the line at the exact time when this behavior is encountered). That is placed into now1
When subtracting them you get the negative value (in my case is -delta), since now0 happened earlier than now1 (check [Python]: timedelta Objects for more details)

Subtract seconds from datetime in python

I have an int variable that are actually seconds (lets call that amount of seconds X). I need to get as result current date and time (in datetime format) minus X seconds.
Example
If X is 65 and current date is 2014-06-03 15:45:00, then I need to get the result 2014-06-03 15:43:45.
Environment
I'm doing this on Python 3.3.3 and I know I could probably use the datetime module but I haven't had any success so far.
Using the datetime module indeed:
import datetime
X = 65
result = datetime.datetime.now() - datetime.timedelta(seconds=X)
You should read the documentation of this package to learn how to use it!
Consider using dateutil.relativedelta, instead of datetime.timedelta.
>>> from datetime import datetime
>>> from dateutil.relativedelta import relativedelta
>>> now = datetime.now()
>>> now
datetime.datetime(2014, 6, 3, 22, 55, 9, 680637)
>>> now - relativedelta(seconds=15)
datetime.datetime(2014, 6, 3, 22, 54, 54, 680637)
In this case of a 15 seconds delta there is no advantage over using a stdlib timedelta, but relativedelta supports larger units such as months or years, and it may handle the general case with more correctness (consider for example special handling required for leap years and periods with daylight-savings transitions).
To expand on #julienc's answer,
(in case it is helpful to someone)
If you allow X to accept positive or negatives, and, change the subtraction statement to an addition statement, then you can have a more intuitive (so you don't have to add negatives to negatives to get positives) time adjusting feature like so:
def adjustTimeBySeconds(time, delta):
return time + datetime.timedelta(seconds=delta)
time = datetime.datetime.now()
X = -65
print(adjustTimeBySeconds(time, X))
X = 65
print(adjustTimeBySeconds(time, X))

How to format a timedelta like "1 hour ago" in python so that it is translatable

I want to format a python timedelta object as "x minutes/hours/weeks/month/years ago".
I know there are some similar questions, like:
How to display "x days ago" type time using Humanize in Django template?
From: "1 hour ago", To: timedelta + accuracy
User-friendly time format in Python?
However, I did not find an answer for my case, because
I do not use django
I need to format a timedelta to a string, not the other way around
The solution should work in as many languages as possible, not just English
Here is my current code (excerpt, sorry):
delta = babel.dates.format_timedelta(now - dt, format=format,
locale=locale)
if now > dt:
return _(u"%(timedelta)s ago") % {'timedelta': delta}
else:
return _(u"in %(timedelta)s") % {'timedelta': delta}
For the babel function, see http://babel.pocoo.org/docs/dates/#time-delta-formatting
Now this works fine in English. In German, however, it fails:
The above code would translate "2 years ago" to "vor 2 Jahre" instead of "vor 2 Jahren".
I would also be happy with a solution that does not use the "... ago" phrasing. As long as it is similar and translatable I can accept it.
According to documentation I found here, you should add the add_direction=True parameter to your format_timedelta call.
Here's a quick example that should put you on the right track:
>>> import datetime
>>> delta = datetime.timedelta(days=2)
>>> delta.days
2
>>> print delta
2 days, 0:00:00
You should create your formatting as it makes sense, perhaps someone else can put you on to an answer out of the box here.
>>> '{0} ago'.format(delta)
'2 days, 0:00:00 ago'
A function for the timedelta object
def total_hours(a_timedelta):
return a_timedelta.total_seconds()/60.0/60.0
Usage:
>>> total_hours(delta)
48.0
>>> '{0} hours ago'.format(total_hours(delta))
'48.0 hours ago'
>>> then = datetime.datetime.now()
>>> diff = datetime.datetime.now() - then
>>> diff
datetime.timedelta(0, 12, 967773)
>>> '{0} hours ago'.format(total_hours(diff))
'0.00360215916667 hours ago'
You may want to use the arrow package, especially the humanize function :
import arrow
a = arrow.now()
print a.humanize()
Output : "just now"
For your needs, suppose you have a date like:
myDate = "2019-02-10 08:00:00"
a = arrow.get(myDate, "YYYY-DD-MM HH:mm:ss")
print a.humanize()
Output : "an hour ago"
You can find the doc here : https://arrow.readthedocs.io/en/latest/

Parse time string in python [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to construct a timedelta object from a simple string
I have a string that is in the format hours:minutes:seconds but it is not a time of day but a duration. For example, 100:00:00 means 100 hours.
I am trying to find the time that is offset from the current time by the amount of time specified in the string. I could use regular expressions to manually pull apart the time string and convert it to seconds and add it to the floating point returned by time.time(), but is there a time function to do this?
The time.strptime() function formatting seems to work on time of day/date strings and not arbitrary strings.
import datetime
dur_str = "100:00:00"
h, m, s = map(int, dur_str.split(':'))
dur = datetime.timedelta(hours=h, minutes=m, seconds=s)
not using re but sometimes it's more work to understand the regex than write the python.
>>> import datetime
>>> time_str = "100:00:00"
>>> hours, minutes, seconds = [int(i) for i in time_str.split(":")]
>>> time_in_seconds = hours * 60 * 60 + minutes * 60 + seconds
>>> time_in_seconds
360000
>>> now = datetime.datetime.now()
>>> now
datetime.datetime(2012, 10, 2, 10, 24, 6, 639000)
>>> new_time = now + datetime.timedelta(seconds=time_in_seconds)
>>> new_time
datetime.datetime(2012, 10, 6, 14, 24, 6, 639000)
As nneonneo pointed out datetime.timedelta() accepts the hours, minutes, and seconds as arguments. So you can even do something silly like this (not recommended):
>>> datetime.timedelta(**{k:v for k,v in zip(["hours", "minutes", "seconds"], [int(i) for i in "100:00:00".split(":")])})
datetime.timedelta(4, 14400)

Is there any way to use a strftime-like function for dates before 1900 in Python?

I didn't realize this, but apparently Python's strftime function doesn't support dates before 1900:
>>> from datetime import datetime
>>> d = datetime(1899, 1, 1)
>>> d.strftime('%Y-%m-%d')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: year=1899 is before 1900; the datetime strftime() methods require year >= 1900
I'm sure I could hack together something myself to do this, but I figure the strftime function is there for a reason (and there also is a reason why it can't support pre-1900 dates). I need to be able to support dates before 1900. I'd just use str, but there's too much variation. In other words, it may or may not have microseconds or it may or may not have a timezone. Is there any solution to this?
If it makes a difference, I'm doing this so that I can write the data to a text file and load it into a database using Oracle SQL*Loader.
I essentially ended up doing Alex Martelli's answer. Here's a more complete implementation:
>>> from datetime import datetime
>>> d = datetime.now()
>>> d = d.replace(microsecond=0, tzinfo=None)
>>> str(d)
'2009-10-29 11:27:27'
The only difference is that str(d) is equivalent to d.isoformat(' ').
isoformat works on datetime instances w/o limitation of range:
>>> import datetime
>>> x=datetime.datetime(1865, 7, 2, 9, 30, 21)
>>> x.isoformat()
'1865-07-02T09:30:21'
If you need a different-format string it's not too hard to slice, dice and remix pieces of the string you get from isoformat, which is very consistent (YYYY-MM-DDTHH:MM:SS.mmmmmm, with the dot and following microseconds omitted if microseconds are zero).
The documentation seems pretty clear about this:
The exact range of years for which strftime() works also varies across platforms. Regardless of platform, years before 1900 cannot be used.
So there isn't going to be a solution that uses strftime(). Luckily, it's pretty straightforward to do this "by hand":
>>> "%02d-%02d-%02d %02d:%02d" % (d.year,d.month,d.day,d.hour,d.minute)
'1899-01-01 00:00'
mxDateTime can handle arbitrary dates. Python's time and datetime modules use UNIX timestamps internally, that's why they have limited range.
In [5]: mx.DateTime.DateTime(1899)
Out[5]: <mx.DateTime.DateTime object for '1899-01-01 00:00:00.00' at 154a960>
In [6]: DateTime.DateTime(1899).Format('%Y-%m-%d')
Out[6]: 1899-01-01
This is from the matplotlib source. Could provide a good starting point for rolling your own.
def strftime(self, dt, fmt):
fmt = self.illegal_s.sub(r"\1", fmt)
fmt = fmt.replace("%s", "s")
if dt.year > 1900:
return cbook.unicode_safe(dt.strftime(fmt))
year = dt.year
# For every non-leap year century, advance by
# 6 years to get into the 28-year repeat cycle
delta = 2000 - year
off = 6*(delta // 100 + delta // 400)
year = year + off
# Move to around the year 2000
year = year + ((2000 - year)//28)*28
timetuple = dt.timetuple()
s1 = time.strftime(fmt, (year,) + timetuple[1:])
sites1 = self._findall(s1, str(year))
s2 = time.strftime(fmt, (year+28,) + timetuple[1:])
sites2 = self._findall(s2, str(year+28))
sites = []
for site in sites1:
if site in sites2:
sites.append(site)
s = s1
syear = "%4d" % (dt.year,)
for site in sites:
s = s[:site] + syear + s[site+4:]
return cbook.unicode_safe(s)
This is the "feature" of the ctime library (UTF).
Also You may have problem above 2038.

Categories