Comparing dates and finding the closest date to the current date - python

I'm looking to compare a list of dates with todays date and would like to return the closest one. Ive had various ideas on it but they are seem very convoluted and involve scoring based on how many days diff and taking the smallest diff. But I have no clue how to do this simply any pointers would be appreciated.
import datetime
import re
date_list = ['2019-02-10', '2018-01-13', '2019-02-8',]
now = datetime.date.today()
for date_ in date_list:
match = re.match('.*(\d{4})-(\d{2})-(\d{2}).*', date_)
if match:
year = match.group(1)
month = match.group(2)
day = match.group(3)
delta = now - datetime.date(int(year), int(month), int(day))
print(delta)
As I was Waiting EDIT
So I solved this using the below
import datetime
import re
date_list = ['2019-02-10', '2018-01-13', '2019-02-8',]
now = datetime.date.today()
for date_ in date_list:
match = re.match('.*(\d{4})-(\d{2})-(\d{2}).*', date_)
if match:
year = match.group(1)
month = match.group(2)
day = match.group(3)
delta = now - datetime.date(int(year), int(month), int(day))
dates_range.append(int(delta.days))
days = min(s for s in dates_range)

convert each string into a datetime.date object, then just subtract and get the smallest difference
import datetime
import re
date_list = ['2019-02-10', '2018-01-13', '2019-02-8',]
now = datetime.date.today()
date_list_converted = [datetime.datetime.strptime(each_date, "%Y-%m-%d").date() for each_date in date_list]
differences = [abs(now - each_date) for each_date in date_list_converted]
minimum = min(differences)
closest_date = date_list[differences.index(minimum)]

This converts the strings to a datetime object, then subracts the current date from that and returns the date with the corresponding lowest absolute difference:
import datetime
import re
date_list = ['2019-02-10', '2018-01-13', '2019-02-8',]
numPattern = re.compile("[0-9]+")
def getclosest(dates):
global numPattern
now = datetime.date.today()
diffs = []
for day in date_list:
year, month, day = [int(i) for i in re.findall(numPattern, day)]
currcheck = datetime.date(year, month, day)
diffs.append(abs(now - currcheck))
return dates[diffs.index(min(diffs))]
It's by no means the most efficient, but it's semi-elegant and works.

Using inbuilts
Python's inbuilt datetime module has the functionality to do what you desire.
Let's first take your list of dates and convert it into a list of datetime objects:
from datetime import datetime
date_list = ['2019-02-10', '2018-01-13', '2019-02-8']
datetime_list = [datetime.strptime(date, "%Y-%m-%d") for date in date_list]
Once we have this we can find the difference between those dates and today's date.
today = datetime.today()
date_diffs = [abs(date - today) for date in datetime_list]
Excellent, date_diffs is now a list of datetime.timedelta objects. All that is left is to find the minimum and find which date this represents.
To find the minimum difference it is simple enough to use min(date_diffs), however, we then want to use this minimum to extract the corresponding closest date. This can be achieved as:
closest_date = date_list[date_diffs.index(min(date_diffs))]
With pandas
If performance is an issue, it may be worth investigating a pandas implementation. Using pandas we can convert your dates to a pandas dataframe:
from datetime import datetime
import pandas as pd
date_list = ['2019-02-10', '2018-01-13', '2019-02-8']
date_df = pd.to_datetime(date_list)
Finally, as in the method using inbuilts we find the differences in the dates and use it to extract the closest date to today.
today = datetime.today()
date_diffs = abs(today - date_df)
closest_date = date_list[date_diffs.argmin()]
The advantage of this method is that we've removed the for loops and so I'd expect this method to be more efficient for large numbers of dates

one fast and simple way will be to use bisect algorithm, especially if your date_list is significantly big :
import datetime
from bisect import bisect_left
FMT = '%Y-%m-%d'
date_list = ['2019-02-10', '2018-01-13', '2019-02-8', '2019-02-12']
date_list.sort()
def closest_day_to_now(days):
"""
Return the closest day form an ordered list of days
"""
now = datetime.datetime.now()
left_closest_day_index = bisect_left(days, now.strftime(FMT))
# check if there is one greater value
if len(days) - 1 > left_closest_day_index:
right_closest_day_index = left_closest_day_index + 1
right_day = datetime.datetime.strptime(days[right_closest_day_index], FMT)
left_day = datetime.datetime.strptime(days[left_closest_day_index], FMT)
closest_day_index = right_closest_day_index if abs(right_day - now) < abs(left_day - now) \
else left_closest_day_index
else:
closest_day_index = left_closest_day_index
return days[closest_day_index]
print(closest_day_to_now(date_list))

Related

How can I restructure a string of numbers and convert them to integers?

I have a list that contains dates as a string:
date_list = [['4272', '07/18/2022'], ['4271', '07/18/2022'], ['4254', '06/23/2022'], ['4222', '05/09/2022'], ['4174', '03/09/2022'], ['3946', '06/07/2021'], ['3918', '05/03/2021'], ['3914', '08/19/2021'], ['3907', '08/19/2021'], ['3888', '07/05/2022'], ['3784', '12/21/2020'], ['3651', '05/07/2020'], ['3644', '04/20/2020'], ['3615', '02/06/2020'], ['3140', '09/24/2018'], ['3125', '03/03/2022']]
The plan is to use datetime() to compare these dates with today's date and I'll return the output. Obviously for each item in the list, the first value is a unique ID number and the second value is the date. I'm not sure how to pull out the date from each item and re-format it so it can be used with datetime(). Using the first item in my list as an example, I need it to go from the string '07/18/2022' into the function as valid_date = datetime.date(2022,07,18) as ints, so it can run. I have to iterate through each date in my list and run it through datetime()
today = datetime.date.today()
valid_date = datetime.date(<year>,<month>,<day>)
diff = valid_date - today
You can use the strptime from datetime module. It takes a string and a format then gives you a datetime object:
from datetime import datetime
date_list = [
["4272", "07/18/2022"],
["4271", "07/18/2022"],
["4254", "06/23/2022"],
["4222", "05/09/2022"],
["4174", "03/09/2022"],
["3946", "06/07/2021"],
["3918", "05/03/2021"],
["3914", "08/19/2021"],
["3907", "08/19/2021"],
["3888", "07/05/2022"],
["3784", "12/21/2020"],
["3651", "05/07/2020"],
["3644", "04/20/2020"],
["3615", "02/06/2020"],
["3140", "09/24/2018"],
["3125", "03/03/2022"],
]
for num, date_string in date_list:
datetime_object = datetime.strptime(date_string, "%m/%d/%Y")
today = datetime.today()
diff = today - datetime_object
print(diff)
You can use strptime() to parse the date.
parsed = datetime.strptime("07/18/2022", "%m/%d/%Y")
print(type(parsed))
Expected output:
<class 'datetime.datetime'>
Using strptime would be the pythonic and best solution as suggested by others. But anyway, this might be you asking for:
import datetime
date_list = [['4272', '07/18/2022'], ['4271', '07/18/2022'], ['4254', '06/23/2022'], ['4222', '05/09/2022'], ['4174', '03/09/2022'], ['3946', '06/07/2021'], ['3918', '05/03/2021'], ['3914', '08/19/2021'], ['3907', '08/19/2021'], ['3888', '07/05/2022'], ['3784', '12/21/2020'], ['3651', '05/07/2020'], ['3644', '04/20/2020'], ['3615', '02/06/2020'], ['3140', '09/24/2018'], ['3125', '03/03/2022']]
for ID, date in date_list:
month, day, year = date.split('/')
month, day, year = int(month), int(day), int(year)
today = datetime.date.today()
valid_date = datetime.date(year, month, day)
diff = today - valid_date
print(diff.days)

Getting list of months in between two dates according to specific format

start = "Nov20"
end = "Jan21"
# Expected output:
["Nov20", "Dec20", "Jan21"]
What I've tried so far is the following but am looking for more elegant way.
from calendar import month_abbr
from time import strptime
def get_range(a, b):
start = strptime(a[:3], '%b').tm_mon
end = strptime(b[:3], '%b').tm_mon
dates = []
for m in month_abbr[start:]:
dates.append(m+a[-2:])
for mm in month_abbr[1:end + 1]:
dates.append(mm+b[-2:])
print(dates)
get_range('Nov20', 'Jan21')
Note: i don't want to use pandas as that's not logical to import such library for generating dates.
The date range may span different years so one way is to loop from the start date to end date and increment the month by 1 until end date is reached.
Try this:
from datetime import datetime
def get_range(a, b):
start = datetime.strptime(a, '%b%y')
end = datetime.strptime(b, '%b%y')
dates = []
while start <= end:
dates.append(start.strftime('%b%y'))
if start.month == 12:
start = start.replace(month=1, year=start.year+1)
else:
start = start.replace(month=start.month+1)
return dates
dates = get_range("Nov20", "Jan21")
print(dates)
Output:
['Nov20', 'Dec20', 'Jan21']
You can use timedelta to step one month (31 days) forward, but make sure you stay on the 1st of the month, otherwise the days might add up and eventually skip a month.
from datetime import datetime
from datetime import timedelta
def get_range(a, b):
start = datetime.strptime(a, '%b%y')
end = datetime.strptime(b, '%b%y')
dates = []
while start <= end:
dates.append(start.strftime('%b%y'))
start = (start + timedelta(days=31)).replace(day=1) # go to 1st of next month
return dates
dates = get_range("Jan20", "Jan21")
print(dates)

How can I get the first day of the next month in Python?

How can I get the first date of the next month in Python? For example, if it's now 2019-12-31, the first day of the next month is 2020-01-01. If it's now 2019-08-01, the first day of the next month is 2019-09-01.
I came up with this:
import datetime
def first_day_of_next_month(dt):
'''Get the first day of the next month. Preserves the timezone.
Args:
dt (datetime.datetime): The current datetime
Returns:
datetime.datetime: The first day of the next month at 00:00:00.
'''
if dt.month == 12:
return datetime.datetime(year=dt.year+1,
month=1,
day=1,
tzinfo=dt.tzinfo)
else:
return datetime.datetime(year=dt.year,
month=dt.month+1,
day=1,
tzinfo=dt.tzinfo)
# Example usage (assuming that today is 2021-01-28):
first_day_of_next_month(datetime.datetime.now())
# Returns: datetime.datetime(2021, 2, 1, 0, 0)
Is it correct? Is there a better way?
Here is a 1-line solution using nothing more than the standard datetime library:
(dt.replace(day=1) + datetime.timedelta(days=32)).replace(day=1)
Examples:
>>> dt = datetime.datetime(2016, 2, 29)
>>> print((dt.replace(day=1) + datetime.timedelta(days=32)).replace(day=1))
2016-03-01 00:00:00
>>> dt = datetime.datetime(2019, 12, 31)
>>> print((dt.replace(day=1) + datetime.timedelta(days=32)).replace(day=1))
2020-01-01 00:00:00
>>> dt = datetime.datetime(2019, 12, 1)
>>> print((dt.replace(day=1) + datetime.timedelta(days=32)).replace(day=1))
2020-01-01 00:00:00
Using dateutil you can do it the most literally possible:
import datetime
from dateutil import relativedelta
today = datetime.date.today()
next_month = today + relativedelta.relativedelta(months=1, day=1)
In English: add 1 month(s) to the today's date and set the day (of the month) to 1. Note the usage of singular and plural forms of day(s) and month(s). Singular sets the attribute to a value, plural adds the number of periods.
You can store this relativedelta.relativedelta object to a variable and the pass it around. Other answers involve more programming logic.
EDIT You can do it with the standard datetime library as well, but it's not so beautiful:
next_month = (today.replace(day=1) + datetime.timedelta(days=32)).replace(day=1)
sets the date to the 1st of the current month, adds 32 days (or any number between 31 and 59 which guarantees to jump into the next month) and then sets the date to the 1st of that month.
you can use calendar to get the number of days in a given month, then add timedelta(days=...), like this:
from datetime import date, timedelta
from calendar import monthrange
days_in_month = lambda dt: monthrange(dt.year, dt.month)[1]
today = date.today()
first_day = today.replace(day=1) + timedelta(days_in_month(today))
print(first_day)
if you're fine with external deps, you can use dateutil (which I love...)
from datetime import date
from dateutil.relativedelta import relativedelta
today = date.today()
first_day = today.replace(day=1) + relativedelta(months=1)
print(first_day)
Extract the year and month, add 1 and form a new date using the year, month and day=1:
from datetime import date
now = date(2020,12,18)
y,m = divmod(now.year*12+now.month,12)
nextMonth = date(y,m+1,1)
print(now,nextMonth)
# 2020-12-18 2021-01-01
Your way looks good yet I would have done it this way:
import datetime
from dateutil import relativedelta
dt = datetime.datetime(year=1998,
month=12,
day=12)
nextmonth = dt + relativedelta.relativedelta(months=1)
nextmonth.replace(day=1)
print(nextmonth)
Using only python standard libraries:
import datetime
today = datetime.date.today()
first_of_next_month = return date.replace(
day=1,
month=date.month % 12 + 1,
year=date.year + (date.month // 12)
)
could be generalized to...
def get_first_of_month(date, month_offset=0):
# zero based indexing of month to make math work
month_count = date.month - 1 + month_offset
return date.replace(
day=1, month=month_count % 12 + 1, year=date.year + (month_count // 12)
)
first_of_next_month = get_first_of_month(today, 1)
Other solutions that don't require 3rd party libraries include:
Toby Petty's answer is another good option.
If the exact timedelta is helpful to you,
a slight modification on Adam.Er8's answer might be convenient:
import calendar, datetime
today = datetime.date.today()
time_until_next_month = datetime.timedelta(
calendar.monthrange(today.year, today.month)[1] - today.day + 1
)
first_of_next_month = today + time_until_next_month
With Zope's DateTime library a very simple solution is possible
from DateTime.DateTime import DateTime
date = DateTime() # today
while date.day() != 1:
date += 1
print(date)
I see so many wonderful solutions to this problem I personally was looking for a solution for getting the first and last day of the previous month when I stmbled on this question.
But here is a solution I like to think is quite simple and elegant:
date = datetime.datetime.now().date()
same_time_next_month = date + datetime.timedelta(days = date.day)
first_day_of_next_month_from_date = same_time_next_month - datetime.timedelta(days = same_time_next_month.day - 1)
Here we simply add the day of the target date to the date to get the same time of the next month, and then remove the number of days elapsed from the new date gotten.
Try this, for starting day of each month, change MonthEnd(1) to MonthBegin(1):
import pandas as pd
from pandas.tseries.offsets import MonthBegin, MonthEnd
date_list = (pd.date_range('2021-01-01', '2022-01-31',
freq='MS') + MonthEnd(1)).strftime('%Y-%m-%d').tolist()
date_list
Out:
['2021-01-31',
'2021-02-28',
'2021-03-31',
'2021-04-30',
'2021-05-31',
'2021-06-30',
'2021-07-31',
'2021-08-31',
'2021-09-30',
'2021-10-31',
'2021-11-30',
'2021-12-31',
'2022-01-31']
With python-dateutil:
from datetime import date
from dateutil.relativedelta import relativedelta
last day of current month:
date.today() + relativedelta(day=31)
first day of next month:
date.today() + relativedelta(day=31) + relativedelta(days=1)

How to get a specific date from the previous month given the current date in python?

I want to get the 20th of previous month, given the current_date()
I am trying to use time.strftime but not able to subtract the value from it.
timestr = time.strftime("%Y-(%m-1)%d")
This is giving me error. The expected output is 2019-03-20 if my current_date is in April. Not sure how to go about it.
I read the posts from SO and most of them address getting the first day / last day of the month. Any help would be appreciated.
from datetime import date, timedelta
today = date.today()
last_day_prev_month = today - timedelta(days=today.day)
twenty_prev_month = last_day_prev_month.replace(day=20)
print(twenty_prev_month) # 2019-03-20
Use datetime.replace
import datetime
current_date = datetime.date.today()
new_date = current_date.replace(
month = current_date.month - 1,
day = 20
)
print(new_date)
#2019-03-20
Edit
That won't work for Jan so this is a workaround:
import datetime
current_date = datetime.date(2019, 2, 17)
month = current_date.month - 1
year = current_date.year
if not month:
month, year = 12, year - 1
new_date = datetime.date(year=year, month=month, day=20)
I imagine it is the way dates are parsed. It is my understanding that with your code it is looking for
2019-(03-1)20 or 2019-(12-1)15, etc..
Because the %y is not a variable, but a message about how the date is to be expected within a string of text, and other characters are what should be expected, but not processed (like "-")
This seems entirely not what you are going for. I would just parse the date like normal and then reformat it to be a month earlier:
import datetime
time = datetime.datetime.today()
print(time)
timestr = time.strftime("%Y-%m-%d")
year, month, day = timestr.split("-")
print("{}-{}-{}".format(year, int(month)-1, day))
This would be easier with timedelta objects, but sadly there isn't one for months, because they are of various lengths.
To be more robust if a new year is involved:
import datetime
time = datetime.datetime.today()
print(time)
timestr = time.strftime("%Y-%m-%d")
year, month, day = timestr.split("-")
if month in [1, "01", "1"]: # I don't remember how January is represented
print("{}-{}-{}".format(int(year) - 1, 12, day)) # use December of last year
else:
print("{}-{}-{}".format(year, int(month)-1, day))
This will help:
from datetime import date, timedelta
dt = date.today() - timedelta(30)// timedelta(days No.)
print('Current Date :',date.today())
print(dt)
It is not possible to do math inside a string passed to time.strftime, but you can do something similar to what you're asking very easily using the time module
in Python 3
# Last month
t = time.gmtime()
print(f"{t.tm_year}-{t.tm_mon-1}-20")
or in Python 2
print("{0}-{1}-{2}".format(t.tm_year, t.tm_mon -1, 20))
If you have fewer constraints, you can just use the datetime module instead.
You could use datetime, dateutil or arrow to find the 20th day of the previous month. See examples below.
Using datetime:
from datetime import date
d = date.today()
month, year = (d.month-1, d.year) if d.month != 1 else (12, d.year-1)
last_month = d.replace(day=20, month=month, year=year)
print(last_month)
Using datetime and timedelta:
from datetime import date
from datetime import timedelta
d = date.today()
last_month = (d - timedelta(days=d.day)).replace(day=20)
print(last_month)
Using datetime and dateutil:
from datetime import date
from dateutil.relativedelta import relativedelta # pip install python-dateutil
d = date.today()
last_month = d.replace(day=20) - relativedelta(months=1)
print(last_month)
Using arrow:
import arrow # pip install arrow
d = arrow.now()
last_month = d.shift(months=-1).replace(day=20).datetime.date()
print(last_month)

finding last business day of a month in python

I'm trying to find last business day of of the month. I wrote the code below for that and it works fine but I was wondering if there is a cleaner way of doing it?
from datetime import date,timedelta
import datetime
import calendar
today=datetime.date.today()
last = today.replace(day=calendar.monthrange(today.year,today.month)[1])
if last.weekday()<5:
print last
else:
print last-timedelta(days=1+last.weekday()-5)
Thanks in advance!
I use the following:
from pandas.tseries.offsets import BMonthEnd
from datetime import date
d=date.today()
offset = BMonthEnd()
#Last day of current month
offset.rollforward(d)
#Last day of previous month
offset.rollback(d)
Let's say you want to get the last business days of the month up-to the end of the next two years, the following will work.
import pandas as pd
import datetime
start = datetime.date.today()
end = datetime.date(start.year+2, 12, 31)
bussiness_days_rng =pd.date_range(start, end, freq='BM')
For one-liner fans:
import calendar
def last_business_day_in_month(year: int, month: int) -> int:
return max(calendar.monthcalendar(year, month)[-1][:5])
I use this for the first business day of the month but it can be used for last business day of the month as well:
import time
import datetime
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay
from dateutil.relativedelta import relativedelta
#Create dates needed to be entered as parameters
today = datetime.date.today()
first = today.replace(day=1)
#End of the Prior Month
eopm = first - datetime.timedelta(days=1)
eopm = eopm.strftime("%Y%m%d")
#Create first business day of current month date
us_bd = CustomBusinessDay(calendar=USFederalHolidayCalendar())
focm = first
nxtMo = today + relativedelta(months=+1)
fonm = nxtMo.replace(day=1)
eocm = fonm - datetime.timedelta(days=1)
first_bd = pd.DatetimeIndex(start = focm, end = eocm, freq= us_bd)
first_bd = first_bd.strftime("%Y%m%d")
#First Business Day of the Month
first_bd = first_bd[0]
#Last Business Day of the Month
lst_day = len(first_bd)-1
last_bd = first_bd[lst_day]
I left some code in there that is not needed for the last business day of the current month, but may be useful to someone.
You can use Pandas to get business days. Refer http://pandas.pydata.org/pandas-docs/stable/timeseries.html
Also you can refer this https://pypi.python.org/pypi/business_calendar/ for simple business days calculation.
with rollforward(d) you will skip to the next month if the date is past the last business day of the current month, so below might be safer for any day of the month:
from datetime import date
import pandas as pd
d = date(2011, 12, 31) # a caturday
pd.bdate_range(end=pd.offsets.MonthEnd().rollforward(d), periods=1)
pd.offsets.BMonthEnd().rollforward(d)
I needed something intuitively readable and opted for the following:
from datetime import datetime, timedelta
import pandas as pd
def isMonthLastBusinessDay(date):
lastDayOfMonth = date + pd.offsets.MonthEnd(0)
isFriday = date.weekday() == 4
if (date.weekday() < 5 and lastDayOfMonth == date) or (isFriday and lastDayOfMonth == date+timedelta(days=1)) or (isFriday and lastDayOfMonth == date+timedelta(days=2)):
return True
else:
return False

Categories