Create Pandas Holiday for SIFMA Good Friday - python

For SIFMA (bonds) market, Good Friday is NOT a holiday if it is the first Friday of the month because NFP (Non-Farm Payroll) numbers coming out.
We can use a hack to take the existing holidays and not include them if day of month <= 7, but this is ugly. Is there a way to properly define rules where the NFP Fridays are taken into account?
This works, but is ugly:
from pandas.tseries.holiday import (
AbstractHolidayCalendar,
Holiday,
nearest_workday,
sunday_to_monday,
GoodFriday,
)
# occurs on the first Friday of every month
sifma_good_friday_holidays = [
Holiday('Good Friday', year=d.year, month=d.month, day=d.day)
for d in GoodFriday.dates('1900-01-01', '2200-12-31')
if d.day > 7
]
Note You cannot currently create a holiday where you both pass in an offset and an observance (so you can't simply add an observance that filters out the holidays where the day of the month <= 7)
The following does seem to work, but also seems like quite a kludge.
def not_NFP_GoodFriday(dt: datetime):
possible = GoodFriday.dates(date(dt.year,1,1), date(dt.year,12,31))[0]
if possible.day <= 7:
return None
return datetime.fromtimestamp(possible.timestamp())
sifma_good_friday_holidays = Holiday("NFP Good Friday", month=1, day=1,
observance=not_NFP_GoodFriday)
Is there no better way than one of these two functions?

Related

Adding a range of dates as one holiday rule, instead of just a single date, in Pandas.tseries AbstractHolidayCalendar?

I'm working on a Python script to offset a given start date with X number of business days according to a custom holiday calendar. Pandas.tseries seems to be a good choice.
When building my generic holiday calendar, I have come across examples on adding a single date to the holiday rules.
Example:
import pandas as pd
from pandas.tseries.holiday import AbstractHolidayCalendar, Holiday, Easter
from pandas.tseries.offsets import Day
class myCalendar(AbstractHolidayCalendar):
rules = [
Holiday('Off-day during Easter', month=1, day=1, offset=[Easter(), Day(-2)]),
Holiday('Christmas Day', month=12, day=25)
]
When using a function like this:
def offset_date(start, offset):
return start + pd.offsets.CustomBusinessDay(n=offset, calendar=myCalendar())
The dates within the rules will be skipped as expected.
But I now want to add 3 full weeks, 21 days to the rule set, with a given start-offset, instead of writing 21 rule lines to achieve the same thing?
I wonder if you guys know if it's possible to create a one-liner that adds 21 days to the rule set?
Here is one way to do it with a list comprehension, which keeps it short and readable:
class myCalendar(AbstractHolidayCalendar):
rules = [
Holiday("Off-day during Easter", month=1, day=1, offset=[Easter(), Day(-2)]),
Holiday("Christmas Day", month=12, day=25),
Holiday("Christmas Day", month=12, day=25),
] + [Holiday("Ski days", month=2, day=x) for x in range(1, 22)]
Here, a 21 days-off period starting February, 1st is added to the set of rules.
So that:
print(offset_date(pd.to_datetime("2023-01-31"), 1))
# 2023-02-22 00:00:00 as expected

How can I find the elapsed business hours between two dates using pandas' CustomBusinessHour objects?

If I want to find the number of hours between two datetime objects, I can do something like this:
from datetime import datetime
today = datetime.today()
day_after_tomorrow = datetime(2022, 9, 24)
diff = (day_after_tomorrow - today).total_seconds() / 3600
print(diff)
which returns: 37.58784580333333 hours.
But this is the number of real hours between two dates. I want to know the number of specific business hours between two dates.
I can define two CustomBusinessHour objects with pandas to specify those business hours (which are 8AM to 4:30PM M-F, and 8AM to 12PM on Saturday, excluding US Federal holidays):
from pandas.tseries.offsets import CustomBusinessHour
from pandas.tseries.holiday import USFederalHolidayCalendar
business_hours_mtf = CustomBusinessHour(calendar=USFederalHolidayCalendar(), start='08:00', end='16:30')
business_hours_sat = CustomBusinessHour(calendar=USFederalHolidayCalendar(), start='08:00', end='12:00')
My understanding is that CustomBusinessHour is a type of pandas DateOffset object, so it should behave just like a relativedelta object. So I should be able to use it in the datetime arithmetic somehow, to get the number I want.
And that's as far as I was able to get.
What I think I'm struggling to understand is how relativedeltas work, and how to actually use them in datetime arithmetic.
Is this the right approach? If so, how can I use these CustomBusinessHour objects to get an accurate amount of elapsed business hours between the two dates?
I figured out a solution. It feels ugly and hacky, but it seems to work. Hopefully someone else has a simpler or more elegant solution.
Edit: I cleaned up the documentation a little bit to make it easier to read. Also added a missing kwarg in business_hours_sat. Figuring this out was a headache, so if anyone else has to deal with this problem, hopefully this solution helps.
from datetime import datetime, timedelta
from pandas.tseries.offsets import CustomBusinessHour
from pandas.tseries.holiday import USFederalHolidayCalendar
business_hours_mtf = CustomBusinessHour(calendar=USFederalHolidayCalendar(), start='08:00', end='16:30')
business_hours_sat = CustomBusinessHour(calendar=USFederalHolidayCalendar(), weekmask='Sat', start='08:00', end='12:00')
def get_business_hours_range(earlier_date: datetime, later_date: datetime) -> float:
"""Return the number of business hours between `earlier_date` and `later_date` as a float with two decimal places.
Algorithm:
1. Increment `earlier_date` by 1 "business hour" until it's further in the future than `later_date`.
2. Also increment an `elapsed_business_hours` variable by 1.
3. Once `earlier_date` is larger (further in the future) than `later_date`...
a. Roll back `earlier_date` by one business hour.
b. Get the close of business hour for `earlier_date` ([3a]).
c. Get the number of minutes between [3b] and [3a] (`minutes_remaining`).
d. Create a timedelta with `elapsed_business_hours` and `minutes_remaining`
e. Represent this timedelta as a float with two decimal places.
f. Return this float.
"""
# Count how many "business hours" have elapsed between the `earlier_date` and `later_date`.
elapsed_business_hours = 0.0
current_day_of_week = 0
while earlier_date < later_date:
day_of_week = earlier_date.isoweekday()
# 6 = Saturday
if day_of_week == 6:
# Increment `earlier_date` by one "business hour", as specified by the `business_hours_sat` CBH object.
earlier_date += business_hours_sat
# Increment the counter of how many "business hours" have elapsed between these two dates.
elapsed_business_hours += 1
# Save the current day of the week in `earlier_date`, in case this is the last iteration of this while loop.
current_day_of_week = day_of_week
# 1 = Monday, 2 = Tuesday, ...
elif day_of_week in (1, 2, 3, 4, 5):
# Increment `earlier_date` by one "business hour", as specified by the `business_hours_mtf` CBH object.
earlier_date += business_hours_mtf
# Increment the counter of how many "business hours" have elapsed between these two dates.
elapsed_business_hours += 1
# Save the current day of the week in `earlier_date`, in case this is the last iteration of this while loop.
current_day_of_week = day_of_week
# Once we've incremented `earlier_date` to a date further in the future than `later_date`, we know that we've counted
# all the full (60min) "business hours" between `earlier_date` and `later_date`. (We can only increment by one hour when using
# CBH, so when we make this final increment, we may be skipping over a few minutes in that last day.)
#
# So now we roll `earlier_date` back by 1 business hour, to the last full business hour before `later_date`. Then we get the
# close of business hour for that day, and subtract `earlier_date` from it. This will give us whatever minutes may be remaining
# in that day, that weren't accounted for when tallying the number of "business hours".
#
# But before we do these things, we need to check what day of the week the last business hour is, so we know which closing time
# to use.
if current_day_of_week == 6:
ed_rolled_back = earlier_date - business_hours_sat
ed_closing_time = datetime.combine(ed_rolled_back, business_hours_sat.end[0])
elif current_day_of_week in (1, 2, 3, 4, 5):
ed_rolled_back = earlier_date - business_hours_mtf
ed_closing_time = datetime.combine(ed_rolled_back, business_hours_mtf.end[0])
minutes_remaining = (ed_closing_time - ed_rolled_back).total_seconds() / 60
if 0 < minutes_remaining < 60:
delta = timedelta(hours=elapsed_business_hours, minutes=minutes_remaining)
else:
delta = timedelta(hours=elapsed_business_hours)
delta_hours = round(float(delta.total_seconds() / 3600), 2)
return delta_hours

I'm trying to make a function that filters the last working day of the month, I'm having a problem

This function is taking the last working day of the previous month. Assuming that when the last business day of the previous month falls on a Friday, I need to fall on the next business day, which in this case would be a Monday. But it's falling on a Saturday. How do I integrate so it only drops on weekdays? And how can I integrate the holidays?
I already have an existing holiday table
import calendar
from datetime import date
import pandas as pd
from datetime import timedelta
def BsDay():
today = date.today()
last_day = max(calendar.monthcalendar(today.year, today.month)[-1:][0][:5])
validDay = (today.year, today.month-1,
max(calendar.monthcalendar(today.year, today.month-1)[-1:][0][:5]))
if today.month == 1 and today.day <= last_day:
validDay = (today.year-1, today.month+11,
max(calendar.monthcalendar(today.year, today.month-1)[-1:][0][:5]))
elif today.day <= last_day:
validDay
else:
validDay = (today.year, today.month, max(
calendar.monthcalendar(today.year, today.month)[-1:][0][:5]))
return validDay
I tried to do this calculation but it is not working.
BusinessDay = ''.join(map(str, BsDay()))
BusinessDay = BusinessDay[0:4] + '-0' + BusinessDay[4]+'-'+BusinessDay[5:7]
dateD='2022-07-29'
dateD=pd.to_datetime(dateD)
dateF=pd.to_datetime(BusinessDay)
LastDayNotWeekend=dateD+timedelta(days=1)
print('LastDayNotWeekend not Weekend',LastDayNotWeekend)
LastDayInWeekend=dateF+timedelta(days=1)
print('LastDayInWeekend in Weekend->',LastDayInWeekend)
Output of the variable where the last working day of the month does not fall on a weekend.
LastDayNotWeekend not Weekend 2022-07-30 00:00:00
Output of the variable where the last working day of the month falls on a weekend.
LastDayInWeekend in Weekend-> 2022-09-01 00:00:00

Get number of days in a specific month that are in a date range

Haven't been able to find an answer to this problem. Basically what I'm trying to do is this:
Take a daterange, for example October 10th to November 25th. What is the best algorithm for determining how many of the days in the daterange are in October and how many are in November.
Something like this:
def daysInMonthFromDaterange(daterange, month):
# do stuff
return days
I know that this is pretty easy to implement, I'm just wondering if there's a very good or efficient algorithm.
Thanks
Borrowing the algorithm from this answer How do I divide a date range into months in Python?
, this might work. The inputs are in date format, but can be changed to date strings if preferred:
import datetime
begin = '2018-10-10'
end = '2018-11-25'
dt_start = datetime.datetime.strptime(begin, '%Y-%m-%d')
dt_end = datetime.datetime.strptime(end, '%Y-%m-%d')
one_day = datetime.timedelta(1)
start_dates = [dt_start]
end_dates = []
today = dt_start
while today <= dt_end:
#print(today)
tomorrow = today + one_day
if tomorrow.month != today.month:
start_dates.append(tomorrow)
end_dates.append(today)
today = tomorrow
end_dates.append(dt_end)
out_fmt = '%d %B %Y'
for start, end in zip(start_dates,end_dates):
diff = (end - start).days
print('{} to {}: {} days'.format(start.strftime(out_fmt), end.strftime(out_fmt), diff))
result:
10 October 2018 to 31 October 2018: 21 days
01 November 2018 to 25 November 2018: 24 days
The problem as stated may not have a unique answer. For example what should you get from daysInMonthFromDaterange('Feb 15 - Mar 15', 'February')? That will depend on the year!
But if you substitute actual days, I would suggest converting from dates to integer days, using the first of the month to the first of the next month as your definition of a month. This is now reduced to intersecting intervals of integers, which is much easier.
The assumption that the first of the month always happened deals with months of different lengths, variable length months, and even correctly handles the traditional placement of the switch from the Julian calendar to the Gregorian. See cal 1752 for that. (It will not handle that switch for all locations though. Should you be dealing with a library that does Romanian dates in 1919, you could have a problem...)
You can use the datetime module:
from datetime import datetime
start = datetime(2018,10,10)
end = datetime(2018,11,25)
print((end - start).days)
Something like this would work:
def daysInMonthFromDaterange(date1, date2, month):
return [x for x in range(date1.toordinal(), date2.toordinal()) if datetime.date.fromordinal(x).year == month.year and datetime.date.fromordinal(x).month == month.month]
print(len(days_in_month(date(2018,10,10), date(2018,11,25), date(2018,10,01))))
This just loops through all the days between date1 and date2, and returns it as part of a list if it matches the year and month of the third argument.

I want to extract a certain dates from the calendar IN PYTHON please

I want to extract a certain dates from the calendar. For example, given the start date is 02/01/2017 and that day happens to be a Monday, I want to add the date to an empty list and if day in (Mon to Thurs), I want to add 8 days to that date and add that date to the list. If friday, add 3 days to the list.
Put it simply, if start date is monday, next tuesday, wednesday after that week and the thursday after and then friday ... each week, increment the day and add it to the list but if friday, add monday's date to the list.
The pandas package isn't necessary, but if you ever need to input a date via raw_input(), it is a good tool for handling formatting issues with just one line of code. (2016/12/12 vs. 12-12-16, etc.)
If I understand your question correctly, this should accomplish what you want:
from datetime import timedelta
import pandas as pd
dates = []
start_date = pd.to_datetime(raw_input('Enter Start Date: '))
dates.append(start_date.strftime('%x'))
if start_date.weekday() < 4:
add_date = start_date + timedelta(8)
dates.append(add_date.strftime('%x'))
elif start_date.weekday() == 4:
add_date = start_date + timedelta(3)
dates.append(add_date.strftime('%x'))
else:
print 'Error: Start Date is on a weekend.'
print dates

Categories