How to retrieve previous NYSE trading day in Pandas? - python

I'm trying to get the previous trading day in relation to a given trading day. To start I am simply trying to make a function which returns the previous business day when given a date, like so:
import datetime
def get_previous_trading_day(day):
day = pd.to_datetime(day)
previous_trading_day = day - datetime.timedelta(days=1)
return previous_trading_day
But when I call my function and print the current vs previous date, the previous date is not the previous day:
2021-05-01 curr
1885-02-22 00:00:00 prev
How do I make this work?

If you change the calculation to use pd.tseries.offsets.BDay instead, you will get the previous business day (instead of the day before). I realise this will not work for bank holidays where no trading occurs. Your function works well when I try it for returning the previous day's date.
def get_previous_trading_day(day):
day = pd.to_datetime(day)
previous_trading_day = day - pd.tseries.offsets.BDay(1)
return previous_trading_day
Calling the function for tomorrow's date will return Friday's date:
get_previous_trading_day("2022-05-16")
#Out: Timestamp('2022-05-13 00:00:00')
For your returned date, you may have put a date format that was not read correctly by the pd.to_datetime. If you require a specific format, add the kwarg format= to this.

Related

Filter Django query by week of month?

In a Django query, how would you filter by a timestamp's week within a month?
There's a built-in week accessor, but that refers to week-of-the-year, e.g. 1-52. As far as I can tell, there's no other built-in option.
The only way I see to do this is to calculate the start and end date range for the week, and then filter on that using the conventional means.
So I'm using a function like:
def week_of_month_date(year, month, week):
"""
Returns the date of the first day in the week of the given date's month,
where Monday is the first day of the week.
e.g. week_of_month_date(year=2022, month=8, week=2) -> date(2022, 8, 7)
"""
assert 1 <= week <= 5
assert 1 <= month <= 12
for i in range(1, 32):
dt = date(year, month, i)
_week = week_of_month(dt)
if _week == week:
return dt
and then to calculate for, say, the 3rd week of July, 2022, I'd do:
start_date = week_of_month_date(2022, 7, 3)
end_date = week_of_month_date(2022, 7, 3) + timedelta(days=7)
qs = MyModel.objects.filter(created__gte=start_date, created__lte=end_date)
Is there an easier or more efficient way to do this with the Django ORM or SQL?
The easiest way to do this using datetime objects is to quite simply subtract the current date weekly year value, with the yearly week value for the 1st day (or 1st week) of the month.
You can use the .isocalendar() function to achieve this:
dt.isocalendar[1] - dt.replace(day=1).isocalendar()[1] + 1
Basically if the week is 46 and that means the first week is week 44 then the resulting output should be 2.
UPDATE
I misunderstood the question, the answer is clear below. However, you may want to consider revising your function based on my above comments.
Come to think of it, if you have a datetime object, you can get the isocalendar week and filter using that like so:
MyModel.objects.filter(created__week=dt.isocalendar()[1])
dt.isocalendar() returns essentially a tuple of 3 integers, [0], is the year, [1], is the iso week (1-52 or 53) and [2], the day of the week (1-7).
As per the docs here:
https://docs.djangoproject.com/en/4.1/ref/models/querysets/#week
There is a built-in filter for isoweek out of the box :)
However, filtering by "week of month" is not possible within the realms of "out of the box".
You might consider writing your own query expression object which accepts an isocalendar object and converts that? But I think you would be better off converting a datetime object and use the isoweek filter.
There's a neat little blog post here to get you started if you really want to do that:
https://dev.to/idrisrampurawala/writing-custom-django-database-functions-4dmb

How can I get the year, month, and day from a Deephaven DateTime in Python?

I have a Deephaven DateTime in the New York (US-East) timezone and I'd like to get the year, month, and day (of the month) numbers from it as integers in Python.
Deephaven's time module has these utilities. You may have used it to create a Deephaven DateTime in the first place.
from deephaven import time as dhtu
timestamp = dhtu.to_datetime("2022-04-01T12:00:00 NY")
The following three methods will give you what you're looking for:
year - Gets the year
month_of_year - Gets the month
day_of_month - Gets the day of the month
All three methods will give you what you want based on the DateTime itself and your preferred time zone.
tz_ny = dhtu.TimeZone.NY
year = dhtu.year(timestamp, tz_ny)
month = dhtu.month_of_year(timestamp, tz_ny)
day = dhtu.day_of_month(timestamp, tz_ny)

Python/Pandas: Find the Custom Business Quarter End of a datetime which takes holidays into account

I want to find the Business Quarter End of a datetime in python which will take care of holidays as well. These holidays may be passed as list for simplicity. I know BQuarterEnd() from pandas.tseries.offsets. As far as I know, it doesn't take holidays into account.
Example: If 2020-11-20 is passed and 2020-12-31 is a business day but a holiday as well; it should return 2020-12-30.
Thanks.
In Pandas, there are a set of Custom business days functions where you can define your own list of holidays and then the functions calculate the correct date offsets for you, taking into account the custom holiday list.
For example, we have CustomBusinessMonthEnd (better documentation here). Unfortunately, there is no corresponding CustomBusinessQuarterEnd (Custom Business QuarterEnd) function for quarter end.
However, we can still get some workaround solution, like below:
Define your custom holiday list, e.g. :
holiday_list = ['2020-12-31']
Make use of a combination of QuarterEnd + CustomBusinessMonthEnd to get the required date for Custom Business QuarterEnd skipping the holidays:
import pandas as pd
base_date = pd.to_datetime('2020-11-20') # Base date
custom_business_quarter_end = (base_date
+ pd.offsets.QuarterEnd(n=0)
- pd.offsets.MonthBegin()
+ pd.offsets.CustomBusinessMonthEnd(holidays=holiday_list))
Firstly, we add your base date to the QuarterEnd to get the quarter end date (without considering holidays). Then, to get the Custom Business QuarterEnd skipping the holidays, we use the CustomBusinessMonthEnd passing also the holiday list as parameter for it to adjust for the holidays.
For QuarterEnd, we pass the parameter n=0 to handle the edge case where the base date is already on the Quarter End date. We avoid QuarterEnd to rollover this quarter end date to the next quarter end date. You can refer to the official doc here to know more about how Pandas handles dates falling onto anchor dates (see the subsection starting with "For the case when n=0, ...")
We also make use of MonthBegin first before calling CustomBusinessMonthEnd. This is to avoid rolling over of a day at month-end anchor to the next month. We need this because the n=0 parameter does not work similarly for CustomBusinessMonthEnd like how it works for QuarterEnd to avoid rolling over. Hence, this extra minus MonthBegin is required. With the use of MonthBegin, we get the month begin date of the quarter-end, i.e. 2020-12-01 first, and then get the custom business month-end date. In this way, we can avoid the result of QuarterEnd e.g. 2020-12-31 being rolled over to the next month end e.g. 2021-01-31 when directly calling CustomBusinessMonthEnd.
Result:
print(custom_business_quarter_end)
2020-12-30 00:00:00
You probably need a custom function. Maybe something like:
def custom_quarter_end(date, holidays=[]):
holidays = [pd.Timestamp(h) for h in holidays]
end = pd.Timestamp(date)+pd.tseries.offsets.BQuarterEnd()
while end in holidays:
end = end - pd.tseries.offsets.BDay()
return end
>>> custom_quarter_end("2020-11-20", ["2020-12-30", "2020-12-31"])
Timestamp('2020-12-29 00:00:00')

Python - Calendar / Date Library for Arithmetic Date Operations

This is for Python:
I need a library that is able to do arithmetic operations on dates while taking into account the duration of a month and or year.
For example, say I add a value of "1 day" to 3/31/2020, the result of should return:
1 + 3/31/2020 = 4/1/2020.
I also would need to be able to convert this to datetime format, and extract day, year and month.
Does a library like this exist?
import datetime
tday = datetime.date.today() # create today
print("Today:", tday)
""" create one week time duration """
oneWeek = datetime.timedelta(days=7)
""" create 1 day and 1440 minutes of time duraiton """
eightDays = datetime.timedelta(days=7, minutes=1440)
print("A week later than today:", tday + oneWeek) # print today +7 days
And the output to this code snippet is:
Today: 2020-03-25
A week later than today: 2020-04-01
>>>
As you see, it takes month overflows into account and turns March to April. datetime module has lots of things, I don't know all its attributes well and haven't used for a long time. However, I believe you can find nice documentation or tutorials on the web.
You definitely can create any specific date(there should be some constraints though) instead of today by supplying day, month and year info. I just don't remember how to do it.

Python3 Panda's Holiday fails to NOT find dates in arbitrary periods in the past

Made my own definition of MLK Day Holiday that adheres not to when the holiday was first observed, but by when it was first observed by the NYSE. The NYSE first observed MLK day in January of 1998.
When asking the Holiday for the days in which the holiday occurred between dates, it works fine for the most part, returning an empty set when the MLK date is not in the range requested, and returning the appropriate date when it is. For date ranges that precede the start_date of the holiday, it appropriately returns the empty set, until we hit around 1995, and then it fails. I cannot figure out why it fails then and not in other situations when the empty set is the correct answer.
Note: Still stuck on Pandas 0.22.0. Python3
import pandas as pd
from datetime import datetime
from dateutil.relativedelta import MO
from pandas.tseries.holiday import Holiday
__author__ = 'eb'
mlk_rule = Holiday('MLK Day (NYSE Observed)',
start_date=datetime(1998, 1, 1), month=1, day=1,
offset=pd.DateOffset(weekday=MO(3)))
start = pd.to_datetime('1999-01-17')
end = pd.to_datetime('1999-05-01')
finish = pd.to_datetime('1980-01-01')
while start > finish:
print(f"{start} - {end}:")
try:
dates = mlk_rule.dates(start, end, return_name=True)
except Exception as e:
print("\t****** Fail *******")
print(f"\t{e}")
break
print(f"\t{dates}")
start = start - pd.DateOffset(years=1)
end = end - pd.DateOffset(years=1)
When run, this results in:
1999-01-17 00:00:00 - 1999-05-01 00:00:00:
1999-01-18 MLK Day (NYSE Observed)
Freq: 52W-MON, dtype: object
1998-01-17 00:00:00 - 1998-05-01 00:00:00:
1998-01-19 MLK Day (NYSE Observed)
Freq: 52W-MON, dtype: object
1997-01-17 00:00:00 - 1997-05-01 00:00:00:
Series([], dtype: object)
1996-01-17 00:00:00 - 1996-05-01 00:00:00:
Series([], dtype: object)
1995-01-17 00:00:00 - 1995-05-01 00:00:00:
****** Fail *******
Must provide freq argument if no data is supplied
What happens in 1995 that causes it to fail, that does not happen in the same periods in the years before?
ANSWER: Inside of the Holiday class, the dates() method is used to
gather the list of valid holidays within a requested date range. In
order to insure that this occurs properly, the implementation gathers
all holidays from one year before to one year after the requested date
range via the internal _reference_dates() method. In this method,
if the receiving Holiday instance has an internal start or end date,
it uses that date as the begin or end of the range to be examined
rather than the passed in requested range, even if the dates in the requested
range precede or exceed the start or end date of the rule.
The existing implementation mistakenly assumes it is ok to limit the effective range over which it must accurately identify what holidays are in existence to the range over which holidays exist. As part of a set of rules in a calendar, it is as important for a Holiday to identify where holidays do not exist as where they do. The NULL set response is an important function of the Holiday class.
For example, in a Trading Day Calendar that needs to identify when financial markets are open or closed, the calendar may need to accurately identify which days the market is closed over a 100 year history. The market only closed for MLK day for a small part of that history. A calendar that includes the MLK holiday as constructed above throws an error when asked for the open days or holidays for periods preceding the MLK start_date[1].
To fix this, I re-implemented the _reference_dates() method in a
custom sub-class of Holiday to insure that when the requested date
range extends before the start_date or after the end_date of the
holiday rule, it uses the actual requested range to build the
reference dates from, rather than bound the request by the internal
start and end dates.
Here is the implementation I am using.
class MLKHoliday(Holiday):
def __init__(self):
super().__init__('MLK Day (NYSE Observed)',
start_date=datetime(1998, 1, 1), month=1, day=1,
offset=pd.DateOffset(weekday=MO(3)))
def _reference_dates(self, start_date, end_date):
"""
Get reference dates for the holiday.
Return reference dates for the holiday also returning the year
prior to the start_date and year following the end_date. This ensures
that any offsets to be applied will yield the holidays within
the passed in dates.
"""
if self.start_date and start_date and start_date >= self.start_date:
start_date = self.start_date.tz_localize(start_date.tz)
if self.end_date and end_date and end_date <= self.end_date:
end_date = self.end_date.tz_localize(end_date.tz)
year_offset = pd.DateOffset(years=1)
reference_start_date = pd.Timestamp(
datetime(start_date.year - 1, self.month, self.day))
reference_end_date = pd.Timestamp(
datetime(end_date.year + 1, self.month, self.day))
# Don't process unnecessary holidays
dates = pd.DatetimeIndex(start=reference_start_date,
end=reference_end_date,
freq=year_offset, tz=start_date.tz)
return dates
Does anyone know if this has been fixed in a more up-to-date version of pandas?
[1] Note: As constructed in the original question, the mlk_rule will not actually fail to provide the NULL set to the dates() call over a range just preceding the start_date but will actually start throwing exceptions a year or so before that. This is because the mistaken assumption about the lack of need for a proper NULL set response is mitigated by the extension of the date range by a year in each direction by _reference_dates().

Categories