Date Offset in pandas data range - python

I have the following formula which get me EOM date every 3M starting Feb 90.
dates = pd.date_range(start="1990-02-01", end="2029-09-30", freq="3M")
I am looking to get in a condensed manner the same table but where the dates are offset by x business days.
This mean, if x = 2, 2 business days before the EOM date calculated every 3M starting Feb 90.
Thanks for the help.

from pandas.tseries.offsets import BDay
x = 2
dates = pd.date_range(start="1990-02-01", end="2029-09-30", freq="3M") - BDay(x)
>>> dates
DatetimeIndex(['1990-02-26', '1990-05-29', '1990-08-29', '1990-11-28',
'1991-02-26', '1991-05-29', '1991-08-29', '1991-11-28',
'1992-02-27', '1992-05-28',
...
'2027-05-27', '2027-08-27', '2027-11-26', '2028-02-25',
'2028-05-29', '2028-08-29', '2028-11-28', '2029-02-26',
'2029-05-29', '2029-08-29'],
dtype='datetime64[ns]', length=159, freq=None)
Example
x = 2
dti1 = pd.date_range(start="1990-02-01", end="2029-09-30", freq="3M")
dti2 = pd.date_range(start="1990-02-01", end="2029-09-30", freq="3M") - BDay(x)
df = pd.DataFrame({"dti1": dti1.day_name(), "dti2": dti2.day_name()})
>>> df.head(20)
dti1 dti2
0 Wednesday Monday
1 Thursday Tuesday
2 Friday Wednesday
3 Friday Wednesday
4 Thursday Tuesday
5 Friday Wednesday
6 Saturday Thursday
7 Saturday Thursday
8 Saturday Thursday
9 Sunday Thursday
10 Monday Thursday
11 Monday Thursday
12 Sunday Thursday
13 Monday Thursday
14 Tuesday Friday
15 Tuesday Friday
16 Monday Thursday
17 Tuesday Friday
18 Wednesday Monday
19 Wednesday Monday

Related

How to iterate two data frames in python?

Dataset1:
Date Weekday OpenPrice ClosePrice
_______________________________________________
28/07/2022 Thursday 5678 5674
04/08/2022 Thursday 5274 5674
11/08/2022. Thursday 7650 7652
Dataset2:
Date Weekday Open Price Close Price
______________________________________________
29/07/2022 Friday 4371 4387
05/08/2022 Friday 6785 6790
12/08/2022 Friday 4367 6756
I would like to iterate these two datasets and create a new dataset with shows data as below. This is the difference between Open Price of Week1 (Week n-1) on Friday and Close price of Week2 (Week n) on Thursday.
Week Difference
______________________________
Week2 543 (i.e 5674 - 4371)
Week3 867 (i.e 7652 - 6785)
Here is the real file:
https://github.com/ravindraprasad75/HRBot/blob/master/DatasetforSOF.xlsx
Don't iterate over dataframes. Merge them instead.
Reconstruction of your data (cf. How to make good reproducible pandas examples on how to share dataframes)
from io import StringIO
from datetime import datetime
cols = ['Date', 'Weekday', 'OpenPrice', 'ClosePrice']
data1 = """28/07/2022 Thursday 5674 5678
04/08/2022 Thursday 5274 5278
11/08/2022. Thursday 7652 7687"""
data2 = """29/07/2022 Friday 4371 4387
05/08/2022 Friday 6785 6790
12/08/2022 Friday 4367 6756"""
df1, df2 = (pd.read_csv(StringIO(d),
header = None,
sep="\s+",
names=cols,
parse_dates=["Date"],
dayfirst=True) for d in (data1, data2))
Add Week column
df1['Week'] = df1.Date.dt.isocalendar().week
df2['Week'] = df2.Date.dt.isocalendar().week
Resulting dataframes:
>>> df1
Date Weekday OpenPrice ClosePrice Week
0 2022-07-28 Thursday 5674 5678 30
1 2022-08-04 Thursday 5274 5278 31
2 2022-08-11 Thursday 7652 7687 32
>>> df2
Date Weekday OpenPrice ClosePrice Week
0 2022-07-29 Friday 4371 4387 30
1 2022-08-05 Friday 6785 6790 31
2 2022-08-12 Friday 4367 6756 32
Merge on Week
df3 = df1.merge(df2, on="Week", suffixes=("_Thursday", "_Friday"))
Result:
>>> df3
Date_Thursday Weekday_Thursday OpenPrice_Thursday ClosePrice_Thursday \
0 2022-07-28 Thursday 5674 5678
1 2022-08-04 Thursday 5274 5278
2 2022-08-11 Thursday 7652 7687
Week Date_Friday Weekday_Friday OpenPrice_Friday ClosePrice_Friday
0 30 2022-07-29 Friday 4371 4387
1 31 2022-08-05 Friday 6785 6790
2 32 2022-08-12 Friday 4367 6756
Now you can simply do df3.OpenPrice_Friday - df3.ClosePrice_Thursday, using shift where you need to compare different weeks.

How to split a dataframe by week on a particular starting weekday (e.g, Thursday)?

I'm using Python, and I have a Dataframe in which all dates and weekdays are mentioned.
And I want to divide them into Week (Like - Thursday to Thursday)
Dataframe -
And Now I want to divide this dataframe in this format-
Date Weekday
0 2021-01-07 Thursday
1 2021-01-08 Friday
2 2021-01-09 Saturday
3 2021-01-10 Sunday
4 2021-01-11 Monday
5 2021-01-12 Tuesday
6 2021-01-13 Wednesday
7 2021-01-14 Thursday,
Date Weekday
0 2021-01-14 Thursday
1 2021-01-15 Friday
2 2021-01-16 Saturday
3 2021-01-17 Sunday
4 2021-01-18 Monday
5 2021-01-19 Tuesday
6 2021-01-20 Wednesday
7 2021-01-21 Thursday,
Date Weekday
0 2021-01-21 Thursday
1 2021-01-22 Friday
2 2021-01-23 Saturday
3 2021-01-24 Sunday
4 2021-01-25 Monday
5 2021-01-26 Tuesday
6 2021-01-27 Wednesday
7 2021-01-28 Thursday,
Date Weekday
0 2021-01-28 Thursday
1 2021-01-29 Friday
2 2021-01-30 Saturday.
In this Format but i don't know how can i divide this dataframe.
You can use pandas.to_datetime if the Date is not yet datetime type, then use the dt.week accessor to groupby:
dfs = [g for _,g in df.groupby(pd.to_datetime(df['Date']).dt.week)]
Alternatively, if you have several years, use dt.to_period:
dfs = [g for _,g in df.groupby(pd.to_datetime(df['Date']).dt.to_period('W'))]
output:
[ Date Weekday
0 2021-01-07 Thursday
1 2021-01-08 Friday
2 2021-01-09 Saturday
3 2021-01-10 Sunday,
Date Weekday
4 2021-01-11 Monday
5 2021-01-12 Tuesday
6 2021-01-13 Wednesday
7 2021-01-14 Thursday
8 2021-01-14 Thursday
9 2021-01-15 Friday
10 2021-01-16 Saturday
11 2021-01-17 Sunday,
Date Weekday
12 2021-01-18 Monday
13 2021-01-19 Tuesday
14 2021-01-20 Wednesday
15 2021-01-21 Thursday
16 2021-01-21 Thursday
17 2021-01-22 Friday
18 2021-01-23 Saturday
19 2021-01-24 Sunday,
Date Weekday
20 2021-01-25 Monday
21 2021-01-26 Tuesday
22 2021-01-27 Wednesday
23 2021-01-28 Thursday
24 2021-01-28 Thursday
25 2021-01-29 Friday
26 2021-01-30 Saturday]
variants
As dictionary:
{k:g for k,g in df.groupby(pd.to_datetime(df['Date']).dt.to_period('W'))}
reset_index of subgroups:
[g.reset_index() for _,g in df.groupby(pd.to_datetime(df['Date']).dt.to_period('W'))]
weeks ending on Wednesday/starting on Thursday with anchor offsets:
[g.reset_index() for _,g in df.groupby(pd.to_datetime(df['Date']).dt.to_period('W-WED'))]

Change Saturdays and Sundays to Fridays

My DataFrame:
start_trade week_day
0 2021-01-16 09:30:00 Saturday
1 2021-01-19 14:30:00 Tuesday
2 2021-01-25 22:00:00 Monday
3 2021-01-29 12:15:00 Friday
4 2021-01-31 12:35:00 Sunday
There are no trades on the exchange on Saturday and Sunday. Therefore, if my trading signal falls on the weekend, I want to open a trade on Friday 23:50.
Expexted output:
start_trade week_day
0 2021-01-15 23:50:00 Friday
1 2021-01-19 14:30:00 Tuesday
2 2021-01-25 22:00:00 Monday
3 2021-01-29 12:15:00 Friday
4 2021-01-29 23:50:00 Friday
How to do it?
You can do it playing with to_timedelta to change the date to the Friday of the week and then set the time with Timedelta. Do this only on the rows wanted with the mask
#for week ends dates
mask = df['start_trade'].dt.weekday.isin([5,6])
df.loc[mask, 'start_trade'] = (df['start_trade'].dt.normalize() # to get midnight
- pd.to_timedelta(df['start_trade'].dt.weekday-4, unit='D') # to get the friday date
+ pd.Timedelta(hours=23, minutes=50)) # set 23:50 for time
df.loc[mask, 'week_day'] = 'Friday'
print(df)
start_trade week_day
0 2021-01-15 23:50:00 Friday
1 2021-01-19 14:30:00 Tuesday
2 2021-01-25 22:00:00 Monday
3 2021-01-29 12:15:00 Friday
4 2021-01-29 23:50:00 Friday
Try:
weekend = df['week_day'].isin(['Saturday', 'Sunday'])
df.loc[weekend, 'week_day'] = 'Friday'
Or np.where along with str.contains, and | operator:
df['week_day'] = np.where(df['week_day'].str.contains(r'Saturday|Sunday'),'Friday',df['week_day'])

How to set the columns in pandas

Here is my dataframe:
Dec-18 Jan-19 Feb-19 Mar-19 Apr-19 May-19
Saturday 2540.0 2441.0 3832.0 4093.0 1455.0 2552.0
Sunday 1313.0 1891.0 2968.0 2260.0 1454.0 1798.0
Monday 1360.0 1558.0 2967.0 2156.0 1564.0 1752.0
Tuesday 1089.0 2105.0 2476.0 1577.0 1744.0 1457.0
Wednesday 1329.0 1658.0 2073.0 2403.0 1231.0 874.0
Thursday 798.0 1195.0 2183.0 1287.0 1460.0 1269.0
I have tried some pandas ops but I am not able to do that.
This is what I want to do:
items
Saturday 2540.0
Sunday 1313.0
Monday 1360.0
Tuesday 1089.0
Wednesday 1329.0
Thursday 798.0
Saturday 2441.0
Sunday 1891.0
Monday 1558.0
Tuesday 2105.0
Wednesday 1658.0
Thursday 1195.0 ............ and so on
I want to set those rows into rows in downside, how to do that?
df.reset_index().melt(id_vars='index').drop('variable',1)
Output:
index value
0 Saturday 2540.0
1 Sunday 1313.0
2 Monday 1360.0
3 Tuesday 1089.0
4 Wednesday 1329.0
5 Thursday 798.0
6 Saturday 2441.0
7 Sunday 1891.0
8 Monday 1558.0
9 Tuesday 2105.0
10 Wednesday 1658.0
11 Thursday 1195.0
12 Saturday 3832.0
13 Sunday 2968.0
14 Monday 2967.0
15 Tuesday 2476.0
16 Wednesday 2073.0
17 Thursday 2183.0
18 Saturday 4093.0
19 Sunday 2260.0
20 Monday 2156.0
21 Tuesday 1577.0
22 Wednesday 2403.0
23 Thursday 1287.0
24 Saturday 1455.0
25 Sunday 1454.0
26 Monday 1564.0
27 Tuesday 1744.0
28 Wednesday 1231.0
29 Thursday 1460.0
30 Saturday 2552.0
31 Sunday 1798.0
32 Monday 1752.0
33 Tuesday 1457.0
34 Wednesday 874.0
35 Thursday 1269.0
Note: just noted a commented suggesting to do the same thing, I will delete my post if requested :)
Create it with numpy by reshaping the data.
import pandas as pd
import numpy as np
pd.DataFrame(df.to_numpy().flatten('F'),
index=np.tile(df.index, df.shape[1]),
columns=['items'])
Output:
items
Saturday 2540.0
Sunday 1313.0
Monday 1360.0
Tuesday 1089.0
Wednesday 1329.0
Thursday 798.0
Saturday 2441.0
...
Sunday 1798.0
Monday 1752.0
Tuesday 1457.0
Wednesday 874.0
Thursday 1269.0
You can do:
df = df.stack().sort_index(level=1).reset_index(level = 1, drop=True).to_frame('items')
It is interesting that this method got overlooked even though it is the fastest:
import time
start = time.time()
df.stack().sort_index(level=1).reset_index(level = 1, drop=True).to_frame('items')
end = time.time()
print("time taken {}".format(end-start))
yields: time taken 0.006181955337524414
while this:
start = time.time()
df.reset_index().melt(id_vars='days').drop('variable',1)
end = time.time()
print("time taken {}".format(end-start))
yields: time taken 0.010072708129882812
Any my output format matches OP's requested exactly.

Find date name in datetime column with user input as weekday name as a 'String'

I have a datetime column called 'Start Time'
I am trying to find entries with a specific weekday name based on user input.
The user input is a String and can be
Sunday, Monday,...Saturday. So if Sunday is input I want to find all entries that have Sunday as the day, regardless of the month or year.
Here is my code:
user_day = input('Input the name of the day.user_day.')
print(df[df['Start Time'].dt.weekday == dt.datetime.strptime(user_day, '%A')])
The output is:
Empty DataFrame
Columns: [Unnamed: 0, Start Time, End Time, Trip Duration, Start Station, End Station, User Type]
Index: []
Use weekday_name or strftime with same format:
print(df[df['Start Time'].dt.weekday_name == user_day])
Or:
print(df[df['Start Time'].dt.strftime('%A') == user_day])
Verify:
df = pd.DataFrame({'Start Time':pd.date_range('2015-01-01 15:02:45', periods=10)})
print (df)
Start Time
0 2015-01-01 15:02:45
1 2015-01-02 15:02:45
2 2015-01-03 15:02:45
3 2015-01-04 15:02:45
4 2015-01-05 15:02:45
5 2015-01-06 15:02:45
6 2015-01-07 15:02:45
7 2015-01-08 15:02:45
8 2015-01-09 15:02:45
9 2015-01-10 15:02:45
user_day = 'Monday'
print(df[df['Start Time'].dt.weekday_name == user_day])
Start Time
4 2015-01-05 15:02:45
print (df['Start Time'].dt.weekday_name)
0 Thursday
1 Friday
2 Saturday
3 Sunday
4 Monday
5 Tuesday
6 Wednesday
7 Thursday
8 Friday
9 Saturday
Name: Start Time, dtype: object
print (df['Start Time'].dt.strftime('%A'))
0 Thursday
1 Friday
2 Saturday
3 Sunday
4 Monday
5 Tuesday
6 Wednesday
7 Thursday
8 Friday
9 Saturday
Name: Start Time, dtype: object
pandas.Series.dt.weekday: The day of the week with Monday=0,
Sunday=6
pandas.Series.dt.weekday_name: The name of day in a week (e.g.,: Friday)
Pay attention to user input. You are loading a string, you don't need to convert such a string to datetime using strptime.
You just need to match the weekday_name of the datetimelike in your DF with user input. It's a string matching.
print(df[df['Start Time'].dt.weekday_name == user_day])

Categories