If a date in a datetime series falls on a weekend (US), I'd like to move that date forward to the following Monday. So far I've come up with this, but it obviously won't work for likely several reasons, least of which because the days parameter of timedelta can't be a series.
df['Open Date'] = np.where(df['Open Date'].dt.weekday > 4, df['Open Date'] + timedelta(days=7-df['Open Date'].dt.weekday), df['Open Date'])
How can I change this to work with a series?
pd.offsets.BusinessDay(0) will shift weekends to the following Monday, leaving weekdays unchanged.
import pandas as pd
df = pd.DataFrame({'date': pd.date_range('2020-12-20', '2020-12-29', freq='D')})
df['date_shift'] = df['date'] + pd.offsets.BusinessDay(0)
date date_shift
0 2020-12-20 2020-12-21 # Sunday -> Monday
1 2020-12-21 2020-12-21 # Monday -> Monday
2 2020-12-22 2020-12-22
3 2020-12-23 2020-12-23
4 2020-12-24 2020-12-24
5 2020-12-25 2020-12-25 # Christmas Holiday Friday Unchanged
6 2020-12-26 2020-12-28 # Saturday -> Monday
7 2020-12-27 2020-12-28 # Sunday -> Monday
8 2020-12-28 2020-12-28
9 2020-12-29 2020-12-29
Related
Dataset1:
Date Weekday OpenPrice ClosePrice
_______________________________________________
28/07/2022 Thursday 5678 5674
04/08/2022 Thursday 5274 5674
11/08/2022. Thursday 7650 7652
Dataset2:
Date Weekday Open Price Close Price
______________________________________________
29/07/2022 Friday 4371 4387
05/08/2022 Friday 6785 6790
12/08/2022 Friday 4367 6756
I would like to iterate these two datasets and create a new dataset with shows data as below. This is the difference between Open Price of Week1 (Week n-1) on Friday and Close price of Week2 (Week n) on Thursday.
Week Difference
______________________________
Week2 543 (i.e 5674 - 4371)
Week3 867 (i.e 7652 - 6785)
Here is the real file:
https://github.com/ravindraprasad75/HRBot/blob/master/DatasetforSOF.xlsx
Don't iterate over dataframes. Merge them instead.
Reconstruction of your data (cf. How to make good reproducible pandas examples on how to share dataframes)
from io import StringIO
from datetime import datetime
cols = ['Date', 'Weekday', 'OpenPrice', 'ClosePrice']
data1 = """28/07/2022 Thursday 5674 5678
04/08/2022 Thursday 5274 5278
11/08/2022. Thursday 7652 7687"""
data2 = """29/07/2022 Friday 4371 4387
05/08/2022 Friday 6785 6790
12/08/2022 Friday 4367 6756"""
df1, df2 = (pd.read_csv(StringIO(d),
header = None,
sep="\s+",
names=cols,
parse_dates=["Date"],
dayfirst=True) for d in (data1, data2))
Add Week column
df1['Week'] = df1.Date.dt.isocalendar().week
df2['Week'] = df2.Date.dt.isocalendar().week
Resulting dataframes:
>>> df1
Date Weekday OpenPrice ClosePrice Week
0 2022-07-28 Thursday 5674 5678 30
1 2022-08-04 Thursday 5274 5278 31
2 2022-08-11 Thursday 7652 7687 32
>>> df2
Date Weekday OpenPrice ClosePrice Week
0 2022-07-29 Friday 4371 4387 30
1 2022-08-05 Friday 6785 6790 31
2 2022-08-12 Friday 4367 6756 32
Merge on Week
df3 = df1.merge(df2, on="Week", suffixes=("_Thursday", "_Friday"))
Result:
>>> df3
Date_Thursday Weekday_Thursday OpenPrice_Thursday ClosePrice_Thursday \
0 2022-07-28 Thursday 5674 5678
1 2022-08-04 Thursday 5274 5278
2 2022-08-11 Thursday 7652 7687
Week Date_Friday Weekday_Friday OpenPrice_Friday ClosePrice_Friday
0 30 2022-07-29 Friday 4371 4387
1 31 2022-08-05 Friday 6785 6790
2 32 2022-08-12 Friday 4367 6756
Now you can simply do df3.OpenPrice_Friday - df3.ClosePrice_Thursday, using shift where you need to compare different weeks.
I have a dataframe as follows:
period
1651622400000.00000
1651536000000.00000
1651449600000.00000
1651363200000.00000
1651276800000.00000
1651190400000.00000
1651104000000.00000
1651017600000.00000
I have converted it into human readable datetime as:
df['period'] = pd.to_datetime(df['period'], unit='ms')
and this outputs:
2022-04-04 00:00:00
2022-04-05 00:00:00
2022-04-06 00:00:00
2022-04-07 00:00:00
2022-04-08 00:00:00
2022-04-09 00:00:00
2022-04-10 00:00:00
2022-04-11 00:00:00
2022-04-12 00:00:00
hours minutes and seconds are turned to 0.
I checked this into https://www.epochconverter.com/ and this gives
GMT: Monday, April 4, 2022 12:00:00 AM
Your time zone: Monday, April 4, 2022 5:45:00 AM GMT+05:45
How do I get h, m, and s as well?
If use https://www.epochconverter.com/ is added timezone.
If need add timezones to column use Series.dt.tz_localize and then Series.dt.tz_convert:
df['period'] = (pd.to_datetime(df['period'], unit='ms')
.dt.tz_localize('GMT')
.dt.tz_convert('Asia/Kathmandu'))
print (df)
period
0 2022-05-04 05:45:00+05:45
1 2022-05-03 05:45:00+05:45
2 2022-05-02 05:45:00+05:45
3 2022-05-01 05:45:00+05:45
4 2022-04-30 05:45:00+05:45
5 2022-04-29 05:45:00+05:45
6 2022-04-28 05:45:00+05:45
7 2022-04-27 05:45:00+05:45
There is no problem with your code or with pandas. And I don't think the timezone is an issue here either (as the other answer says). April 4, 2022 12:00:00 AM is the exact same time and date as 2022-04-04 00:00:00, just in one case you use AM... You could specify timezones as jezrael writes or with utc=True (check the docs) but I guess that's not your problem.
My DataFrame:
start_trade week_day
0 2021-01-16 09:30:00 Saturday
1 2021-01-19 14:30:00 Tuesday
2 2021-01-25 22:00:00 Monday
3 2021-01-29 12:15:00 Friday
4 2021-01-31 12:35:00 Sunday
There are no trades on the exchange on Saturday and Sunday. Therefore, if my trading signal falls on the weekend, I want to open a trade on Friday 23:50.
Expexted output:
start_trade week_day
0 2021-01-15 23:50:00 Friday
1 2021-01-19 14:30:00 Tuesday
2 2021-01-25 22:00:00 Monday
3 2021-01-29 12:15:00 Friday
4 2021-01-29 23:50:00 Friday
How to do it?
You can do it playing with to_timedelta to change the date to the Friday of the week and then set the time with Timedelta. Do this only on the rows wanted with the mask
#for week ends dates
mask = df['start_trade'].dt.weekday.isin([5,6])
df.loc[mask, 'start_trade'] = (df['start_trade'].dt.normalize() # to get midnight
- pd.to_timedelta(df['start_trade'].dt.weekday-4, unit='D') # to get the friday date
+ pd.Timedelta(hours=23, minutes=50)) # set 23:50 for time
df.loc[mask, 'week_day'] = 'Friday'
print(df)
start_trade week_day
0 2021-01-15 23:50:00 Friday
1 2021-01-19 14:30:00 Tuesday
2 2021-01-25 22:00:00 Monday
3 2021-01-29 12:15:00 Friday
4 2021-01-29 23:50:00 Friday
Try:
weekend = df['week_day'].isin(['Saturday', 'Sunday'])
df.loc[weekend, 'week_day'] = 'Friday'
Or np.where along with str.contains, and | operator:
df['week_day'] = np.where(df['week_day'].str.contains(r'Saturday|Sunday'),'Friday',df['week_day'])
I have a pandas series s, I would like to extract the Monday before the third Friday:
with the help of the answer in following link, I can get a resample of third friday, I am still not sure how to get the Monday just before it.
pandas resample to specific weekday in month
from pandas.tseries.offsets import WeekOfMonth
s.resample(rule=WeekOfMonth(week=2,weekday=4)).bfill().asfreq(freq='D').dropna()
Any help is welcome
Many thanks
For each source date, compute your "wanted" date in 3 steps:
Shift back to the first day of the current month.
Shift forward to Friday in third week.
Shift back 4 days (from Friday to Monday).
For a Series containing dates, the code to do it is:
s.dt.to_period('M').dt.to_timestamp() + pd.offsets.WeekOfMonth(week=2, weekday=4)\
- pd.Timedelta('4D')
To test this code I created the source Series as:
s = (pd.date_range('2020-01-01', '2020-12-31', freq='MS') + pd.Timedelta('1D')).to_series()
It contains the second day of each month, both as the index and value.
When you run the above code, you will get:
2020-01-02 2020-01-13
2020-02-02 2020-02-17
2020-03-02 2020-03-16
2020-04-02 2020-04-13
2020-05-02 2020-05-11
2020-06-02 2020-06-15
2020-07-02 2020-07-13
2020-08-02 2020-08-17
2020-09-02 2020-09-14
2020-10-02 2020-10-12
2020-11-02 2020-11-16
2020-12-02 2020-12-14
dtype: datetime64[ns]
The left column contains the original index (source date) and the right
column - the "wanted" date.
Note that third Monday formula (as proposed in one of comments) is wrong.
E.g. third Monday in January is 2020-01-20, whereas the correct date is 2020-01-13.
Edit
If you have a DataFrame, something like:
Date Amount
0 2020-01-02 10
1 2020-01-12 10
2 2020-01-13 2
3 2020-01-20 2
4 2020-02-16 2
5 2020-02-17 12
6 2020-03-15 12
7 2020-03-16 3
8 2020-03-31 3
and you want something like resample but each "period" should start
on a Monday before the third Friday in each month, and e.g. compute
a sum for each period, you can:
Define the following function:
def dateShift(d):
d += pd.Timedelta(4, 'D')
d = pd.offsets.WeekOfMonth(week=2, weekday=4).rollback(d)
return d - pd.Timedelta(4, 'D')
i.e.:
Add 4 days (e.g. move 2020-01-13 (Monday) to 2020-01-17 (Friday).
Roll back (in the above case (on offset) this date will not be moved).
Subtract 4 days.
Run:
df.groupby(df.Date.apply(dateShift)).sum()
The result is:
Amount
Date
2019-12-16 20
2020-01-13 6
2020-02-17 24
2020-03-16 6
E. g. two values of 10 for 2020-01-02 and 2020-01-12 are assigned
to period starting on 2019-12-16 (the "wanted" date for December 2019).
Given a df of this kind, where we have DateTime Index:
DateTime A
2007-08-07 18:00:00 1
2007-08-08 00:00:00 2
2007-08-08 06:00:00 3
2007-08-08 12:00:00 4
2007-08-08 18:00:00 5
2007-11-02 18:00:00 6
2007-11-03 00:00:00 7
2007-11-03 06:00:00 8
2007-11-03 12:00:00 9
2007-11-03 18:00:00 10
I would like to subset observations using the attributes of the index, like:
First business day of the month
Last business day of the month
First Friday of the month 'WOM-1FRI'
Third Friday of the month 'WOM-3FRI'
I'm specifically interested to know if this can be done using something like:
df.loc[(df['A'] < 5) & (df.index == 'WOM-3FRI'), 'Signal'] = 1
Thanks
You could try...
# FIRST DAY OF MONTH
df.loc[df[1:][df.index.month[:-1]!=df.index.month[1:]].index]
# LAST DAY OF MONTH
df.loc[df[:-1][df.index.month[:-1]!=df.index.month[1:]].index]
# 1st Friday
fr1 = df.groupby(df.index.year*100+df.index.month).apply(lambda x: x[(x.index.week==1)*(x.index.weekday==4)])
# 3rd Friday
fr3 = df.groupby(df.index.year*100+df.index.month).apply(lambda x: x[(x.index.week==3)*(x.index.weekday==4)])
If you want to remove extra-levels in the index of fr1 and fr3:
fr1.index=fr1.index.droplevel(0)
fr3.index=fr3.index.droplevel(0)