Change Saturdays and Sundays to Fridays - python

My DataFrame:
start_trade week_day
0 2021-01-16 09:30:00 Saturday
1 2021-01-19 14:30:00 Tuesday
2 2021-01-25 22:00:00 Monday
3 2021-01-29 12:15:00 Friday
4 2021-01-31 12:35:00 Sunday
There are no trades on the exchange on Saturday and Sunday. Therefore, if my trading signal falls on the weekend, I want to open a trade on Friday 23:50.
Expexted output:
start_trade week_day
0 2021-01-15 23:50:00 Friday
1 2021-01-19 14:30:00 Tuesday
2 2021-01-25 22:00:00 Monday
3 2021-01-29 12:15:00 Friday
4 2021-01-29 23:50:00 Friday
How to do it?

You can do it playing with to_timedelta to change the date to the Friday of the week and then set the time with Timedelta. Do this only on the rows wanted with the mask
#for week ends dates
mask = df['start_trade'].dt.weekday.isin([5,6])
df.loc[mask, 'start_trade'] = (df['start_trade'].dt.normalize() # to get midnight
- pd.to_timedelta(df['start_trade'].dt.weekday-4, unit='D') # to get the friday date
+ pd.Timedelta(hours=23, minutes=50)) # set 23:50 for time
df.loc[mask, 'week_day'] = 'Friday'
print(df)
start_trade week_day
0 2021-01-15 23:50:00 Friday
1 2021-01-19 14:30:00 Tuesday
2 2021-01-25 22:00:00 Monday
3 2021-01-29 12:15:00 Friday
4 2021-01-29 23:50:00 Friday

Try:
weekend = df['week_day'].isin(['Saturday', 'Sunday'])
df.loc[weekend, 'week_day'] = 'Friday'

Or np.where along with str.contains, and | operator:
df['week_day'] = np.where(df['week_day'].str.contains(r'Saturday|Sunday'),'Friday',df['week_day'])

Related

How to split a dataframe by week on a particular starting weekday (e.g, Thursday)?

I'm using Python, and I have a Dataframe in which all dates and weekdays are mentioned.
And I want to divide them into Week (Like - Thursday to Thursday)
Dataframe -
And Now I want to divide this dataframe in this format-
Date Weekday
0 2021-01-07 Thursday
1 2021-01-08 Friday
2 2021-01-09 Saturday
3 2021-01-10 Sunday
4 2021-01-11 Monday
5 2021-01-12 Tuesday
6 2021-01-13 Wednesday
7 2021-01-14 Thursday,
Date Weekday
0 2021-01-14 Thursday
1 2021-01-15 Friday
2 2021-01-16 Saturday
3 2021-01-17 Sunday
4 2021-01-18 Monday
5 2021-01-19 Tuesday
6 2021-01-20 Wednesday
7 2021-01-21 Thursday,
Date Weekday
0 2021-01-21 Thursday
1 2021-01-22 Friday
2 2021-01-23 Saturday
3 2021-01-24 Sunday
4 2021-01-25 Monday
5 2021-01-26 Tuesday
6 2021-01-27 Wednesday
7 2021-01-28 Thursday,
Date Weekday
0 2021-01-28 Thursday
1 2021-01-29 Friday
2 2021-01-30 Saturday.
In this Format but i don't know how can i divide this dataframe.
You can use pandas.to_datetime if the Date is not yet datetime type, then use the dt.week accessor to groupby:
dfs = [g for _,g in df.groupby(pd.to_datetime(df['Date']).dt.week)]
Alternatively, if you have several years, use dt.to_period:
dfs = [g for _,g in df.groupby(pd.to_datetime(df['Date']).dt.to_period('W'))]
output:
[ Date Weekday
0 2021-01-07 Thursday
1 2021-01-08 Friday
2 2021-01-09 Saturday
3 2021-01-10 Sunday,
Date Weekday
4 2021-01-11 Monday
5 2021-01-12 Tuesday
6 2021-01-13 Wednesday
7 2021-01-14 Thursday
8 2021-01-14 Thursday
9 2021-01-15 Friday
10 2021-01-16 Saturday
11 2021-01-17 Sunday,
Date Weekday
12 2021-01-18 Monday
13 2021-01-19 Tuesday
14 2021-01-20 Wednesday
15 2021-01-21 Thursday
16 2021-01-21 Thursday
17 2021-01-22 Friday
18 2021-01-23 Saturday
19 2021-01-24 Sunday,
Date Weekday
20 2021-01-25 Monday
21 2021-01-26 Tuesday
22 2021-01-27 Wednesday
23 2021-01-28 Thursday
24 2021-01-28 Thursday
25 2021-01-29 Friday
26 2021-01-30 Saturday]
variants
As dictionary:
{k:g for k,g in df.groupby(pd.to_datetime(df['Date']).dt.to_period('W'))}
reset_index of subgroups:
[g.reset_index() for _,g in df.groupby(pd.to_datetime(df['Date']).dt.to_period('W'))]
weeks ending on Wednesday/starting on Thursday with anchor offsets:
[g.reset_index() for _,g in df.groupby(pd.to_datetime(df['Date']).dt.to_period('W-WED'))]

How can i extract day of week from timestamp in pandas

I have a timestamp column in a dataframe as below, and I want to create another column called day of week from that. How can do it?
Input:
Pickup date/time
07/05/2018 09:28:00
14/05/2018 17:00:00
15/05/2018 17:00:00
15/05/2018 17:00:00
23/06/2018 17:00:00
29/06/2018 17:00:00
Expected Output:
Pickup date/time Day of Week
07/05/2018 09:28:00 Monday
14/05/2018 17:00:00 Monday
15/05/2018 17:00:00 Tuesday
15/05/2018 17:00:00 Tuesday
23/06/2018 17:00:00 Saturday
29/06/2018 17:00:00 Friday
You can use weekday_name
df['date/time'] = pd.to_datetime(df['date/time'], format = '%d/%m/%Y %H:%M:%S')
df['Day of Week'] = df['date/time'].dt.weekday_name
You get
date/time Day of Week
0 2018-05-07 09:28:00 Monday
1 2018-05-14 17:00:00 Monday
2 2018-05-15 17:00:00 Tuesday
3 2018-05-15 17:00:00 Tuesday
4 2018-06-23 17:00:00 Saturday
5 2018-06-29 17:00:00 Friday
Edit:
For the newer versions of Pandas, use day_name(),
df['Day of Week'] = df['date/time'].dt.day_name()
pandas>=0.23.0: pandas.Timestamp.day_name()
df['Day of Week'] = df['date/time'].day_name()
https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.day_name.html
pandas>=0.18.1,<0.23.0: pandas.Timestamp.weekday_name()
Deprecated since version 0.23.0
https://pandas-docs.github.io/pandas-docs-travis/reference/api/pandas.Timestamp.weekday_name.html

Split dataframe to several dataframes

I have following code:
Date X
...
2014-12-30 23:00:00 2
2014-12-30 23:15:00 0
2014-12-30 23:30:00 1
2014-12-30 23:45:00 1
2014-12-31 00:00:00 22
...
2015-01-01 00:00:00 0
2015-01-02 00:00:00 2
2015-01-03 00:00:00 2
2015-01-04 00:00:00 2
2015-01-04 00:00:00 2
2015-01-05 00:00:00 2
...
I want to split this time series (dataframe) into many time series (dataframe). I would like to have one time series for each Monday, one for all Tuesdays, Wednesdays ... etc.
How can I do that with pandas?
You can create dictionary of DataFrames with groupby and weekday_name:
dfs = dict(tuple(df.groupby(df['Date'].dt.weekday_name)))
#select by days
print (dfs['Friday'])
Date X
6 2015-01-02 2
print (dfs['Thursday'])
Date X
5 2015-01-01 0
Detail:
print (df['Date'].dt.weekday_name)
0 Tuesday
1 Tuesday
2 Tuesday
3 Tuesday
4 Wednesday
5 Thursday
6 Friday
7 Saturday
8 Sunday
9 Sunday
10 Monday
Name: Date, dtype: object

Select periods from days of week and time

A B C
0 2001-01-13 10:00:00 Saturday
1 2001-01-14 12:33:00 Sunday
2 2001-01-20 15:10:00 Saturday
3 2001-01-24 13:15:00 Wednesday
4 2001-01-24 16:56:00 Wednesday
5 2001-01-24 19:09:00 Wednesday
6 2001-01-28 19:14:00 Sunday
7 2001-01-29 11:00:00 Monday
8 2001-01-29 23:50:00 Monday
9 2001-01-30 11:50:00 Tuesday
10 2001-01-30 13:00:00 Tuesday
11 2001-02-02 16:14:00 Wednesday
12 2001-02-02 09:25:00 Friday
I want to create a new df containing rows between all periods from Mondays at 12:00:00 to Wednesdays at 17:00:00
The output would be:
A B C
3 2001-01-24 13:15:00 Wednesday
5 2001-01-24 16:56:00 Wednesday
8 2001-01-29 23:50:00 Monday
9 2001-01-30 11:50:00 Tuesday
10 2001-01-30 13:00:00 Tuesday
11 2001-02-02 16:14:00 Wednesday
I tried with
df[(df["B"] >= "12:00:00") & (df["B"] <= "17:00:00")] & df[(df["C"] >= "Monday") & (df["C"] <= "Wednesday")]
But this is not what I want.
Thank you.
You can create 3 boolean masks and filter by boolean indexing - first for first day with starts time, second for all day between and last for last day and end time:
from datetime import time
#if necessary convert to datetime
df['A'] = pd.to_datetime(df['A'])
#if necessary convert to times
df['B'] = pd.to_datetime(df['B']).dt.time
m1 = (df['B']>=time(12)) & (df['C'] == 'Monday')
m2 = (df['C'] == 'Tuesday')
m3 = (df['B']<=time(17)) & (df['C'] == 'Wednesday')
df = df[m1 | m2 | m3]
print (df)
A B C
3 2001-01-24 13:15:00 Wednesday
4 2001-01-24 16:56:00 Wednesday
8 2001-01-29 23:50:00 Monday
9 2001-01-30 11:50:00 Tuesday
10 2001-01-30 13:00:00 Tuesday
12 2001-02-02 09:25:00 Wednesday
Another solution with same times from Monday to Friday:
from datetime import time
df['A'] = pd.to_datetime(df['A'])
df['B'] = pd.to_datetime(df['B']).dt.time
m1 = (df['B']>=time(12)) & (df['C'] == 'Monday')
m2 = df['C'].isin(['Tuesday', 'Wednesday'])
m3 = (df['B']<=time(17)) & (df['C'] == 'Friday')
df = df[m1 | m2 | m3]
print (df)
A B C
3 2001-01-24 13:15:00 Wednesday
4 2001-01-24 16:56:00 Wednesday
5 2001-01-24 19:09:00 Wednesday
8 2001-01-29 23:50:00 Monday
9 2001-01-30 11:50:00 Tuesday
10 2001-01-30 13:00:00 Tuesday
11 2001-02-02 16:14:00 Friday
12 2001-02-02 09:25:00 Wednesday
Use OR (|) operator and equal (=), instead of & and <=, >=). Hope it helps. Thanks.
old: df[(df["B"] >= "12:00:00") & (df["B"] <= "17:00:00")] & df[(df["C"] >= "Monday") & (df["C"] <= "Wednesday")]
New: df[(df["B"] >= "12:00:00") & (df["B"] <= "17:00:00")] & (df[(df["C"] = "Monday") | (df["C"] = "Tuesday") | (df["C"] = "Wednesday"))]

How to select observations of df using datetime index atributes in Pandas?

Given a df of this kind, where we have DateTime Index:
DateTime A
2007-08-07 18:00:00 1
2007-08-08 00:00:00 2
2007-08-08 06:00:00 3
2007-08-08 12:00:00 4
2007-08-08 18:00:00 5
2007-11-02 18:00:00 6
2007-11-03 00:00:00 7
2007-11-03 06:00:00 8
2007-11-03 12:00:00 9
2007-11-03 18:00:00 10
I would like to subset observations using the attributes of the index, like:
First business day of the month
Last business day of the month
First Friday of the month 'WOM-1FRI'
Third Friday of the month 'WOM-3FRI'
I'm specifically interested to know if this can be done using something like:
df.loc[(df['A'] < 5) & (df.index == 'WOM-3FRI'), 'Signal'] = 1
Thanks
You could try...
# FIRST DAY OF MONTH
df.loc[df[1:][df.index.month[:-1]!=df.index.month[1:]].index]
# LAST DAY OF MONTH
df.loc[df[:-1][df.index.month[:-1]!=df.index.month[1:]].index]
# 1st Friday
fr1 = df.groupby(df.index.year*100+df.index.month).apply(lambda x: x[(x.index.week==1)*(x.index.weekday==4)])
# 3rd Friday
fr3 = df.groupby(df.index.year*100+df.index.month).apply(lambda x: x[(x.index.week==3)*(x.index.weekday==4)])
If you want to remove extra-levels in the index of fr1 and fr3:
fr1.index=fr1.index.droplevel(0)
fr3.index=fr3.index.droplevel(0)

Categories