Calculating calendar weeks from fiscal weeks - python

So I am really new to this and struggling with something, which I feel should be quite simple.
I have a Pandas Dataframe containing two columns: Fiscal Week (str) and Amount sold (int).
Fiscal Week
Amount sold
0
2019031
24
1
2019041
47
2
2019221
34
3
2019231
46
4
2019241
35
My problem is the fiscal week column. It contains strings which describe the fiscal year and week . The fiscal year for this purpose starts on October 1st and ends on September 30th. So basically, 2019031 is the Monday (the 1 at the end) of the third week of October 2019. And 2019221 would be the 2nd week of March 2020.
The issue is that I want to turn this data into timeseries later. But I can't do that with the data in string format - I need it to be in date time format.
I actually added the 1s at the end of all these strings using
df['Fiscal Week']= df['Fiscal Week'].map('{}1'.format)
so that I can then turn it into a proper date:
df['Fiscal Week'] = pd.to_datetime(df['Fiscal Week'], format="%Y%W%w")
as I couldn't figure out how to do it with just the weeks and no day defined.
This, of course, returns the following:
Fiscal Week
Amount sold
0
2019-01-21
24
1
2019-01-28
47
2
2019-06-03
34
3
2019-06-10
46
4
2019-06-17
35
As expected, this is clearly not what I need, as according to the definition of the fiscal year week 1 is not January at all but rather October.
Is there some simple solution to get the dates to what they are actually supposed to be?
Ideally I would like the final format to be e.g. 2019-03 for the first entry. So basically exactly like the string but in some kind of date format, that I can then work with later on. Alternatively, calendar weeks would also be fine.

Assuming you have a data frame with fiscal dates of the form 'YYYYWW' where YYY = the calendar year of the start of the fiscal year and ww = the number of weeks into the year, you can convert to calendar dates as follows:
def getCalendarDate(fy_date: str):
f_year = fy_date[0:4]
f_week = fy_date[4:]
fys = pd.to_datetime(f'{f_year}/10/01', format= '%Y/%m/%d')
return fys + pd.to_timedelta(int(f_week), "W")
You can then use this function to create the column of calendar dates as follows:
df['Calendar Date]'] = list(getCalendarDate(x) for x in df['Fiscal Week'].to_list())

Related

Trying to sort by date in a grouped dataframe in python

I want to produce a dataframe that splits by day (which is the day date of the month) but then orders them by the date. At the moment the code below splits them into dates e.g. 1 - 11, 2 - 11 but the 30 -10 and 31-10 come after all my November dates.
ResultSet2 = ResultProxy2.fetchall()
df2 = pd.DataFrame(ResultSet2)
resultsrecovery = [group[1] for group in df2.groupby(["day"])]
The current code output :
I basically want the grouped dataframe for the 30-10 and 31st of October to come before all the ones in November

Update Financial Week based on Multiple Parameters using Pandas/Python

I have a table that is updated manually every week using excel. I would like to automate this process using python/pandas. I want to update report week number(This number indicates how many times we have reported on that month so far for a given quarter) based on financial week and month. Obviously we are now in September but I will show you the first week to give you an idea of how its updated. The first week for 2021 would start on 01/04/2021 (First Monday of the year) and end on 12/27/2021 (Last Monday of the year).
This script is to be run weekly so next time it is run 01/04/201 --> 01/11/2021, 1 week is added & the "Report Week" should update by 1 too, unless report week is greater than 13. If "Report Week" is greater than 13 than we stop updating that month and add the next month. So in this case we drop December and start reporting March and its report Week becomes 1, as this is the first month we are reporting on it.
Month
Finance Week
Report Week
December
01/04/2021
13
January
01/04/2021
8
February
01/04/2021
4
January
01/11/2021
9
February
01/11/2021
5
March
01/11/2021
1
When January hits Report week 13 we will stop updating that month and move onto April and give it a value of 1 for tis Report Week and so on for every month.
I am not sure what is the best way to go about this. I read here https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iterrows.html that one should not update when iterating a df so I'm not sure what to do.

How to convert a date into the nth day of the week in Alteryx or Python?

If I have a column of dates filed like the below;
Date
2021-08-01
2021-08-02
2021-08-03
2021-08-01
2021-08-02
What I wish to do is add a new column that will tell me the number of mondays for example that the date is in the year.
so I can see that for the first record the first of August was a Sunday and it was the 31st Sunday of the year, whereas the 12th was a Thursday and was the 32nd Thursday of the year.
Date Number Of WeekDay in Year
2021-08-01 31
2021-08-02 31
2021-08-03 31
2021-08-12 32
... ...
If it makes it easier is there a way to do it using the python tool within Alteryx?
The answer by johnjps111 explains it all, but here's an implementation using python only (no alteryx):
import math
from datetime import datetime
def get_weekday_occurrences(date_string):
year_day = datetime.strptime(date_string, '%Y-%m-%d').timetuple().tm_yday
return math.ceil(year_day / 7)
Which can be used as follows:
get_weekday_occurrences('2021-08-12')
Note we use datetime.timetuple to get the day of the year (tm_yday).
For Alteryx, try the formula Ceil(DateTimeFormat([date],'%j') / 7) ... explanation: regardless of day of week, if it's the first day of the year, it's also the first "of that weekday" of the year... at day number 8, it becomes the 2nd "of that weekday" of the year, and so on. Since Alteryx gives you "day of the year" for free using the given DateTimeFornat function, it's then a simple division and Ceil() function.
To get the week in the year from a date string:
from datetime import datetime
a = '2021-08-02'
b = datetime.fromisoformat(a)
print('week of the year:', b.strftime('%W'))
output:
week of the year: 31
For more information about datetime: link

Python - Get policy year from datetime dataframe

I have a dataframe (df) with a column in datetime format YYYY-MM-DD ('date'). I am trying to create a new column that returns the policy year, which always starts on April 1st and thus the policy year for January through March will always be the prior calander year. There are dates that are rather old so setting up individual date ranges for the sample size below wouldn't be ideal
The dataframe would look like this
df['date']
2020-12-10
2021-02-10
2019-03-31
and output should look like this
2020
2020
2018
I now know how to get the year using df['date'].dt.year. However, I am having trouble getting the dataframe to convert each year to the respective policy year so that if df['date'].dt.month >= 4 then df['date'].dt.year, else df['date'].dt.year - 1
I am not quite sure how to set this up exactly. I have been trying to avoid setting up multiple columns to do a bool for month >= 4 and then setting up different columns. I've gone so far as to set up this but get ValueError stating the series is too ambiguous
def PolYear(x):
y = x.dt.month
if y >= 4:
x.dt.year
else:
x.dt.year - 1
df['Pol_Year'] = PolYear(df['date'])
I'm wasn't sure if this was the right way to go about it so I also tried a df.loc format for >= and < 4 but len key and value are not equal. Definitely think I'm missing something super simple.
I previously had mentioned 'fiscal year', but this is incorrect.
Quang Hoand had the right idea but used the incorrect frequency in the call to to_period(self, freq). For your purposes you want to use the following code:
df.date.dt.to_period('Q-MAR').dt.qyear
This will give you:
0 2021
1 2021
2 2019
Name: date, dtype: int64
Q-MAR defines fiscal year end in March
These values are the correct fiscal years (fiscal years use the year in which they end, not where they begin[reference]). If you you want to have the output using the year in which they begin, it's simple:
df.date.dt.to_period('Q-MAR').dt.qyear - 1
Giving you
0 2020
1 2020
2 2018
Name: date, dtype: int64
qyear docs
This is qyear:
df.date.dt.to_period('Q').dt.qyear
Output:
0 2020
1 2021
2 2019
Name: date, dtype: int64

how can I align different-day timeseries in pandas?

I have two time series, df1
day cnt
2020-03-01 135006282
2020-03-02 145184482
2020-03-03 146361872
2020-03-04 147702306
2020-03-05 148242336
and df2:
day cnt
2017-03-01 149104078
2017-03-02 149781629
2017-03-03 151963252
2017-03-04 147384922
2017-03-05 143466746
The problem is that the sensors I'm measuring are sensitive to the day of the week, so on Sunday, for instance, they will produce less cnt. Now I need to compare the time series over 2 different years, 2017 and 2020, but to do that I have to align (March, in this case) to the matching day of the week, and plot them accordingly. How do I "shift" the data to make the series comparable?
The ISO calendar is a representation of date in a tuple (year, weeknumber, weekday). In pandas they are the dt members year, weekofyear and weekday. So assuming that the day column actually contains Timestamps (convert if first with to_datetime if it does not), you could do:
df1['Y'] = df1.day.dt.year
df1['W'] = df1.day.dt.weekofyear
df1['D'] = df1.day.dt.weekday
Then you could align the dataframes on the W and D columns
March 2017 started on wednesday
March 2020 started on Sunday
So, delete the last 3 days of march 2017
So, delete the first sunday, monday and tuesday from 2020
this way you have comparable days
df1['ctn2020'] = df1['cnt']
df2['cnt2017'] = df2['cnt']
df1 = df1.iloc[2:, 2]
df2 = df2.iloc[:-3, 2]
Since you don't want to plot the date, but want the months to align, make a new dataframe with both columns and a index column. This way you will have 3 columns: index(0-27), 2017 and 2020. The index will represent.
new_df = pd.concat([df1,df2], axis=1)
If you also want to plot the days of the week on the x axis, check out this link, to know how to get the day of the week from a date, and them change the x ticks label.
Sorry for the "written step-to-stop", if it all sounds confusing, i can type the whole code later for you.

Categories