In Impala - I am trying to get the all week start dates and week end dates between 08/01/2021 (Aug 1st 2021)- 12/31/2022 (December 31st 2021)
can anyone help with this?
Related
How can I use the pandas date range function to have a frequency of a dekad. A dekad is a 10 day period, and each month has 3 dekads, (from 1st - 10th, 11th - 20th, 212t - 30th).
pd.date_range(start_date, end_date, freq='D')
So I am really new to this and struggling with something, which I feel should be quite simple.
I have a Pandas Dataframe containing two columns: Fiscal Week (str) and Amount sold (int).
Fiscal Week
Amount sold
0
2019031
24
1
2019041
47
2
2019221
34
3
2019231
46
4
2019241
35
My problem is the fiscal week column. It contains strings which describe the fiscal year and week . The fiscal year for this purpose starts on October 1st and ends on September 30th. So basically, 2019031 is the Monday (the 1 at the end) of the third week of October 2019. And 2019221 would be the 2nd week of March 2020.
The issue is that I want to turn this data into timeseries later. But I can't do that with the data in string format - I need it to be in date time format.
I actually added the 1s at the end of all these strings using
df['Fiscal Week']= df['Fiscal Week'].map('{}1'.format)
so that I can then turn it into a proper date:
df['Fiscal Week'] = pd.to_datetime(df['Fiscal Week'], format="%Y%W%w")
as I couldn't figure out how to do it with just the weeks and no day defined.
This, of course, returns the following:
Fiscal Week
Amount sold
0
2019-01-21
24
1
2019-01-28
47
2
2019-06-03
34
3
2019-06-10
46
4
2019-06-17
35
As expected, this is clearly not what I need, as according to the definition of the fiscal year week 1 is not January at all but rather October.
Is there some simple solution to get the dates to what they are actually supposed to be?
Ideally I would like the final format to be e.g. 2019-03 for the first entry. So basically exactly like the string but in some kind of date format, that I can then work with later on. Alternatively, calendar weeks would also be fine.
Assuming you have a data frame with fiscal dates of the form 'YYYYWW' where YYY = the calendar year of the start of the fiscal year and ww = the number of weeks into the year, you can convert to calendar dates as follows:
def getCalendarDate(fy_date: str):
f_year = fy_date[0:4]
f_week = fy_date[4:]
fys = pd.to_datetime(f'{f_year}/10/01', format= '%Y/%m/%d')
return fys + pd.to_timedelta(int(f_week), "W")
You can then use this function to create the column of calendar dates as follows:
df['Calendar Date]'] = list(getCalendarDate(x) for x in df['Fiscal Week'].to_list())
I have a table that is updated manually every week using excel. I would like to automate this process using python/pandas. I want to update report week number(This number indicates how many times we have reported on that month so far for a given quarter) based on financial week and month. Obviously we are now in September but I will show you the first week to give you an idea of how its updated. The first week for 2021 would start on 01/04/2021 (First Monday of the year) and end on 12/27/2021 (Last Monday of the year).
This script is to be run weekly so next time it is run 01/04/201 --> 01/11/2021, 1 week is added & the "Report Week" should update by 1 too, unless report week is greater than 13. If "Report Week" is greater than 13 than we stop updating that month and add the next month. So in this case we drop December and start reporting March and its report Week becomes 1, as this is the first month we are reporting on it.
Month
Finance Week
Report Week
December
01/04/2021
13
January
01/04/2021
8
February
01/04/2021
4
January
01/11/2021
9
February
01/11/2021
5
March
01/11/2021
1
When January hits Report week 13 we will stop updating that month and move onto April and give it a value of 1 for tis Report Week and so on for every month.
I am not sure what is the best way to go about this. I read here https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iterrows.html that one should not update when iterating a df so I'm not sure what to do.
If I have a column of dates filed like the below;
Date
2021-08-01
2021-08-02
2021-08-03
2021-08-01
2021-08-02
What I wish to do is add a new column that will tell me the number of mondays for example that the date is in the year.
so I can see that for the first record the first of August was a Sunday and it was the 31st Sunday of the year, whereas the 12th was a Thursday and was the 32nd Thursday of the year.
Date Number Of WeekDay in Year
2021-08-01 31
2021-08-02 31
2021-08-03 31
2021-08-12 32
... ...
If it makes it easier is there a way to do it using the python tool within Alteryx?
The answer by johnjps111 explains it all, but here's an implementation using python only (no alteryx):
import math
from datetime import datetime
def get_weekday_occurrences(date_string):
year_day = datetime.strptime(date_string, '%Y-%m-%d').timetuple().tm_yday
return math.ceil(year_day / 7)
Which can be used as follows:
get_weekday_occurrences('2021-08-12')
Note we use datetime.timetuple to get the day of the year (tm_yday).
For Alteryx, try the formula Ceil(DateTimeFormat([date],'%j') / 7) ... explanation: regardless of day of week, if it's the first day of the year, it's also the first "of that weekday" of the year... at day number 8, it becomes the 2nd "of that weekday" of the year, and so on. Since Alteryx gives you "day of the year" for free using the given DateTimeFornat function, it's then a simple division and Ceil() function.
To get the week in the year from a date string:
from datetime import datetime
a = '2021-08-02'
b = datetime.fromisoformat(a)
print('week of the year:', b.strftime('%W'))
output:
week of the year: 31
For more information about datetime: link
I have some data and a date column. By running the command below, it goes through the DF and counts all the events happened during that week.
df['date'].groupby(df.date.dt.to_period("W")).agg('count')
The result is something like:
2018-04-16/2018-04-22 40
2018-04-23/2018-04-29 18
The weeks starts on Monday and end Sunday.
I want the week to start on Sunday and end on Saturday. So, the data should be
2018-04-15/2018-04-21 40
2018-04-22/2018-04-28 18
Use:
df = pd.DataFrame({'Date':np.random.choice(pd.date_range('2018-04-10',periods=365, freq='D'),1000)})
df.groupby(df['Date'].dt.to_period('W-SAT')).agg('count')
Output:
Date
Date
2018-04-08/2018-04-14 12
2018-04-15/2018-04-21 19
2018-04-22/2018-04-28 21
2018-04-29/2018-05-05 16
2018-05-06/2018-05-12 21
Use an anchored offset. Excerpt from the linked table:
W-SUN weekly frequency (Sundays). Same as ‘W’
W-MON weekly frequency (Mondays)
W-TUE weekly frequency (Tuesdays)
W-WED weekly frequency (Wednesdays)
W-THU weekly frequency (Thursdays)
W-FRI weekly frequency (Fridays)
W-SAT weekly frequency (Saturdays)
Since you want the week to end on Saturday, W-SAT should
suffice.