Does anyone know how to generate a list in Calendar in python (or some other platform) with "even days", month and year from 2018 until 2021?
Example:
Sun, 02 Jan 2019
Tue, 04 Jan 2019
Thur, 06 Jan 2019
Sat, 08 Jan 2019
Sun, 10 Jan 2019
Tue, 12 Jan 2019
Thur, 14 Jan 2019
Sat, 16 Jan 2019
Sun, 18 Jan 2019
Tue, 20 Jan 2019
Thur, 22 Jan 2019
and so on, respecting the calendar until 2021.
EDIT:
how to generate in python a calendar list between 2018 and 2022 with 2 formats:
Day of the week, Date Month Year Time (hours: minutes: seconds) - Year-Month-Date Time (hours: minutes: seconds)
Note:
Dates: Peer dates only
schedule: Randomly generated schedules
Example:
Tue, 02 Jan 2018 00:59:23 - 2018-01-02 00:59:23
Thu, 04 Jan 2018 10:24:52 - 2018-01-04 10:24:52
Sat, 06 Jan 2018 04:11:09 - 2018-01-06 04:11:09
Mon, 08 Jan 2018 16:12:40 - 2018-01-08 16:12:40
Wed, 10 Jan 2018 10:08:15 - 2018-01-10 10:08:15
Fri, 12 Jan 2018 07:10:09 - 2018-01-12 07:10:09
Sun, 14 Jan 2018 11:50:10 - 2018-01-14 11:50:10
Tue, 16 Jan 2018 02:29:22 - 2018-01-16 02:29:22
Thu, 18 Jan 2018 19:07:20 - 2018-01-18 19:07:20
Sat, 20 Jan 2018 08:50:13 - 2018-01-20 08:50:13
Mon, 22 Jan 2018 02:40:02 - 2018-01-22 02:40:02
and so on, until the year 2022 ...
Here's something fairly simple that seems to work and handles leap years:
from calendar import isleap
from datetime import date
# Days in each month (1-12).
MDAYS = [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
def dim(year, month):
""" Number of days in month of the given year. """
return MDAYS[month] + ((month == 2) and isleap(year))
start_year, end_year = 2018, 2021
for year in range(start_year, end_year+1):
for month in range(1, 12+1):
days = dim(year, month)
for day in range(1, days+1):
if day % 2 == 0:
dt = date(year, month, day)
print(dt.strftime('%a, %d %b %Y'))
Output:
Tue, 02 Jan 2018
Thu, 04 Jan 2018
Sat, 06 Jan 2018
Mon, 08 Jan 2018
Wed, 10 Jan 2018
Fri, 12 Jan 2018
Sun, 14 Jan 2018
Tue, 16 Jan 2018
...
Edit:
Here's a way to do what (I think) you asked how to do in your follow-on question:
from calendar import isleap
from datetime import date, datetime, time
from random import randrange
# Days in each month (1-12).
MDAYS = [0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
def dim(year, month):
""" Number of days in month of the given year. """
return MDAYS[month] + ((month == 2) and isleap(year))
def whenever():
""" Gets the time value. """
# Currently just returns a randomly selected time of day.
return time(*map(randrange, (24, 60, 60))) # hour:minute:second
start_year, end_year = 2018, 2021
for year in range(start_year, end_year+1):
for month in range(1, 12+1):
days = dim(year, month)
for day in range(1, days+1):
if day % 2 == 0:
dt, when = date(year, month, day), whenever()
dttm = datetime.combine(dt, when)
print(dt.strftime('%a, %d %b %Y'), when, '-', dttm)
Output:
Tue, 02 Jan 2018 00:54:02 - 2018-01-02 00:54:02
Thu, 04 Jan 2018 10:19:51 - 2018-01-04 10:19:51
Sat, 06 Jan 2018 22:48:09 - 2018-01-06 22:48:09
Mon, 08 Jan 2018 06:48:46 - 2018-01-08 06:48:46
Wed, 10 Jan 2018 14:01:54 - 2018-01-10 14:01:54
Fri, 12 Jan 2018 05:42:43 - 2018-01-12 05:42:43
Sun, 14 Jan 2018 21:42:37 - 2018-01-14 21:42:37
Tue, 16 Jan 2018 08:08:39 - 2018-01-16 08:08:39
...
What about:
import datetime
d = datetime.date.today() # Define Start date
while d.year <= 2021: # This will go *through* 2012
if d.day % 2 == 0: # Print if even date
print(d.strftime('%a, %d %b %Y'))
d += datetime.timedelta(days=1) # Jump forward a day
Wed, 31 Oct 2018
Fri, 02 Nov 2018
Sun, 04 Nov 2018
Tue, 06 Nov 2018
Thu, 08 Nov 2018
Sat, 10 Nov 2018
Mon, 12 Nov 2018
Wed, 14 Nov 2018
Fri, 16 Nov 2018
Sun, 18 Nov 2018
Tue, 20 Nov 2018
Thu, 22 Nov 2018
...
Fri, 24 Dec 2021
Sun, 26 Dec 2021
Tue, 28 Dec 2021
Thu, 30 Dec 2021
Related
I have a dataframe df as below:
Student_id Date_of_visit(d/m/y)
1 1/4/2020
1 30/12/2019
1 26/12/2019
2 3/1/2021
2 10/1/2021
3 4/5/2020
3 22/8/2020
How can I get the bar-graph with x-axis as month-year(eg: y-ticks: Dec 2019, Jan 2020, Feb 2020) and on y-axis - the total number of students (count) visited on a particular month.
Convert values to datetimes, then use DataFrame.resample with Resampler.size for counts, create new format of datetimes by DatetimeIndex.strftime:
df['Date_of_visit'] = pd.to_datetime(df['Date_of_visit'], dayfirst=True)
s = df.resample('M', on='Date_of_visit')['Student_id'].size()
s.index = s.index.strftime('%b %Y')
print (s)
Date_of_visit
Dec 2019 2
Jan 2020 0
Feb 2020 0
Mar 2020 0
Apr 2020 1
May 2020 1
Jun 2020 0
Jul 2020 0
Aug 2020 1
Sep 2020 0
Oct 2020 0
Nov 2020 0
Dec 2020 0
Jan 2021 2
Name: Student_id, dtype: int64
If need count only unique Student_id use Resampler.nunique:
s = df.resample('M', on='Date_of_visit')['Student_id'].nunique()
s.index = s.index.strftime('%b %Y')
print (s)
Date_of_visit
Dec 2019 1
Jan 2020 0
Feb 2020 0
Mar 2020 0
Apr 2020 1
May 2020 1
Jun 2020 0
Jul 2020 0
Aug 2020 1
Sep 2020 0
Oct 2020 0
Nov 2020 0
Dec 2020 0
Jan 2021 1
Name: Student_id, dtype: int64
Last plot by Series.plot.bar
s.plot.bar()
I am running into an issue trying to convert datetime values consistently into years, weeks, and months.
I was able to figure out how to convert a particular date into a Year/Wk/Month combination, but because of the overlap in week and month numbers, I am encountering duplicate combinations which I want to account for. For example:
2019/ Week 31 / Aug: this is because august 1 is still part of week 31 in the calendar, but the month extracted is in August
2019/ Week 31 / Jul: this is because July 31 is still part of week 31 in the calendar, but the month extracted is in July
My goal is to avoid having duplicates and wrong values extracted. Another example:
2019/ Week 01 / Dec: this is because december 31 is part of week 01 in the new year, and it's tied to calendar year 2019.
This is my code:
req_df is the original dataframe
req_total_grouped for me to group values based on a loc/filter, grouping by datecol which is a date value (ex: 2020-01-01)
import calendar
req_total_grouped = req_df.loc[req_df['datecol'] >= '2019-07-01'].groupby(req_df['datecol'])
req_total_df = req_total_grouped.count()
req_total_df['YEAR'] = req_total_df['datecol'].dt.year
req_total_df['WEEK'] = req_total_df['datecol'].dt.week.map("{:02}".format)
req_total_df['MONTH'] = req_total_df['datecol'].dt.month.apply(lambda x: calendar.month_abbr[x])
req_total_df['YR_WK_MTH'] = req_total_df['YEAR'].astype(str) + \
'/ Week ' + \
req_total_df['WEEK'].astype(str) + \
' / ' \
+ req_total_df['MONTH']
My desired output:
In cases where there are month overlaps, I would want there to be a uniform value. It doesn't matter which month I take they just need to be under the same week. (ex: 2019/ Week 31 / Aug and 2019/ Week 31 / Jul should consolidate into one single value '2019/ Week 31 / Aug' for example)
In cases where there are year over laps (ex: 2019 / Week 01 / Dec) should be 2020 / Week 01 / Jan
I guess grouping the rows by 'year' and 'week' and keeping the last value of each group gives your desired result. Can you try this?
Data (same as yours?)
df = pd.DataFrame({'date': pd.date_range('01/01/2019', '12/31/2020', freq='D')})
df['year'] = df['date'].dt.year
df['month'] = df['date'].dt.month.apply(lambda x: calendar.month_abbr[x])
df['week'] = df['date'].dt.week.map("{:02}".format)
df['yr_wk_mth'] = df['year'].astype(str) + ' / Week ' + df['week'] + ' / ' + df['month']
Code:
print(df.groupby(['year','week'])['yr_wk_mth'].last())
Result:
date month yr_wk_mth
year week
2019 01 2019-12-31 Dec 2019 / Week 01 / Dec
02 2019-01-13 Jan 2019 / Week 02 / Jan
03 2019-01-20 Jan 2019 / Week 03 / Jan
04 2019-01-27 Jan 2019 / Week 04 / Jan
05 2019-02-03 Feb 2019 / Week 05 / Feb
06 2019-02-10 Feb 2019 / Week 06 / Feb
07 2019-02-17 Feb 2019 / Week 07 / Feb
08 2019-02-24 Feb 2019 / Week 08 / Feb
09 2019-03-03 Mar 2019 / Week 09 / Mar
10 2019-03-10 Mar 2019 / Week 10 / Mar
11 2019-03-17 Mar 2019 / Week 11 / Mar
12 2019-03-24 Mar 2019 / Week 12 / Mar
13 2019-03-31 Mar 2019 / Week 13 / Mar
14 2019-04-07 Apr 2019 / Week 14 / Apr
15 2019-04-14 Apr 2019 / Week 15 / Apr
16 2019-04-21 Apr 2019 / Week 16 / Apr
17 2019-04-28 Apr 2019 / Week 17 / Apr
18 2019-05-05 May 2019 / Week 18 / May
19 2019-05-12 May 2019 / Week 19 / May
20 2019-05-19 May 2019 / Week 20 / May
21 2019-05-26 May 2019 / Week 21 / May
22 2019-06-02 Jun 2019 / Week 22 / Jun
23 2019-06-09 Jun 2019 / Week 23 / Jun
24 2019-06-16 Jun 2019 / Week 24 / Jun
25 2019-06-23 Jun 2019 / Week 25 / Jun
26 2019-06-30 Jun 2019 / Week 26 / Jun
27 2019-07-07 Jul 2019 / Week 27 / Jul
28 2019-07-14 Jul 2019 / Week 28 / Jul
29 2019-07-21 Jul 2019 / Week 29 / Jul
30 2019-07-28 Jul 2019 / Week 30 / Jul
31 2019-08-04 Aug 2019 / Week 31 / Aug
32 2019-08-11 Aug 2019 / Week 32 / Aug
33 2019-08-18 Aug 2019 / Week 33 / Aug
34 2019-08-25 Aug 2019 / Week 34 / Aug
35 2019-09-01 Sep 2019 / Week 35 / Sep
36 2019-09-08 Sep 2019 / Week 36 / Sep
37 2019-09-15 Sep 2019 / Week 37 / Sep
38 2019-09-22 Sep 2019 / Week 38 / Sep
39 2019-09-29 Sep 2019 / Week 39 / Sep
40 2019-10-06 Oct 2019 / Week 40 / Oct
41 2019-10-13 Oct 2019 / Week 41 / Oct
42 2019-10-20 Oct 2019 / Week 42 / Oct
43 2019-10-27 Oct 2019 / Week 43 / Oct
44 2019-11-03 Nov 2019 / Week 44 / Nov
45 2019-11-10 Nov 2019 / Week 45 / Nov
46 2019-11-17 Nov 2019 / Week 46 / Nov
47 2019-11-24 Nov 2019 / Week 47 / Nov
48 2019-12-01 Dec 2019 / Week 48 / Dec
49 2019-12-08 Dec 2019 / Week 49 / Dec
50 2019-12-15 Dec 2019 / Week 50 / Dec
51 2019-12-22 Dec 2019 / Week 51 / Dec
52 2019-12-29 Dec 2019 / Week 52 / Dec
2020 01 2020-01-05 Jan 2020 / Week 01 / Jan
02 2020-01-12 Jan 2020 / Week 02 / Jan
03 2020-01-19 Jan 2020 / Week 03 / Jan
04 2020-01-26 Jan 2020 / Week 04 / Jan
05 2020-02-02 Feb 2020 / Week 05 / Feb
06 2020-02-09 Feb 2020 / Week 06 / Feb
07 2020-02-16 Feb 2020 / Week 07 / Feb
08 2020-02-23 Feb 2020 / Week 08 / Feb
09 2020-03-01 Mar 2020 / Week 09 / Mar
10 2020-03-08 Mar 2020 / Week 10 / Mar
11 2020-03-15 Mar 2020 / Week 11 / Mar
12 2020-03-22 Mar 2020 / Week 12 / Mar
13 2020-03-29 Mar 2020 / Week 13 / Mar
14 2020-04-05 Apr 2020 / Week 14 / Apr
15 2020-04-12 Apr 2020 / Week 15 / Apr
16 2020-04-19 Apr 2020 / Week 16 / Apr
17 2020-04-26 Apr 2020 / Week 17 / Apr
18 2020-05-03 May 2020 / Week 18 / May
19 2020-05-10 May 2020 / Week 19 / May
20 2020-05-17 May 2020 / Week 20 / May
21 2020-05-24 May 2020 / Week 21 / May
22 2020-05-31 May 2020 / Week 22 / May
23 2020-06-07 Jun 2020 / Week 23 / Jun
24 2020-06-14 Jun 2020 / Week 24 / Jun
25 2020-06-21 Jun 2020 / Week 25 / Jun
26 2020-06-28 Jun 2020 / Week 26 / Jun
27 2020-07-05 Jul 2020 / Week 27 / Jul
28 2020-07-12 Jul 2020 / Week 28 / Jul
29 2020-07-19 Jul 2020 / Week 29 / Jul
30 2020-07-26 Jul 2020 / Week 30 / Jul
31 2020-08-02 Aug 2020 / Week 31 / Aug
32 2020-08-09 Aug 2020 / Week 32 / Aug
33 2020-08-16 Aug 2020 / Week 33 / Aug
34 2020-08-23 Aug 2020 / Week 34 / Aug
35 2020-08-30 Aug 2020 / Week 35 / Aug
36 2020-09-06 Sep 2020 / Week 36 / Sep
37 2020-09-13 Sep 2020 / Week 37 / Sep
38 2020-09-20 Sep 2020 / Week 38 / Sep
39 2020-09-27 Sep 2020 / Week 39 / Sep
40 2020-10-04 Oct 2020 / Week 40 / Oct
41 2020-10-11 Oct 2020 / Week 41 / Oct
42 2020-10-18 Oct 2020 / Week 42 / Oct
43 2020-10-25 Oct 2020 / Week 43 / Oct
44 2020-11-01 Nov 2020 / Week 44 / Nov
45 2020-11-08 Nov 2020 / Week 45 / Nov
46 2020-11-15 Nov 2020 / Week 46 / Nov
47 2020-11-22 Nov 2020 / Week 47 / Nov
48 2020-11-29 Nov 2020 / Week 48 / Nov
49 2020-12-06 Dec 2020 / Week 49 / Dec
50 2020-12-13 Dec 2020 / Week 50 / Dec
51 2020-12-20 Dec 2020 / Week 51 / Dec
52 2020-12-27 Dec 2020 / Week 52 / Dec
53 2020-12-31 Dec 2020 / Week 53 / Dec
I have this dataframe:
date value
1 Thu 17th Nov 2016 385.943800
2 Fri 18th Nov 2016 1074.160340
3 Sat 19th Nov 2016 2980.857860
4 Sun 20th Nov 2016 1919.723960
5 Mon 21st Nov 2016 884.279340
6 Tue 22nd Nov 2016 869.071070
7 Wed 23rd Nov 2016 760.289260
8 Thu 24th Nov 2016 2481.689270
9 Fri 25th Nov 2016 2745.990070
10 Sat 26th Nov 2016 2273.413250
11 Sun 27th Nov 2016 2630.414900
12 Mon 28th Nov 2016 817.322310
13 Tue 29th Nov 2016 1766.876030
14 Wed 30th Nov 2016 469.388420
I would like to change the format of the date column to this format YYYY-MM-DD. The dataframe consists of more than 200 rows, and every day new rows will be added, so I need to find a way to do this automatically.
This link is not helping because it sets the dates like this dates = ['30th November 2009', '31st March 2010', '30th September 2010'] and I can't do it for every row. Anyone knows a way to solve this?
Dateutil will do this job.
from dateutil import parser
print df
df2 = df.copy()
df2.date = df2.date.apply(lambda x: parser.parse(x))
df2
Output:
I want to write a program where i can compare current date with couple of dates that i have.
my data is
12 JUN 2016
21 MAR 1989
15 MAR 1958
15 SEP 1958
23 OCT 1930
15 SEP 1928
10 MAR 2010
23 JAN 1928
15 NOV 1925
26 AUG 2009
29 APR 1987
20 JUL 1962
10 MAY 1960
13 FEB 1955
10 MAR 1956
3 MAR 2010
14 NOV 1958
4 AUG 1985
24 AUG 1956
15 FEB 1955
19 MAY 1987
30 APR 1990
8 SEP 2014
18 JAN 2012
14 DEC 1960
1 AUG 1998
7 SEP 1963
9 MAR 2012
1 MAY 1990
14 MAY 1985
15 JUN 1945
5 APR 1995
26 FEB 1987
13 DEC 1983
15 AUG 2009
16 SEP 1980
16 JAN 2005
19 JUN 2011
Now how can i compare this to current date to know that date is not exceeding current date ( i.e 13/JUN/2016).
please help me! Thank you.
You have to create a datetime object using the string data. You can create the object by parsing the date string using strptime method.
from datetime import datetime
mydate = datetime.strptime("19 JUN 2011", "%d %b %Y")
And then use the object to compare it with today's date.
print mydate < datetime.today()
True
I have a list of timestamps and I want to calculate the mean of the list, but I need to ignore the weekend days which are Saturday and Sunday and consider Friday and Monday as one day. I only want to include the working days from Monday to Friday. This is an example of the list. I wrote the timestamps in readable format to follow the process easily.
Example:
['Wed Feb 17 12:57:40 2011', ' Wed Feb 8 12:57:40 2011', 'Tue Jan 25 17:15:35 2011']
MIN='Tue Jan 25 17:15:35 2011'
' Wed Feb 17 12:57:40 2011' , since we have 6 weekend days between this number and the MIN I shift back this number 6days.It will be = 'Fri Feb 11 12:57:40 2011'.
'Wed Feb 8 12:57:40 2011', since we have 4 weekend days between this number and the MIN I shift back this number 4days it will be 'Wed Feb 4 12:57:40 2011'
The new list is now [' Fri Feb 11 12:57:40 2011',' Wed Feb 4 12:57:40 2011',' Tue Jan 25 17:15:35 2011]
MAX= 'Fri Feb 11 12:57:40 2011'
average= (Fri Feb 11 12:57:40 2011 + Wed Feb 4 12:57:40 2011 + Tue Jan 25 17:15:35 2011) /3
difference= MAX - average
Edit: [Removed previous code, which had an error; replaced with code below.]
Here is some output from code that squeezes out weekends, computes average, and puts weekends back in to get an apparently valid average. The code is shown after the output from some test cases.
['Fri Jan 13 12:00:00 2012', 'Mon Jan 16 11:00:00 2012']
Average = Fri Jan 13 23:30:00 2012
['Fri Jan 13 12:00:00 2012', 'Mon Jan 16 13:00:00 2012']
Average = Mon Jan 16 00:30:00 2012
['Fri Jan 13 14:17:58 2012', 'Sat Jan 14 1:2:3 2012', 'Sun Jan 15 4:5:6 2012', 'Mon Jan 16 11:03:29 2012', 'Wed Jan 18 14:27:17 2012', 'Mon Jan 23 10:02:12 2012', 'Mon Jan 30 10:02:12 2012']
Average = Thu Jan 19 16:46:37 2012
['Fri Jan 14 14:17:58 2011', 'Mon Jan 17 11:03:29 2011', 'Wed Jan 19 14:27:17 2011', 'Mon Jan 24 10:02:12 2011']
Average = Wed Jan 19 00:27:44 2011
Python code:
from time import strptime, mktime, localtime, asctime
from math import floor
def averageBusinessDay (dates):
f = [mktime(strptime(x)) for x in dates]
h = [x for x in f if localtime(x).tm_wday < 5] # Get rid of weekend days
bweek, cweek, dweek = 3600*24*5, 3600*24*7, 3600*24*2
e = localtime(h[0]) # Get struct_time for first item
# fm is first Monday in local time
fm = mktime((e.tm_year, e.tm_mon, e.tm_mday-e.tm_wday, 0,0,0,0,0,0))
i = [x-fm for x in h] # Subtract leading Monday
j = [x-floor(x/cweek)*dweek for x in i] # Squeeze out weekends
avx = sum(j)/len(j)
avt = asctime(localtime(avx+floor(avx/bweek)*dweek+fm))
return avt
def atest(dates):
print dates
print 'Average = ', averageBusinessDay (dates)
atest(['Fri Jan 13 12:00:00 2012', 'Mon Jan 16 11:00:00 2012'])
atest(['Fri Jan 13 12:00:00 2012', 'Mon Jan 16 13:00:00 2012'])
atest(['Fri Jan 13 14:17:58 2012', 'Sat Jan 14 1:2:3 2012', 'Sun Jan 15 4:5:6 2012', 'Mon Jan 16 11:03:29 2012', 'Wed Jan 18 14:27:17 2012', 'Mon Jan 23 10:02:12 2012', 'Mon Jan 30 10:02:12 2012'])
atest(['Fri Jan 14 14:17:58 2011', 'Mon Jan 17 11:03:29 2011', 'Wed Jan 19 14:27:17 2011', 'Mon Jan 24 10:02:12 2011'])
Split the strings based on ' ', take the first element and if it's not saturday or sunday, it's a weekday. Now I need to know what you mean by the "mean" of a list of dates.