I have a datetime range in python that adds a bunch of dates to a range, but i can't for the life of me figure out how to subtract Sundays from that list. I know how to count business days and weekends separately, but how do i eliminate JUST Sundays?
Here is my formula:
days = 100
i = 1
daterange= []
while i < days:
yesterday = datetime.now() - timedelta(days=i)
daterange.append(yesterday.strftime('%m%d%y'))
i +=1
print(daterange)
Any help on this stubborn issue is appreciated :) Thanks
Use datetime.weekday() to exclude Sundays.
from datetime import datetime, timedelta
days = 100
daterange = []
for i in range(1, 100):
yesterday = datetime.now() - timedelta(days=i)
if yesterday.weekday() != 6:
daterange.append(yesterday.strftime('%m%d%y'))
print(*daterange, sep='\n')
Also, I'd rather use a for loop instead of a while loop here.
Related
start = "Nov20"
end = "Jan21"
# Expected output:
["Nov20", "Dec20", "Jan21"]
What I've tried so far is the following but am looking for more elegant way.
from calendar import month_abbr
from time import strptime
def get_range(a, b):
start = strptime(a[:3], '%b').tm_mon
end = strptime(b[:3], '%b').tm_mon
dates = []
for m in month_abbr[start:]:
dates.append(m+a[-2:])
for mm in month_abbr[1:end + 1]:
dates.append(mm+b[-2:])
print(dates)
get_range('Nov20', 'Jan21')
Note: i don't want to use pandas as that's not logical to import such library for generating dates.
The date range may span different years so one way is to loop from the start date to end date and increment the month by 1 until end date is reached.
Try this:
from datetime import datetime
def get_range(a, b):
start = datetime.strptime(a, '%b%y')
end = datetime.strptime(b, '%b%y')
dates = []
while start <= end:
dates.append(start.strftime('%b%y'))
if start.month == 12:
start = start.replace(month=1, year=start.year+1)
else:
start = start.replace(month=start.month+1)
return dates
dates = get_range("Nov20", "Jan21")
print(dates)
Output:
['Nov20', 'Dec20', 'Jan21']
You can use timedelta to step one month (31 days) forward, but make sure you stay on the 1st of the month, otherwise the days might add up and eventually skip a month.
from datetime import datetime
from datetime import timedelta
def get_range(a, b):
start = datetime.strptime(a, '%b%y')
end = datetime.strptime(b, '%b%y')
dates = []
while start <= end:
dates.append(start.strftime('%b%y'))
start = (start + timedelta(days=31)).replace(day=1) # go to 1st of next month
return dates
dates = get_range("Jan20", "Jan21")
print(dates)
My below working code calculates date/month ranges, but I am using the Pandas library, which I want to get rid of.
import pandas as pd
dates=pd.date_range("2019-12","2020-02",freq='MS').strftime("%Y%m%d").tolist()
#print dates : ['20191101','20191201','20200101','20200201']
df=(pd.to_datetime(dates,format="%Y%m%d") + MonthEnd(1)).strftime("%Y%m%d").tolist()
#print df : ['20191130','20191231','20200131','20200229']
How can I rewrite this code without using Pandas?
I don't want to use Pandas library as I am triggering my job through Oozie and we don't have Pandas installed on all our nodes.
Pandas offers some nice functionalities when using datetimes which the standard library datetime module does not have (like the frequency or the MonthEnd). You have to reproduce these yourself.
import datetime as DT
def next_first_of_the_month(dt):
"""return a new datetime where the month has been increased by 1 and
the day is always the first
"""
new_month = dt.month + 1
if new_month == 13:
new_year = dt.year + 1
new_month = 1
else:
new_year = dt.year
return DT.datetime(new_year, new_month, day=1)
start, stop = [DT.datetime.strptime(dd, "%Y-%m") for dd in ("2019-11", "2020-02")]
dates = [start]
cd = next_first_of_the_month(start)
while cd <= stop:
dates.append(cd)
cd = next_first_of_the_month(cd)
str_dates = [d.strftime("%Y%m%d") for d in dates]
print(str_dates)
# prints: ['20191101', '20191201', '20200101', '20200201']
end_dates = [next_first_of_the_month(d) - DT.timedelta(days=1) for d in dates]
str_end_dates = [d.strftime("%Y%m%d") for d in end_dates]
print(str_end_dates)
# prints ['20191130', '20191231', '20200131', '20200229']
I used here a function to get a datetime corresponding to the first day of the next month of the input datetime. Sadly, timedelta does not work with months, and adding 30 days of course is not feasible (not all months have 30 days).
Then a while loop to get a sequence of fist days of the month until the stop date.
And to the get the end of the month, again get the next first day of the month fo each datetime in your list and subtract a day.
Let's say I have 11 Sessions for myself to complete. I haven't set dates for these sessions but rather just weekdays where one session would take place. Let's say when scheduling these sessions, I chose MON, TUE and WED. This means that after today, I want the dates to 11 my sessions which would be 4 Mondays, 4 Tuesdays and 3 Wednesdays from now after which my sessions will be completed.
I want to automatically get the dates for these days until there are 11 dates in total.
I really hope this makes sense... Please help me. I've been scratching my head over this for 3 hours straight.
Thanks,
You can use pd.date_range and the CustomBusinessDay object to do this very easily.
You can use the CustomBusinessDay to specify your "business days" and create your date range from it:
import pandas
from datetime import date
session_days = pd.offset.CustomBusinessDay(weekmask="Mon Tue Wed")
dates = pd.date_range(date.today(), freq=session_days, periods=11)
I figured it out a while ago but my internet died. All it took was Dunhill and some rest.
import datetime
def get_dates():
#This is the max number of dates you want. In my case, sessions.
required_sessions = 11
#These are the weekdays you want these sessions to be
days = [1,2,3]
#An empty list to store the dates you get
dates = []
#Initialize a variable for the while loop
current_sessions = 0
#I will start counting from today but you can choose any date
now = datetime.datetime.now()
#For my use case, I don't want a session on the same day I run this function.
#I will start counting from the next day
if now.weekday() in days:
now = now + datetime.timedelta(days=1)
while current_sessions != required_sessions:
#Iterate over every day in your desired days
for day in days:
#Just a precautionary measure so the for loops breaks as soon as you have the max number of dates
#Or the while loop will run for ever
if current_sessions == required_sessions:
break
#If it's Saturday, you wanna hop onto the next week
if now.weekday() == 6:
#Check if Sunday is in the days, add it
if 0 in days:
date = now + datetime.timedelta(days=1)
dates.append(date)
current_sessions += 1
now = date
else:
#Explains itself.
if now.weekday() == day:
dates.append(now)
now = now + datetime.timedelta(days=1)
current_sessions += 1
#If the weekday today is greater than the day you're iterating over, this means you've iterated over all the days in a NUMERIC ORDER
#NOTE: This only works if the days in your "days" list are in a correct numeric order meaning 0 - 6. If it's random, you'll have trouble
elif not now.weekday() > day:
difference = day - now.weekday()
date = now + datetime.timedelta(days=difference)
dates.append(date)
now = date
current_sessions += 1
#Reset the cycle after the for loop is done so you can hop on to the next week.
reset_cycle_days = 6 - now.weekday()
if reset_cycle_days == 0:
original_now = now + datetime.timedelta(days=1)
now = original_now
else:
original_now = now + datetime.timedelta(days=reset_cycle_days)
now = original_now
for date in dates:(
print(date.strftime("%d/%m/%y"), date.weekday()))
Btw, I know this answer is pointless compared to #Daniel Geffen 's answer. If I were you, I would definitely choose his answer as it is very simple. This was just my contribution to my own question in case anyone would want to jump into the "technicalities" of how it's done by just using datetime. For me, this works best as I'm having issues with _bz2 in Python3.7 .
Thank you all for your help.
I have a dataframe (df) with start_date column's and add_days column's (=10). I want to create target_date (=start_date + add_days) excluding week-end and holidays (holidays as dataframe).
I do some research and I try this.
from datetime import date, timedelta
import datetime as dt
df["star_date"] = pd.to_datetime(df["star_date"])
Holidays['Date_holi'] = pd.to_datetime(Holidays['Date_holi'])
def date_by_adding_business_days(from_date, add_days, holidays):
business_days_to_add = add_days
current_date = from_date
while business_days_to_add > 0:
current_date += datetime.timedelta(days=1)
weekday = current_date.weekday()
if weekday >= 5: # sunday = 6
continue
if current_date in holidays:
continue
business_days_to_add -= 1
return current_date
#demo:
base["Target_date"]=date_by_adding_business_days(df["start_date"], 10, Holidays['Date_holi'])
but i get this error:
AttributeError: 'Series' object has no attribute 'weekday'
Thanks you for your help.
The comments by ALollz are very valid; customizing your date during creation to only keep what is defined as business day for your problem would be optimal.
However, I assume that you cannot define the business day beforehand and that you need to solve the problem with the data frame constructed as is.
Here is one possible solution:
import pandas as pd
import numpy as np
from datetime import timedelta
# Goal is to offset a start date by N business days (weekday + not a holiday)
# Here we fake the dataset as it was not provided
num_row = 1000
df = pd.DataFrame()
df['start_date'] = pd.date_range(start='1/1/1979', periods=num_row, freq='D')
df['add_days'] = pd.Series([10]*num_row)
# Define what is a week day
week_day = [0,1,2,3,4] # Monday to Friday
# Define what is a holiday with month and day without year (you can add more)
holidays = ['10-30','12-24']
def add_days_to_business_day(df, week_day, holidays, increment=10):
'''
modify the dataframe to increment only the days that are part of a weekday
and not part of a pre-defined holiday
>>> add_days_to_business_day(df, [0,1,2,3,4], ['10-31','12-31'])
this will increment by 10 the days from Monday to Friday excluding Halloween and new year-eve
'''
# Increment everything that is in a business day
df.loc[df['start_date'].dt.dayofweek.isin(week_day),'target_date'] = df['start_date'] + timedelta(days=increment)
# Remove every increment done on a holiday
df.loc[df['start_date'].dt.strftime('%m-%d').isin(holidays), 'target_date'] = np.datetime64('NaT')
add_days_to_business_day(df, week_day, holidays)
df
To Note: I'm not using the 'add_days' column since its just a repeated value. I am instead using a parameter for my function increment which will increment by N number of days (with a default of N = 10).
Hope it helps!
Python noob here
from datetime import datetime, time
now = datetime.now()
now_time = now.time()
if now_time >= time(10,30) and now_time <= time(13,30):
print "yes, within the interval"
I would like the timer to work between 10,30 AM today and 10 AM the next day. Changing time(13,30) to time(10,00) will not work, because I need to tell python 10,00 is the next day. I should use datetime function but don't know how. Any tips or examples appreciated.
The combine method on the datetime class will help you a lot, as will the timedelta class. Here's how you would use them:
from datetime import datetime, timedelta, date, time
today = date.today()
tomorrow = today + timedelta(days=1)
interval_start = datetime.combine(today, time(10,30))
interval_end = datetime.combine(tomorrow, time(10,00))
time_to_check = datetime.now() # Or any other datetime
if interval_start <= time_to_check <= interval_end:
print "Within the interval"
Notice how I did the comparison. Python lets you "nest" comparisons like that, which is usually more succinct than writing if start <= x and x <= end.
P.S. Read https://docs.python.org/2/library/datetime.html for more details about these classes.
Consider this:
from datetime import datetime, timedelta
now = datetime.now()
today_10 = now.replace(hour=10, minute=30)
tomorrow_10 = (now + timedelta(days=1)).replace(hour=10, minute=0)
if today_10 <= now <= tomorrow_10:
print "yes, within the interval"
The logic is to create 3 datetime objects: one for today 10 AM, one for right now and one for tomorrow 10 AM. Them simply checking for the condition.
An alternative to creating time objects for the sake of comparison is to simply query the hour and minute attributes:
now= datetime.now().time()
if now.hour<10 or now.hour>10 or (now.hour==10 and now.minute>30):
print('hooray')