Python - Find Dates inbetween Dates - python

I have an array of dates, they're sets of available dates only for a schedule, the unavailable dates not included. What I'd like to do is find the unavailable dates, what would be my best bet to accomplish this?
Note, I'll be converting them to unix time afterwards, so if the solution is converting them to unix time first, don't worry about doing it backwards! Any help is appreciated.
['2013-10-22', '2013-10-23', '2013-10-24', '2013-10-25', '2013-10-26', '2013-10-27', '2013-10-28', '2013-10-29', '2013-10-30', '2013-10-31', '2013-11-01', '2013-11-01', '2013-11-02', '2013-11-03', '2013-11-04', '2013-11-05', '2013-11-06', '2013-11-07', '2013-11-08', '2013-11-09', '2013-11-10', '2013-11-11', '2013-11-12', '2013-11-13', '2013-11-14', '2013-11-15', '2013-11-16', '2013-11-17', '2013-11-18', '2013-11-19', '2013-11-20', '2013-11-21', '2013-11-22', '2013-11-23', '2013-11-24', '2013-11-25', '2013-11-26', '2013-11-27', '2013-11-28', '2013-11-29', '2013-11-30', '2013-12-01', '2013-12-02', '2013-12-03', '2013-12-04', '2013-12-05', '2013-12-06', '2013-12-07', '2013-12-08', '2013-12-09', '2013-12-10', '2013-12-11', '2013-12-12', '2013-12-13', '2013-12-14', '2013-12-15', '2013-12-16', '2013-12-17', '2013-12-18', '2013-12-19', '2013-12-20', '2013-12-21', '2013-12-22', '2013-12-23', '2013-12-24', '2013-12-24', '2013-12-25', '2013-12-26', '2013-12-26', '2013-12-27', '2013-12-28', '2013-12-29', '2013-12-30', '2013-12-31', '2014-01-01', '2014-01-02', '2014-01-03', '2014-01-04', '2014-01-04', '2014-01-05', '2014-01-06', '2014-01-07', '2014-01-07', '2014-01-08', '2014-01-09', '2014-01-10', '2014-01-11', '2014-01-12', '2014-01-13', '2014-01-14', '2014-01-15', '2014-01-16', '2014-01-17', '2014-01-18', '2014-01-19', '2014-01-20', '2014-01-21', '2014-01-22', '2014-01-23', '2014-01-24', '2014-01-25', '2014-01-26', '2014-01-27', '2014-01-28', '2014-01-29', '2014-01-30', '2014-01-31', '2014-02-01', '2014-02-02', '2014-02-03', '2014-02-04', '2014-02-05', '2014-02-06', '2014-02-07', '2014-02-08', '2014-02-09', '2014-02-10', '2014-02-11', '2014-02-12', '2014-02-13', '2014-02-14', '2014-02-15', '2014-02-16', '2014-02-17', '2014-02-18', '2014-02-19', '2014-02-20', '2014-02-21', '2014-02-22', '2014-02-23', '2014-02-24', '2014-02-25', '2014-02-26', '2014-02-27', '2014-02-28', '2014-03-01', '2014-03-01', '2014-03-02', '2014-03-03', '2014-03-04', '2014-03-05', '2014-03-06', '2014-03-07', '2014-03-08', '2014-03-09', '2014-03-10', '2014-03-11', '2014-03-12', '2014-03-13', '2014-03-14', '2014-03-15', '2014-03-16', '2014-03-17', '2014-03-18', '2014-03-19', '2014-03-20', '2014-03-21', '2014-03-22', '2014-03-23', '2014-03-24', '2014-03-25', '2014-03-26', '2014-03-27', '2014-03-28', '2014-03-29', '2014-03-30', '2014-03-31', '2014-04-01', '2014-04-01', '2014-04-02', '2014-04-03', '2014-04-04', '2014-04-05', '2014-04-06', '2014-04-07', '2014-04-08', '2014-04-09', '2014-04-10', '2014-04-11', '2014-04-12', '2014-04-13', '2014-04-13', '2014-04-14', '2014-04-15', '2014-04-16', '2014-04-17', '2014-04-17', '2014-04-18', '2014-04-19', '2014-04-20', '2014-04-21', '2014-04-22', '2014-04-23', '2014-04-24', '2014-04-24', '2014-04-25', '2014-04-26', '2014-04-27', '2014-04-28', '2014-04-28', '2014-04-29', '2014-04-30', '2014-05-01', '2014-05-02', '2014-05-03', '2014-05-04', '2014-05-05', '2014-05-06', '2014-05-07', '2014-05-08', '2014-05-09', '2014-05-10', '2014-05-11', '2014-05-12', '2014-05-13', '2014-05-14', '2014-05-15', '2014-05-16', '2014-05-17', '2014-05-18', '2014-05-19', '2014-05-20', '2014-05-21', '2014-05-22', '2014-05-23', '2014-05-24', '2014-05-25', '2014-05-26', '2014-05-27', '2014-05-28', '2014-05-29', '2014-05-30', '2014-05-31', '2014-06-01', '2014-06-02', '2014-06-03', '2014-06-04', '2014-06-05', '2014-06-06', '2014-06-07', '2014-06-08', '2014-06-09', '2014-06-10', '2014-06-11', '2014-06-12', '2014-06-13', '2014-06-14', '2014-06-15', '2014-06-16', '2014-06-17', '2014-06-18', '2014-06-19', '2014-06-20', '2014-06-21', '2014-06-22', '2014-06-23', '2014-06-24', '2014-06-25', '2014-06-26', '2014-06-27', '2014-06-28', '2014-06-29', '2014-06-30', '2014-07-01', '2014-07-01', '2014-07-02', '2014-07-03', '2014-07-04', '2014-07-05', '2014-07-06', '2014-07-07', '2014-07-08', '2014-07-09', '2014-07-10', '2014-07-11', '2014-07-12', '2014-07-13', '2014-07-14', '2014-07-15', '2014-07-16', '2014-07-17', '2014-07-18', '2014-07-19', '2014-07-20', '2014-07-21', '2014-07-22', '2014-07-23', '2014-07-24', '2014-07-25', '2014-07-26', '2014-07-27', '2014-07-28', '2014-07-29', '2014-07-30', '2014-07-31', '2014-08-01', '2014-08-01', '2014-08-02', '2014-08-03', '2014-08-04', '2014-08-05', '2014-08-06', '2014-08-07', '2014-08-08', '2014-08-09', '2014-08-10', '2014-08-11', '2014-08-12', '2014-08-13', '2014-08-14', '2014-08-15', '2014-08-16', '2014-08-17', '2014-08-18', '2014-08-19', '2014-08-20', '2014-08-21', '2014-08-22', '2014-08-23', '2014-08-24', '2014-08-25', '2014-08-26', '2014-08-27', '2014-08-28', '2014-08-29', '2014-08-30', '2014-08-31', '2014-09-01', '2014-09-01', '2014-09-02', '2014-09-03', '2014-09-04', '2014-09-05', '2014-09-06', '2014-09-07', '2014-09-08', '2014-09-09', '2014-09-10', '2014-09-11', '2014-09-12', '2014-09-13', '2014-09-14', '2014-09-15', '2014-09-16', '2014-09-17', '2014-09-18', '2014-09-19', '2014-09-20', '2014-09-21', '2014-09-22', '2014-09-23', '2014-09-24', '2014-09-25', '2014-09-26', '2014-09-27', '2014-09-28', '2014-09-29', '2014-09-30', '2014-10-01', '2014-10-02', '2014-10-03', '2014-10-04', '2014-10-05', '2014-10-06', '2014-10-07', '2014-10-08', '2014-10-09', '2014-10-10', '2014-10-11', '2014-10-12', '2014-10-13', '2014-10-14', '2014-10-15', '2014-10-16', '2014-10-17', '2014-10-18', '2014-10-19', '2014-10-20', '2014-10-21', '2014-10-22', '2014-10-23', '2014-10-24', '2014-10-25', '2014-10-26', '2014-10-27', '2014-10-28', '2014-10-29', '2014-10-30', '2014-10-31', '2014-11-01', '2014-11-01', '2014-11-02', '2014-11-03', '2014-11-04', '2014-11-05', '2014-11-06', '2014-11-07', '2014-11-08', '2014-11-09', '2014-11-10', '2014-11-11', '2014-11-12', '2014-11-13', '2014-11-14', '2014-11-15', '2014-11-16', '2014-11-17', '2014-11-18', '2014-11-19', '2014-11-20', '2014-11-21', '2014-11-22', '2014-11-23', '2014-11-24', '2014-11-25', '2014-11-26', '2014-11-27', '2014-11-28', '2014-11-29', '2014-11-30', '2014-12-01', '2014-12-02', '2014-12-03', '2014-12-04', '2014-12-05', '2014-12-06', '2014-12-07', '2014-12-08', '2014-12-09', '2014-12-10', '2014-12-11', '2014-12-12', '2014-12-13', '2014-12-14', '2014-12-15', '2014-12-16', '2014-12-17', '2014-12-18', '2014-12-19', '2014-12-20', '2014-12-21', '2014-12-22', '2014-12-23', '2014-12-24', '2014-12-24', '2014-12-25', '2014-12-26', '2014-12-26', '2014-12-27', '2014-12-28', '2014-12-29', '2014-12-30', '2014-12-31', '2015-01-01', '2015-01-02', '2015-01-03', '2015-01-04', '2015-01-04', '2015-01-05', '2015-01-06', '2015-01-07', '2015-01-07', '2015-01-08', '2015-01-09', '2015-01-10', '2015-01-11', '2015-01-12', '2015-01-13', '2015-01-14', '2015-01-15', '2015-01-16', '2015-01-17', '2015-01-18', '2015-01-19', '2015-01-20', '2015-01-21', '2015-01-22', '2015-01-23', '2015-01-24', '2015-01-25', '2015-01-26', '2015-01-27', '2015-01-28', '2015-01-29', '2015-01-30', '2015-01-31', '2015-02-01', '2015-02-02', '2015-02-03', '2015-02-04', '2015-02-05', '2015-02-06', '2015-02-07', '2015-02-08', '2015-02-09', '2015-02-10', '2015-02-11', '2015-02-12', '2015-02-13', '2015-02-14', '2015-02-15', '2015-02-16', '2015-02-17', '2015-02-18', '2015-02-19', '2015-02-20', '2015-02-21', '2015-02-22', '2015-02-23', '2015-02-24', '2015-02-25', '2015-02-26', '2015-02-27', '2015-02-28', '2015-03-01', '2015-03-01', '2015-03-02', '2015-03-03', '2015-03-04', '2015-03-05', '2015-03-06', '2015-03-07', '2015-03-08', '2015-03-09', '2015-03-10', '2015-03-11', '2015-03-12', '2015-03-13', '2015-03-14', '2015-03-15', '2015-03-16', '2015-03-17', '2015-03-18', '2015-03-19', '2015-03-20', '2015-03-21', '2015-03-22', '2015-03-23', '2015-03-24', '2015-03-25', '2015-03-26', '2015-03-27', '2015-03-28', '2015-03-29', '2015-03-29', '2015-03-30', '2015-03-31', '2015-04-01', '2015-04-02', '2015-04-02', '2015-04-03', '2015-04-04', '2015-04-05', '2015-04-06', '2015-04-07', '2015-04-08', '2015-04-09', '2015-04-09', '2015-04-10', '2015-04-11', '2015-04-12', '2015-04-13', '2015-04-13', '2015-04-14', '2015-04-15', '2015-04-16', '2015-04-17', '2015-04-18', '2015-04-19', '2015-04-20', '2015-04-21', '2015-04-22', '2015-04-23', '2015-04-24', '2015-04-25', '2015-04-26', '2015-04-27', '2015-04-28', '2015-04-29', '2015-04-30', '2015-05-01', '2015-05-02', '2015-05-03', '2015-05-04', '2015-05-05', '2015-05-06', '2015-05-07', '2015-05-08', '2015-05-09', '2015-05-10', '2015-05-11', '2015-05-12', '2015-05-13', '2015-05-14', '2015-05-15', '2015-05-16', '2015-05-17', '2015-05-18', '2015-05-19', '2015-05-20', '2015-05-21', '2015-05-22', '2015-05-23', '2015-05-24', '2015-05-25', '2015-05-26', '2015-05-27', '2015-05-28', '2015-05-29', '2015-05-30', '2015-05-31', '2015-06-01', '2015-06-02', '2015-06-03', '2015-06-04', '2015-06-05', '2015-06-06', '2015-06-07', '2015-06-08', '2015-06-09', '2015-06-10', '2015-06-11', '2015-06-12', '2015-06-13', '2015-06-14', '2015-06-15', '2015-06-16', '2015-06-17', '2015-06-18', '2015-06-19', '2015-06-20', '2015-06-21', '2015-06-22', '2015-06-23', '2015-06-24', '2015-06-25', '2015-06-26', '2015-06-27', '2015-06-28', '2015-06-29', '2015-06-30', '2015-07-01', '2015-07-01', '2015-07-02', '2015-07-03', '2015-07-04', '2015-07-05', '2015-07-06', '2015-07-07', '2015-07-08', '2015-07-09', '2015-07-10', '2015-07-11', '2015-07-12', '2015-07-13', '2015-07-14', '2015-07-15', '2015-07-16', '2015-07-17', '2015-07-18', '2015-07-19', '2015-07-20', '2015-07-21', '2015-07-22', '2015-07-23', '2015-07-24', '2015-07-25', '2015-07-26', '2015-07-27', '2015-07-28', '2015-07-29', '2015-07-30', '2015-07-31', '2015-08-01', '2015-08-01', '2015-08-02', '2015-08-03', '2015-08-04', '2015-08-05', '2015-08-06', '2015-08-07', '2015-08-08', '2015-08-09', '2015-08-10', '2015-08-11', '2015-08-12', '2015-08-13', '2015-08-14', '2015-08-15', '2015-08-16', '2015-08-17', '2015-08-18', '2015-08-19', '2015-08-20', '2015-08-21', '2015-08-22', '2015-08-23', '2015-08-24', '2015-08-25', '2015-08-26', '2015-08-27', '2015-08-28', '2015-08-29', '2015-08-30', '2015-08-31', '2015-09-01', '2015-09-01', '2015-09-02', '2015-09-03', '2015-09-04', '2015-09-05', '2015-09-06', '2015-09-07', '2015-09-08', '2015-09-09', '2015-09-10', '2015-09-11', '2015-09-12', '2015-09-13', '2015-09-14', '2015-09-15', '2015-09-16', '2015-09-17', '2015-09-18', '2015-09-19', '2015-09-20', '2015-09-21', '2015-09-22', '2015-09-23', '2015-09-24', '2015-09-25', '2015-09-26', '2015-09-27', '2015-09-28', '2015-09-29', '2015-09-30', '2015-10-01', '2015-10-02', '2015-10-03', '2015-10-04', '2015-10-05', '2015-10-06', '2015-10-07', '2015-10-08', '2015-10-09', '2015-10-10', '2015-10-11', '2015-10-12', '2015-10-13', '2015-10-14', '2015-10-15', '2015-10-16', '2015-10-17', '2015-10-18', '2015-10-19', '2015-10-20', '2015-10-21']

You could convert all the dates to datetime objects first, and then iterate through them and check for holes. Or you could create two sets, one of all available dates, and then remove the available ones from that set.
>>> from datetime import datetime, timedelta
>>> available = list(map(lambda x: datetime.strptime(x, '%Y-%m-%d'), available))
>>> dateRange = available[-1] - available[0]
>>> allDays = set((available[0] + timedelta(days=i)) for i in range(dateRange.days))
>>> allDays - set(available)
{datetime.datetime(2013, 10, 26, 0, 0)}
(I took that one out from the original input; there weren’t any missing days)

from datetime import datetime, timedelta
available_date_strings = ['2013-01-01','2013-01-03','2013-01-15']
available_dates = [datetime.strptime(d, '%Y-%m-%d').date() \
for d in available_date_strings]
unavailable_dates = []
prev_day = available_dates[0]
while (prev_day < available_dates[-1]):
the_date = prev_day + timedelta(days=1)
if the_date not in available_dates:
unavailable_dates.append(the_date)
prev_day = the_date
unavailable_dates =>
[datetime.date(2013, 1, 2),
datetime.date(2013, 1, 4),
datetime.date(2013, 1, 5),
datetime.date(2013, 1, 6),
datetime.date(2013, 1, 7),
datetime.date(2013, 1, 8),
datetime.date(2013, 1, 9),
datetime.date(2013, 1, 10),
datetime.date(2013, 1, 11),
datetime.date(2013, 1, 12),
datetime.date(2013, 1, 13),
datetime.date(2013, 1, 14)]

I would simply map each date into an index number and mark off the ones which are not available. Then run through the list once to find the available dates. This will be an O(N) solution.
Mapping example:
[..., '2013-10-22', '2013-10-25', ...] -> [..., 295, 298, ...]
October 22 is 295th day of the year 2013:
http://www.soils.wisc.edu/cgi-bin/asig/doyCal.rb
Make sure you use the right starting point (epoch). You could use the unix epoch as well.

I would do something like this -
from datetime import *
from dateutil import parser as dp
available = ['2013-10-22', '2013-10-23', '2013-10-25', '2013-10-26', '2013-10-27', '2013-10-28', '2013-10-30']
sorted_available = sorted(available, key=lambda d: map(int, d.split('-')))
first = dp.parse(sorted_available[0])
last = dp.parse(sorted_available[-1])
daysInRange = (last - first).days
booked = []
d = first
for x in range(0, daysInRange):
d = d + timedelta(days=1)
if not (d.strftime("%Y-%m-%d") in available):
booked.append(d.strftime("%Y-%m-%d"))
print booked

Related

Select specific dates from a data frame in python using MODIS data in NETCDF4

I am not well-versed in python, and I'm sure there is a simple solution to this (although, I have looked). I got this code from an lpdaac tutorial.
My input is a NETCDF4 file downloaded from MODIS satellite. Printing the metadata of the file returns the variables
file_in = Dataset(file_list[0], 'r', format = 'NETCDF4')
#print metadata
list(file_in.variables)
Out[19]: ['crs', 'time', 'lat', 'lon', '_1_km_16_days_EVI', '_1_km_16_days_VI_Quality']
I want to convert the time variable to date format, and then only select 1 date from each year. Here is the code to convert to date format:
from netCDF4 import num2date
times = file_in.variables["time"] #import time variables
dates = num2date(times[:], times.units) #get the time info
dates = [date.strftime("%Y-%m-%d") for date in dates] #get the list of datetime
print(dates)
The dates are as follows:
['2000-06-25', '2000-07-11', '2000-07-27', '2000-08-12', '2000-08-28', '2000-09-13', '2000-09-29', '2000-10-15', '2000-10-31', '2000-11-16', '2000-12-02', '2000-12-18', '2001-01-01', '2001-01-17', '2001-02-02', '2001-02-18', '2001-03-06', '2001-03-22', '2001-04-07', '2001-04-23', '2001-05-09', '2001-05-25', '2001-06-10', '2001-06-26', '2001-07-12', '2001-07-28', '2001-08-13', '2001-08-29', '2001-09-14', '2001-09-30', '2001-10-16', '2001-11-01', '2001-11-17', '2001-12-03', '2001-12-19', '2002-01-01', '2002-01-17', '2002-02-02', '2002-02-18', '2002-03-06', '2002-03-22', '2002-04-07', '2002-04-23', '2002-05-09', '2002-05-25', '2002-06-10', '2002-06-26', '2002-07-12', '2002-07-28', '2002-08-13', '2002-08-29', '2002-09-14', '2002-09-30', '2002-10-16', '2002-11-01', '2002-11-17', '2002-12-03', '2002-12-19', '2003-01-01', '2003-01-17', '2003-02-02', '2003-02-18', '2003-03-06', '2003-03-22', '2003-04-07', '2003-04-23', '2003-05-09', '2003-05-25', '2003-06-10', '2003-06-26', '2003-07-12', '2003-07-28', '2003-08-13', '2003-08-29', '2003-09-14', '2003-09-30', '2003-10-16', '2003-11-01', '2003-11-17', '2003-12-03', '2003-12-19', '2004-01-01', '2004-01-17', '2004-02-02', '2004-02-18', '2004-03-05', '2004-03-21', '2004-04-06', '2004-04-22', '2004-05-08', '2004-05-24', '2004-06-09', '2004-06-25', '2004-07-11', '2004-07-27', '2004-08-12', '2004-08-28', '2004-09-13', '2004-09-29', '2004-10-15', '2004-10-31', '2004-11-16', '2004-12-02', '2004-12-18', '2005-01-01', '2005-01-17', '2005-02-02', '2005-02-18', '2005-03-06', '2005-03-22', '2005-04-07', '2005-04-23', '2005-05-09', '2005-05-25', '2005-06-10', '2005-06-26', '2005-07-12', '2005-07-28', '2005-08-13', '2005-08-29', '2005-09-14', '2005-09-30', '2005-10-16', '2005-11-01', '2005-11-17', '2005-12-03', '2005-12-19', '2006-01-01', '2006-01-17', '2006-02-02', '2006-02-18', '2006-03-06', '2006-03-22', '2006-04-07', '2006-04-23', '2006-05-09', '2006-05-25', '2006-06-10', '2006-06-26', '2006-07-12', '2006-07-28', '2006-08-13', '2006-08-29', '2006-09-14', '2006-09-30', '2006-10-16', '2006-11-01', '2006-11-17', '2006-12-03', '2006-12-19', '2007-01-01', '2007-01-17', '2007-02-02', '2007-02-18', '2007-03-06', '2007-03-22', '2007-04-07', '2007-04-23', '2007-05-09', '2007-05-25', '2007-06-10', '2007-06-26', '2007-07-12', '2007-07-28', '2007-08-13', '2007-08-29', '2007-09-14', '2007-09-30', '2007-10-16', '2007-11-01', '2007-11-17', '2007-12-03', '2007-12-19', '2008-01-01', '2008-01-17', '2008-02-02', '2008-02-18', '2008-03-05', '2008-03-21', '2008-04-06', '2008-04-22', '2008-05-08', '2008-05-24', '2008-06-09', '2008-06-25', '2008-07-11', '2008-07-27', '2008-08-12', '2008-08-28', '2008-09-13', '2008-09-29', '2008-10-15', '2008-10-31', '2008-11-16', '2008-12-02', '2008-12-18', '2009-01-01', '2009-01-17', '2009-02-02', '2009-02-18', '2009-03-06', '2009-03-22', '2009-04-07', '2009-04-23', '2009-05-09', '2009-05-25', '2009-06-10', '2009-06-26', '2009-07-12', '2009-07-28', '2009-08-13', '2009-08-29', '2009-09-14', '2009-09-30', '2009-10-16', '2009-11-01', '2009-11-17', '2009-12-03', '2009-12-19', '2010-01-01', '2010-01-17', '2010-02-02', '2010-02-18', '2010-03-06', '2010-03-22', '2010-04-07', '2010-04-23', '2010-05-09', '2010-05-25', '2010-06-10', '2010-06-26', '2010-07-12', '2010-07-28', '2010-08-13', '2010-08-29', '2010-09-14', '2010-09-30', '2010-10-16', '2010-11-01', '2010-11-17', '2010-12-03', '2010-12-19', '2011-01-01', '2011-01-17', '2011-02-02', '2011-02-18', '2011-03-06', '2011-03-22', '2011-04-07', '2011-04-23', '2011-05-09', '2011-05-25', '2011-06-10', '2011-06-26', '2011-07-12', '2011-07-28', '2011-08-13', '2011-08-29', '2011-09-14', '2011-09-30', '2011-10-16', '2011-11-01', '2011-11-17', '2011-12-03', '2011-12-19', '2012-01-01', '2012-01-17', '2012-02-02', '2012-02-18', '2012-03-05', '2012-03-21', '2012-04-06', '2012-04-22', '2012-05-08', '2012-05-24', '2012-06-09', '2012-06-25', '2012-07-11', '2012-07-27', '2012-08-12', '2012-08-28', '2012-09-13', '2012-09-29', '2012-10-15', '2012-10-31', '2012-11-16', '2012-12-02', '2012-12-18', '2013-01-01', '2013-01-17', '2013-02-02', '2013-02-18', '2013-03-06', '2013-03-22', '2013-04-07', '2013-04-23', '2013-05-09', '2013-05-25', '2013-06-10', '2013-06-26', '2013-07-12', '2013-07-28', '2013-08-13', '2013-08-29', '2013-09-14', '2013-09-30', '2013-10-16', '2013-11-01', '2013-11-17', '2013-12-03', '2013-12-19', '2014-01-01', '2014-01-17', '2014-02-02', '2014-02-18', '2014-03-06', '2014-03-22', '2014-04-07', '2014-04-23', '2014-05-09', '2014-05-25', '2014-06-10', '2014-06-26', '2014-07-12', '2014-07-28', '2014-08-13', '2014-08-29', '2014-09-14', '2014-09-30', '2014-10-16', '2014-11-01', '2014-11-17', '2014-12-03', '2014-12-19', '2015-01-01', '2015-01-17', '2015-02-02', '2015-02-18', '2015-03-06', '2015-03-22', '2015-04-07', '2015-04-23', '2015-05-09', '2015-05-25', '2015-06-10', '2015-06-26', '2015-07-12', '2015-07-28', '2015-08-13', '2015-08-29', '2015-09-14', '2015-09-30', '2015-10-16', '2015-11-01', '2015-11-17', '2015-12-03', '2015-12-19', '2016-01-01', '2016-01-17', '2016-02-02', '2016-02-18', '2016-03-05', '2016-03-21', '2016-04-06', '2016-04-22', '2016-05-08', '2016-05-24', '2016-06-09', '2016-06-25', '2016-07-11', '2016-07-27', '2016-08-12', '2016-08-28', '2016-09-13', '2016-09-29', '2016-10-15', '2016-10-31', '2016-11-16', '2016-12-02', '2016-12-18', '2017-01-01', '2017-01-17', '2017-02-02', '2017-02-18', '2017-03-06', '2017-03-22', '2017-04-07', '2017-04-23', '2017-05-09', '2017-05-25', '2017-06-10', '2017-06-26', '2017-07-12', '2017-07-28', '2017-08-13', '2017-08-29', '2017-09-14', '2017-09-30', '2017-10-16', '2017-11-01', '2017-11-17', '2017-12-03', '2017-12-19', '2018-01-01', '2018-01-17', '2018-02-02', '2018-02-18', '2018-03-06', '2018-03-22', '2018-04-07', '2018-04-23', '2018-05-09', '2018-05-25', '2018-06-10', '2018-06-26', '2018-07-12', '2018-07-28', '2018-08-13', '2018-08-29', '2018-09-14', '2018-09-30', '2018-10-16', '2018-11-01', '2018-11-17', '2018-12-03', '2018-12-19', '2019-01-01', '2019-01-17', '2019-02-02', '2019-02-18', '2019-03-06', '2019-03-22', '2019-04-07', '2019-04-23', '2019-05-09', '2019-05-25', '2019-06-10', '2019-06-26', '2019-07-12', '2019-07-28', '2019-08-13', '2019-08-29', '2019-09-14', '2019-09-30', '2019-10-16', '2019-11-01', '2019-11-17', '2019-12-03', '2019-12-19', '2020-01-01', '2020-01-17', '2020-02-02', '2020-02-18', '2020-03-05', '2020-03-21', '2020-04-06', '2020-04-22', '2020-05-08', '2020-05-24', '2020-06-09', '2020-06-25', '2020-07-11', '2020-07-27']
And these are the dates I want in the data frame:
dates = ['2000-07-11', '2001-07-12', '2002-07-12', '2003-07-12', '2004-07-11',
'2005-07-12', '2006-07-12', '2007-07-12', '2008-07-11', '2009-07-12',
'2010-07-12', '2011-07-12', '2012-07-11', '2013-07-12', '2014-07-12',
'2015-07-12', '2016-07-11', '2017-07-12', '2018-07-12', '2019-07-12',
'2020-07-11']
I tried just defining a new dates data frame, but I think that caused problems for me later in the code, so I would like to just subset the first dates data frame if there is an easy way to do it.
Thank you for your help

python last working day of month (with CustomBusinessDay)?

I like to calculate last working day before or after a specific date(includes holidays, not just weekends)?
import datetime as dt
from pandas.tseries.holiday import AbstractHolidayCalendar, Holiday, nearest_workday, \
USMartinLutherKingJr, USPresidentsDay, GoodFriday, USMemorialDay, \
USLaborDay, USThanksgivingDay
class USTradingCalendar(AbstractHolidayCalendar):
rules = [
Holiday('NewYearsDay', month=1, day=1, observance=nearest_workday),
USMartinLutherKingJr,
USPresidentsDay,
GoodFriday,
USMemorialDay,
Holiday('USIndependenceDay', month=7, day=4, observance=nearest_workday),
USLaborDay,
USThanksgivingDay,
Holiday('Christmas', month=12, day=25, observance=nearest_workday)
]
def get_trading_close_holidays(fromyear, toyear):
inst = USTradingCalendar()
return inst.holidays(dt.datetime(fromyear-1, 12, 31), dt.datetime(toyear, 12, 31))
print(get_trading_close_holidays(2018,2018))
>> DatetimeIndex(['2018-01-01', '2018-01-15', '2018-02-19', '2018-03-30', '2018-05-28', '2018-07-04', '2018-09-03', '2018-11-22', '2018-12-25'], dtype='datetime64[ns]', freq=None)
import datetime as dt
from pandas.tseries.holiday import USFederalHolidayCalendar
bday_us = CustomBusinessDay(calendar=get_trading_close_holidays(2000,2050))
d = dt.datetime(2018, 3, 31)
d - bday_us
>> Timestamp('2018-03-30 00:00:00')
This falls on Good Friday, that holiday(as shown)... should show 1 day before = 2018-03-29...
What's the issue?
I was able to reproduce the problem and after some testing I've narrowed it down to using a DatetimeIndex as the input of the calendar parameter in CustomBusinessDay.
You can skip that and use the calendar instance directly:
import datetime as dt
import pandas as pd
from pandas.tseries.holiday import AbstractHolidayCalendar, Holiday, nearest_workday, \
USMartinLutherKingJr, USPresidentsDay, GoodFriday, USMemorialDay, \
USLaborDay, USThanksgivingDay
from pandas.tseries.offsets import CustomBusinessDay, BDay
class USTradingCalendar(AbstractHolidayCalendar):
rules = [
Holiday('NewYearsDay', month=1, day=1, observance=nearest_workday),
USMartinLutherKingJr,
USPresidentsDay,
GoodFriday,
USMemorialDay,
Holiday('USIndependenceDay', month=7, day=4, observance=nearest_workday),
USLaborDay,
USThanksgivingDay,
Holiday('Christmas', month=12, day=25, observance=nearest_workday)
]
bday_us = CustomBusinessDay(calendar=USTradingCalendar())
d = dt.datetime(2018, 3, 31)
c = d - bday_us
print(c)
The output:
2018-03-29 00:00:00

How to filter two datetime indices?

I have two datetime indices - one being a date_range of business days and the other being a list of holidays.
I filter the holiday list by a start and end date. But now I need to join them and drop any duplicates (holidays and trading days both exist).
Finally I need to convert the daterange into a list of formatted strings ie: yyyy_mm_dd that I can iterate through later.
Here is my code so far:
import datetime
import pandas as pd
from pandas.tseries.holiday import AbstractHolidayCalendar, Holiday, nearest_workday, \
USMartinLutherKingJr, USPresidentsDay, GoodFriday, USMemorialDay, \
USLaborDay, USThanksgivingDay
class USTradingCalendar(AbstractHolidayCalendar):
rules = [
Holiday('NewYearsDay', month=1, day=1, observance=nearest_workday),
USMartinLutherKingJr,
USPresidentsDay,
GoodFriday,
USMemorialDay,
Holiday('USIndependenceDay', month=7, day=4, observance=nearest_workday),
USLaborDay,
USThanksgivingDay,
Holiday('Christmas', month=12, day=25, observance=nearest_workday)
]
def get_trading_close_holidays(year):
inst = USTradingCalendar()
return inst.holidays(datetime.datetime(year-1, 12, 31),
datetime.datetime(year, 12, 31))
start_date = "2017_07_01"
end_date = "2017_08_31"
start_date = datetime.datetime.strptime(start_date,"%Y_%m_%d").date()
end_date = datetime.datetime.strptime(end_date,"%Y_%m_%d").date()
date_range = pd.bdate_range(start = start_date, end = end_date, name =
"trading_days")
holidays = get_trading_close_holidays(start_date.year)
holidays = holidays.where((holidays.date > start_date) &
(holidays.date < end_date))
holidays = holidays.dropna(how = 'any')
date_range = date_range.where(~(date_range.trading_days.isin(holidays)))
Consider filtering by boolean condition:
date_range = date_range[date_range.date != holidays.date]
print(date_range) # ONE HOLIDAY 2017-07-04 DOES NOT APPEAR
# DatetimeIndex(['2017-07-03', '2017-07-05', '2017-07-06', '2017-07-07',
# '2017-07-10', '2017-07-11', '2017-07-12', '2017-07-13',
# '2017-07-14', '2017-07-17', '2017-07-18', '2017-07-19',
# '2017-07-20', '2017-07-21', '2017-07-24', '2017-07-25',
# '2017-07-26', '2017-07-27', '2017-07-28', '2017-07-31',
# '2017-08-01', '2017-08-02', '2017-08-03', '2017-08-04',
# '2017-08-07', '2017-08-08', '2017-08-09', '2017-08-10',
# '2017-08-11', '2017-08-14', '2017-08-15', '2017-08-16',
# '2017-08-17', '2017-08-18', '2017-08-21', '2017-08-22',
# '2017-08-23', '2017-08-24', '2017-08-25', '2017-08-28',
# '2017-08-29', '2017-08-30', '2017-08-31'],
# dtype='datetime64[ns]', name='trading_days', freq=None)
And using astype() to convert the datetime index to string type array, even tostring() for list conversion:
strdates = date_range.date.astype('str').tolist()
print(strdates)
# ['2017-07-03', '2017-07-05', '2017-07-06', '2017-07-07', '2017-07-10',
# '2017-07-11', '2017-07-12', '2017-07-13', '2017-07-14', '2017-07-17',
# '2017-07-18', '2017-07-19', '2017-07-20', '2017-07-21', '2017-07-24',
# '2017-07-25', '2017-07-26', '2017-07-27', '2017-07-28', '2017-07-31',
# '2017-08-01', '2017-08-02', '2017-08-03', '2017-08-04', '2017-08-07',
# '2017-08-08', '2017-08-09', '2017-08-10', '2017-08-11', '2017-08-14',
# '2017-08-15', '2017-08-16', '2017-08-17', '2017-08-18', '2017-08-21',
# '2017-08-22', '2017-08-23', '2017-08-24', '2017-08-25', '2017-08-28',
# '2017-08-29', '2017-08-30', '2017-08-31']

pandas.concat() does not fill the columns

I am trying to create dummy data as follows:
import numpy as np
import pandas as pd
def dummy_historical(seclist, dates, startvalues):
dfHist = pd.DataFrame(0, index=[0], columns=seclist)
for sec in seclist:
# (works fine)
svalue = startvalues[sec].max()
# this creates a random sequency of 84 rows and 1 column (works fine)
dfRandom = pd.DataFrame(np.random.randint(svalue-10,svalue+10, size=(dates.size, 1 )), index=dates, columns=[sec])
# does not work
dfHist[sec] = pd.concat([ dfHist[sec] , dfRandom ])
return dfHist
When I print dfHist, it only shows me the first row (as when initiated). Thus nothing has been filled.
Here is an example of the data:
seclist = ['AAPL', 'GOOGL']
# use any number for startvalues
dates = DatetimeIndex(['2017-01-05', '2017-01-06', '2017-01-07', '2017-01-08',
'2017-01-09', '2017-01-10', '2017-01-11', '2017-01-12',
'2017-01-13', '2017-01-14', '2017-01-15', '2017-01-16',
'2017-01-17', '2017-01-18', '2017-01-19', '2017-01-20',
'2017-01-21', '2017-01-22', '2017-01-23', '2017-01-24',
'2017-01-25', '2017-01-26', '2017-01-27', '2017-01-28',
'2017-01-29', '2017-01-30', '2017-01-31', '2017-02-01',
'2017-02-02', '2017-02-03', '2017-02-04', '2017-02-05',
'2017-02-06', '2017-02-07', '2017-02-08', '2017-02-09',
'2017-02-10', '2017-02-11', '2017-02-12', '2017-02-13',
'2017-02-14', '2017-02-15', '2017-02-16', '2017-02-17',
'2017-02-18', '2017-02-19', '2017-02-20', '2017-02-21',
'2017-02-22', '2017-02-23', '2017-02-24', '2017-02-25',
'2017-02-26', '2017-02-27', '2017-02-28', '2017-03-01',
'2017-03-02', '2017-03-03', '2017-03-04', '2017-03-05',
'2017-03-06', '2017-03-07', '2017-03-08', '2017-03-09',
'2017-03-10', '2017-03-11', '2017-03-12', '2017-03-13',
'2017-03-14', '2017-03-15', '2017-03-16', '2017-03-17',
'2017-03-18', '2017-03-19', '2017-03-20', '2017-03-21',
'2017-03-22', '2017-03-23', '2017-03-24', '2017-03-25',
'2017-03-26', '2017-03-27', '2017-03-28', '2017-03-29'],
dtype='datetime64[ns]', freq='D')
You need to pass axis=1 to concat if you want to concatenate columns. In addition, you don't need to initialize your data frame with data in the beginning (except you want to have the 0 value):
def dummy_historical(seclist, dates, startvalues):
dfHist = pd.DataFrame()
for sec in seclist:
svalue = startvalues[sec].max()
dfRandom = pd.DataFrame(np.random.randint(svalue-10,svalue+10, size=(dates.size, 1 )), index=dates, columns=[sec])
dfHist = pd.concat([ dfHist , dfRandom ], axis=1)
return dfHist
You can even write in a more concise way avoiding concat like:
def generate(sec):
svalue = startvalues[sec].max()
return np.random.randint(svalue-10,svalue+10, size=dates.size)
dfHist = pd.DataFrame({sec: generate(sec) for sec in seclist}, index=dates)

Finding the previous month

I've seen some methods using dateutil module to do this, but is there a way to do this without just using the built in libs?
For example, the current month right now is July. I can do this using the datetime.now() function.
What would be the easiest way for python to return the previous month?
It's very easy:
>>> previous_month = datetime.now().month - 1
>>> if previous_month == 0:
... previous_month = 12
You can use the calendar module
>>> from calendar import month_name, month_abbr
>>> d = datetime.now()
>>> month_name[d.month - 1] or month_name[-1]
'June'
>>> month_abbr[d.month - 1] or month_abbr[-1]
'Jun'
>>>
If you just want it as a string then do below process.
import datetime
months =(" Blank", "December", "January", "February", "March", "April",
"May","June", "July","August","September","October","November")
d = datetime.date.today()
print(months[d.month])
Generalized function finding the year and month, based on a month delta:
# %% function
def get_year_month(ref_date, month_delta):
year_delta, month_index = divmod(ref_date.month - 1 + month_delta, 12)
year = ref_date.year + year_delta
month = month_index + 1
return year, month
# %% test
some_date = date(2022, 5, 31)
for delta in range(-12, 12):
year, month = get_year_month(some_date, delta)
print(f"{delta=}, {year=}, {month=}")
delta=-12, year=2021, month=5
delta=-11, year=2021, month=6
delta=-10, year=2021, month=7
delta=-9, year=2021, month=8
delta=-8, year=2021, month=9
delta=-7, year=2021, month=10
delta=-6, year=2021, month=11
delta=-5, year=2021, month=12
delta=-4, year=2022, month=1
delta=-3, year=2022, month=2
delta=-2, year=2022, month=3
delta=-1, year=2022, month=4
delta=0, year=2022, month=5
delta=1, year=2022, month=6
delta=2, year=2022, month=7
delta=3, year=2022, month=8
delta=4, year=2022, month=9
delta=5, year=2022, month=10
delta=6, year=2022, month=11
delta=7, year=2022, month=12
delta=8, year=2023, month=1
delta=9, year=2023, month=2
delta=10, year=2023, month=3
delta=11, year=2023, month=4
If you want a date object:
import datetime
d = datetime.date.today() - datetime.timedelta(days=30)
>>> datetime.date(2015, 6, 29)

Categories