I am analysing daily rainfall data to calculate Insurance payout for farmers in case of excess rainfall coverage. policy covers if the "Consecutive 3 day cumulative rainfall" is greater than 50mm between 1st-Oct and 31st-Oct.
I was able to write the code in Python to find the matching criteria. But when it rains continuously then the result has overlapping dates which is not acceptable payout.
Need help in calculating best payout option in case of overlapping dates.
for dist in data["dcode"].unique():
d_data = data[data["dcode"] == dist]
#print(dist)
for block in d_data["mandal"].unique():
prev_rain = 0
prev_to_date = "01/12/2022"
for each in rain_dev_input:
#[(rain_dev_input["TERM"] == "DEFICIT RAINFALL") & (rain_dev_input["DIST_CODE"] == 240)]:
#print(each)
distcode = each["DIST_CODE"]
#print(distcode)
term = str(each["TERM"])
if (distcode == dist) & (term == "EXCESS RAINFALL"):
start_date = each["FROM_PERIOD"]
end_date = each["TO_PERIOD"]
s_date = datetime.datetime.strptime(start_date, "%d/%m/%y")
e_date = datetime.datetime.strptime(end_date, "%d/%m/%y")
#s_date = start_date.strftime(start_date, "%Y-%m-%d")
#e_date = end_date.strftime(end_date, "%Y-%m-%d")
#count1 = daterange(s_date, e_date)
#print(s_date, ": ", e_date)
m_data = d_data[d_data["mandal"] == block]
p_data = m_data.loc[start_date:end_date]
for singledate in daterange(s_date, e_date):
#print("Inside Excess Rain")
from_date = datetime.datetime.strftime(singledate, "%Y-%m-%d")
to_date = datetime.datetime.strftime(
(singledate + timedelta(2)), "%Y-%m-%d"
)
total_rain = p_data.loc[from_date:to_date]["rain"].sum()
#print(total_rain)
range1 = float(each["RANGE1"])
range2 = float(each["RANGE2"])
if (total_rain >= range1) & (total_rain < range2):
#print("inside write to file")
if (from_date <= prev_to_date <= to_date) & (prev_rain <= total_rain):
temp["Max"] = total_rain;
prev_rain = total_rain
prev_to_date = to_date
temp = dict()
temp["district"] = each["DIST_NAME"]
temp["mandal"] = block
temp[
"category"
] = "excess rainfall for 3 consecutive days, cumulative"
temp["rainfall"] = total_rain
temp["from_date"] = from_date
temp["to_date"] = to_date
temp["phase"] = each["PHASE"]
#temp["insurance"] = insurance_excessrain(each, total_rain)
temp["insurance"] = (total_rain - range1)* each["PAYOUT"]
excess_rainfall.append(temp)
# %%
# output excess rainfall rain_dev_list data
excess_rain = pd.DataFrame(excess_rainfall)
excess_rain.to_csv(str(output_folder) + "/excess_rainfall_2020_h2.csv", index=False)
This result has overlapping dates
Daily Rainfall sample data
The data frame shows the date with the amount of import and export
and it is further bifurcated into coastal and regional data per day
of one month.
What I wish to achieve is to club i.e sum all the data presented, which is of one month in this
case, in the end, it will show only one entry that will be of month
ending date and adding all the corresponding fields.
This is the following code:
df=pd.read_csv('output.csv',
encoding="utf-8",skipinitialspace=True,engine='python')
datadf = df
datadf = datadf.dropna(axis = 0, how ='any')
datadf = datadf.astype({'ForeignType' : 'category','ImportType' : 'category','ArrDate' : 'datetime64',
'DepDate' : 'datetime64'})
# datadf = datadf.groupby(datadf['ArrDate'].dt.strftime('%B'))['ComoQty'].sum()
datadf1 = datadf.groupby(['ArrDate','ImportType','ForeignType'])['ComoQty'].sum()
datadf2 = datadf1.to_frame()
datadf2.fillna(value=0,inplace=True)
# datadf2 = datadf2.reset_index('ImportType')
# datadf2 = datadf2.reset_index('ForeignType')
# datadf2 = datadf2.reset_index('ArrDate')
datadf2
datadf1 = datadf.drop(columns='Unnamed: 0')
prac = datadf1
prac =prac.set_index('ArrDate')
prac_dates = prac.copy()
prac = prac.resample('D').apply({'ShipName':'count','ComoQty':'sum'}).reset_index()
prac_dates = ((prac_dates.resample('M').apply({'ComoQty':'sum'}))/1000).reset_index()
prac_dates['Month'] = pd.DatetimeIndex(prac_dates['ArrDate']).strftime('%B')
del prac_dates['ArrDate']
# prac_dates
prac['Month'] = pd.DatetimeIndex(prac['ArrDate']).strftime('%B')
# prac['Month'] = pd.to_datetime(prac['Month'], format='%B')
prac['ArrDate'] = pd.DatetimeIndex(prac['ArrDate']).strftime('%d')
I am using SpreadedLinearZeroInterpolatedTermStructure class to price bond. I have 35 key rates, from 1M to 30Y. I also have a daily spot curve. So I want to input 35 key rates extracted from the daily spot curve to the class, then change key rates to see what's the bond price.
Giving credit to GB, and his article here:
http://gouthamanbalaraman.com/blog/bonds-with-spreads-quantlib-python.html
I followed his method which worked well, the bond price is changing due to the different values set to key rates.
Then I substituted his flat curve with my daily spot curve, his handles list with my handles (35 handles in it), and his two dates with my 35 dates.
I set values to some of the key rates while the NPV stayed still(even I gave a huge shock). I also tried to give only two key rates on a zero curve, and it worked. So I guess it's because 35 key rates is way too much? Any help is appreciated
import QuantLib as ql
# =============================================================================
# normal yc term structure
# =============================================================================
todaysDate = ql.Date(24,5,2019)
ql.Settings.instance().evaluationDate = todaysDate
KR1 = [0, 1, 3, 6, 9] # KR in month unit
KR2 = [x for x in range(1,31)] # KR in year unit
spotDates = [] # starting from today
for kr in KR1:
p = ql.Period(kr,ql.Months)
spotDates.append(todaysDate+p)
for kr in KR2:
p = ql.Period(kr,ql.Years)
spotDates.append(todaysDate+p)
spotRates = [0.02026,
0.021569,
0.02326,
0.025008,
0.026089,
0.026679,
0.028753,
0.029376,
0.030246,
0.031362,
0.033026,
0.034274,
0.033953,
0.033474,
0.033469,
0.033927,
0.03471,
0.035596,
0.036396,
0.036994,
0.037368,
0.037567,
0.037686,
0.037814,
0.037997,
0.038247,
0.038562,
0.038933,
0.039355,
0.039817,
0.040312,
0.040832,
0.041369,
0.041922,
0.042487] # matching points
dayCount = ql.Thirty360()
calendar = ql.China()
interpolation = ql.Linear()
compounding = ql.Compounded
compoundingFrequency = ql.Annual
spotCurve = ql.ZeroCurve(spotDates, spotRates, dayCount, calendar,
interpolation,compounding, compoundingFrequency)
spotCurveHandle = ql.YieldTermStructureHandle(spotCurve)
# =============================================================================
# bond settings
# =============================================================================
issue_date = ql.Date(24,5,2018)
maturity_date = ql.Date(24,5,2023)
tenor = ql.Period(ql.Semiannual)
calendar = ql.China()
business_convention = ql.Unadjusted
date_generation = ql.DateGeneration.Backward
month_end = False
schedule = ql.Schedule(issue_date,maturity_date,tenor,calendar,
business_convention, business_convention,
date_generation,month_end)
settlement_days = 0
day_count = ql.Thirty360()
coupon_rate = 0.03
coupons = [coupon_rate]
face_value = 100
fixed_rate_bond = ql.FixedRateBond(settlement_days,
face_value,
schedule,
coupons,
day_count)
#bond_engine = ql.DiscountingBondEngine(spotCurveHandle)
#fixed_rate_bond.setPricingEngine(bond_engine)
#print(fixed_rate_bond.NPV())
# =============================================================================
# non-parallel shift of yc
# =============================================================================
#def KRshocks(kr0=0.0, kr_1M=0.0, kr_3M=0.0, kr_6M=0.0, kr_9M=0.0,
# kr_1Y=0.0,kr_2Y=0.0, kr_3Y=0.0, kr_4Y=0.0, kr_5Y=0.0, kr_6Y=0.0,
# kr_7Y=0.0, kr_8Y=0.0, kr_9Y=0.0, kr_10Y=0.0, kr_11Y=0.0, kr_12Y=0.0,
# kr_13Y=0.0, kr_14Y=0.0, kr_15Y=0.0, kr_16Y=0.0, kr_17Y=0.0, kr_18Y=0.0,
# kr_19Y=0.0, kr_20Y=0.0, kr_21Y=0.0, kr_22Y=0.0, kr_23Y=0.0, kr_24Y=0.0,
# kr_25Y=0.0, kr_26Y=0.0, kr_27Y=0.0, kr_28Y=0.0, kr_29Y=0.0, kr_30Y=0.0):
# '''
#
# Parameters:
# Input shocks for each key rate.
# kr0 = today's spot rate shock;
# kr_1M = 0.083 year(1 month) later spot rate shock;
# kr_1Y = 1 year later spot rate shock;
# .
# .
# .
#
# '''
#
# krs = list(locals().keys())
# KRHandles = {}
# for k in krs:
# KRHandles['{}handle'.format(k)] = ql.QuoteHandle(ql.SimpleQuote(locals()[k]))
# return list(KRHandles.values())
#handles = KRshocks()
kr = ['kr0', 'kr_1M', 'kr_3M', 'kr_6M', 'kr_9M', 'kr_1Y','kr_2Y', 'kr_3Y',
'kr_4Y', 'kr_5Y', 'kr_6Y','kr_7Y', 'kr_8Y', 'kr_9Y', 'kr_10Y', 'kr_11Y',
'kr_12Y', 'kr_13Y', 'kr_14Y', 'kr_15Y', 'kr_16Y', 'kr_17Y', 'kr_18Y',
'kr_19Y', 'kr_20Y', 'kr_21Y', 'kr_22Y', 'kr_23Y', 'kr_24Y','kr_25Y',
'kr_26Y', 'kr_27Y', 'kr_28Y', 'kr_29Y', 'kr_30Y']
#KRQuotes = {}
handles = []
#for k in range(len(kr)):
# KRQuotes['{}'.format(kr[k])] = ql.SimpleQuote(spotRates[k])
# handles.append(ql.QuoteHandle(ql.SimpleQuote(spotRates[k])))
kr0 = ql.SimpleQuote(spotRates[0])
kr_1M = ql.SimpleQuote(spotRates[1])
kr_3M = ql.SimpleQuote(spotRates[2])
kr_6M = ql.SimpleQuote(spotRates[3])
kr_9M = ql.SimpleQuote(spotRates[4])
kr_1Y = ql.SimpleQuote(spotRates[5])
kr_2Y = ql.SimpleQuote(spotRates[6])
kr_3Y = ql.SimpleQuote(spotRates[7])
kr_4Y = ql.SimpleQuote(spotRates[8])
kr_5Y = ql.SimpleQuote(spotRates[9])
kr_6Y = ql.SimpleQuote(spotRates[10])
kr_7Y = ql.SimpleQuote(spotRates[11])
kr_8Y = ql.SimpleQuote(spotRates[12])
kr_9Y = ql.SimpleQuote(spotRates[13])
kr_10Y = ql.SimpleQuote(spotRates[14])
kr_11Y = ql.SimpleQuote(spotRates[15])
kr_12Y = ql.SimpleQuote(spotRates[16])
kr_13Y = ql.SimpleQuote(spotRates[17])
kr_14Y = ql.SimpleQuote(spotRates[18])
kr_15Y = ql.SimpleQuote(spotRates[19])
kr_16Y = ql.SimpleQuote(spotRates[20])
kr_17Y = ql.SimpleQuote(spotRates[21])
kr_18Y = ql.SimpleQuote(spotRates[22])
kr_19Y = ql.SimpleQuote(spotRates[23])
kr_20Y = ql.SimpleQuote(spotRates[24])
kr_21Y = ql.SimpleQuote(spotRates[25])
kr_22Y = ql.SimpleQuote(spotRates[26])
kr_23Y = ql.SimpleQuote(spotRates[27])
kr_24Y = ql.SimpleQuote(spotRates[28])
kr_25Y = ql.SimpleQuote(spotRates[29])
kr_26Y = ql.SimpleQuote(spotRates[30])
kr_27Y = ql.SimpleQuote(spotRates[31])
kr_28Y = ql.SimpleQuote(spotRates[32])
kr_29Y = ql.SimpleQuote(spotRates[33])
kr_30Y = ql.SimpleQuote(spotRates[34])
handles.append(ql.QuoteHandle(kr0))
handles.append(ql.QuoteHandle(kr_1M))
handles.append(ql.QuoteHandle(kr_3M))
handles.append(ql.QuoteHandle(kr_6M))
handles.append(ql.QuoteHandle(kr_9M))
handles.append(ql.QuoteHandle(kr_1Y))
handles.append(ql.QuoteHandle(kr_2Y))
handles.append(ql.QuoteHandle(kr_3Y))
handles.append(ql.QuoteHandle(kr_4Y))
handles.append(ql.QuoteHandle(kr_5Y))
handles.append(ql.QuoteHandle(kr_6Y))
handles.append(ql.QuoteHandle(kr_7Y))
handles.append(ql.QuoteHandle(kr_8Y))
handles.append(ql.QuoteHandle(kr_9Y))
handles.append(ql.QuoteHandle(kr_10Y))
handles.append(ql.QuoteHandle(kr_11Y))
handles.append(ql.QuoteHandle(kr_12Y))
handles.append(ql.QuoteHandle(kr_13Y))
handles.append(ql.QuoteHandle(kr_14Y))
handles.append(ql.QuoteHandle(kr_15Y))
handles.append(ql.QuoteHandle(kr_16Y))
handles.append(ql.QuoteHandle(kr_17Y))
handles.append(ql.QuoteHandle(kr_18Y))
handles.append(ql.QuoteHandle(kr_19Y))
handles.append(ql.QuoteHandle(kr_20Y))
handles.append(ql.QuoteHandle(kr_21Y))
handles.append(ql.QuoteHandle(kr_22Y))
handles.append(ql.QuoteHandle(kr_23Y))
handles.append(ql.QuoteHandle(kr_24Y))
handles.append(ql.QuoteHandle(kr_25Y))
handles.append(ql.QuoteHandle(kr_26Y))
handles.append(ql.QuoteHandle(kr_27Y))
handles.append(ql.QuoteHandle(kr_28Y))
handles.append(ql.QuoteHandle(kr_29Y))
handles.append(ql.QuoteHandle(kr_30Y))
ts_spreaded2 = ql.SpreadedLinearZeroInterpolatedTermStructure(spotCurveHandle,
handles,
spotDates)
ts_spreaded_handle2 = ql.YieldTermStructureHandle(ts_spreaded2)
bond_engine = ql.DiscountingBondEngine(ts_spreaded_handle2)
fixed_rate_bond.setPricingEngine(bond_engine)
#print(fixed_rate_bond.NPV())
kr0.setValue(0.1)
kr_10Y.setValue(0.2)
kr_12Y.setValue(0.2)
print(fixed_rate_bond.NPV())
no errors came out but the bond price is the same as the price before spreads added
Fairly new to Python. I'm parsing an XML file and the following code returns the undesired results. I can understand why I'm getting my results - there are two escalations in the XML for this deal and I'm getting results for each set. I'm need help updating my code to only return the monthly rent for each escalation in the XML:
<RentEscalations>
<RentEscalation ID="354781">
<BeginIn>7</BeginIn>
<Escalation>3.8</Escalation>
<RecurrenceInterval>12</RecurrenceInterval>
<EscalationType>bump</EscalationType>
</RentEscalation>
<RentEscalation ID="354782">
<BeginIn>61</BeginIn>
<Escalation>1.0</Escalation>
<RecurrenceInterval>12</RecurrenceInterval>
<EscalationType>bump</EscalationType>
</RentEscalation>
</RentEscalations>
The rent starts at $3.00/sqft for the first 6 months. This XML block shows that, for each 12 months (RecurrenceInterval), the rent will be $6.80/sqft ($3.00 base + $3.80 escalation). The following twelve months will be $10.60 ($6.80 + 3.80). Each year, the amount per square foot will increase by $3.80 until the 61st month in the term. At that point, the rent will increase by $1.00/sqft for the remainder of the term. The entire term of the lease is 120 months.
My results include 114 results based on the first escalation (3.80/sqft) followed by 114 rows showing as if the rent starts at $3.00/sqft incrementing by $1.00/sqft each year.
Any help is appreciated!
import xml.etree.ElementTree as ET
import pyodbc
import dateutil.relativedelta as rd
import datetime as dt
tree = ET.parse('C:\\FileLocation\\DealData.xml')
root = tree.getroot()
for deal in root.findall("Deals"):
for dl in deal.findall("Deal"):
dealid = dl.get("DealID")
for dts in dl.findall("DealTerms/DealTerm"):
dtid = dts.get("ID")
darea = float(dts.find("RentableArea").text)
dterm = int(dts.find("LeaseTerm").text)
for brrent in dts.findall("BaseRents/BaseRent"):
brid = brrent.get("ID")
rent = float(brrent.find("Rent").text)
darea = float(dts.find("RentableArea").text)
per = brrent.find("Period").text
dtstart = dts.find("CommencementDate").text
startyr = int(dtstart[0:4])
startmo = int(dtstart[5:7])
startday = int(dtstart[8:])
start = dt.date(startyr, startmo, startday)
end = start + rd.relativedelta(months=dterm)
if brrent.find("Duration").text is None:
duration = 0
else:
duration = int(brrent.find("Duration").text)
termbal = dterm - duration
for resc in dts.findall("RentEscalations/RentEscalation"):
rescid = resc.get("ID")
esctype = resc.find("EscalationType").text
begmo = int(resc.find("BeginIn").text)
esc = float(resc.find("Escalation").text)
intrvl = int(resc.find("RecurrenceInterval").text)
if intrvl != 0:
pers = termbal / intrvl
else:
pers = 0
escst = start + rd.relativedelta(months=begmo - 1)
i = 0
x = begmo
newrate = rent
while i < termbal:
billdt = escst + rd.relativedelta(months=i)
if per == "rsf/year":
monthlyamt = (newrate + esc) * darea / 12.0
if per == "month":
monthlyamt = newrate + esc
if per == "year":
monthlyamt = (newrate + esc) / 12.0
if per == "rsf/month":
monthlyamt = (newrate + esc) * darea
try:
if i % intrvl == 0:
level = x + 1
newrent = monthlyamt
x += 1
newrate += esc
else:
level = x
except ZeroDivisionError:
break
i += 1
if dealid == "1254278":
print(dealid, dtid, rescid, dterm, darea, escst, rent, intrvl, esctype, termbal, \
monthlyamt, billdt, pers, level, newrate, newrent)
After solving a naive datetime problem I am facing a new problem on a view to generate graphs. Now I get mktime argument out of range.
I have no idea how to solve it. I didn't write the code, I am using it from a colleague of mine and I can't seem o understand why it fails. I think it has to do with a function that runs overtime and the error pops out.
#login_required(login_url='/accounts/login/')
def loggedin(request):
data = []
data2 = []
data3 = []
dicdata2 = {}
dicdata3 = {}
datainterior = []
today = timezone.localtime(timezone.now()+timedelta(hours=1)).date()
tomorrow = today + timedelta(1)
semana= today - timedelta(7)
today = today - timedelta(1)
semana_start = datetime.combine(today, time())
semana_start = timezone.make_aware(semana_start, timezone.utc)
today_start = datetime.combine(today, time())
today_start = timezone.make_aware(today_start, timezone.utc)
today_end = datetime.combine(tomorrow, time())
today_end = timezone.make_aware(today_end, timezone.utc)
for modulo in Repository.objects.values("des_especialidade").distinct():
dic = {}
mod = str(modulo['des_especialidade'])
dic["label"] = str(mod)
dic["value"] = Repository.objects.filter(des_especialidade__iexact=mod).count()
data.append(dic)
for modulo in Repository.objects.values("modulo").distinct():
dic = {}
mod = str(modulo['modulo'])
dic["label"] = str(mod)
dic["value"] = Repository.objects.filter(modulo__iexact=mod, dt_diag__gte=semana_start).count()
datainterior.append(dic)
# print mod, Repository.objects.filter(modulo__iexact=mod).count()
# data[mod] = Repository.objects.filter(modulo__iexact=mod).count()
dicdata2['values'] = datainterior
dicdata2['key'] = "Cumulative Return"
dicdata3['values'] = data
dicdata3['color'] = "#d67777"
dicdata3['key'] = "Diagnosticos Identificados"
data3.append(dicdata3)
data2.append(dicdata2)
#-------sunburst
databurst = []
dictburst = {}
dictburst['name'] = "CHP"
childrenmodulo = []
for modulo in Repository.objects.values("modulo").distinct():
childrenmodulodic = {}
mod = str(modulo['modulo'])
childrenmodulodic['name'] = mod
childrenesp = []
for especialidade in Repository.objects.filter(modulo__iexact=mod).values("des_especialidade").distinct():
childrenespdic = {}
esp = str(especialidade['des_especialidade'])
childrenespdic['name'] = esp
childrencode = []
for code in Repository.objects.filter(modulo__iexact=mod,des_especialidade__iexact=esp).values("cod_diagnosis").distinct():
childrencodedic = {}
codee= str(code['cod_diagnosis'])
childrencodedic['name'] = 'ICD9 - '+codee
childrencodedic['size'] = Repository.objects.filter(modulo__iexact=mod,des_especialidade__iexact=esp,cod_diagnosis__iexact=codee).count()
childrencode.append(childrencodedic)
childrenespdic['children'] = childrencode
#childrenespdic['size'] = Repository.objects.filter(des_especialidade__iexact=esp).count()
childrenesp.append(childrenespdic)
childrenmodulodic['children'] = childrenesp
childrenmodulo.append(childrenmodulodic)
dictburst['children'] = childrenmodulo
databurst.append(dictburst)
# print databurst
# --------stacked area chart
datastack = []
for modulo in Repository.objects.values("modulo").distinct():
datastackdic = {}
mod = str(modulo['modulo'])
datastackdic['key'] = mod
monthsarray = []
year = timezone.localtime(timezone.now()+timedelta(hours=1)).year
month = timezone.localtime(timezone.now()+timedelta(hours=1)).month
last = timezone.localtime(timezone.now()+timedelta(hours=1)) - relativedelta(years=1)
lastyear = int(last.year)
lastmonth = int(last.month)
#i = 1
while lastmonth <= int(month) or lastyear<int(year):
date = str(lastmonth) + '/' + str(lastyear)
if (lastmonth < 12):
datef = str(lastmonth + 1) + '/' + str(lastyear)
else:
lastmonth = 01
lastyear = int(lastyear)+1
datef = str(lastmonth)+'/'+ str(lastyear)
lastmonth = 0
datainicial = datetime.strptime(date, '%m/%Y')
datainicial = timezone.make_aware(datainicial, timezone.utc)
datafinal = datetime.strptime(datef, '%m/%Y')
datafinal = timezone.make_aware(datafinal, timezone.utc)
#print "lastmonth",lastmonth,"lastyear", lastyear
#print "datainicial:",datainicial,"datafinal: ",datafinal
filtro = Repository.objects.filter(modulo__iexact=mod)
count = filtro.filter(dt_diag__gte=datainicial, dt_diag__lt=datafinal).count()
conv = datetime.strptime(date, '%m/%Y')
ms = datetime_to_ms_str(conv)
monthsarray.append([ms, count])
#i += 1
lastmonth += 1
datastackdic['values'] = monthsarray
datastack.append(datastackdic)
#print datastack
if request.user.last_login is not None:
#print(request.user.last_login)
contador_novas = Repository.objects.filter(dt_diag__lte=today_end, dt_diag__gte=today_start).count()
return render_to_response('loggedin.html',
{'user': request.user.username, 'contador': contador_novas, 'data': data, 'data2': data2,
'data3': data3,
'databurst': databurst, 'datastack':datastack})
def datetime_to_ms_str(dt):
return str(1000 * mktime(dt.timetuple()))
I think the problem is with this condition.
while lastmonth <= int(month) or lastyear<int(year):
During December, month=12, so lastmonth <= int(month) will always be True. So the loop whill always return True, even once lastyear is more that the current year.
You want to loop if the loop is in the previous year, or if the loop is in the current year and the month is not in the future. Therefore, I think you want to change it to the following:
while lastyear < year or (lastyear == year and lastmonth <= month):
To be sure that the code is working and to understand it, you need to add lots of print statements to the loops, see how lastmonth and lastyear change, and check that the loop exits when you expect it to. You also need to test it for other values of year and month so that it doesn't break next month. Ideally you want to extract this bit of the code into a separate function. It would be easier to understand the loop if it only returned a list of (month, year) integers, instead of doing lots of date formatting at the same time. Then it would be easier to add unit tests.