split data in nested json to make some calculations - python

hoping someone helps me out. I have a nested json file and I'm trying to calculate the age difference between two lines of the file, the start_date and end_date with date format of mm/yyyy only. So I'm trying to split it so I can calculate the year difference between end_date and start_date, if over 10 years, I add to another list.
This is my code below, but it prints an empty list and I don't know how to fix it. Any tips or directions will be appreciated
Oh...I have to use default python libraries so even though pandas will be easier, I can't use it.
remove_card=[]
def datebreakdown(data_file):
expr1 = data_file['Credit Card']['start_date']
expr2 = data_file['Credit Card']['end_date']
breakdown1 = expr1.split('/')
breakdown2 = expr2.split('/')
card_month = int(breakdown1[0]) - int(breakdown2[0])
card_year= int(breakdown1[1]) - int(breakdown2[1])
if card_year >= 10:
return True
elif card_year == 10 and card_year > 0:
return True
else:
return False
for line in data_json: #data_json is name of the json file.
if datebreakdown(data_file) == True:
remove_card.append(data_file)

I think these are the conditions you want:
if card_year > 10:
return True
elif card_year == 10 and card_month > 0:
return True
else:
return False
The first condition should be strictly >, not >=. The second condition should compare the months when the year difference is exactly 10.
Another problem is that you're subtracting the dates in the wrong order. You're subtracting the end from the start, so it will always be negative. So those subtractions should be:
card_month = int(breakdown2[1]) - int(breakdown1[0])
card_year= int(breakdown2[1]) - int(breakdown1[1])
def datebreakdown(data_file):
expr1 = data_file['Credit Card']['start_date']
expr2 = data_file['Credit Card']['end_date']
year1, month1 = expr1.split('/')
year2, month2 = expr2.split('/')
start_date = int(year1) + int(month1)/12
end_date = int(year2) + int(month2)/12
return end_date - start_date > 10
DEMO

Related

In python, how can you compare two CalVer strings to determine if one is greater than, lesser than, or equal to, the other?

I have the occasional need to adjust my python scripts based on the versions of various dependencies. Most often in my case, a python codebase works alongside front-end javascript that may be running releases spanning multiple years. If a javascript dependency has a version greater than A, the python should do B. If the dependency has a version less than X, the python should do Y, etc.
These dependencies are calendar versioned (CalVer). While I've located many tools for maintaining a project's own CalVer, I was unable to find a ready-made solution to evaluate CalVers in this fashion.
if "YY.MM.DD" > "YY.MM.DD.MICRO":
# Do this thing
else:
# Do that thing
Comparing dates is easy enough, but when MICRO versions come into the mix, things get more complex.
The Python Packaging Authority (PyPA) maintains the packaging library, which, among other things, implements version handling according to PEP 440 ("Version Identification and Dependency Specification"), including Calendar Versioning.
Examples (taken from Dennis's answer):
>>> from packaging import version
>>> version.parse('2021.01.31') >= version.parse('2021.01.30.dev1')
True
>>> version.parse('2021.01.31.0012') >= version.parse('2021.01.31.1012')
False
I ended up writing my own solution to allow me to compare CalVer strings like below.
subject = "2021.01.31"
test = "2021.01.30.dev1"
if calver_evaluate(operator="gte", subject=subject, test=test):
# if "2021.01.31" >= "2021.01.30.dev1"
result = True
subject = "2021.01.31.0012"
test = "2021.01.31.1012"
if calver_evaluate(operator="gte", subject=subject, test=test):
# if "2021.01.31.0012" >= "2021.01.30.1012"
result = False
Full details on the operations are included the function's docstring. Note some of the limited rules around evaluating micros that cannot be converted to integers.
import datetime
def calver_evaluate(operator=None, subject=None, test=None):
"""Evaluates two calver strings based on the operator.
Params
------
operator : str
Defines how to evaluate the subject and test params. Acceptable values are:
- "gt" or ">" for greater than
- "gte" or ">=" for greater than or equal to
- "e", "eq", "equal", "=", or "==" for equal to
- "lt" or "<" for less than
- "lte" or "<=" for less than or equal to
subject : str
A calver string formatted as YYYY.0M.0D.MICRO (recommended) or YY.MM.DD.MICRO.
https://calver.org/calendar_versioning.html
test : str
A calver string to evaluate against the subject, formatted as YYYY.0M.0D.MICRO
(recommended) or YY.MM.DD.MICRO.
https://calver.org/calendar_versioning.html
Returns
-------
bool
The results of the `subject`:`test` evaluation using the `operator`.
Notes
-----
The MICRO segment of the calver strings are only considered in the following
scenarios.
1. One calver has a MICRO value and the other does not. The calver without a
MICRO value is evaluated as `0`, making the calver *with* the MICRO, no matter
what the value, as the greater of the two.
`2021.01.01 == 2021.01.01.0`, therefore `2021.01.01.2 > 2021.01.01` and
`2021.01.01.dev1 > 2021.01.01`
2. Both calvers have MICRO values that are numeric and able to be converted to
integers.
3. Both calvers have string MICRO values **and** the operator selected is
"equals".
"""
if not operator or not subject or not test:
raise Exception("calver_evaluate: Missing keyword argument.")
allowed = ["lt","<","lte","<=","e","eq","equal","=","==","gte",">=","gt",">"]
if operator not in allowed:
raise Exception("calver_evaluate: Unrecognized evaluation operator.")
sparts = subject.split(".")
syear = int(sparts[0]) if int(sparts[0]) > 100 else int(sparts[0]) + 2000
smonth = int(sparts[1])
sday = int(sparts[2])
sdate = datetime.date(syear, smonth, sday)
smicro = sparts[3] if len(sparts) > 3 else 0
tparts = test.split(".")
tyear = int(tparts[0]) if int(tparts[0]) > 100 else int(tparts[0]) + 2000
tmonth = int(tparts[1])
tday = int(tparts[2])
tdate = datetime.date(tyear, tmonth, tday)
tmicro = tparts[3] if len(tparts) > 3 else 0
if unicode(smicro).isnumeric() and unicode(tmicro).isnumeric():
smicro = int(smicro)
tmicro = int(tmicro)
elif smicro == 0:
tmicro = 1
elif tmicro == 0:
smicro = 1
lt = ["lt","<"]
lte = ["lte","<="]
equal = ["e","eq","equal","=","=="]
gte = ["gte",">="]
gt = ["gt",">"]
check_micro = (
(
isinstance(smicro, int) and isinstance(tmicro, int) and
(smicro > 0 or tmicro > 0)
) or
(
operator in equal and
not isinstance(smicro, int) and
not isinstance(tmicro, int)
)
)
def evaluate_micro(operator, smicro, tmicro):
if operator in lt:
if smicro < tmicro:
return True
elif operator in lte:
if smicro <= tmicro:
return True
elif operator in equal:
if smicro == tmicro:
return True
elif operator in gte:
if smicro >= tmicro:
return True
elif operator in gt:
if smicro > tmicro:
return True
return False
if operator in lt and sdate <= tdate:
if sdate < tdate:
return True
elif sdate == tdate and check_micro:
return evaluate_micro(operator, smicro, tmicro)
elif operator in lte and sdate <= tdate:
if sdate == tdate and check_micro:
return evaluate_micro(operator, smicro, tmicro)
return True
elif operator in equal:
if sdate == tdate:
if check_micro:
return evaluate_micro(operator, smicro, tmicro)
return True
elif operator in gte and sdate >= tdate:
if sdate == tdate and check_micro:
return evaluate_micro(operator, smicro, tmicro)
return True
elif operator in gt and sdate >= tdate:
if sdate > tdate:
return True
elif sdate == tdate and check_micro:
return evaluate_micro(operator, smicro, tmicro)
return False

Code to check if current time is before due time

I am trying to see if the current hour, time and section is before the due hour, due minute and due section then it should print true otherwise false. My code is not working and ive been working on this for 2 hours
current_hour = 12
current_minute = 37
current_section = "PM"
due_hour = 9
due_minute = 0
due_section = "AM"
if (((current_hour < 9) and (current_hour != 12)) and (current_minute != 0) and current_section):
print("True")
else:
print("False")
Your current code is failing (presumably) because you're using 'and current_section' which will pass True for any value of current_selection.
Using the datetime library makes this quite simple:
from datetime import datetime
due_time = datetime.strptime('9:00AM','%I:%M%p')
curr_time = datetime.strptime('12:37PM','%I:%M%p')
diff_seconds = (curr_time - due_time).total_seconds()
if diff_seconds > 0:
print('False')
else:
print('True')
You can also add dates to make it more robust (see https://stackoverflow.com/a/466376/10475762 for more information on how to use strptime).

Comparing Dates string in Python

I need help with the python function of comparing two dates (string) and return True if date1 is ealier than date2. Here is my code but I'm don't know why it returns True for the test case("2013/10/24", "2013/9/24")
# str, str -> boolean
def dateLessThan(date1,date2):
date1 = date1.split('/')
date2 = date2.split('/')
if date1[0] < date2[0]:
return True
elif date1[0] == date2[0] and date1[1] < date2[1]:
return True
elif date1[0] == date2[0] and date1[1] == date2[1] and date1[2] < date2[2]:
return True
else:
return False
Just use the datetime.strptime class method instead of doing your own parsing.
def dateLessThan(date1,date2):
date1 = datetime.datetime.strptime(date1, "%Y/%m/%d")
date2 = datetime.datetime.strptime(date2, "%Y/%m/%d")
return date1 < date2
Consider using datetime objects (assumed your time format is YY/mm/dd)
from datetime import datetime
def dateLessThan(date1,date2):
datetime1 = datetime.strptime(date1, '%Y/%m/%d')
datetime2 = datetime.strptime(date2, '%Y/%m/%d')
return datetime1 < datetime2
your test fails because of lexicographical comparison of strings. "10" < "9".
Without using datetime or time parsing (which is required when there are complex formats, months names...), it's possible to do something simple since there's only numbers involved (and you have years/month/day, so you're close to the ISO date format where you can compare lexicographically).
Just map the values into integers and convert to lists and let the natural/lexicographical order of lists do the rest:
def dateLessThan(date1,date2):
return [int(x) for x in date1.split('/')] < [int(x) for x in date2.split('/')]

selecting date in date ranges wrong result

Following if statement occasionally gives wrong result. any idea why?
if self.start_date <= datetime.datetime.now().date() <= self.end_date:
self._is_current = True
else:
self._is_current = False
return self._is_current

Add n business days to a given date ignoring holidays and weekends in python

I'm trying to add n (integer) working days to a given date, the date addition has to avoid the holidays and weekends (it's not included in the working days)
Skipping weekends would be pretty easy doing something like this:
import datetime
def date_by_adding_business_days(from_date, add_days):
business_days_to_add = add_days
current_date = from_date
while business_days_to_add > 0:
current_date += datetime.timedelta(days=1)
weekday = current_date.weekday()
if weekday >= 5: # sunday = 6
continue
business_days_to_add -= 1
return current_date
#demo:
print '10 business days from today:'
print date_by_adding_business_days(datetime.date.today(), 10)
The problem with holidays is that they vary a lot by country or even by region, religion, etc. You would need a list/set of holidays for your use case and then skip them in a similar way. A starting point may be the calendar feed that Apple publishes for iCal (in the ics format), the one for the US would be http://files.apple.com/calendars/US32Holidays.ics
You could use the icalendar module to parse this.
If you don't mind using a 3rd party library then dateutil is handy
from dateutil.rrule import *
print "In 4 business days, it's", rrule(DAILY, byweekday=(MO,TU,WE,TH,FR))[4]
You can also look at rruleset and using .exdate() to provide the holidays to skip those in the calculation, and optionally there's a cache option to avoid re-calculating that might be worth looking in to.
There is no real shortcut to do this. Try this approach:
Create a class which has a method skip(self, d) which returns True for dates that should be skipped.
Create a dictionary in the class which contains all holidays as date objects. Don't use datetime or similar because the fractions of a day will kill you.
Return True for any date that is in the dictionary or d.weekday() >= 5
To add N days, use this method:
def advance(d, days):
delta = datetime.timedelta(1)
for x in range(days):
d = d + delta
while holidayHelper.skip(d):
d = d + delta
return d
Thanks based on omz code i made some little changes ...it maybe helpful for other users:
import datetime
def date_by_adding_business_days(from_date, add_days,holidays):
business_days_to_add = add_days
current_date = from_date
while business_days_to_add > 0:
current_date += datetime.timedelta(days=1)
weekday = current_date.weekday()
if weekday >= 5: # sunday = 6
continue
if current_date in holidays:
continue
business_days_to_add -= 1
return current_date
#demo:
Holidays =[datetime.datetime(2012,10,3),datetime.datetime(2012,10,4)]
print date_by_adding_business_days(datetime.datetime(2012,10,2), 10,Holidays)
I wanted a solution that wasn't O(N) and it looked like a fun bit of code golf. Here's what I banged out in case anyone's interested. Works for positive and negative numbers. Let me know if I missed anything.
def add_business_days(d, business_days_to_add):
num_whole_weeks = business_days_to_add / 5
extra_days = num_whole_weeks * 2
first_weekday = d.weekday()
remainder_days = business_days_to_add % 5
natural_day = first_weekday + remainder_days
if natural_day > 4:
if first_weekday == 5:
extra_days += 1
elif first_weekday != 6:
extra_days += 2
return d + timedelta(business_days_to_add + extra_days)
I know it does not handle holidays, but I found this solution more helpful because it is constant in time. It consists of counting the number of whole weeks, adding holidays is a little more complex. I hope it can help somebody :)
def add_days(days):
today = datetime.date.today()
weekday = today.weekday() + ceil(days)
complete_weeks = weekday // 7
added_days = weekday + complete_weeks * 2
return today + datetime.timedelta(days=added_days)
This will take some work since there isn't any defined construct for holidays in any library (by my knowledge at least). You will need to create your own enumeration of those.
Checking for weekend days is done easily by calling .weekday() < 6 on your datetime object.
Refactoring omz code, and using holidays package, this is what I use to add business days taking into account the country's holidays
import datetime
import holidays
def today_is_holiday(date):
isHoliday = date.date() in [key for key in holidays.EN(years = date.year).keys()]
isWeekend = date.weekday() >= 5
return isWeekend or isHoliday
def date_by_adding_business_days(from_date, add_days):
business_days_to_add = add_days
current_date = from_date
while business_days_to_add > 0:
current_date += datetime.timedelta(days=1)
if today_is_holiday(current_date):
continue
business_days_to_add -= 1
return current_date
Hope this helps. It's not O(N) but O(holidays). Also, holidays only works when the offset is positive.
def add_working_days(start, working_days, holidays=()):
"""
Add working_days to start start date , skipping weekends and holidays.
:param start: the date to start from
:type start: datetime.datetime|datetime.date
:param working_days: offset in working days you want to add (can be negative)
:type working_days: int
:param holidays: iterator of datetime.datetime of datetime.date instances
:type holidays: iter(datetime.date|datetime.datetime)
:return: the new date wroking_days date from now
:rtype: datetime.datetime
:raise:
ValueError if working_days < 0 and holidays
"""
assert isinstance(start, (datetime.date, datetime.datetime)), 'start should be a datetime instance'
assert isinstance(working_days, int)
if working_days < 0 and holidays:
raise ValueError('Holidays and a negative offset is not implemented. ')
if working_days == 0:
return start
# first just add the days
new_date = start + datetime.timedelta(working_days)
# now compensate for the weekends.
# the days is 2 times plus the amount of weeks are included in the offset added to the day of the week
# from the start. This compensates for adding 1 to a friday because 4+1 // 5 = 1
new_date += datetime.timedelta(2 * ((working_days + start.weekday()) // 5))
# now compensate for the holidays
# process only the relevant dates so order the list and abort the handling when the holiday is no longer
# relevant. Check each holiday not being in a weekend, otherwise we don't mind because we skip them anyway
# next, if a holiday is found, just add 1 to the date, using the add_working_days function to compensate for
# weekends. Don't pass the holiday to avoid recursion more then 1 call deep.
for hday in sorted(holidays):
if hday < start:
# ignore holidays before start, we don't care
continue
if hday.weekday() > 4:
# skip holidays in weekends
continue
if hday <= new_date:
# only work with holidays up to and including the current new_date.
# increment using recursion to compensate for weekends
new_date = add_working_days(new_date, 1)
else:
break
return new_date
If someone needs to add/substract days, extending #omz's answer:
def add_business_days(from_date, ndays):
business_days_to_add = abs(ndays)
current_date = from_date
sign = ndays/abs(ndays)
while business_days_to_add > 0:
current_date += datetime.timedelta(sign * 1)
weekday = current_date.weekday()
if weekday >= 5: # sunday = 6
continue
business_days_to_add -= 1
return current_date
similar to #omz solution but recursively:
def add_days_skipping_weekends(start_date, days):
if not days:
return start_date
start_date += timedelta(days=1)
if start_date.weekday() < 5:
days -= 1
return add_days_skipping_weekends(start_date, days)
If you are interested in using NumPy, then you can follow the solution below:
import numpy as np
from datetime import datetime, timedelta
def get_future_date_excluding_weekends(date,no_of_days):
"""This methods return future date by adding given number of days excluding
weekends"""
future_date = date + timedelta(no_of_days)
no_of_busy_days = int(np.busday_count(date.date(),future_date.date()))
if no_of_busy_days != no_of_days:
extend_future_date_by = no_of_days - no_of_busy_days
future_date = future_date + timedelta(extend_future_date_by)
return future_date
This is the best solution because it has O(1) complexity (no loop) and no 3-rd party, but it does not take into account the holidays:
def add_working_days_to_date(self, start_date, days_to_add):
from datetime import timedelta
start_weekday = start_date.weekday()
# first week
total_days = start_weekday + days_to_add
if total_days < 5:
return start_date + timedelta(days=total_days)
else:
# first week
total_days = 7 - start_weekday
days_to_add -= 5 - start_weekday
# middle whole weeks
whole_weeks = days_to_add // 5
remaining_days = days_to_add % 5
total_days += whole_weeks * 7
days_to_add -= whole_weeks * 5
# last week
total_days += remaining_days
return start_date + timedelta(days=total_days)
Even though this does not fully solves your problem, I wanted to let it here because the solutions found on the internet for adding working days to dates, all of them have O(n) complexity.
Keep in mind that, if you want to add 500 days to a date, you will go through a loop and make the same set of computations 500 times. The above approach operates in the same amount of time, no matter how many days you have.
This was heavily tested.
Use numpy (you can skip holidays too):
np.busday_offset(
np.datetime64('2022-12-08'),
offsets=range(12),
roll='following',
weekmask="1111100",
holidays=[])
Result:
array(['2022-12-08', '2022-12-09', '2022-12-12', '2022-12-13',
'2022-12-14', '2022-12-15', '2022-12-16', '2022-12-19',
'2022-12-20', '2022-12-21', '2022-12-22', '2022-12-23'],
dtype='datetime64[D]')
I am using following code to handle business date delta. For holidays, you need to create your own list to skip.
today = datetime.now()
t_1 = today - BDay(1)
t_5 = today - BDay(5)
t_1_str = datetime.strftime(t_1,"%Y%m%d")

Categories