function tests haven't gone as expected(part of AoC day4) - python

I wrote a function that checks if data is correct.
Requirements are as follows:
byr-(Birth Year) - four digits; at least 1920 and at most 2002.
iyr (Issue Year) - four digits; at least 2010 and at most 2020.
eyr (Expiration Year) - four digits; at least 2020 and at most 2030.
def check_byr_iyr_eyr(line):
statement = True
if line[:3] == "byr":
if (len(line[line.index(':')+1:]) != 4 or
1920 > int(line[line.index(':')+1:]) > 2002 ):
statement = False
elif line[:3] == "iyr":
if (len(line[line.index(':')+1:]) != 4 or
2010 > int(line[line.index(':')+1:]) > 2020 ):
statement = False
elif line[:3] == "eyr":
if (len(line[line.index(':')+1:]) != 4 or
2020 > int(line[line.index(':')+1:]) > 2030 ):
statement = False
return statement
list = ['byr:1919', 'iyr:2010', 'eyr:2021', 'iyr:2019', 'iyr:1933',
'byr:1946', 'iyr:1919', 'eyr:2005']
for i in list:
print(check_byr_iyr_eyr(i))
'''
expected result:
False
True
True
True
False
True
False
False
'''
and results of checking provided samples should be like in that multi-line comment "expected results", but unfortunately a result is always True.
I don't know what I'am doing wrong - conditions seems good to me...

Consider this line:
1920 > val > 2002
It is the same result as:
val < 1920 and val > 2002
It means that val is both less than 1920, and greater than 2002, which can never be true.

An elegant solution, using too many if statements is not DRY (don't repeat yourself):
def check_byr_iyr_eyr(line):
# split the string on the colon to get the two parts
prefix, year = line.split(':')
# getting around python's lack of case statements with a dictionary
cases = {
'byr': {'min': 1920, 'max': 2002},
'iyr': {'min': 2010, 'max': 2020},
'eyr': {'min': 2020, 'max': 2030},
}
# get the corresponding min and max and check if the year is inclusively between them
# (note the <= instead of <)
return cases[prefix]['min'] <= int(year) <= cases[prefix]['max']
data = ['byr:1919', 'iyr:2010', 'eyr:2021', 'iyr:2019', 'iyr:1933',
'byr:1946', 'iyr:1919', 'eyr:2005']
for i in data:
print(check_byr_iyr_eyr(i))
Output:
False
True
True
True
False
True
False
False

There are two problems with your function:
Instead of doing complicated string slicing to check what type it is, just do string in line. It's much simpler and readable.
The expression 5 > x > 7, as wim mentioned, is equivalent to an and statement. Since x cannot be both smaller than 5 and greater than 7, this is never True. Do x > 7 or x < 5 instead.
Below is the corrected code which has the correct output:
def check_byr_iyr_eyr(line):
statement = True
if "byr" in line:
if (len(line[line.index(':')+1:]) != 4 or
int(line[line.index(':')+1:]) > 2002 or
int(line[line.index(':')+1:]) < 1920):
statement = False
elif "iyr" in line:
if (len(line[line.index(':')+1:]) != 4 or
int(line[line.index(':')+1:]) > 2020 or
int(line[line.index(':')+1:]) < 2010):
statement = False
elif "eyr" in line:
if (len(line[line.index(':')+1:]) != 4 or
int(line[line.index(':')+1:]) > 2030 or
int(line[line.index(':')+1:]) < 2020):
statement = False
return statement
list = ['byr:1919', 'iyr:2010', 'eyr:2021', 'iyr:2019', 'iyr:1933',
'byr:1946', 'iyr:1919', 'eyr:2005']
for i in list:
print(check_byr_iyr_eyr(i))
'''
expected result:
False
True
True
True
False
True
False
False
'''
Check out Tenacious B's answer for a much more elegant solution!

Related

in Python does 'or not' has 'not' precencence or 'or' precedence?

first time asking for help here.
I going through Kaggle' Python exercises and on Booleans section there is this example:
prepared_for_weather = have_umbrella or rain_level < 5 and have_hood or not rain_level > 0 and is_workday
They say that to make the code more readable it helps using parenthesis this way:
prepared_for_weather = have_umbrella or (rain_level < 5 and have_hood) or not (rain_level > 0 and is_workday)
I am confused about why (considering that the order of operator's precendence in not, and, or) at the end of the expression the and after rain_level > 0 had higher precendence than the not before it.
Is it because the not should be considerate as 'or not' so having less precence than the and?
I hope this is clear enough, thanks!
Having read on Python's documentation that the oerder of precedence is not, and, or I expected the not just before rain_level > 0 to have higher precendence than the and after it.
As stated in the docs, not x has precedence over and
The suggested expression with braces "for readability" is not equivalent to first one, e.g.
have_umbrella = False
have_hood = False
is_workday = False
rain_level = 3
# first
prepared_for_weather = have_umbrella or rain_level < 5 and have_hood or not rain_level > 0 and is_workday
print(prepared_for_weather)
# Kaggle's
prepared_for_weather = have_umbrella or (rain_level < 5 and have_hood) or not (rain_level > 0 and is_workday)
print(prepared_for_weather)
# mine from comments
prepared_for_weather = have_umbrella or rain_level < 5 and have_hood or ((not rain_level > 0) and is_workday)
print(prepared_for_weather)
output
False
True
False

In python, how can you compare two CalVer strings to determine if one is greater than, lesser than, or equal to, the other?

I have the occasional need to adjust my python scripts based on the versions of various dependencies. Most often in my case, a python codebase works alongside front-end javascript that may be running releases spanning multiple years. If a javascript dependency has a version greater than A, the python should do B. If the dependency has a version less than X, the python should do Y, etc.
These dependencies are calendar versioned (CalVer). While I've located many tools for maintaining a project's own CalVer, I was unable to find a ready-made solution to evaluate CalVers in this fashion.
if "YY.MM.DD" > "YY.MM.DD.MICRO":
# Do this thing
else:
# Do that thing
Comparing dates is easy enough, but when MICRO versions come into the mix, things get more complex.
The Python Packaging Authority (PyPA) maintains the packaging library, which, among other things, implements version handling according to PEP 440 ("Version Identification and Dependency Specification"), including Calendar Versioning.
Examples (taken from Dennis's answer):
>>> from packaging import version
>>> version.parse('2021.01.31') >= version.parse('2021.01.30.dev1')
True
>>> version.parse('2021.01.31.0012') >= version.parse('2021.01.31.1012')
False
I ended up writing my own solution to allow me to compare CalVer strings like below.
subject = "2021.01.31"
test = "2021.01.30.dev1"
if calver_evaluate(operator="gte", subject=subject, test=test):
# if "2021.01.31" >= "2021.01.30.dev1"
result = True
subject = "2021.01.31.0012"
test = "2021.01.31.1012"
if calver_evaluate(operator="gte", subject=subject, test=test):
# if "2021.01.31.0012" >= "2021.01.30.1012"
result = False
Full details on the operations are included the function's docstring. Note some of the limited rules around evaluating micros that cannot be converted to integers.
import datetime
def calver_evaluate(operator=None, subject=None, test=None):
"""Evaluates two calver strings based on the operator.
Params
------
operator : str
Defines how to evaluate the subject and test params. Acceptable values are:
- "gt" or ">" for greater than
- "gte" or ">=" for greater than or equal to
- "e", "eq", "equal", "=", or "==" for equal to
- "lt" or "<" for less than
- "lte" or "<=" for less than or equal to
subject : str
A calver string formatted as YYYY.0M.0D.MICRO (recommended) or YY.MM.DD.MICRO.
https://calver.org/calendar_versioning.html
test : str
A calver string to evaluate against the subject, formatted as YYYY.0M.0D.MICRO
(recommended) or YY.MM.DD.MICRO.
https://calver.org/calendar_versioning.html
Returns
-------
bool
The results of the `subject`:`test` evaluation using the `operator`.
Notes
-----
The MICRO segment of the calver strings are only considered in the following
scenarios.
1. One calver has a MICRO value and the other does not. The calver without a
MICRO value is evaluated as `0`, making the calver *with* the MICRO, no matter
what the value, as the greater of the two.
`2021.01.01 == 2021.01.01.0`, therefore `2021.01.01.2 > 2021.01.01` and
`2021.01.01.dev1 > 2021.01.01`
2. Both calvers have MICRO values that are numeric and able to be converted to
integers.
3. Both calvers have string MICRO values **and** the operator selected is
"equals".
"""
if not operator or not subject or not test:
raise Exception("calver_evaluate: Missing keyword argument.")
allowed = ["lt","<","lte","<=","e","eq","equal","=","==","gte",">=","gt",">"]
if operator not in allowed:
raise Exception("calver_evaluate: Unrecognized evaluation operator.")
sparts = subject.split(".")
syear = int(sparts[0]) if int(sparts[0]) > 100 else int(sparts[0]) + 2000
smonth = int(sparts[1])
sday = int(sparts[2])
sdate = datetime.date(syear, smonth, sday)
smicro = sparts[3] if len(sparts) > 3 else 0
tparts = test.split(".")
tyear = int(tparts[0]) if int(tparts[0]) > 100 else int(tparts[0]) + 2000
tmonth = int(tparts[1])
tday = int(tparts[2])
tdate = datetime.date(tyear, tmonth, tday)
tmicro = tparts[3] if len(tparts) > 3 else 0
if unicode(smicro).isnumeric() and unicode(tmicro).isnumeric():
smicro = int(smicro)
tmicro = int(tmicro)
elif smicro == 0:
tmicro = 1
elif tmicro == 0:
smicro = 1
lt = ["lt","<"]
lte = ["lte","<="]
equal = ["e","eq","equal","=","=="]
gte = ["gte",">="]
gt = ["gt",">"]
check_micro = (
(
isinstance(smicro, int) and isinstance(tmicro, int) and
(smicro > 0 or tmicro > 0)
) or
(
operator in equal and
not isinstance(smicro, int) and
not isinstance(tmicro, int)
)
)
def evaluate_micro(operator, smicro, tmicro):
if operator in lt:
if smicro < tmicro:
return True
elif operator in lte:
if smicro <= tmicro:
return True
elif operator in equal:
if smicro == tmicro:
return True
elif operator in gte:
if smicro >= tmicro:
return True
elif operator in gt:
if smicro > tmicro:
return True
return False
if operator in lt and sdate <= tdate:
if sdate < tdate:
return True
elif sdate == tdate and check_micro:
return evaluate_micro(operator, smicro, tmicro)
elif operator in lte and sdate <= tdate:
if sdate == tdate and check_micro:
return evaluate_micro(operator, smicro, tmicro)
return True
elif operator in equal:
if sdate == tdate:
if check_micro:
return evaluate_micro(operator, smicro, tmicro)
return True
elif operator in gte and sdate >= tdate:
if sdate == tdate and check_micro:
return evaluate_micro(operator, smicro, tmicro)
return True
elif operator in gt and sdate >= tdate:
if sdate > tdate:
return True
elif sdate == tdate and check_micro:
return evaluate_micro(operator, smicro, tmicro)
return False

Get index number when condition is true in 3 columns

I've a question regarding some code in python. I'm trying to extract the index of the first row when the condition TRUE is satisfied in 3 different columns. This is the data I'm using:
0 1 2 3 4
0 TRUE TRUE TRUE 0.41871395 0.492517879
1 TRUE TRUE TRUE 0.409863582 0.519425031
2 TRUE TRUE TRUE 0.390077415 0.593127232
3 FALSE FALSE FALSE 0.372020631 0.704367199
4 FALSE FALSE FALSE 0.373546556 0.810876797
5 FALSE FALSE FALSE 0.398876919 0.86855678
6 FALSE FALSE FALSE 0.432142094 0.875576037
7 FALSE FALSE FALSE 0.454115421 0.863063448
8 FALSE TRUE FALSE 0.460676901 0.855739006
9 FALSE TRUE FALSE 0.458693197 0.855128636
10 FALSE FALSE FALSE 0.459201839 0.856451104
11 FALSE FALSE FALSE 0.458693197 0.855739006
12 FALSE FALSE FALSE 0.458082827 0.856349376
13 FALSE FALSE FALSE 0.456556902 0.856959746
14 TRUE TRUE TRUE 0.455946532 0.858180486
15 TRUE TRUE TRUE 0.455030976 0.858790857
16 TRUE TRUE TRUE 0.454725791 0.858485672
17 FALSE FALSE FALSE 0.454420606 0.857875301
18 FALSE FALSE FALSE 0.454725791 0.858383943
19 FALSE TRUE FALSE 0.453199866 0.856654561
20 FALSE FALSE FALSE 0.451979125 0.856349376
21 FALSE FALSE FALSE 0.45167394 0.856959746
22 FALSE FALSE FALSE 0.451775669 0.857570116
23 FALSE FALSE FALSE 0.45106357 0.857264931
24 TRUE TRUE TRUE 0.450758385 0.856654561
25 TRUE TRUE TRUE 0.4504532 0.856044191
26 TRUE TRUE TRUE 0.449232459 0.856349376
27 TRUE TRUE TRUE 0.448316904 0.855535549
and I need to get the index number only when there are 3 'True' conditions:
0
14
24
Thank you!
I guess everyone missed the "extract the index of the first row" part. One of the way would be removing consecutive duplicates first and then obtaining index where all three is True so that you only get first row of the truth
df=df[['0', '1', '2']]
df=df[df.shift()!=df].dropna().all(axis=1)
print(df[df].index.tolist())
OUTPUT:
[0, 14, 24]
I tried this on a demo dataframe and it seems to work for me.
df = pd.DataFrame(data={'A':[True,True,True,True,True,False,True,True],'B':[True,True,False,True,True,False,True,True],'C':[True,False,True,True,True,False,True,True]})
i =df[(df['A']==True) & (df['B']==True) & (df['C']==True)].index.to_list()
i = [x for x in i if x-1 not in i]
EDIT 2: I have a new answer in response to some clarifications.
You're looking for each row that has TRUE in columns 0, 1, or 2, BUT you'd like to ignore such rows that are not the first in a streak of them. The first part of my answer is still the same, I think you should create a mask that selects your TRUE triplet rows:
condition = df[[0, 1, 2]].all(axis='columns')
But now I present a possible way to filter out the rows you want to ignore. To be not-first in a streak of TRUE triplet rows means that the previous row also satisfies condition.
idx = df[condition].index
ignore = idx.isin(idx + 1)
result = idx[~ignore]
In other words, ignore rows where the index value is the successor of an index value satisfying condition.
Hope this helps!
Keeping my original answer for record keeping:
I think you'll end up with the most readable solution by breaking this out into two steps:
First, find out which rows have the value True for all of the columns you're interested in:
condition = df[[0, 1, 2]].all(axis='columns')
Then, the index values you're interested in are simply df[condition].index.
EDIT: if, as Benoit points out may be the case, TRUE and FALSE are strings, that's fine, you just need a minor tweak to the first step:
condition = (df[[0, 1, 2]] == 'TRUE').all(axis='columns')
If the TRUE and FALSE in your DataFrame are actually the boolean values True and False then,
#This will look at the first 3 columns and return True if "all" are True else it will return False:
step1 = [all(q) for q in df[[0,1,2]].values]
id = []
cnt = 0
temp_cnt = 0
#this loop finds where the value is true and checks if the next 2 are also true
#it then appends the count-2 to a list named id, the -2 compensates for the index.
for q in step1:
if q:
cnt += 1
if cnt == 3:
id.append(temp_cnt - 2)
else:
cnt = 0
temp_cnt += 1
#Then when printing "id" it will return the first index where AT LEAST 3 True values occur in sequence.
id
Out[108]: [0, 14, 24]
I think this could do the trick. As a general advice though, it always helps to name the columns in pandas.
Say that your pandas data frame is named data:
data[(data[0] == True) & (data[1] == True) & (data[2] == True)].index.values
or
list(data[(data[0] == True) & (data[1] == True) & (data[2] == True)].index.values)
Based on the answer here, something like this will provide a list of indices for the rows that meet all conditions:
df[(df[0]==True) & (df[1]==True) & (df[2]==True)].index.tolist()
The following will work regardless of the position of the 3 columns you wish to check for True values, and gives you back a list indicating which rows have 3 True values present:
Edit:
Now updated to better align with the OP's original request:
#df.iloc[:,:3] = df.iloc[:,:3].apply(lambda x: str(x) == "TRUE") # If necessary
s = (df == True).apply(sum, axis=1) == 3
s = s[s.shift() != s]
s.index[s].tolist()

Writing and using your own functions - basics

Your task is to write and test a function which takes two arguments (a year and a month) and returns the number of days for the given month/year pair (yes, we know that only February is sensitive to the year value, but we want our function to be universal). Now, convince the function to return None if its arguments don't make sense.
Use a list filled with the months' lengths. You can create it inside the function - this trick will significantly shorten the code.
I have got the code down but not the 'none' part. Can someone help me with this?
def IsYearLeap(year):
if (year%4==0):
return True
if (year%4!=0):
return False
def DaysInMonth(year,month):
if month in {1, 3, 5, 7, 8, 10, 12}:
return 31
elif month==2:
if IsYearLeap(year):
return 29
else:
return 28
elif month in {4,6,8,9,11}:
return 30
else:
return none
testyears = [1900, 2000, 2016, 1987,2019]
testmonths = [ 2, 2, 1, 11,4]
testresults = [28, 29, 31, 30,33]
for i in range(len(testyears)):
yr = testyears[i]
mo = testmonths[i]
print(yr,mo,"->",end="")
result = DaysInMonth(yr,mo)
if result == testresults[i]:
print("OK")
else:
print("Failed")
It seems that you have rather made a simple mistake. If you are not used the case-sensitive programming languages or have no experience in programming languages, this is understandable.
The keyword None is being misspelled as the undefined word none.
I think your testresults is wrong. February of 1900 should be 29 days also April of 2019 30 days. Also its None instead none. Another things also its better to using list on months list so you could using [1, 3, 5, 7, ...] instead {1, 3, 5, 7, ...}.
Also from your test cases you won't got None, in case you want check this case you could check with month = 13, and you will cover this case
As a further comment to the other good answers to this question, the correct rule for leap years should be something like:
def is_leap_year(year):
""" is it a leap year?
>>> is_leap_year(1984)
True
>>> is_leap_year(1985)
False
>>> is_leap_year(1900)
False
>>> is_leap_year(2000)
True
"""
return (year % 4 == 0 and
(year % 100 != 0 or year % 400 == 0))
Similarly, the test cases need to be clear that 1900 was not a leap year, 2000 was. I recommend writing a separate set of test cases for is_leap_year. Ultimately, in production code, you will be better off to use one of the many time/date libraries. The comments that I've provided make use of doctest to provide this unit test quickly.
A function which does not explicitly return anything implicitly returns None.
In addition to the spelling error (none vs None) you are using this by accident here:
def IsYearLeap(year):
if (year%4==0):
return True
if (year%4!=0):
return False
Can you see what will happen if neither of the conditions is true? It won't return either False or True, which presumably the caller expects. (Though if you check whether None == True you will get False, and not None is True, so you won't get a syntax error, just a result which might be different from what you expect - the worst kind of bug!)
def IsYearLeap(year):
return year % 4 == 0 & (year % 400 == 0 | year % 100 != 0)
def DaysInMonth(year,month):
if month in [1, 3, 5, 7, 8, 10, 12]:
return 31
elif month==2:
if IsYearLeap(year):
return 29
else:
return 28
elif month in [4,6,8,9,11]:
return 30
else:
return None
#
testYears = [1900, 2000, 2016, 1987]
testMonths = [2, 2, 1, 11]
testResults = [28, 29, 31, 30]
for i in range(len(testYears)):
yr = testYears[i]
mo = testMonths[i]
print(yr, mo, "->", end="")
result = DaysInMonth(yr, mo)
if result == testResults[i]:
print("OK")
else:
print("Failed")
def is_year_leap(year):
return year % 4 == 0 and year % 100 != 0 or year % 400 == 0
def days_in_month(year, month):
days = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
if type(year) != int or year < 1582 or\
type(month) != int or month < 1 or month > 12:
return None
elif is_year_leap(year):
del days[1]
days.insert(1, 29)
return days[month - 1]
test_years = [1900, 2000, 2016, 1987]
test_months = [2, 2, 1, 11]
test_results = [28, 29, 31, 30]
for i in range(len(test_years)):
yr = test_years[i]
mo = test_months[i]
print(yr, mo, "->", end="")
result = days_in_month(yr, mo)
if result == test_results[i]:
print("OK")
else:
print("Failed")
A few minor remarks:
You should remove the duplicate 8th month (August is listed for 30 and 31 days),
Better replace brackets {} with the list [],
Replace none with None (Python is case-sensitive, None is the keyword),
Add one more condition for a leap year:
(year % 400 == 0) and (year % 100 == 0) -> return True
(year % 4 == 0) and (year % 100 != 0) -> return True

Strange logic with bool

I can't understand one thing with logic in python. Here is the code:
maxCounter = 1500
localCounter = 0
while True:
print str(localCounter) + ' >= ' + str(maxCounter)
print localCounter >= maxCounter
if localCounter >= maxCounter:
break
localCounter += 30
And the result output:
...
1440 >= 1500
False
1470 >= 1500
False
1500 >= 1500
False
1530 >= 1500
False
1560 >= 1500
False
...
And I have infinity cycle there. Why?
topPos = someClass.get_element_pos('element')
scrolledHeight = 0
while True:
print str(scrolledHeight) + ' >= ' + str(topPos)
print scrolledHeight >= topPos
if scrolledHeight >= topPos:
print 'break'
break
someClass.run_javascript("window.scrollBy(0, 30)")
scrolledHeight += 30
print scrolledHeight
time.sleep(0.1)
To fix your code try this:
topPos = int(someClass.get_element_pos('element'))
Why?
When I copy and paste your original code I get this:
...
1440 >= 1500
False
1470 >= 1500
False
1500 >= 1500
True
One small change that I can find to make to your code that reproduces the behaviour you are seeing is to change the first line to this:
maxCounter = '1500' # string instead of integer
After making this change I can also see the output you get:
1410 >= 1500
False
1440 >= 1500
False
1470 >= 1500
False
1500 >= 1500
False
1530 >= 1500
False
etc..
The problem seems to be at this line:
topPos = someClass.get_element_pos('element')
This is likely to assign a string to topPos, instead of a numeric variable. You need to convert this string to a numeric variable so you can do a numeric comparison against it.
topPos = int(someClass.get_element_pos('element'))
Otherwise, e.g. in CPython implementation of v2.7, any int is always going to compare less than any string.
Related questions
How does Python compare string and int?

Categories