As a beginner in Python I decided to have a go at the Codewars puzzles.
Codewars uses Python 2.7.6.
The second puzzle requires you to:
Write a function that will return the count of distinct case-insensitive alphabetic characters and numeric digits that occur more than once in the input string. The input string can be assumed to contain only alphabets (both uppercase and lowercase) and numeric digits.
For example, if you give the program "abcde" it should give you 0, because there are no duplicates. But, if you give it "indivisibilities" it should give you 2, because there are 2 duplicate letters: i (occurs 7 times) and s (occurs twice).
As a beginner I came up with an approach that I imagine is very crude, but nevertheless it works perfectly on my system:
def duplicate_count(text):
# the number of duplicates
dupes = 0
# convert input string to lower case and split into individual characters
list_of_chars = list(text.lower())
# sort list into groups
sorted_chars = sorted(list_of_chars)
# get length of list
n = len(sorted_chars)
# check whether the first element of the list is the same as the second. If
# it is, add one to the dupes count
if sorted_chars[0] == sorted_chars[1]:
dupes += 1
else:
dupes += 0
# start with the second element (index: 1) and finish with the (n - 1)-th
# element
for i in range(1, n - 1):
# if the ith element of the list is the same as the next one, add one
# to the dupes count. However, since we only want to count each
# duplicate once, we must check that the ith element is not the same as
# the previous one
if sorted_chars[i] == sorted_chars[i + 1] and sorted_chars[i] != sorted_chars[i - 1]:
dupes += 1
else:
dupes += 0
return dupes
This passes all of the automated tests, but when I submit this as a solution I get an STDERR:
Traceback:
in <module>
in duplicate_count
IndexError: list index out of range
As I understand it, this error is given if I try and access an element of the list that does not exist. But I cannot see where in my code I am doing that. I calculate the length of my list and store it in n. So let's say I supply the string "ababa" to duplicate_count, it should generate a list sorted_chars: ['a', 'a', 'a', 'b', 'b'] of length 5. So n = 5. Therefore range(1, n - 1) = range(1, 4) which will generate the numbers 1, 2 and 3. Thus for i in range(1, n - 1) is, mathematically speaking, for each i ϵ I = {1, 2, 3}. The largest index I therefore use in this code is 4 (if sorted_chars[i] == sorted_chars[i + 1]), which is fine, because there is an element at index 4 (in this case 'b').
Why, then, is Codewars giving me this error.
In this case, your function requires at least two characters to work. Try running duplicate_count('a') and see the error it throws. Add the following after n = len(sorted_chars):
if n < 2:
return 0
That will stop running the rest of the function and return 0 duplicates (because you can't have any if there's only one character).
Related
I'm trying to write a function that returns the length of the longest run of repetition in a given list
Here is my code:
def longest_repetition(a):
longest = 0
j = 0
run2 = 0
while j <= len(a)-1:
for i in a:
run = a.count(a[j] == i)
if run == 1:
run2 += 1
if run2 > longest:
longest = run2
j += 1
run2 = 0
return longest
print(longest_repetition([4,1,2,4,7,9,4]))
print(longest_repetition([5,3,5,6,9,4,4,4,4]))
3
0
The first test function works fine, but the second test function is not counting at all and I'm not sure why. Any insight is much appreciated
Just noticed that the question I was given and the expected results are not consistent. So what I'm basically trying to do is find the most repeated element in a list and the output would be the number of times it is repeated. That said, the output for the second test function should be 4 because the element '4' is repeated four times (elements are not required to be in one run as implied in my original question)
First of all, let's check if you were consistent with your question (function that returns the length of the longest run of repetition):
e.g.:
a = [4,1,2,4,7,9,4]
b = [5,3,5,6,9,4,4,4,4]
(assuming, you are only checking single position, e.g. c = [1,2,3,1,2,3] could have one repetition of sequence 1,2,3 - i am assuming that is not your goal)
So:
for a, there is no repetitions of same value, therefore length equals 0
for b, you have one, quadruple repetition of 4, therefore length equals 4
First, your max_amount_of_repetitions=0 and current_repetitions_run=0' So, what you need to do to detect repetition is simply check if value of n-1'th and n'th element is same. If so, you increment current_repetitions_run', else, you reset current_repetitions_run=0.
Last step is check if your current run is longest of all:
max_amount_of_repetitions= max(max_amount_of_repetitions, current_repetitions_run)
to surely get both n-1 and n within your list range, I'd simply start iteration from second element. That way, n-1 is first element.
for n in range(1,len(a)):
if a[n-1] == a[n]:
print("I am sure, you can figure out the rest")
you can use hash to calculate the frequency of the element and then get the max of frequencies.
using functional approach
from collections import Counter
def longest_repitition(array):
return max(Counter(array).values())
other way, without using Counter
def longest_repitition(array):
freq = {}
for val in array:
if val not in freq:
freq[val] = 0
freq[val] += 1
values = freq.values()
return max(values)
can someone explain this function to me?
#from the geeksforgeeks website
def isPalimdrome(str):
for i in range(0, int(len(str)/2)):
if str[i] != str[len(str)-i-1]:
return False
return True
I dont understand the for loop and the if statement.
A - why is the range from 0 to length of the string divided by 2?
B - what does "str[len(str)-i-1" do?
//sorry, ik I ask stupid questions
To determine if a string is a palindrome, we can split the string in half and compare each letter of each half.
Consider the example
string ABCCBA
the range in the for loop sets this up by only iterating over the first n/2 characters. int(n/2) is used to force an integer (question A)
ex_str = 'ABCCBA'
for s in range(int(len(ex_str)/2)):
print(ex_str[s])
A
B
C
we now have to look at the letters in the other half, CBA, in reverse order
adding an index to our example to visualize this
string ABCCBA
index 012345
to determine if string is a palindrome, we can compare indices 0 to 5, 1 to 4, and 2 to 3
len(str)-i-1 gives us the correct index of the other half for each i (question B)
example:
ex_str = 'ABCCBA'
for s in range(int(len(ex_str)/2)):
print(f'compare index {s} to index {len(ex_str)-s-1}')
print(f"{ex_str[s]} to {ex_str[len(ex_str) - s - 1]}")
compare index 0 to index 5
A to A
compare index 1 to index 4
B to B
compare index 2 to index 3
C to C
for i in range(0, int(len(str)/2)):
Iterate through(go one by one from) 0(because in string first letter's index is 0) to half length of the string.
Why to only half length?
Because in a palindrome you need to compare only half length of string to the other half.
e.g., RADAR. 0=R, 1=A, 2=D, 3=A, 4=R. Number of letters = 5.
int(len(str)/2) will evaluate to 2. So first two letters will be compared with last two letters and middle one is common so will not be compared.
if str[i] != str[len(str)-i-1]:
Now, length of string is 5 but index of letters in string goes from 0 to 4, which is why len(str)-1 (5-1 = 4, i.e., last letter R).
len(str)-1-i Since i is a loop variable, it will be incremented by 1 every time for loop runs. In first run i is 0, in second 1....
The for loop will run two times.
str[i] != str[len(str)-1-i] will be evaluated as-
0 != 4 i.e. R != R FALSE
1 != 3 i.e. A != A FALSE
This code is not very readable and can be simplified as pointed out by others. This also reflects why code readability is important.
1. why is the range from 0 to length of the string divided by 2?
That's because we don't need to iterate all the way through the string but just halfway through it.
2. what does "str[len(str)-i-1]" do?
It returns the ith element from the end ie for a string "noon" when i is 0 it will get str[3] ie n
Easiest way to check palindrome is this
def isPalimdrome(s):
return s == s[::-1]
Reading the string from the beginning is same as reading it reverse.
I am trying to solve the question given in this video https://www.youtube.com/watch?reload=9&v=XCeDBWI4sa4
My list contains sub-lists that constitute each digit of a number of the type strings.
Example: I turned my list of strings
['58','12','50','17'] into four sub-lists like so [['5','8'],['1','2'],['5','0'],['1','7']] because I want to compare the first digit of each number and if the first digits are equal, I increment the variable "pair" which is currently 0. pair=0
Since 58 and 50 have the same first digit, they constitute a pair, same goes for 12 and 17. Also, a pair can only be made if both the numbers are at either even position or odd position. 58 and 50 are at even indices, hence they satisfy the condition. also, at most two pairs can be made for the same first digit. So 51,52, 53 would constitute only 2 pairs instead of three. How do I check this? A simple solution will be appreciated.
list_1=[['5','8'],['1','2'],['5','0'],['1','7']]
and test_list= ['58','12','50','17']
for i in range(0,len(test_list)):
for j in range(1,len(test_list)):
if (list_1[i][0] == list_1[j][0] and (i,j%2==0 or i,j%2==1)):
pair =pair+1
print (pair)
That is what I came up with but I am not getting the desired output.
pair = 0
val_list = ['58','12','50','17', '57', '65', '51']
first_digit, visited_item_list = list(), list()
for item in val_list:
curr = int(item[0])
first_digit.append(curr)
for item in first_digit:
if item not in visited_item_list:
occurences = first_digit.count(item)
if occurences % 2 == 0:
pair = pair + occurences // 2
visited_item_list.append(item)
print(pair)
Using collections.Counter to count occurrences for each first digit. Sum up the totals minus the total number of unique types (to account for more than one).
Iterates over even and odd separately:
Uncomment #return sum(min(c,2) for x in c) - len(c) if you want it to never count more than 2 for digit duplicates. eg: [51,52,53,54,56,57,58,59,50,...] will still return 4, no matter how many more 5X you add. (min(c,2) guarantees the value will never exceed 2)
from collections import Counter
a = ['58','12','50','17','50','18']
def dupes(a):
c = Counter(a).values() # count instances of each element in a, get list of counts
#return sum(min(c,2) for x in c) - len(c) # maximum value of 2 for counts
return sum(c) - len(c) # sum up all the counts, subtract unique elements (you want the counts starting from 0)
even = dupes(a[x][0] for x in range(0, len(a), 2))
# a[x][0]: first digit of even a elements
# range(0, len(a), 2): range of numbers from 0 to length of a, skip by 2 (evens)
# call dupes([list of first digit of even elements])
odd = dupes(a[x][0] for x in range(1, len(a), 2))
# same for odd
print(even+odd)
Here's a fairly simple solution:
import collections
l= [['5','8'],['1','2'],['5','0'],['1','7']]
c = collections.Counter([i[0] for i in l])
# Counter counts the occurrences of items in a list (or other
# collection). After the previous line, c is
# Counter({'5': 2, '1': 2})
sum([c-1 for c in c.values()])
The output, in this case, is 2.
I have a nested list, with every second element having varying lengths:
lst = [[a,bcbcbcbcbc],[e,bbccbbccb],[i,ccbbccbb],[o,cbbccbb]]
My output is a csv of dataframe with this look:
comparison similarity_score
a:e *some score
a:i *some score
a:o *some score
e:i *some score
e:o *some score
i:o *some score
my code:
similarity = []
for i in lst:
name = i[0]
string = i[1]
score = 0.0
length =(len(string))
for i in range(length):
if string[i]==string[i+1]:
score += 1.0
new_score = (100.0*score)/length
name_seq = name[i] + ':' + name[i+1]
similarity.append(name_seq,new_score)
similarity.pdDataFrame(similarity, columns = ['comparison' , 'similarity_score'])
similarity.to_csv('similarity_score.csv')
but I am recieving an error:
if codes[i]==codes[i+1]:
IndexError: string index out of range
any advice? thanks!
According to Python's documentation range does the following by example:
>>>range(10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In your code (assuming variable names have not changed):
...
length =(len(string)) # For an input of 'bcb' length will be 3
for i in range(length): # For an input of 'bcb' range will be [0, 1, 2]
if string[i]==string[i+1]: # When i == 2 i + 1 == 3 which gives you the
# IndexError: string index out of range
...
In other words, given an input bcb, your if statement will look at the following indices:
(0, 1)
(1, 2)
(2, 3) <-- The 3 in this case is your issue.
To fix your issue iterate from [0, len(string) - 1]
I think your biggest issue is that at the top level you're just iterating on one name,string pair at a time, not a pair of name,string pairs like you want to see in your output (as shown by the paired names a:e).
You're trying to index the name and string values later on, but doing so is not achieving what you want (comparing two strings to each other to compute a score), since you're only accessing adjacent characters in the same string. The exception you're getting is because i+1 may go off the end of the string. There's further confusion since you're using i for both the index in the inner loop and as the items taken from the outer loop (the name, string pairs).
To get pairs of pairs, I suggest using itertools.combinations:
import itertools
for [name1, string1], [name2, string2] in itertools.combinations(lst, 2):
Now you can use the two name and two string variables in the rest of the loop.
I'm not entirely sure I understand how you want to compare the strings to get your score, since they're not the same length as one another. If you want to compare just the initial parts of the strings (and ignore the trailing bit of the longer one), you could use zip to get pairs of corresponding characters between the two strings. You can then compare them in a generator expression and add up the bool results (True is a special version of the integer 1 and False is a version of 0). You can then divide by the smaller of the string's lengths (or maybe the larger if you want to penalize length differences):
common_letters = sum(c1 == c2 for c1, c2 in zip(string1, string2))
new_score = common_letters * 100 / min(len(string1), len(string2))
There's one more obvious issue, where you're calling append with two arguments. If you really want to be appending a 2-tuple, you need an extra set of parentheses:
similarity.append((name_seq, new_score))
I am working through the prep materials for my application to a coding bootcamp. This is a practice problem I am struggling with (using Python):
"Write a function 'lucky_sevens(numbers)', which takes in a list of integers and print True if any three consecutive elements sum to 7.
Make sure your code correctly checks for the first and last elements of the array."
I know how to loop through an array one element at a time, but don't know how to "hold" one element while also assessing the second and third elements in relation to the first, as this prompt requires. As you can see from my attempt below, I'm not sure when/where/how to increment the index values to search the whole list.
def lucky_sevens(numbers):
index1 = 0 # For assessing 1st of 3 numbers
index2 = index1 + 1 # For assessing 2nd of 3 numbers
index3 = index2 + 1 # For assessing 3rd of 3 numbers
# If all index values are within the list...
if index1 <= (len(numbers) - 2) and index2 <= (len(numbers) - 1) and index3 <= len(numbers):
# If the values at those indices sum to 7...
if numbers[index1] + numbers[index2] + numbers[index3] == 7:
print True
else:
print False
# I think the increments below may be one of the places I am incorrect
index1 += 1
index2 += 1
index3 += 1
When I run
lucky_sevens([2, 1, 5, 1, 0])
It is printing False, I think because it is only considering elements in the 0th, 1st and 2nd positions (sums to 8, not 7, as required).
It should print True, because elements in the 1st, 2nd and 3rd positions sum to 7. (1 + 5 + 1 = 7).
Can anyone please provide a suggestion? I would be most appreciative.
Yes, for your case its only considering the first, second and third elements. This is because you do not have any loops in your function.
In Python loop constructs are for and while . So you would need to use either one.
I can give you some hints to the problem , not going to provide you the complete code (since otherwise how would you learn?) -
You need to loop through the indexes from first index (0) to the len(numbers) -2 . An easy function that can help you do this would be enumerate() , it spits out the index as well as actual element when iterating over it using for loop (If you are using enumerate , you would need to put a condition to check that index should be less than len(numbers) - 2).
You should then get the elements from index+1 pos and index+2 position as well, and sum them and check if thats equal to 7 , if so you should return True.
A common mistake many make is to return False if the above (2) condition is not met, but actually what you need to do is to return it only when there are no matches at all (at the end of the function) .
You need a loop through the list to evaluate all elements. In your code, you only evaluate the first 3 elements.
Try this:
def lucky_sevens(numbers):
for i in range(0, len(numbers)):
if sum(numbers[i:i + 3]) == 7:
print True
return
print False
The reason yours doesn't work is because you're not looping it, you only check the first 3 elements in the list.
What about using recursion?
def lucky_sevens(numbers, index=0):
if index <= len(numbers):
if sum(numbers[index:index + 4]) == 7:
return True
else:
index += 1
return lucky_sevens(numbers[1:], index)
return False