I am trying to find greatest length of a word from the string return it by using values of each letter from alphabets by assigning each letter it's value as per it's rank . So for example For a string s = 'abcd a', I intend to return 10 [a=1 + b=2 + c =3 + d=4] .But, I am getting output as 7 When I debugged the code, I noticed that in while loop my code skips i=2 and directly jumps on i=3. Where am I going wrong? Below is my code.
class Solution(object):
def highest_scoring_word(self,s):
# Dictionary of English letters
dt = {'a':1,'b':2,'c':3,'d':4,'e':5,'f':6,
'g':7,'h':8,'i':9,'j':10,'k':11,'l':12,
'm':13,'n':14,'o':15,'p':16,'q':17,
'r':18,'s':19,'t':20,'u':21,'v':22,
'w':23,'x':24,'y':25,'z':26}
value_sum =0
max_value =value_sum
for i in range(0,len(s)):
if s.upper():
s= s.lower()
words = s.split()
# convert the string in char array
to_char_array = list(words[i])
j=0
while j<len(to_char_array):
if to_char_array[j] in dt.keys() :
value_sum = max(dt.get(to_char_array[j]),value_sum + dt.get(to_char_array[j]))
max_value = max(value_sum,max_value)
else:
pass
j +=j+1
return max_value
if __name__ == '__main__':
p = 'abcd a'
print(Solution().highest_scoring_word(p))
`
I have created a dictionary where I have stored all letters in english alphabet and their values and later I have split the string into words using split() and then after converting each individual word into character array I have traversed it to find their occurrence in the dictionary and add to the final value. I am expecting to get a correct value of a string and finally the greatest value.
As you are using a class and methods, make use of them:
from string import ascii_lowercase as dt
class Solution(object):
def __init__(self, data):
self.scores = {}
self.words = data.lower().strip().split()
def get_scoring(self):
# for each word caculate the scoring
for word in self.words:
score = 0
# for each character in the word, find its index in 'a..z' and add it to score
# same as in your dt implementation (just using index not absolute values)
for c in word:
score += dt.find(c) + 1
self.scores[word] = score
print(self.scores)
# filer the dictionary by its greates value in order to get the word with max score:
return max(self.scores.keys(), key=lambda k: self.scores[k])
if __name__ == '__main__':
p = 'abcd fg11'
maxWord = Solution(p).get_scoring()
print(maxWord)
Out:
{'abcd': 10, 'fg11': 13}
fg11
Try using this:
class Solution(object):
def highest_scoring_word(self,s):
# Dictionary of English letters
dt = {'a':1,'b':2,'c':3,'d':4,'e':5,'f':6,
'g':7,'h':8,'i':9,'j':10,'k':11,'l':12,
'm':13,'n':14,'o':15,'p':16,'q':17,
'r':18,'s':19,'t':20,'u':21,'v':22,
'w':23,'x':24,'y':25,'z':26}
value_sum1 =0
max_value1 =value_sum1
value_sum2 =0
max_value2 =value_sum2
for i in range(0,len(s)):
if s.upper():
s= s.lower()
words = s.split()
if len(words)>1:
# convert the string in char array
to_char_array = list(words[0])
j=0
while j<len(to_char_array):
if to_char_array[j] in dt.keys() :
value_sum1 = max(dt.get(to_char_array[j]),value_sum1 + dt.get(to_char_array[j]))
max_value1 = max(value_sum1,max_value1)
else:
pass
j=j+1
to_char_array = list(words[1])
j=0
while j<len(to_char_array):
if to_char_array[j] in dt.keys() :
value_sum2 = max(dt.get(to_char_array[j]),value_sum2 + dt.get(to_char_array[j]))
max_value2 = max(value_sum2,max_value2)
else:
pass
j=j+1
if max_value2>max_value1:
return max_value2
elif max_value1>max_value2:
return max_value1
else:
return 'Both words have equal score'
else:
# convert the string in char array
to_char_array = list(words[i])
j=0
while j<len(to_char_array):
if to_char_array[j] in dt.keys() :
value_sum1 = max(dt.get(to_char_array[j]),value_sum1 + dt.get(to_char_array[j]))
max_value1 = max(value_sum1,max_value1)
else:
pass
j=j+1
return max_value1
if __name__ == '__main__':
p = 'abcd fg'
print(Solution().highest_scoring_word(p))
It is maybe of interest that the code can be greatly simplified by using features available in Python:
the_sum = sum(ord(c)-96 for c in s.lower() if c.isalpha())
to break this down. for c in s.lower() gets the lower-case characters one by one; the function ord() gives the numerical value with a of 97 so we subtract to get 1. Then we check if the character is a letter and if so accept it. Then sum() adds up all the numbers. You could break up this one line an check how the separate parts work.
I've done some digging and most use arrays, but our class is not that far and we're to use mostly for loops to return the most repeated letter in a function.
Here was my code so far, but all I could get was to return the count of the first letter.
def most_repeated_letters(word_1):
x = 0
z = 0
for letter in word_1:
y = word_1.count(letter[0:])
if y > z:
z = y
x += 1
return z
print most_repeated_letters('jackaby')
Make use collections.Counter
from collections import Counter
c = Counter('jackaby').most_common(1)
print(c)
# [('a', 2)]
There are a few problems with your code:
you calculate the count of the most common letter, but not the letter itself
you return inside the loop and thus after the very first letter
also, you never use x, and the slicing of letter is unneccesary
Some suggestions to better spot those errors yourself:
use more meaningful variable names
use more than two spaces for indentation
Fixing those, your code might look something like this:
def most_repeated_letters(word_1):
most_common_count = 0
most_common_letter = None
for letter in word_1:
count = word_1.count(letter)
if count > most_common_count:
most_common_count = count
most_common_letter = letter
return most_common_letter
Once you are comfortable with Python's basic language features, you should have a closer look at the builtin functions. In fact, your entire function can be reduced to a single line using max, using the word_1.count as the key function for comparison.
def most_repeated_letters(word_1):
return max(word_1, key=word_1.count)
But while this is very short, it is not very efficient, as the count function is called for each letter in the word, giving the function quadratic complexity O(n²). Instead, you can use a dict to store counts of individual letters and increase those counts in a single pass over the word in O(n).
def most_repeated_letters(word_1):
counts = {}
for letter in word_1:
if letter not in counts:
counts[letter] = 1
else:
counts[letter] += 1
return max(counts, key=counts.get)
And this is basically the same as what collections.Counter would do, as already described in another answer.
If you don't want to use Counter:
def most_repeated_letters(word_1):
lettersCount = {}
for ch in word_1:
if ch not in lettersCount:
lettersCount[ch] = 1
else:
lettersCount[ch] += 1
return max(lettersCount, key=lettersCount.get)
print(most_repeated_letters('jackabybb'))
Here is a code that works for multiple:
def most_repeated_letters(word_1):
d = {}
for letter in word_1:
if not d.get(letter):
d[letter] = 0
d[letter] = d.get(letter) + 1
ret = {}
for k,v in d.iteritems():
if d[k] == max(d.values()):
ret[k] = v
return ret
most_repeated_letters('jackaby')
If you don’t want to use the collections modue :
def mostRepeatedLetter(text):
counter = {}
for letter in text:
if letter in counter:
counter[letter]+=1
else:
counter[letter]=1
max = { letter: 0, quantity: 0 }
for key, value in counter.items():
if value > max.quantity:
max.letter, max.quantity = key, value
return max
I'd like to compare 2 strings and keep the matched, splitting off where the comparison fails.
So if I have 2 strings:
string1 = "apples"
string2 = "appleses"
answer = "apples"
Another example, as the string could have more than one word:
string1 = "apple pie available"
string2 = "apple pies"
answer = "apple pie"
I'm sure there is a simple Python way of doing this but I can't work it out, any help and explanation appreciated.
For completeness, difflib in the standard-library provides loads of sequence-comparison utilities. For instance find_longest_match which finds the longest common substring when used on strings. Example use:
from difflib import SequenceMatcher
string1 = "apple pie available"
string2 = "come have some apple pies"
match = SequenceMatcher(None, string1, string2).find_longest_match()
print(match) # -> Match(a=0, b=15, size=9)
print(string1[match.a:match.a + match.size]) # -> apple pie
print(string2[match.b:match.b + match.size]) # -> apple pie
If you're using a version older than 3.9, you'need to call find_longest_match() with the following arguments:
SequenceMatcher(None, string1, string2).find_longest_match(0, len(string1), 0, len(string2))
One might also consider os.path.commonprefix that works on characters and thus can be used for any strings.
import os
common = os.path.commonprefix(['apple pie available', 'apple pies'])
assert common == 'apple pie'
As the function name indicates, this only considers the common prefix of two strings.
def common_start(sa, sb):
""" returns the longest common substring from the beginning of sa and sb """
def _iter():
for a, b in zip(sa, sb):
if a == b:
yield a
else:
return
return ''.join(_iter())
>>> common_start("apple pie available", "apple pies")
'apple pie'
Or a slightly stranger way:
def stop_iter():
"""An easy way to break out of a generator"""
raise StopIteration
def common_start(sa, sb):
return ''.join(a if a == b else stop_iter() for a, b in zip(sa, sb))
Which might be more readable as
def terminating(cond):
"""An easy way to break out of a generator"""
if cond:
return True
raise StopIteration
def common_start(sa, sb):
return ''.join(a for a, b in zip(sa, sb) if terminating(a == b))
Its called Longest Common Substring problem. Here I present a simple, easy to understand but inefficient solution. It will take a long time to produce correct output for large strings, as the complexity of this algorithm is O(N^2).
def longestSubstringFinder(string1, string2):
answer = ""
len1, len2 = len(string1), len(string2)
for i in range(len1):
match = ""
for j in range(len2):
if (i + j < len1 and string1[i + j] == string2[j]):
match += string2[j]
else:
if (len(match) > len(answer)): answer = match
match = ""
return answer
print(longestSubstringFinder("apple pie available", "apple pies"))
print(longestSubstringFinder("apples", "appleses"))
print(longestSubstringFinder("bapples", "cappleses"))
Output
apple pie
apples
apples
Fix bugs with the first's answer:
def longestSubstringFinder(string1, string2):
answer = ""
len1, len2 = len(string1), len(string2)
for i in range(len1):
for j in range(len2):
lcs_temp = 0
match = ''
while ((i+lcs_temp < len1) and (j+lcs_temp<len2) and string1[i+lcs_temp] == string2[j+lcs_temp]):
match += string2[j+lcs_temp]
lcs_temp += 1
if len(match) > len(answer):
answer = match
return answer
print(longestSubstringFinder("dd apple pie available", "apple pies"))
print(longestSubstringFinder("cov_basic_as_cov_x_gt_y_rna_genes_w1000000", "cov_rna15pcs_as_cov_x_gt_y_rna_genes_w1000000")
print(longestSubstringFinder("bapples", "cappleses"))
print(longestSubstringFinder("apples", "apples"))
The same as Evo's, but with arbitrary number of strings to compare:
def common_start(*strings):
""" Returns the longest common substring
from the beginning of the `strings`
"""
def _iter():
for z in zip(*strings):
if z.count(z[0]) == len(z): # check all elements in `z` are the same
yield z[0]
else:
return
return ''.join(_iter())
The fastest way I've found is to use suffix_trees package:
from suffix_trees import STree
a = ["xxxabcxxx", "adsaabc"]
st = STree.STree(a)
print(st.lcs()) # "abc"
This script requests you the minimum common substring length and gives all common substrings in two strings. Also, it eliminates shorter substrings that longer substrings include already.
def common_substrings(str1,str2):
len1,len2=len(str1),len(str2)
if len1 > len2:
str1,str2=str2,str1
len1,len2=len2,len1
#short string=str1 and long string=str2
min_com = int(input('Please enter the minumum common substring length:'))
cs_array=[]
for i in range(len1,min_com-1,-1):
for k in range(len1-i+1):
if (str1[k:i+k] in str2):
flag=1
for m in range(len(cs_array)):
if str1[k:i+k] in cs_array[m]:
#print(str1[k:i+k])
flag=0
break
if flag==1:
cs_array.append(str1[k:i+k])
if len(cs_array):
print(cs_array)
else:
print('There is no any common substring according to the parametres given')
common_substrings('ciguliuana','ciguana')
common_substrings('apples','appleses')
common_substrings('apple pie available','apple pies')
Try:
import itertools as it
''.join(el[0] for el in it.takewhile(lambda t: t[0] == t[1], zip(string1, string2)))
It does the comparison from the beginning of both strings.
def matchingString(x,y):
match=''
for i in range(0,len(x)):
for j in range(0,len(y)):
k=1
# now applying while condition untill we find a substring match and length of substring is less than length of x and y
while (i+k <= len(x) and j+k <= len(y) and x[i:i+k]==y[j:j+k]):
if len(match) <= len(x[i:i+k]):
match = x[i:i+k]
k=k+1
return match
print matchingString('apple','ale') #le
print matchingString('apple pie available','apple pies') #apple pie
A Trie data structure would work the best, better than DP.
Here is the code.
class TrieNode:
def __init__(self):
self.child = [None]*26
self.endWord = False
class Trie:
def __init__(self):
self.root = self.getNewNode()
def getNewNode(self):
return TrieNode()
def insert(self,value):
root = self.root
for i,character in enumerate(value):
index = ord(character) - ord('a')
if not root.child[index]:
root.child[index] = self.getNewNode()
root = root.child[index]
root.endWord = True
def search(self,value):
root = self.root
for i,character in enumerate(value):
index = ord(character) - ord('a')
if not root.child[index]:
return False
root = root.child[index]
return root.endWord
def main():
# Input keys (use only 'a' through 'z' and lower case)
keys = ["the","anaswe"]
output = ["Not present in trie",
"Present in trie"]
# Trie object
t = Trie()
# Construct trie
for key in keys:
t.insert(key)
# Search for different keys
print("{} ---- {}".format("the",output[t.search("the")]))
print("{} ---- {}".format("these",output[t.search("these")]))
print("{} ---- {}".format("their",output[t.search("their")]))
print("{} ---- {}".format("thaw",output[t.search("thaw")]))
if __name__ == '__main__':
main()
Let me know in case of doubts.
In case we have a list of words that we need to find all common substrings I check some of the codes above and the best was https://stackoverflow.com/a/42882629/8520109 but it has some bugs for example 'histhome' and 'homehist'. In this case, we should have 'hist' and 'home' as a result. Furthermore, it differs if the order of arguments is changed. So I change the code to find every block of substring and it results a set of common substrings:
main = input().split(" ") #a string of words separated by space
def longestSubstringFinder(string1, string2):
'''Find the longest matching word'''
answer = ""
len1, len2 = len(string1), len(string2)
for i in range(len1):
for j in range(len2):
lcs_temp=0
match=''
while ((i+lcs_temp < len1) and (j+lcs_temp<len2) and string1[i+lcs_temp] == string2[j+lcs_temp]):
match += string2[j+lcs_temp]
lcs_temp+=1
if (len(match) > len(answer)):
answer = match
return answer
def listCheck(main):
'''control the input for finding substring in a list of words'''
string1 = main[0]
result = []
for i in range(1, len(main)):
string2 = main[i]
res1 = longestSubstringFinder(string1, string2)
res2 = longestSubstringFinder(string2, string1)
result.append(res1)
result.append(res2)
result.sort()
return result
first_answer = listCheck(main)
final_answer = []
for item1 in first_answer: #to remove some incorrect match
string1 = item1
double_check = True
for item2 in main:
string2 = item2
if longestSubstringFinder(string1, string2) != string1:
double_check = False
if double_check:
final_answer.append(string1)
print(set(final_answer))
main = 'ABACDAQ BACDAQA ACDAQAW XYZCDAQ' #>>> {'CDAQ'}
main = 'homehist histhome' #>>> {'hist', 'home'}
def LongestSubString(s1,s2):
if len(s1)<len(s2) :
s1,s2 = s2,s1
maxsub =''
for i in range(len(s2)):
for j in range(len(s2),i,-1):
if s2[i:j] in s1 and j-i>len(maxsub):
return s2[i:j]
Returns the first longest common substring:
def compareTwoStrings(string1, string2):
list1 = list(string1)
list2 = list(string2)
match = []
output = ""
length = 0
for i in range(0, len(list1)):
if list1[i] in list2:
match.append(list1[i])
for j in range(i + 1, len(list1)):
if ''.join(list1[i:j]) in string2:
match.append(''.join(list1[i:j]))
else:
continue
else:
continue
for string in match:
if length < len(list(string)):
length = len(list(string))
output = string
else:
continue
return output
**Return the comman longest substring**
def longestSubString(str1, str2):
longestString = ""
maxLength = 0
for i in range(0, len(str1)):
if str1[i] in str2:
for j in range(i + 1, len(str1)):
if str1[i:j] in str2:
if(len(str1[i:j]) > maxLength):
maxLength = len(str1[i:j])
longestString = str1[i:j]
return longestString
This is the classroom problem called 'Longest sequence finder'. I have given some simple code that worked for me, also my inputs are lists of a sequence which can also be a string:
def longest_substring(list1,list2):
both=[]
if len(list1)>len(list2):
small=list2
big=list1
else:
small=list1
big=list2
removes=0
stop=0
for i in small:
for j in big:
if i!=j:
removes+=1
if stop==1:
break
elif i==j:
both.append(i)
for q in range(removes+1):
big.pop(0)
stop=1
break
removes=0
return both
As if this question doesn't have enough answers, here's another option:
from collections import defaultdict
def LongestCommonSubstring(string1, string2):
match = ""
matches = defaultdict(list)
str1, str2 = sorted([string1, string2], key=lambda x: len(x))
for i in range(len(str1)):
for k in range(i, len(str1)):
cur = match + str1[k]
if cur in str2:
match = cur
else:
match = ""
if match:
matches[len(match)].append(match)
if not matches:
return ""
longest_match = max(matches.keys())
return matches[longest_match][0]
Some example cases:
LongestCommonSubstring("whose car?", "this is my car")
> ' car'
LongestCommonSubstring("apple pies", "apple? forget apple pie!")
> 'apple pie'
This isn't the most efficient way to do it but it's what I could come up with and it works. If anyone can improve it, please do. What it does is it makes a matrix and puts 1 where the characters match. Then it scans the matrix to find the longest diagonal of 1s, keeping track of where it starts and ends. Then it returns the substring of the input string with the start and end positions as arguments.
Note: This only finds one longest common substring. If there's more than one, you could make an array to store the results in and return that Also, it's case sensitive so (Apple pie, apple pie) will return pple pie.
def longestSubstringFinder(str1, str2):
answer = ""
if len(str1) == len(str2):
if str1==str2:
return str1
else:
longer=str1
shorter=str2
elif (len(str1) == 0 or len(str2) == 0):
return ""
elif len(str1)>len(str2):
longer=str1
shorter=str2
else:
longer=str2
shorter=str1
matrix = numpy.zeros((len(shorter), len(longer)))
for i in range(len(shorter)):
for j in range(len(longer)):
if shorter[i]== longer[j]:
matrix[i][j]=1
longest=0
start=[-1,-1]
end=[-1,-1]
for i in range(len(shorter)-1, -1, -1):
for j in range(len(longer)):
count=0
begin = [i,j]
while matrix[i][j]==1:
finish=[i,j]
count=count+1
if j==len(longer)-1 or i==len(shorter)-1:
break
else:
j=j+1
i=i+1
i = i-count
if count>longest:
longest=count
start=begin
end=finish
break
answer=shorter[int(start[0]): int(end[0])+1]
return answer
First a helper function adapted from the itertools pairwise recipe to produce substrings.
import itertools
def n_wise(iterable, n = 2):
'''n = 2 -> (s0,s1), (s1,s2), (s2, s3), ...
n = 3 -> (s0,s1, s2), (s1,s2, s3), (s2, s3, s4), ...'''
a = itertools.tee(iterable, n)
for x, thing in enumerate(a[1:]):
for _ in range(x+1):
next(thing, None)
return zip(*a)
Then a function the iterates over substrings, longest first, and tests for membership. (efficiency not considered)
def foo(s1, s2):
'''Finds the longest matching substring
'''
# the longest matching substring can only be as long as the shortest string
#which string is shortest?
shortest, longest = sorted([s1, s2], key = len)
#iterate over substrings, longest substrings first
for n in range(len(shortest)+1, 2, -1):
for sub in n_wise(shortest, n):
sub = ''.join(sub)
if sub in longest:
#return the first one found, it should be the longest
return sub
s = "fdomainster"
t = "exdomainid"
print(foo(s,t))
>>>
domain
>>>
def LongestSubString(s1,s2):
left = 0
right =len(s2)
while(left<right):
if(s2[left] not in s1):
left = left+1
else:
if(s2[left:right] not in s1):
right = right - 1
else:
return(s2[left:right])
s1 = "pineapple"
s2 = "applc"
print(LongestSubString(s1,s2))
This is supposed to become a random name generator in the end, all the random part is working. Only problem is that it is REALLY random, getting weird stuff like aaaaaaaa etc.
So I'm trying to add a rule to not allow 2 vowels after each other (same goes with consonants).
So yeah, guys please help me out here. I've been looking throu' this code for 2 hours now and I cant find the problem.
Just pasting my entire code here.
import random
import string
import numpy as np
from sys import argv
import csv
# abcdefghijklmnopqrstuvwxyz
# Example output: floke fl0ke flok3 fl0k3
#
class facts:
kons = list('bcdfghjklmnpqrstvwxz') #20
voks = list('aeiouy') #6
abc = list('abcdefghijklmnopqrstuvwxyz')
def r_trfa(): #True Or False (1/0)
x = random.randrange(0, 2)
return x;
def r_kons(): #Konsonant
y = random.randrange(0, 20)
x = facts.kons[y]
return x;
def r_vok(): #Vokal
y = random.randrange(0, 6)
x = facts.voks[y]
return x;
def r_len(): #Langd
x = random.randrange(4, 8)
return x;
def r_type():
x = random.randrange(1, 4)
return x;
def r_structure(length): #Skapar strukturen
y = r_type()
if y == 0:
no1 = 1
else:
no1 = 2
i = 0
x = [no1]
y = r_type()
if not no1 == y:
x.append(y)
while i < length:
y = r_type()
if not x[i] == y:
x.append(y)
i = i + 1
x2 = list(x)
return x2;
def name(): #Final product
struct = r_structure(r_len())
name = struct
You've got several bugs. For example, you're checking the value y against 0 even though it is always in the range 1-4, probably unintended behavior. Furthermore, you never actually call a function that gets you a character, and you never create a string. Thus it's not clear what you're trying to do.
Here's how I'd rewrite things based on my guess of what you want to do.
import random, itertools
voks = frozenset('aeiouy')
abc = 'abcdefghijklmnopqrstuvwxyz'
def r_gen():
last=None #both classes ok
while 1:
new = random.choice(abc)
if (new in voks) != last:
yield new
last = (new in voks)
def name(): #Final product
length = random.randrange(4, 8)
return ''.join(itertools.islice(r_gen(), length))
The problem you're having is that your loop increments i always, but only adds an additional value to your x list if the random value doesn't match x[i]. This means that if you get several matches in a row, i may become larger than the largest index into x and so you'll get an IndexError exception.
I'm not entirely sure I understand what you're trying to do, but I think this will do something similar to your current r_structure function:
def r_structure(length):
"""Returns a list of random "types", avoiding any immediate repeats"""
x = [r_type()]
while len(x) < length:
y = r_type()
if y != x[-1]: # check against the last item in the list
x.append(y)
return x
If your goal is simply to randomly generate a sequence of alternating vowels and consonants, there's an easier way than what you seem to be doing. First off, you can use random.choice to pick your characters. Further, rather than picking many letters and rejecting ones that are of the wrong type, you can simply pick from one string, then pick from the other, for as long as you need:
import random
def alternating_characters(length):
characters = ["aeiouy", "bcdfghjklmnpqrstvwxz"]
char_type = random.randrange(2) # pick a random letter type to start with
results = []
while len(char_list) < length:
results.append(random.choice(characters[char_type])) # pick random char
char_type = 1-char_type # pick from the other list next time
return "".join(char_list)
Well it's unclear what you want to do.. As the conditions on vowels and consonants is the same, so why do you need to differentiate between them?
So all you need to do is take a random letter and check that it doesn't match with the last letter.
Here's some code:
import random
abc = 'abcdefghijklmnopqrstuvwxyz'
def gen_word(length):
last = ''
while length > 0:
l = random.choice(abc)
if l != last:
length -= 1
yield l
if __name__ == '__main__':
word = ''.join(gen_word(10))
print word