Find the Word Pair - Codewars 5 kyu

Find the Word Pair - Codewars 5 kyu - python

Hello everyone, I have a problem with 5-kyu kata:
Given an array of words and a target compound word, your objective is to find the two words which combine into the target word, returning both words in the order they appear in the array, and their respective indices in the order they combine to form the target word. Words in the array you are given may repeat, but there will only be one unique pair that makes the target compound word. If there is no match found, return null/nil/None.
Examples:
fn(['super','bow','bowl','tar','get','book','let'], "superbowl") => ['super','bowl', [0,2]]
fn(['bow','crystal','organic','ally','rain','line'], "crystalline") => ['crystal','line', [1,5]]
fn(['bow','crystal','organic','ally','rain','line'], "rainbow") => ['bow','rain', [4,0]]
fn(['bow','crystal','organic','ally','rain','line'], "organically") => ['organic','ally', [2,3]]
fn(['top','main','tree','ally','fin','line'], "mainline") => ['main','line', [1,5]]
fn(['top','main','tree','ally','fin','line'], "treetop") => ['top','tree', [2,0]]
This kata on Codewars
I got a solution:
def compound_match(words, target):
#check input
if words == None or target == None:
return None
#find solution
indx = []
result = []
for i in range(len(words)-1):
for k in range(i+1, len(words)):
if words[i]+words[k] == target:
indx[0:1] = i, k
result[0:2] = words[i], words[k], indx
return result
elif words[k]+words[i] == target:
indx[0:1] = k, i
result[0:2] = words[i], words[k], indx
return result
return None
I pass TEST, pass "Some very short random tests" and "Some short random tests", but cant pass "Some very long random tests" and got error:
STDERR
Execution Timed Out (12000 ms)
I understand that my code is not optimal or perfect, but what can I do to improve?

Here's a complete solution.
def compound_match(words, target):
if words is None or target is None:
return None
word_dict = {w:i for i, w in enumerate(words)}
for i, w in enumerate(words):
w = words[i]
if target.startswith(w):
tail = target[len(w):]
if tail in word_dict:
k = word_dict[tail]
idx = [i, k]
result = [words[min(i,k)], words[max(i,k)], idx]
return result
return None
When run on the sampe input, it produces the following:
['super', 'bowl', [0, 2]]
['crystal', 'line', [1, 5]]
['bow', 'rain', [4, 0]]
['organic', 'ally', [2, 3]]
['main', 'line', [1, 5]]
['top', 'tree', [2, 0]]
Note: I left out any check for duplicate words, or for i and k being the same. It can be adapted to handle that, although the constraints about duplicate words, or a target that consists of the same word repeated, weren't entirely clear to me.
However, I did paste it into the code wars link from your post, and it said it passed all the tests.

Prob. you can speed up by using the set idea, like this:
I think it will work, did not do much testing though. Let me know if you find out it helps.
def compound_match(words, target):
S = set(words)
for i in range(1, len(target)):
x, y = target[:i], target[i:]
if x in S and y in S:
ix, iy = words.index(x), words.index(y)
ordered = [x, y] if ix < iy else [y, x]
return [*ordered, [ix, iy]]

Related

Finding all the elements of the deepest list within a list in python without NUMPY

I am trying to code to retrieve all the list which is the "deepest" within a list. I managed to retrieve only one of the list which is the deepest and I am unable to print out all if they have the same "depth". I tried hardcoding it by it will not work if there is 2 or more values of the same depth. Can anyone share how can I modify my code to make it work?
Input: [1,2,[[3]],[[4]],[[5]], [6]]
Output: [3],[4],[5]
def get_deepest(L):
def get_dpst(L, maxdepth):
deepest_list = [L, maxdepth]
for e in L:
is_list = isinstance(e,list)
if is_list:
rList = get_dpst(e, maxdepth+1)
if rList[1] > deepest_list[1]:
deepest_list = rList
elif rList [1] == deepest_list[1]:
deepest_list[1] = rList[0]
return deepest_list
rList = get_dpst(L, 0)
return rList[0]
print (get_deepest(my_list))

How about a pair of functions like this?
unnest is a recursive function that walks a tree of lists, yielding tuples of any non-list value and its depth
get_deepest_values uses unnest and keeps track of the deepest depth it has returned, gathering only those values.
def unnest(lst, depth=0):
for atom in lst:
if isinstance(atom, list):
yield from unnest(atom, depth + 1)
else:
yield (atom, depth)
def get_deepest_values(inp):
deepest_depth = 0
deepest_depth_values = []
for value, depth in unnest(inp):
if depth > deepest_depth:
deepest_depth = depth
deepest_depth_values.clear()
if depth == deepest_depth:
deepest_depth_values.append(value)
return (deepest_depth, deepest_depth_values)
inp = [1, 2, [[3]], [[4]], [[5, "foo"]], [6]]
deepest_depth, deepest_values = get_deepest_values(inp)
print(f"{deepest_depth=} {deepest_values=}")
The output is
deepest_depth=2 deepest_values=[3, 4, 5, 'foo']

Split a list and create partitions(list) depending upon the unique set of values

deck = [1,2,3,4,4,3,2,1]
Output: True
Explanation: Possible partition [1,1],[2,2],[3,3],[4,4].
Input: deck = [1,1,1,2,2,2,3,3]
Output: False
Explanation: No possible partition.
The solution i tried was very , very basic but i couldn't solve further than this. P.S:I am starting my coding experience.
v=[1,2,3,4,4,3,2,1]
t=list()
flag=0
for i in v:
t.append(v.count(i))
for i in range(len(t)):
for j in range(len(t)):
if(t[i]==t[j]):
flag=0
else:
flag=1
if(flag==1):
print("NO")
else:
q=len(list(set(v)))
n = q
lists = [[] for _ in range(n)]

Try this:
How it works - It goes through all the list and stores the occurrence of each value. Then It makes of a set of it. Returns true if length of set is 1 i.e every number appears exactly the same number of times.
EDIT - set has a property that its element dont repeat. So say if the value of dikt is {1: 3, 2:3, 3:3} then the value of dikt.values is [3,3,3]. Making a set out of it will be (3).
IF the value of dikt was {1:3, 2:3, 3:2} then the set would be (3, 2)
def foo(x):
dikt = {}
for i in x:
if i in dikt.keys():
dikt[i]+=1
else:
dikt[i] = 1
return len(set(dikt.values())) == 1

You can use Counter here.
from collections import Counter
def partition(lst):
count=Counter(lst)
if len(set(count.values()))>1:
return 'No partition'
else:
return [[i]*n for i,n in count.items()]
deck = [1,2,3,4,4,3,2,1]
partition(deck)
#[[1, 1], [2, 2], [3, 3], [4, 4]]
deck=[1,1,1,2,2,2,3,3]
partition(deck)
#'No partition found'
If you just want the output to be True or False try this.
def partition(lst):
return len(set(Counter(lst).values()))==1

Algorithmics issue, python string, no idea

I have algorithm problem with Python and strings.
My issue:
My function should sum maximum values of substring.
For example:
ae-afi-re-fi -> 2+6+3+5=16
but
ae-a-fi-re-fi -> 2-10+5+3+5=5
I try use string.count function and counting substring, but this method is not good.
What would be the best way to do this in Python? Thanks in advance.
string = "aeafirefi"
Sum the value of substrings.

In my solution i'll use permutations from itertools module in order to list all the possible permutations of substrings that you gave in your question presented into a dict called vals. Then iterate through the input string and split the strings by all the permutations found below. Then sum the values of each permutations and finally get the max.
PS: The key of this solution is the get_sublists() method.
This is an example with some tests:
from itertools import permutations
def get_sublists(a, perm_vals):
# Find the sublists in the input string
# Based on the permutations of the dict vals.keys()
for k in perm_vals:
if k in a:
a = ''.join(a.split(k))
# Yield the sublist if we found any
yield k
def sum_sublists(a, sub, vals):
# Join the sublist and compare it to the input string
# Get the difference by lenght
diff = len(a) - len(''.join(sub))
# Sum the value of each sublist (on every permutation)
return sub , sum(vals[k] for k in sub) - diff * 10
def get_max_sum_sublists(a, vals):
# Get all the possible permutations
perm_vals = permutations(vals.keys())
# Remove duplicates if there is any
sub = set(tuple(get_sublists(a, k)) for k in perm_vals)
# Get the sum of each possible permutation
aa = (sum_sublists(a, k, vals) for k in sub)
# return the max of the above operation
return max(aa, key= lambda x: x[1])
vals = {'ae': 2, 'qd': 3, 'qdd': 5, 'fir': 4, 'afi': 6, 're': 3, 'fi': 5}
# Test
a = "aeafirefi"
final, s = get_max_sum_sublists(a, vals)
print("Sublists: {}\nSum: {}".format(final, s))
print('----')
a = "aeafirefiqdd"
final, s = get_max_sum_sublists(a, vals)
print("Sublists: {}\nSum: {}".format(final, s))
print('----')
a = "aeafirefiqddks"
final, s = get_max_sum_sublists(a, vals)
print("Sublists: {}\nSum: {}".format(final, s))
Output:
Sublists: ('ae', 'afi', 're', 'fi')
Sum: 16
----
Sublists: ('afi', 'ae', 'qdd', 're', 'fi')
Sum: 21
----
Sublists: ('afi', 'ae', 'qdd', 're', 'fi')
Sum: 1
Please try this solution with many input strings as you can and don't hesitate to comment if you found any wrong result.

Probably having a dictionary with:
key = substring: value = value
So if you have:
string = "aeafirefi"
first you look for the whole string in the dictionary, if you don't find it, you cut the last letter so you have "aeafiref", until you find a substring or you have an only letter.
then you skip the letters used: for example, if you found "aeaf", you start all over again using string = "iref".

Here's a brute force solution:
values_dict = {
'ae': 2,
'qd': 3,
'qdd': 5,
'fir': 4,
'afi': 6,
're': 3,
'fi': 5
}
def get_value(x):
return values_dict[x] if x in values_dict else -10
def next_tokens(s):
"""Returns possible tokens"""
# Return any tokens in values_dict
for x in values_dict.keys():
if s.startswith(x):
yield x
# Return single character.
yield s[0]
def permute(s, stack=[]):
"""Returns all possible variations"""
if len(s) == 0:
yield stack
return
for token in next_tokens(s):
perms = permute(s[len(token):], stack + [token])
for perm in perms:
yield perm
def process_string(s):
def process_tokens(tokens):
return sum(map(get_value, tokens))
return max(map(process_tokens, permute(s)))
print('Max: {}'.format(process_string('aeafirefi')))

Given a linear order completely represented by a list of tuples of strings, output the order as a list of strings

Given pairs of items of form [(a,b),...] where (a,b) means a > b, for example:
[('best','better'),('best','good'),('better','good')]
I would like to output a list of form:
['best','better','good']
This is very hard for some reason. Any thoughts?
======================== code =============================
I know why it doesn't work.
def to_rank(raw):
rank = []
for u,v in raw:
if u in rank and v in rank:
pass
elif u not in rank and v not in rank:
rank = insert_front (u,v,rank)
rank = insert_behind(v,u,rank)
elif u in rank and v not in rank:
rank = insert_behind(v,u,rank)
elif u not in rank and v in rank:
rank = insert_front(u,v,rank)
return [[r] for r in rank]
# #Use: insert word u infront of word v in list of words
def insert_front(u,v,words):
if words == []: return [u]
else:
head = words[0]
tail = words[1:]
if head == v: return [u] + words
else : return ([head] + insert_front(u,v,tail))
# #Use: insert word u behind word v in list of words
def insert_behind(u,v,words):
words.reverse()
words = insert_front(u,v,words)
words.reverse()
return words
=================== Update ===================
Per suggestion of many, this is a straight forward topological sort setting, I ultimately decided to use the code from this source: algocoding.wordpress.com/2015/04/05/topological-sorting-python/
which solved my problem.
def go_topsort(graph):
in_degree = { u : 0 for u in graph } # determine in-degree
for u in graph: # of each node
for v in graph[u]:
in_degree[v] += 1
Q = deque() # collect nodes with zero in-degree
for u in in_degree:
if in_degree[u] == 0:
Q.appendleft(u)
L = [] # list for order of nodes
while Q:
u = Q.pop() # choose node of zero in-degree
L.append(u) # and 'remove' it from graph
for v in graph[u]:
in_degree[v] -= 1
if in_degree[v] == 0:
Q.appendleft(v)
if len(L) == len(graph):
return L
else: # if there is a cycle,
return []
RockBilly's solution also work in my case, because in my setting, for every v < u, we are guaranteed to have a pair (u,v) in our list. So his answer is not very "computer-sciency", but it gets the job done in this case.

If you have a complete grammar specified then you can simply count up the items:
>>> import itertools as it
>>> from collections import Counter
>>> ranks = [('best','better'),('best','good'),('better','good')]
>>> c = Counter(x for x, y in ranks)
>>> sorted(set(it.chain(*ranks)), key=c.__getitem__, reverse=True)
['best', 'better', 'good']
If you have an incomplete grammar then you can build a graph and dfs all paths to find the longest. This isn't very inefficient, as I haven't thought about that yet :):
def dfs(graph, start, end):
stack = [[start]]
while stack:
path = stack.pop()
if path[-1] == end:
yield path
continue
for next_state in graph.get(path[-1], []):
if next_state in path:
continue
stack.append(path+[next_state])
def paths(ranks):
graph = {}
for n, m in ranks:
graph.setdefault(n,[]).append(m)
for start, end in it.product(set(it.chain(*ranks)), repeat=2):
yield from dfs(graph, start, end)
>>> ranks = [('black', 'dark'), ('black', 'dim'), ('black', 'gloomy'), ('dark', 'gloomy'), ('dim', 'dark'), ('dim', 'gloomy')]
>>> max(paths(ranks), key=len)
['black', 'dim', 'dark', 'gloomy']
>>> ranks = [('a','c'), ('b','a'),('b','c'), ('d','a'), ('d','b'), ('d','c')]
>>> max(paths(ranks), key=len)
['d', 'b', 'a', 'c']

What you're looking for is topological sort. You can do this in linear time using depth-first search (pseudocode included in the wiki I linked)

Here is one way. It is based on using the complete pairwise rankings to make an old-style (early Python 2) cmp function and then using functools.cmp_to_key to convert it to a key suitable for the Python 3 approach to sorting:
import functools
def sortByRankings(rankings):
def cmp(x,y):
if x == y:
return 0
elif (x,y) in rankings:
return -1
else:
return 1
items = list({x for y in rankings for x in y})
items.sort(key = functools.cmp_to_key(cmp))
return items
Tested like:
ranks = [('a','c'), ('b','a'),('b','c'), ('d','a'), ('d','b'), ('d','c')]
print(sortByRankings(ranks)) #prints ['d', 'b', 'a', 'c']
Note that to work correctly, the parameter rankings must contain an entry for each pair of distinct items. If it doesn't, you would first need to compute the transitive closure of the pairs that you do have before you feed it to this function.

You can take advantage of the fact that the lowest ranked item in the list will never appear at the start of any tuple. You can extract this lowest item, then remove all elements which contain this lowest item from your list, and repeat to get the next lowest.
This should work even if you have redundant elements, or have a sparser list than some of the examples here. I've broken it up into finding the lowest ranked item, and then the grunt work of using this to create a final ranking.
from copy import copy
def find_lowest_item(s):
#Iterate over set of all items
for item in set([item for sublist in s for item in sublist]):
#If an item does not appear at the start of any tuple, return it
if item not in [x[0] for x in s]:
return item
def sort_by_comparison(s):
final_list = []
#Make a copy so we don't mutate original list
new_s = copy(s)
#Get the set of all items
item_set = set([item for sublist in s for item in sublist])
for i in range(len(item_set)):
lowest = find_lowest_item(new_s)
if lowest is not None:
final_list.insert(0, lowest)
#For the highest ranked item, we just compare our current
#ranked list with the full set of items
else:
final_list.insert(0,set(item_set).difference(set(final_list)).pop())
#Update list of ranking tuples to remove processed items
new_s = [x for x in new_s if lowest not in x]
return final_list
list_to_compare = [('black', 'dark'), ('black', 'dim'), ('black', 'gloomy'), ('dark', 'gloomy'), ('dim', 'dark'), ('dim', 'gloomy')]
sort_by_comparison(list_to_compare)
['black', 'dim', 'dark', 'gloomy']
list2 = [('best','better'),('best','good'),('better','good')]
sort_by_comparison(list2)
['best', 'better', 'good']
list3 = [('best','better'),('better','good')]
sort_by_comparison(list3)
['best', 'better', 'good']

If you do sorting or create a dictionary from the list items, you are going to miss the order as #Rockybilly mentioned in his answer. I suggest you to create a list from the tuples of the original list and then remove duplicates.
def remove_duplicates(seq):
seen = set()
seen_add = seen.add
return [x for x in seq if not (x in seen or seen_add(x))]
i = [(5,2),(1,3),(1,4),(2,3),(2,4),(3,4)]
i = remove_duplicates(list(x for s in i for x in s))
print(i) # prints [5, 2, 1, 3, 4]
j = [('excellent','good'),('excellent','great'),('great','good')]
j = remove_duplicates(list(x for s in j for x in s))
print(j) # prints ['excellent', 'good', 'great']
See reference: How do you remove duplicates from a list in whilst preserving order?
For explanation on the remove_duplicates() function, see this stackoverflow post.

If the list is complete, meaning has enough information to do the ranking(Also no duplicate or redundant inputs), this will work.
from collections import defaultdict
lst = [('best','better'),('best','good'),('better','good')]
d = defaultdict(int)
for tup in lst:
d[tup[0]] += 1
d[tup[1]] += 0 # To create it in defaultdict
print sorted(d, key = lambda x: d[x], reverse=True)
# ['best', 'better', 'good']
Just give them points, increment the left one each time you encounter it in the list.
Edit: I do think the OP has a determined type of input. Always have tuple count of combination nCr(n, 2). Which makes this a correct solution. No need to complain about the edge cases, which I already knew posting the answer(and mentioned it).

Find group of strings that are anagrams

This question refers to this problem on lintcode. I have a working solution, but it takes too long for the huge testcase. I am wondering how can it be improved? Maybe I can decrease the number of comparisons I make in the outer loop.
class Solution:
# #param strs: A list of strings
# #return: A list of strings
def anagrams(self, strs):
# write your code here
ret=set()
for i in range(0,len(strs)):
for j in range(i+1,len(strs)):
if i in ret and j in ret:
continue
if Solution.isanagram(strs[i],strs[j]):
ret.add(i)
ret.add(j)
return [strs[i] for i in list(ret)]
#staticmethod
def isanagram(s, t):
if len(s)!=len(t):
return False
chars={}
for i in s:
if i in chars:
chars[i]+=1
else:
chars[i]=1
for i in t:
if i not in chars:
return False
else:
chars[i]-=1
if chars[i]<0:
return False
for i in chars:
if chars[i]!=0:
return False
return True
Update: Just to add, not looking for built-in pythonic solutions such as using Counter which are already optimized. Have added Mike's suggestions, but still exceeding time-limit.

Skip strings you already placed in the set. Don't test them again.
# #param strs: A list of strings
# #return: A list of strings
def anagrams(self, strs):
# write your code here
ret=set()
for i in range(0,len(strs)):
for j in range(i+1,len(strs)):
# If both anagrams exist in set, there is no need to compare them.
if i in ret and j in ret:
continue
if Solution.isanagram(strs[i],strs[j]):
ret.add(i)
ret.add(j)
return [strs[i] for i in list(ret)]
You can also do a length comparison in your anagram test before iterating through the letters. Whenever the strings aren't the same length, they can't be anagrams anyway. Also, when a counter in chars reaches -1 when comparing values in t, just return false. Don't iterate through chars again.
#staticmethod
def isanagram(s, t):
# Test strings are the same length
if len(s) != len(t):
return False
chars={}
for i in s:
if i in chars:
chars[i]+=1
else:
chars[i]=1
for i in t:
if i not in chars:
return False
else:
chars[i]-=1
# If this is below 0, return false
if chars[i] < 0:
return False
for i in chars:
if chars[i]!=0:
return False
return True

Instead of comparing all pairs of strings, you can just create a dictionary (or collections.defaultdict) mapping each of the letter-counts to the words having those counts. For getting the letter-counts, you can use collections.Counter. Afterwards, you just have to get the values from that dict. If you want all words that are anagrams of any other words, just merge the lists that have more than one entry.
strings = ["cat", "act", "rat", "hut", "tar", "tact"]
anagrams = defaultdict(list)
for s in strings:
anagrams[frozenset(Counter(s).items())].append(s)
print([v for v in anagrams.values()])
# [['hut'], ['rat', 'tar'], ['cat', 'act'], ['tact']]
print([x for v in anagrams.values() if len(v) > 1 for x in v])
# ['cat', 'act', 'rat', 'tar']
Of course, if you prefer not to use builtin functionality you can with just a few more lines just as well use a regular dict instead of defaultdict and write your own Counter, similar to what you have in your isanagram method, just without the comparison part.

Your solution is slow because you're not taking advantage of python's data structures.
Here's a solution that collects results in a dict:
class Solution:
def anagrams(self, strs):
d = {}
for word in strs:
key = tuple(sorted(word))
try:
d[key].append(word)
except KeyError:
d[key] = [word]
return [w for ws in d.values() for w in ws if len(ws) > 1]

As an addition to #Mike's great answer, here is a nice Pythonic way to do it:
import collections
class Solution:
# #param strs: A list of strings
# #return: A list of strings
def anagrams(self, strs):
patterns = Solution.find_anagram_words(strs)
return [word for word in strs if ''.join(sorted(word)) in patterns]
#staticmethod
def find_anagram_words(strs):
anagrams = collections.Counter(''.join(sorted(word)) for word in strs)
return {word for word, times in anagrams.items() if times > 1}

Why not this?
str1 = "cafe"
str2 = "face"
def isanagram(s1,s2):
return all(sorted(list(str1)) == sorted(list(str2)))
if isanagram(str1, str2):
print "Woo"

The same can be done with a single line of code if you are using Linq in C#
string[] = strs; // Input string array
var result = strs.GroupBy(x => new string(x.ToCharArray().OrderBy(z => z).ToArray())).Select(g => g.ToList()).ToList();

Now to Group Anagrams in Python, We have to : Sort the lists. Then, Create a dictionary. Now dictionary will tell us where are those anagrams are( Indices of Dictionary). Then values of the dictionary is the actual indices of the anagrams.
def groupAnagrams(words):
# sort each word in the list
A = [''.join(sorted(word)) for word in words]
dict = {}
for indexofsamewords, names in enumerate(A):
dict.setdefault(names, []).append(indexofsamewords)
print(dict)
#{'AOOPR': [0, 2, 5, 11, 13], 'ABTU': [1, 3, 4], 'Sorry': [6], 'adnopr': [7], 'Sadioptu': [8, 16], ' KPaaehiklry': [9], 'Taeggllnouy': [10], 'Leov': [12], 'Paiijorty': [14, 18], 'Paaaikpr': [15], 'Saaaabhmryz': [17], ' CNaachlortttu': [19], 'Saaaaborvz': [20]}
for index in dict.values():
print([words[i] for i in index])
if __name__ == '__main__':
# list of words
words = ["ROOPA","TABU","OOPAR","BUTA","BUAT" , "PAROO","Soudipta",
"Kheyali Park", "Tollygaunge", "AROOP","Love","AOORP", "Protijayi","Paikpara","dipSouta","Shyambazaar",
"jayiProti", "North Calcutta", "Sovabazaar"]
groupAnagrams(words)
The Output :
['ROOPA', 'OOPAR', 'PAROO', 'AROOP', 'AOORP']
['TABU', 'BUTA', 'BUAT']
['Soudipta', 'dipSouta']
['Kheyali Park']
['Tollygaunge']
['Love']
['Protijayi', 'jayiProti']
['Paikpara']
['Shyambazaar']
['North Calcutta']
['Sovabazaar']

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Find the Word Pair - Codewars 5 kyu - python

Related

Finding all the elements of the deepest list within a list in python without NUMPY

Split a list and create partitions(list) depending upon the unique set of values

Algorithmics issue, python string, no idea

Given a linear order completely represented by a list of tuples of strings, output the order as a list of strings

Find group of strings that are anagrams

Categories

Resources