Find out all possible cartesian products using Recursion - python

The function takes in a string (str) s and an integer (int) n .
The function returns a list (list) of all Cartesian product of s with length n
The expression product(s,n) can be computed by adding each character in s to the
result of product(s,n-1) .
>>> product('ab',3)
'aaa', 'aab', 'aba', 'abb', 'baa', 'bab', 'bba', 'bbb']
My attempt:
def product(s, n):
if n == 0:
return ""
string = ''
for i in range(len(s)):
string += s[i] + product(s, n - 1)
return string
Disclaimer: It doesn't work^

Your code is building a single string. That doesn't match what the function is supposed to do, which is return a list of strings. You need to add the list-building logic, and deal with the fact that your recursive calls are going to produce lists as well.
I'd do something like this:
def product(s, n):
if n == 0:
return ['']
result = []
for prefix in s: # pick a first character
for suffix in product(s, n-1): # recurse to get the rest
result.append(prefix + suffix) # combine and add to our results
return result
This produces the output in the desired order, but it recurses a lot more often than necessary. You could swap the order of the loops, though to avoid getting the results in a different order, you'd need to change the logic so that you pick the last character from s directly while letting the recursion produce the prefix.
def product(s, n):
if n == 0:
return ['']
result = []
for prefix in product(s, n-1): # recurse to get the start of each string
for suffix in s: # pick a final character
result.append(prefix + suffix) # combine and add to our results
return result

Related

Python recursion to split string by sliding window

Recently, I face an interesting coding task that involves splitting a string multiple permutations with a given K-limit size.
For example:
s = "iamfoobar"
k = 4 # the max number of the items on a list after the split
The s can split into the following combinations
[
["i", "a", "m", "foobar"],
["ia", "m", "f", "oobar"],
["iam", "f", "o", "obar"]
# etc
]
I tried to figure out how to do that with a quick recursively function, but I cannot get it to work.
I have try this out, but didn't seem to work
def sliding(s, k):
if len(s) < k:
return []
else:
for i in range(0, k):
return [s[i:i+1]] + sliding(s[i+1:len(s) - i], k)
print(sliding("iamfoobar", 4))
And only got this
['i', 'a', 'm', 'f', 'o', 'o']
Your first main problem is that although you use a loop, you immediately return a single list. So no matter how much you fix everything around, your output will never match what you expect as it will be.... a single list.
Second, on the recursive call you start with s[i:i+1] but according to your example you want all prefixes, so something like s[:i] is more suitable.
Additionaly, in the recursive call you never reduce k which is the natural recursive step.
Lastly, your stop condition seems wrong also. As above, if the natural step is reducing k, the natural stop would be if k == 1 then return [[s]]. This is because the only way to split the string to 1 part is the string itself...
The important thing is to keep in mind your final output format and think how that can work in your step. In this case you want to return a list of all possible permutations as lists. So in case of k == 1, you simply return a list of a single list of the string.
Now as the step, you want to take a different prefix each time, and add to it all permutations from the call of the rest of the string with k-1. All in all the code can be something like this:
def splt(s, k):
if k == 1: # base sace - stop condition
return [[s]]
res = []
# loop over all prefixes
for i in range(1, len(s)-k+2):
for tmp in splt(s[i:], k-1):
# add to prefix all permutations of k-1 parts of the rest of s
res.append([s[:i]] + tmp)
return res
You can test it on some inputs and see how it works.
If you are not restricted to recursion, another approach is to use itertools.combinations. You can use that to create all combinations of indexes inside the string to split it into k parts, and then simply concatenate those parts and put them in a list. A raw version is something like:
from itertools import combinations
def splt(s, k):
res = []
for indexes in combinations(range(1, len(s)), k-1):
indexes = [0] + list(indexes) + [len(s)] # add the edges to k-1 indexes to create k parts
res.append([s[start:end] for start, end in zip(indexes[:-1], indexes[1:])]) # concatenate the k parts
return res
The main issue in your implementation is that your loop does not do what is supposed to do as it returns the first result instead of appending the results.
Here's an example of an implementation:
def sliding(s, k):
# If there is not enough values of k is below 0
# there is no combination possible
if len(s) < k or k < 1:
return []
# If k is one, we return a list containing all the combinations,
# which is a single list containing the string
if k == 1:
return [[s]]
results = []
# Iterate through all the possible values for the first value
for i in range(1, len(s) - k + 2):
first_value = s[:i]
# Append the result of the sub call to the first values
for sub_result in sliding(s[i:], k - 1):
results.append([first_value] + sub_result)
return results
print(sliding("iamfoobar", 4))

Count max substring of the same character

i want to write a function in which it receives a string (s) and a single letter (s). the function needs to return the length of the longest substring of this letter. i dont know why the function i wrote doesn't work
for exmaple: print(count_longest_repetition('eabbaaaacccaaddd', 'a') supposed to return '4'
def count_longest_repetition(s, c):
n= len(s)
lst=[]
length_charachter=0
for i in range(n-1):
if s[i]==c and s[i+1]==c:
if s[i] in lst:
lst.append(s[i])
length_charachter= len(lst)
return length_charachter
Due to the condition if s[i] in lst, nothing will be appended to 'lst' as originally 'lst' is empty and the if condition will never be satisfied. Also, to traverse through the entire string you need to use range(n) as it generates numbers from 0 to n-1. This should work -
def count_longest_repetition(s, c):
n= len(s)
length_charachter=0
max_length = 0
for i in range(n):
if s[i] == c:
length_charachter += 1
else:
length_charachter = 0
max_length = max(max_length, length_charachter)
return max_length
I might suggest using a regex approach here with re.findall:
def count_longest_repetition(s, c):
matches = re.findall(r'' + c + '+', s)
matches = sorted(matches, key=len, reverse=True)
return len(matches[0])
cnt = count_longest_repetition('eabbaaaacccaaddd', 'a')
print(cnt)
This prints: 4
To better explain the above, given the inputs shown, the regex used is a+, that is, find groups of one or more a characters. The sorted list result from the call to re.findall is:
['aaaa', 'aa', 'a']
By sorting descending by string length, we push the longest match to the front of the list. Then, we return this length from the function.
Your function doesn't work because if s[i] in lst: will initially return false and never gets to add anything to to the lst list (so it will remain false throughout the loop).
You should look into regular expressions for this kind of string processing/search:
import re
def count_longest_repetition(s, c):
return max((0,*map(len,re.findall(f"{re.escape(c)}+",s))))
If you're not allowed to use libraries, you could compute repetitions without using a list by adding matches to a counter that you reset on every mismatch:
def count_longest_repetition(s, c):
maxCount = count = 0
for b in s:
count = (count+1)*(b==c)
maxCount = max(count,maxCount)
return maxCount
This can also be done by groupby
from itertools import groupby
def count_longest_repetition(text,let):
return max([len(list(group)) for key, group in groupby(list(text)) if key==let])
count_longest_repetition("eabbaaaacccaaddd",'a')
#returns 4

Given an integer, add operators between digits to get n and return list of correct answers

Here is the problem I'm trying to solve:
Given an int, ops, n, create a function(int, ops, n) and slot operators between digits of int to create equations that evaluates to n. Return a list of all possible answers. Importing functions is not allowed.
For example,
function(111111, '+-%*', 11) => [1*1+11/1-1 = 11, 1*1/1-1+11 =11, ...]
The question recommended using interleave(str1, str2) where interleave('abcdef', 'ab') = 'aabbcdef' and product(str1, n) where product('ab', 3) = ['aaa','aab','abb','bbb','aba','baa','bba'].
I have written interleave(str1, str2) which is
def interleave(str1,str2):
lsta,lstb,result= list(str1),list(str2),''
while lsta and lstb:
result += lsta.pop(0)
result += lstb.pop(0)
if lsta:
for i in lsta:
result+= i
else:
for i in lstb:
result+=i
return result
However, I have no idea how to code the product function. I assume it has to do something with recursion, so I'm trying to add 'a' and 'b' for every product.
def product(str1,n):
if n ==1:
return []
else:
return [product(str1,n-1)]+[str1[0]]
Please help me to understand how to solve this question. (Not only the product it self)
General solution
Assuming your implementation of interleave is correct, you can use it together with product (see my suggested implementation below) to solve the problem with something like:
def f(i, ops, n):
int_str = str(i)
retval = []
for seq_len in range(1, len(int_str)):
for op_seq in r_prod(ops, seq_len):
eq = interleave(int_str, op_seq)
if eval(eq) == n:
retval.append(eq)
return retval
The idea is that you interleave the digits of your string with your operators in a varying order. Basically I do that with all possible sequences of length seq_len which varies from 1 to max, which will be the number of digits - 1 (see assumptions below!). Then you use the built-in function eval to evaluate the expression returned by inteleave for a specific sequence of the operators and compare the result with the desired number, n. If the expression evaluates to n you append it to the return array retval (initially empty). After you evaluated all the expressions for all possible operator sequences (see assumptions!) you return the array.
Assumptions
It's not clear whether you can use the same operator multiple times or if you're allowed to omit using some. I assumed you can use the same operator many times and that you're allowed to omit using an operator. Hence, the r_prod was used (as suggested by your question). In case of such restrictions, you will want to use permutations (of possibly varying length) of the group of operators.
Secondly, I assumed that your implementation of the interleave function is correct. It is not clear if, for example, interleave("112", "*") should return both "1*12" and "11*2" or just "1*12" like your implementation does. In the case both should be returned, then you should also iterate over the possible ways the same ordered sequence of operators can be interleaved with the provided digits. I omitted that, because I saw that your function always returns a single string.
Product implementation
If you look at the itertools docs you can see the equivalent code for the function itertools.product. Using that you'd have:
def product(*args, repeat=1):
pools = [tuple(pool) for pool in args] * repeat
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
for prod in result:
yield tuple(prod)
a = ["".join(x) for x in product('ab', repeat=3)]
print(a)
Which prints ['aaa', 'aab', 'aba', 'abb', 'baa', 'bab', 'bba', 'bbb'] -- what I guess is what you're after.
A more specific (assuming iterable is a string), less efficient, but hopefully more understandable solution would be:
def prod(string, r):
if r < 1:
return None
retval = list(string)
for i in range(r - 1):
temp = []
for l in retval:
for c in string:
temp.append(l + c)
retval = temp
return retval
The idea is simple. The second parameter r gives you the length of the strings you want to produce. The characters in the string give you the elements from which you build the string. Hence, you first generate a string of length 1 that starts with each possible character. Then for each of those strings you generate new strings by concatenating the old string with all of the possible characters.
For example, given a pool of characters "abc", you'll first generate strings "a", "b", and "c". Then you'll replace string "a" with strings "aa", "ab", and "ac". Similarly for "b" and "c". You repeat this process n-times to get all possible strings of length r generated by drawing with replacement from the pool "abc".
I'd think it would be a good idea for you to try to implement the prod function recursively. You can see my ugly solution below, but I'd suggest you stop reading this now and try to do it without looking at my suggestion first.
SPOILER BELOW
def r_prod(string, r):
if r == 1:
return list(string)
else:
return [c + s for c in string for s in r_prod(string, r - 1)]

Joining a string in another string

I have done this code but the output is not like what I want
def replace(s,p,n):
return "".join("{}".format(p) if not i % n else char for i, char in enumerate(s,1))
print(replace("university","-",3))
the output that I get is un-ve-si-y
I must get it like :
uni-ver-sit-y
This is one approach. using str slicing.
Demo:
def replace(s,p,n):
return p.join([s[i:i+n] for i in range(0, len(s), n)])
print(replace("university","-",3))
Output:
uni-ver-sit-y
If you extend the code out over multiple lines:
chars_to_join = []
for i, char in enumerate(s,1):
if not i % n:
chars_to_join.append("{}".format(p))
else:
chars_to_join.append(char)
You'll see that when the if statement is true it'll just replace the character rather than include the replacement character after the given character, so just modify the format string to include the currently iterated character aswell
"{}{}".format(char, p)
Alternatively you can do it functionally like this:
from itertools import repeat
def take(s, n):
""""take n characters from s"""
return s[:n]
def skip(s, n):
""""skip n characters from s"""
return s[n:]
def replace(s, p, n):
# create intervals at which to prefix
intervals = range(0, len(s), n)
# create the prefix for all chunks
prefix = map(skip, repeat(s), intervals)
# trim prefix for n characters each
chunks = map(take, prefix, repeat(n))
return p.join(chunks)
And now:
replace('university', '-', 3)
Will give you:
'uni-ver-sit-y'
Note: this is sample code, if this is meant to be efficient you probably should use lazy evaluated functions (like islice) which can take a lot less memory for bigger inputs.
For this question, I think the list-comprehension is not a very good idea. It's not clearly understood. Maybe we can make it clearer by following:
def replace(s,p,n):
new_list = []
for i, c in enumerate(s, 1):
new_list.append(c)
if i % n == 0:
new_list.append(p)
return "".join(new_list)
print(replace("university","-",3))

Determine prefix from a set of (similar) strings

I have a set of strings, e.g.
my_prefix_what_ever
my_prefix_what_so_ever
my_prefix_doesnt_matter
I simply want to find the longest common portion of these strings, here the prefix. In the above the result should be
my_prefix_
The strings
my_prefix_what_ever
my_prefix_what_so_ever
my_doesnt_matter
should result in the prefix
my_
Is there a relatively painless way in Python to determine the prefix (without having to iterate over each character manually)?
PS: I'm using Python 2.6.3.
Never rewrite what is provided to you: os.path.commonprefix does exactly this:
Return the longest path prefix (taken
character-by-character) that is a prefix of all paths in list. If list
is empty, return the empty string (''). Note that this may return
invalid paths because it works a character at a time.
For comparison to the other answers, here's the code:
# Return the longest prefix of all list elements.
def commonprefix(m):
"Given a list of pathnames, returns the longest common leading component"
if not m: return ''
s1 = min(m)
s2 = max(m)
for i, c in enumerate(s1):
if c != s2[i]:
return s1[:i]
return s1
Ned Batchelder is probably right. But for the fun of it, here's a more efficient version of phimuemue's answer using itertools.
import itertools
strings = ['my_prefix_what_ever',
'my_prefix_what_so_ever',
'my_prefix_doesnt_matter']
def all_same(x):
return all(x[0] == y for y in x)
char_tuples = itertools.izip(*strings)
prefix_tuples = itertools.takewhile(all_same, char_tuples)
''.join(x[0] for x in prefix_tuples)
As an affront to readability, here's a one-line version :)
>>> from itertools import takewhile, izip
>>> ''.join(c[0] for c in takewhile(lambda x: all(x[0] == y for y in x), izip(*strings)))
'my_prefix_'
Here's my solution:
a = ["my_prefix_what_ever", "my_prefix_what_so_ever", "my_prefix_doesnt_matter"]
prefix_len = len(a[0])
for x in a[1 : ]:
prefix_len = min(prefix_len, len(x))
while not x.startswith(a[0][ : prefix_len]):
prefix_len -= 1
prefix = a[0][ : prefix_len]
The following is an working, but probably quite inefficient solution.
a = ["my_prefix_what_ever", "my_prefix_what_so_ever", "my_prefix_doesnt_matter"]
b = zip(*a)
c = [x[0] for x in b if x==(x[0],)*len(x)]
result = "".join(c)
For small sets of strings, the above is no problem at all. But for larger sets, I personally would code another, manual solution that checks each character one after another and stops when there are differences.
Algorithmically, this yields the same procedure, however, one might be able to avoid constructing the list c.
Just out of curiosity I figured out yet another way to do this:
def common_prefix(strings):
if len(strings) == 1:#rule out trivial case
return strings[0]
prefix = strings[0]
for string in strings[1:]:
while string[:len(prefix)] != prefix and prefix:
prefix = prefix[:len(prefix)-1]
if not prefix:
break
return prefix
strings = ["my_prefix_what_ever","my_prefix_what_so_ever","my_prefix_doesnt_matter"]
print common_prefix(strings)
#Prints "my_prefix_"
As Ned pointed out it's probably better to use os.path.commonprefix, which is a pretty elegant function.
The second line of this employs the reduce function on each character in the input strings. It returns a list of N+1 elements where N is length of the shortest input string.
Each element in lot is either (a) the input character, if all input strings match at that position, or (b) None. lot.index(None) is the position of the first None in lot: the length of the common prefix. out is that common prefix.
val = ["axc", "abc", "abc"]
lot = [reduce(lambda a, b: a if a == b else None, x) for x in zip(*val)] + [None]
out = val[0][:lot.index(None)]
Here's a simple clean solution. The idea is to use zip() function to line up all the characters by putting them in a list of 1st characters, list of 2nd characters,...list of nth characters. Then iterate each list to check if they contain only 1 value.
a = ["my_prefix_what_ever", "my_prefix_what_so_ever", "my_prefix_doesnt_matter"]
list = [all(x[i] == x[i+1] for i in range(len(x)-1)) for x in zip(*a)]
print a[0][:list.index(0) if list.count(0) > 0 else len(list)]
output: my_prefix_
Here is another way of doing this using OrderedDict with minimal code.
import collections
import itertools
def commonprefix(instrings):
""" Common prefix of a list of input strings using OrderedDict """
d = collections.OrderedDict()
for instring in instrings:
for idx,char in enumerate(instring):
# Make sure index is added into key
d[(char, idx)] = d.get((char,idx), 0) + 1
# Return prefix of keys while value == length(instrings)
return ''.join([k[0] for k in itertools.takewhile(lambda x: d[x] == len(instrings), d)])
I had a slight variation of the problem and google sends me here, so I think it will be useful to document:
I have a list like:
my_prefix_what_ever
my_prefix_what_so_ever
my_prefix_doesnt_matter
some_noise
some_other_noise
So I would expect my_prefix to be returned. That can be done with:
from collections import Counter
def get_longest_common_prefix(values, min_length):
substrings = [value[0: i-1] for value in values for i in range(min_length, len(value))]
counter = Counter(substrings)
# remove count of 1
counter -= Counter(set(substrings))
return max(counter, key=len)
In one line without using itertools, for no particular reason, although it does iterate through each character:
''.join([z[0] for z in zip(*(list(s) for s in strings)) if all(x==z[0] for x in z)])
Find the common prefix in all words from the given input string, if there is no common prefix print -1
stringList = ['my_prefix_what_ever', 'my_prefix_what_so_ever', 'my_prefix_doesnt_matter']
len2 = len( stringList )
if len2 != 0:
# let shortest word is prefix
prefix = min( stringList )
for i in range( len2 ):
word = stringList[ i ]
len1 = len( prefix )
# slicing each word as lenght of prefix
word = word[ 0:len1 ]
for j in range( len1 ):
# comparing each letter of word and prefix
if word[ j ] != prefix[ j ]:
# if letter does not match slice the prefix
prefix = prefix[ :j ]
break # after getting comman prefix move to next word
if len( prefix ) != 0:
print("common prefix: ",prefix)
else:
print("-1")
else:
print("string List is empty")

Categories