more efficient method of substring calculation for advice

more efficient method of substring calculation for advice - python

My code works and I am looking for smarter ideas to be more efficient?
For string similarity, it is defined as longest common prefix length,
for example, "abc" and "abd" is 2, and "aaa" and "aaab" is 3.
The problem is calculate the similarity of string S and all its suffixes,
including itself as the first suffix.
for example, for S="ababaa", suffixes are "ababaa", "babaa", "abaa","baa","aa"
and "a", the similarity are 6+0+3+0+1+1=11
# Complete the function below.
from collections import defaultdict
class TrieNode:
def __init__(self):
self.children=defaultdict(TrieNode)
self.isEnd=False
class TrieTree:
def __init__(self):
self.root=TrieNode()
def insert(self, word):
node = self.root
for w in word:
node = node.children[w]
node.isEnd = True
def search(self, word):
node = self.root
count = 0
for w in word:
node = node.children.get(w)
if not node:
break
else:
count += 1
return count
def StringSimilarity(inputs):
resultFormat=[]
for word in inputs:
# build Trie tree
index = TrieTree()
index.insert(word)
result = 0
# search for suffix
for i in range(len(word)):
result += index.search(word[i:])
print result
resultFormat.append(result)
return resultFormat

def similarity(s, t):
""" assumes len(t) <= len(s), which is easily doable"""
i = 0
while i < len(t) and s[i] == t[i]:
i += 1
return i
def selfSimilarity(s):
return sum(similarity(s, s[i:]) for i in range(len(s)))
selfSimilarity("ababaa")
# 11

Here are 3 efficient approaches you may wish to consider:
Suffix Tree
Compute the suffix tree of the original string. Then descend down the principal path through the suffix tree, counting how many paths depart from the principal at each stage.
Suffix Array
Compute the suffix array and the longest common prefix array.
These arrays can be used to compute the longest prefix of any pair of suffices, and in particular the longest prefix between the original string and each suffix.
Z function
The output you are trying to construct is known as the Z function.
It can be computed directly in linear time as shown here (Not Python code obviously):
vector z_function(string s) {
int n = (int) s.length();
vector z(n);
for (int i = 1, l = 0, r = 0; i < n; ++i) {
if (i <= r)
z[i] = min (r - i + 1, z[i - l]);
while (i + z[i] < n && s[z[i]] == s[i + z[i]])
++z[i];
if (i + z[i] - 1 > r)
l = i, r = i + z[i] - 1;
}
return z;
}

It takes a lot of work to build the TrieTree object. Skip that. Just do a double loop over all possible starting points of a match, and all possible offsets where you might still be matching.
Building complex objects like that only makes sense if you'll be querying your data structure many times. But here you aren't so it doesn't pay off.

Related

Leetcode 5: Longes Palindrome Substring

I have been working on the LeetCode problem 5. Longest Palindromic Substring:
Given a string s, return the longest palindromic substring in s.
But I kept getting time limit exceeded on large test cases.
I used dynamic programming as follows:
dp[(i, j)] = True implies that s[i] to s[j] is a palindrome. So if s[i] == str[j] and dp[(i+1, j-1]) is set to True, that means S[i] to S[j] is also a palindrome.
How can I improve the performance of this implementation?
class Solution:
def longestPalindrome(self, s: str) -> str:
dp = {}
res = ""
for i in range(len(s)):
# single character is always a palindrome
dp[(i, i)] = True
res = s[i]
#fill in the table diagonally
for x in range(len(s) - 1):
i = 0
j = x + 1
while j <= len(s)-1:
if s[i] == s[j] and (j - i == 1 or dp[(i+1, j-1)] == True):
dp[(i, j)] = True
if(j-i+1) > len(res):
res = s[i:j+1]
else:
dp[(i, j)] = False
i += 1
j += 1
return res

I think the judging system for this problem is kind of too tight, it took some time to make it pass, improved version:
class Solution:
def longestPalindrome(self, s: str) -> str:
dp = {}
res = ""
for i in range(len(s)):
dp[(i, i)] = True
res = s[i]
for x in range(len(s)): # iterate till the end of the string
for i in range(x): # iterate up to the current state (less work) and for loop looks better here
if s[i] == s[x] and (dp.get((i + 1, x - 1), False) or x - i == 1):
dp[(i, x)] = True
if x - i + 1 > len(res):
res = s[i:x + 1]
return res

Here is another idea to improve the performance:
The nested loop will check over many cases where the DP value is already False for smaller ranges. We can avoid looking at large spans, by looking for palindromes from inside-out and stop extending the span as soon as it no longer is a palindrome. This process should be repeated at every offset in the source string, but this could still save some processing.
The inputs for which then most time is wasted, are those where there are lots of the same letters after each other, like "aaaaaaabcaaaaaaa". These lead to many iterations: each "a" or "aa" could be the center of a palindrome, but "growing" each of them is a waste of time. We should just consider all consecutive "a" together from the start and expand from there onwards.
You can specifically deal with these cases by first grouping consecutive letters which are the same. So the above example would be turned into 4 groups: a(7)b(1)c(1)a(7)
Then let each group in turn be taken as the center of a palindrome. For each group, "fan out" to potentially include one or more neighboring groups at both sides in "tandem". Continue fanning out until either the outside groups are not about the same letter, or they have a different group size. From that result you can derive what the largest palindrome is around that center. In particular, when the case is that the letters of the outer groups are the same, but not their sizes, you still include that letter at the outside of the palindrome, but with a repetition that corresponds to the least of these two mismatching group sizes.
Here is an implementation. I used named tuples to make it more readable:
from itertools import groupby
from collections import namedtuple
Group = namedtuple("Group", "letter,size,end")
class Solution:
def longestPalindrome(self, s: str) -> str:
longest = ""
x = 0
groups = [Group(group[0], len(group), x := x + len(group)) for group in
("".join(group[1]) for group in groupby(s))]
for i in range(len(groups)):
for j in range(0, min(i+1, len(groups) - i)):
if groups[i - j].letter != groups[i + j].letter:
break
left = groups[i - j]
right = groups[i + j]
if left.size != right.size:
break
size = right.end - (left.end - left.size) - abs(left.size - right.size)
if size > len(longest):
x = left.end - left.size + max(0, left.size - right.size)
longest = s[x:x+size]
return longest

Alternatively, you can try this approach, it seems to be faster than 96% Python submission.
def longestPalindrome(self, s: str) -> str:
N = len(s)
if N == 0:
return 0
max_len, start = 1, 0
for i in range(N):
df = i - max_len
if df >= 1 and s[df-1: i+1] == s[df-1: i+1][::-1]:
start = df - 1
max_len += 2
continue
if df >= 0 and s[df: i+1] == s[df: i+1][::-1]:
start= df
max_len += 1
return s[start: start + max_len]

If you want to improve the performance, you should create a variable for len(s) at the beginning of the function and use it. That way instead of calling len(s) 3 times, you would do it just once.
Also, I see no reason to create a class for this function. A simple function will outrun a class method, albeit very slightly.

DP solution to find the maximum length of a contiguous subarray with equal number of 0 and 1

The question is from here https://leetcode.com/problems/contiguous-array/
Actually, I came up with a DP solution for this question.
However, It won't pass one test case.
Any thought?
DP[i][j] ==1 meaning from substring[i] to substring[j] is valid
Divide the question into smaller
DP[i][j]==1
- if DP[i+2][j]==1 and DP[i][i+1]==1
- else if DP[i][j-2]==1 and DP[j-1][j]==1
- else if num[i],num[j] == set([0,1]) and DP[i+1][j-1]==1
```
current_max_len = 0
if not nums:
return current_max_len
dp = [] * len(nums)
for _ in range(len(nums)):
dp.append([None] * len(nums))
for thisLen in range(2, len(nums)+1, 2):
for i in range(len(nums)):
last_index = i + thisLen -1
if i + thisLen > len(nums):
continue
if thisLen==2:
if set(nums[i:i+2]) == set([0, 1]):
dp[i][last_index] = 1
elif dp[i][last_index-2] and dp[last_index-1][last_index]:
dp[i][last_index] = 1
elif dp[i][i + 1] and dp[i + 2][last_index]:
dp[i][last_index] = 1
elif dp[i + 1][last_index-1] and set([nums[i], nums[last_index]]) == set([0, 1]):
dp[i][last_index] = 1
else:
dp[i][last_index] = 0
if dp[i][last_index] == 1:
current_max_len = max(current_max_len, thisLen)
return current_max_len
```

Here is a counter example [1, 1, 0, 0, 0, 0, 1, 1]. The problem with you solution that it requires a list to be composed of smaller valid lists of size n-1 or n-2 in this counter example it's two lists of length 4 or n-2 . -- SPOILER ALERT -- You can solve the problem by using other dp technique basically for every i,j you can find the number of ones and zeroes between them in constant time to do that just store the number of ones from the start of the list to every index i

here is python code
def func( nums):
track,has=0,{0:-1}
length=len(nums);
ress_max=0;
for i in range(0,length):
track += (1 if nums[i]==1 else -1)
if track not in has:
has[track]=i
elif ress_max <i-has[track]:
ress_max = i-has[track]
return ress_max
l = list(map(int,input().strip().split()))
print(func(l))

Since given length of binary string may be at most 50000. So, running O(n * n) algorithm may lead to time limit exceed. I would like to suggest you to solve it in O(n) time and space complexity. The idea is :
If we take any valid contiguous sub-sequence and perform summation of numbers treating 0 as -1 then, total summation should be zero always.
If we keep track of prefix summation then we can get zero summation in the range L to R, if prefix summation up to L - 1 and prefix summation up to R are equal.
Since we are looking for maximum length, we will always treat index of newly found summation as a first one and put it into hash map with value as current index and which will persist forever for that particular summation.
Every time we calculate cumulative summation, we look whether it has any previous occurrence. If it has previous occurrence we calculate length and try to maximize , otherwise it will be the first one and will persist forever in hash map with value as current index.
Note: To calculate pure prefix, we must treat summation 0 is already in map and paired with value -1 as index.
The sample code in C++ is as follow:
int findMaxLength(vector<int>& nums) {
unordered_map<int,int>lastIndex;
lastIndex[0] = -1;
int cumulativeSum = 0;
int maxLen = 0;
for (int i = 0; i < nums.size(); ++i) {
cumulativeSum += (nums[i] == 0 ? -1 : 1);
if (lastIndex.find(cumulativeSum) != lastIndex.end()) {
maxLen = max(maxLen, i - lastIndex[cumulativeSum]);
} else {
lastIndex[cumulativeSum] = i;
}
}
return maxLen;
}

Optimal Search Tree Using Python - Code Analysis

First of all, sorry about the naive question. But I couldn't find help elsewhere
I'm trying to create an Optimal Search Tree using Dynamic Programing in Python that receives two lists (a set of keys and a set of frequencies) and returns two answers:
1 - The smallest path cost.
2 - The generated tree for that smallest cost.
I basically need to create a tree organized by the most accessed items on top (most accessed item it's the root), and return the smallest path cost from that tree, by using the Dynamic Programming solution.
I've the following implemented code using Python:
def optimalSearchTree(keys, freq, n):
#Create an auxiliary 2D matrix to store results of subproblems
cost = [[0 for x in xrange(n)] for y in xrange(n)]
#For a single key, cost is equal to frequency of the key
#for i in xrange (0,n):
# cost[i][i] = freq[i]
# Now we need to consider chains of length 2, 3, ... .
# L is chain length.
for L in xrange (2,n):
for i in xrange(0,n-L+1):
j = i+L-1
cost[i][j] = sys.maxint
for r in xrange (i,j):
if (r > i):
c = cost[i][r-1] + sum(freq, i, j)
elif (r < j):
c = cost[r+1][j] + sum(freq, i, j)
elif (c < cost[i][j]):
cost[i][j] = c
return cost[0][n-1]
def sum(freq, i, j):
s = 0
k = i
for k in xrange (k,j):
s += freq[k]
return s
keys = [10,12,20]
freq = [34,8,50]
n=sys.getsizeof(keys)/sys.getsizeof(keys[0])
print(optimalSearchTree(keys, freq, n))
I'm trying to output the answer 1. The smallest cost for that tree should be 142 (the value stored on the Matrix Position [0][n-1], according to the Dynamic Programming solution). But unfortunately it's returning 0. I couldn't find any issues in that code. What's going wrong?

You have several very questionable statements in your code, definitely inspired by C/Java programming practices. For instance,
keys = [10,12,20]
freq = [34,8,50]
n=sys.getsizeof(keys)/sys.getsizeof(keys[0])
I think you think you calculate the number of items in the list. However, n is not 3:
sys.getsizeof(keys)/sys.getsizeof(keys[0])
3.142857142857143
What you need is this:
n = len(keys)
One more find: elif (r < j) is always True, because r is in the range between i (inclusive) and j (exclusive). The elif (c < cost[i][j]) condition is never checked. The matrix c is never updated in the loop - that's why you always end up with a 0.
Another suggestion: do not overwrite the built-in function sum(). Your namesake function calculates the sum of all items in a slice of a list:
sum(freq[i:j])

import sys
def optimalSearchTree(keys, freq):
#Create an auxiliary 2D matrix to store results of subproblems
n = len(keys)
cost = [[0 for x in range(n)] for y in range(n)]
storeRoot = [[0 for i in range(n)] for i in range(n)]
#For a single key, cost is equal to frequency of the key
for i in range (0,n):
cost[i][i] = freq[i]
# Now we need to consider chains of length 2, 3, ... .
# L is chain length.
for L in range (2,n+1):
for i in range(0,n-L+1):
j = i + L - 1
cost[i][j] = sys.maxsize
for r in range (i,j+1):
c = (cost[i][r-1] if r > i else 0)
c += (cost[r+1][j] if r < j else 0)
c += sum(freq[i:j+1])
if (c < cost[i][j]):
cost[i][j] = c
storeRoot[i][j] = r
return cost[0][n-1], storeRoot
if __name__ == "__main__" :
keys = [10,12,20]
freq = [34,8,50]
print(optimalSearchTree(keys, freq))

Longest Common Subsequence of three strings

I've written these functions (which work) to find the longest common subsequence of two strings.
def lcs_grid(xs, ys):
grid = defaultdict(lambda: defaultdict(lambda: (0,"")))
for i,x in enumerate(xs):
for j,y in enumerate(ys):
if x == y:
grid[i][j] = (grid[i-1][j-1][0]+1,'\\')
else:
if grid[i-1][j][0] > grid[i][j-1][0]:
grid[i][j] = (grid[i-1][j][0],'<')
else:
grid[i][j] = (grid[i][j-1][0],'^')
return grid
def lcs(xs,ys):
grid = lcs_grid(xs,ys)
i, j = len(xs) - 1, len(ys) - 1
best = []
length,move = grid[i][j]
while length:
if move == '\\':
best.append(xs[i])
i -= 1
j -= 1
elif move == '^':
j -= 1
elif move == '<':
i -= 1
length,move = grid[i][j]
best.reverse()
return best
Has anybody a proposition to modify the functions s.t. they can print the longest common subsequence of three strings? I.e. the function call would be: lcs(str1, str2, str3)
Till now, I managed it with the 'reduce'-statement, but I'd like to have a function that really prints out the subsequence without the 'reduce'-statement.

To find the longest common substring of D strings, you cannot simply use reduce, since the longest common substring of 3 strings does not have to be a substring of the LCS of any of the two. Counterexample:
a = "aaabb"
b = "aaajbb"
c = "cccbb"
In the example, LCS(a,b) = "aaa" and LCS(a, b, c) = "bb". As you can see, "bb" is not a substring of "aaa".
In your case, since you implemented the dynamic programming version, you have to build a D-dimensional grid and adjust the algorithm accordingly.
You might want to look at suffix trees, which should make things faster, see Wikipedia. Also look at this stackoverflow question

Finding all possible permutations of a given string in python

I have a string. I want to generate all permutations from that string, by changing the order of characters in it. For example, say:
x='stack'
what I want is a list like this,
l=['stack','satck','sackt'.......]
Currently I am iterating on the list cast of the string, picking 2 letters randomly and transposing them to form a new string, and adding it to set cast of l. Based on the length of the string, I am calculating the number of permutations possible and continuing iterations till set size reaches the limit.
There must be a better way to do this.

The itertools module has a useful method called permutations(). The documentation says:
itertools.permutations(iterable[, r])
Return successive r length permutations of elements in the iterable.
If r is not specified or is None, then r defaults to the length of the
iterable and all possible full-length permutations are generated.
Permutations are emitted in lexicographic sort order. So, if the input
iterable is sorted, the permutation tuples will be produced in sorted
order.
You'll have to join your permuted letters as strings though.
>>> from itertools import permutations
>>> perms = [''.join(p) for p in permutations('stack')]
>>> perms
['stack', 'stakc', 'stcak', 'stcka', 'stkac', 'stkca', 'satck',
'satkc', 'sactk', 'sackt', 'saktc', 'sakct', 'sctak', 'sctka',
'scatk', 'scakt', 'sckta', 'sckat', 'sktac', 'sktca', 'skatc',
'skact', 'skcta', 'skcat', 'tsack', 'tsakc', 'tscak', 'tscka',
'tskac', 'tskca', 'tasck', 'taskc', 'tacsk', 'tacks', 'taksc',
'takcs', 'tcsak', 'tcska', 'tcask', 'tcaks', 'tcksa', 'tckas',
'tksac', 'tksca', 'tkasc', 'tkacs', 'tkcsa', 'tkcas', 'astck',
'astkc', 'asctk', 'asckt', 'asktc', 'askct', 'atsck', 'atskc',
'atcsk', 'atcks', 'atksc', 'atkcs', 'acstk', 'acskt', 'actsk',
'actks', 'ackst', 'ackts', 'akstc', 'aksct', 'aktsc', 'aktcs',
'akcst', 'akcts', 'cstak', 'cstka', 'csatk', 'csakt', 'cskta',
'cskat', 'ctsak', 'ctska', 'ctask', 'ctaks', 'ctksa', 'ctkas',
'castk', 'caskt', 'catsk', 'catks', 'cakst', 'cakts', 'cksta',
'cksat', 'cktsa', 'cktas', 'ckast', 'ckats', 'kstac', 'kstca',
'ksatc', 'ksact', 'kscta', 'kscat', 'ktsac', 'ktsca', 'ktasc',
'ktacs', 'ktcsa', 'ktcas', 'kastc', 'kasct', 'katsc', 'katcs',
'kacst', 'kacts', 'kcsta', 'kcsat', 'kctsa', 'kctas', 'kcast',
'kcats']
If you find yourself troubled by duplicates, try fitting your data into a structure with no duplicates like a set:
>>> perms = [''.join(p) for p in permutations('stacks')]
>>> len(perms)
720
>>> len(set(perms))
360
Thanks to #pst for pointing out that this is not what we'd traditionally think of as a type cast, but more of a call to the set() constructor.

You can get all N! permutations without much code
def permutations(string, step = 0):
# if we've gotten to the end, print the permutation
if step == len(string):
print "".join(string)
# everything to the right of step has not been swapped yet
for i in range(step, len(string)):
# copy the string (store as array)
string_copy = [character for character in string]
# swap the current index with the step
string_copy[step], string_copy[i] = string_copy[i], string_copy[step]
# recurse on the portion of the string that has not been swapped yet (now it's index will begin with step + 1)
permutations(string_copy, step + 1)

Here is another way of doing the permutation of string with minimal code based on bactracking.
We basically create a loop and then we keep swapping two characters at a time,
Inside the loop we'll have the recursion. Notice,we only print when indexers reaches the length of our string.
Example:
ABC
i for our starting point and our recursion param
j for our loop
here is a visual help how it works from left to right top to bottom (is the order of permutation)
the code :
def permute(data, i, length):
if i==length:
print(''.join(data) )
else:
for j in range(i,length):
#swap
data[i], data[j] = data[j], data[i]
permute(data, i+1, length)
data[i], data[j] = data[j], data[i]
string = "ABC"
n = len(string)
data = list(string)
permute(data, 0, n)

Stack Overflow users have already posted some strong solutions but I wanted to show yet another solution. This one I find to be more intuitive
The idea is that for a given string: we can recurse by the algorithm (pseudo-code):
permutations = char + permutations(string - char) for char in string
I hope it helps someone!
def permutations(string):
"""
Create all permutations of a string with non-repeating characters
"""
permutation_list = []
if len(string) == 1:
return [string]
else:
for char in string:
[permutation_list.append(char + a) for a in permutations(string.replace(char, "", 1))]
return permutation_list

Here's a simple function to return unique permutations:
def permutations(string):
if len(string) == 1:
return string
recursive_perms = []
for c in string:
for perm in permutations(string.replace(c,'',1)):
recursive_perms.append(c+perm)
return set(recursive_perms)

itertools.permutations is good, but it doesn't deal nicely with sequences that contain repeated elements. That's because internally it permutes the sequence indices and is oblivious to the sequence item values.
Sure, it's possible to filter the output of itertools.permutations through a set to eliminate the duplicates, but it still wastes time generating those duplicates, and if there are several repeated elements in the base sequence there will be lots of duplicates. Also, using a collection to hold the results wastes RAM, negating the benefit of using an iterator in the first place.
Fortunately, there are more efficient approaches. The code below uses the algorithm of the 14th century Indian mathematician Narayana Pandita, which can be found in the Wikipedia article on Permutation. This ancient algorithm is still one of the fastest known ways to generate permutations in order, and it is quite robust, in that it properly handles permutations that contain repeated elements.
def lexico_permute_string(s):
''' Generate all permutations in lexicographic order of string `s`
This algorithm, due to Narayana Pandita, is from
https://en.wikipedia.org/wiki/Permutation#Generation_in_lexicographic_order
To produce the next permutation in lexicographic order of sequence `a`
1. Find the largest index j such that a[j] < a[j + 1]. If no such index exists,
the permutation is the last permutation.
2. Find the largest index k greater than j such that a[j] < a[k].
3. Swap the value of a[j] with that of a[k].
4. Reverse the sequence from a[j + 1] up to and including the final element a[n].
'''
a = sorted(s)
n = len(a) - 1
while True:
yield ''.join(a)
#1. Find the largest index j such that a[j] < a[j + 1]
for j in range(n-1, -1, -1):
if a[j] < a[j + 1]:
break
else:
return
#2. Find the largest index k greater than j such that a[j] < a[k]
v = a[j]
for k in range(n, j, -1):
if v < a[k]:
break
#3. Swap the value of a[j] with that of a[k].
a[j], a[k] = a[k], a[j]
#4. Reverse the tail of the sequence
a[j+1:] = a[j+1:][::-1]
for s in lexico_permute_string('data'):
print(s)
output
aadt
aatd
adat
adta
atad
atda
daat
data
dtaa
taad
tada
tdaa
Of course, if you want to collect the yielded strings into a list you can do
list(lexico_permute_string('data'))
or in recent Python versions:
[*lexico_permute_string('data')]

Here is another approach different from what #Adriano and #illerucis posted. This has a better runtime, you can check that yourself by measuring the time:
def removeCharFromStr(str, index):
endIndex = index if index == len(str) else index + 1
return str[:index] + str[endIndex:]
# 'ab' -> a + 'b', b + 'a'
# 'abc' -> a + bc, b + ac, c + ab
# a + cb, b + ca, c + ba
def perm(str):
if len(str) <= 1:
return {str}
permSet = set()
for i, c in enumerate(str):
newStr = removeCharFromStr(str, i)
retSet = perm(newStr)
for elem in retSet:
permSet.add(c + elem)
return permSet
For an arbitrary string "dadffddxcf" it took 1.1336 sec for the permutation library, 9.125 sec for this implementation and 16.357 secs for #Adriano's and #illerucis' version. Of course you can still optimize it.

Here's a slightly improved version of illerucis's code for returning a list of all permutations of a string s with distinct characters (not necessarily in lexicographic sort order), without using itertools:
def get_perms(s, i=0):
"""
Returns a list of all (len(s) - i)! permutations t of s where t[:i] = s[:i].
"""
# To avoid memory allocations for intermediate strings, use a list of chars.
if isinstance(s, str):
s = list(s)
# Base Case: 0! = 1! = 1.
# Store the only permutation as an immutable string, not a mutable list.
if i >= len(s) - 1:
return ["".join(s)]
# Inductive Step: (len(s) - i)! = (len(s) - i) * (len(s) - i - 1)!
# Swap in each suffix character to be at the beginning of the suffix.
perms = get_perms(s, i + 1)
for j in range(i + 1, len(s)):
s[i], s[j] = s[j], s[i]
perms.extend(get_perms(s, i + 1))
s[i], s[j] = s[j], s[i]
return perms

See itertools.combinations or itertools.permutations.

why do you not simple do:
from itertools import permutations
perms = [''.join(p) for p in permutations(['s','t','a','c','k'])]
print perms
print len(perms)
print len(set(perms))
you get no duplicate as you can see :
['stack', 'stakc', 'stcak', 'stcka', 'stkac', 'stkca', 'satck', 'satkc',
'sactk', 'sackt', 'saktc', 'sakct', 'sctak', 'sctka', 'scatk', 'scakt', 'sckta',
'sckat', 'sktac', 'sktca', 'skatc', 'skact', 'skcta', 'skcat', 'tsack',
'tsakc', 'tscak', 'tscka', 'tskac', 'tskca', 'tasck', 'taskc', 'tacsk', 'tacks',
'taksc', 'takcs', 'tcsak', 'tcska', 'tcask', 'tcaks', 'tcksa', 'tckas', 'tksac',
'tksca', 'tkasc', 'tkacs', 'tkcsa', 'tkcas', 'astck', 'astkc', 'asctk', 'asckt',
'asktc', 'askct', 'atsck', 'atskc', 'atcsk', 'atcks', 'atksc', 'atkcs', 'acstk',
'acskt', 'actsk', 'actks', 'ackst', 'ackts', 'akstc', 'aksct', 'aktsc', 'aktcs',
'akcst', 'akcts', 'cstak', 'cstka', 'csatk', 'csakt', 'cskta', 'cskat', 'ctsak',
'ctska', 'ctask', 'ctaks', 'ctksa', 'ctkas', 'castk', 'caskt', 'catsk', 'catks',
'cakst', 'cakts', 'cksta', 'cksat', 'cktsa', 'cktas', 'ckast', 'ckats', 'kstac',
'kstca', 'ksatc', 'ksact', 'kscta', 'kscat', 'ktsac', 'ktsca', 'ktasc', 'ktacs',
'ktcsa', 'ktcas', 'kastc', 'kasct', 'katsc', 'katcs', 'kacst', 'kacts', 'kcsta',
'kcsat', 'kctsa', 'kctas', 'kcast', 'kcats']
120
120
[Finished in 0.3s]

def permute(seq):
if not seq:
yield seq
else:
for i in range(len(seq)):
rest = seq[:i]+seq[i+1:]
for x in permute(rest):
yield seq[i:i+1]+x
print(list(permute('stack')))

All Possible Word with stack
from itertools import permutations
for i in permutations('stack'):
print(''.join(i))
permutations(iterable, r=None)
Return successive r length permutations of elements in the iterable.
If r is not specified or is None, then r defaults to the length of the iterable and all possible full-length permutations are generated.
Permutations are emitted in lexicographic sort order. So, if the input iterable is sorted, the permutation tuples will be produced in sorted order.
Elements are treated as unique based on their position, not on their value. So if the input elements are unique, there will be no repeat values in each permutation.

This is a recursive solution with n! which accepts duplicate elements in the string
import math
def getFactors(root,num):
sol = []
# return condition
if len(num) == 1:
return [root+num]
# looping in next iteration
for i in range(len(num)):
# Creating a substring with all remaining char but the taken in this iteration
if i > 0:
rem = num[:i]+num[i+1:]
else:
rem = num[i+1:]
# Concatenating existing solutions with the solution of this iteration
sol = sol + getFactors(root + num[i], rem)
return sol
I validated the solution taking into account two elements, the number of combinations is n! and the result can not contain duplicates. So:
inpt = "1234"
results = getFactors("",inpt)
if len(results) == math.factorial(len(inpt)) | len(results) != len(set(results)):
print("Wrong approach")
else:
print("Correct Approach")

With recursive approach.
def permute(word):
if len(word) == 1:
return [word]
permutations = permute(word[1:])
character = word[0]
result = []
for p in permutations:
for i in range(len(p)+1):
result.append(p[:i] + character + p[i:])
return result
running code.
>>> permute('abc')
['abc', 'bac', 'bca', 'acb', 'cab', 'cba']

Yet another initiative and recursive solution. The idea is to select a letter as a pivot and then create a word.
def find_premutations(alphabet):
words = []
word =''
def premute(new_word, alphabet):
if not alphabet:
words.append(word)
else:
for i in range(len(alphabet)):
premute(new_word=word + alphabet[i], alphabet=alphabet[0:i] + alphabet[i+1:])
premute(word, alphabet)
return words
# let us try it with 'abc'
a = 'abc'
find_premutations(a)
Output:
abc
acb
bac
bca
cab
cba

Here's a really simple generator version:
def find_all_permutations(s, curr=[]):
if len(s) == 0:
yield curr
else:
for i, c in enumerate(s):
for combo in find_all_permutations(s[:i]+s[i+1:], curr + [c]):
yield "".join(combo)
I think it's not so bad!

def f(s):
if len(s) == 2:
X = [s, (s[1] + s[0])]
return X
else:
list1 = []
for i in range(0, len(s)):
Y = f(s[0:i] + s[i+1: len(s)])
for j in Y:
list1.append(s[i] + j)
return list1
s = raw_input()
z = f(s)
print z

Here's a simple and straightforward recursive implementation;
def stringPermutations(s):
if len(s) < 2:
yield s
return
for pos in range(0, len(s)):
char = s[pos]
permForRemaining = list(stringPermutations(s[0:pos] + s[pos+1:]))
for perm in permForRemaining:
yield char + perm

from itertools import permutations
perms = [''.join(p) for p in permutations('ABC')]
perms = [''.join(p) for p in permutations('stack')]

def perm(string):
res=[]
for j in range(0,len(string)):
if(len(string)>1):
for i in perm(string[1:]):
res.append(string[0]+i)
else:
return [string];
string=string[1:]+string[0];
return res;
l=set(perm("abcde"))
This is one way to generate permutations with recursion, you can understand the code easily by taking strings 'a','ab' & 'abc' as input.
You get all N! permutations with this, without duplicates.

Everyone loves the smell of their own code. Just sharing the one I find the simplest:
def get_permutations(word):
if len(word) == 1:
yield word
for i, letter in enumerate(word):
for perm in get_permutations(word[:i] + word[i+1:]):
yield letter + perm

This program does not eliminate the duplicates, but I think it is one of the most efficient approaches:
s=raw_input("Enter a string: ")
print "Permutations :\n",s
size=len(s)
lis=list(range(0,size))
while(True):
k=-1
while(k>-size and lis[k-1]>lis[k]):
k-=1
if k>-size:
p=sorted(lis[k-1:])
e=p[p.index(lis[k-1])+1]
lis.insert(k-1,'A')
lis.remove(e)
lis[lis.index('A')]=e
lis[k:]=sorted(lis[k:])
list2=[]
for k in lis:
list2.append(s[k])
print "".join(list2)
else:
break

With Recursion
# swap ith and jth character of string
def swap(s, i, j):
q = list(s)
q[i], q[j] = q[j], q[i]
return ''.join(q)
# recursive function
def _permute(p, s, permutes):
if p >= len(s) - 1:
permutes.append(s)
return
for i in range(p, len(s)):
_permute(p + 1, swap(s, p, i), permutes)
# helper function
def permute(s):
permutes = []
_permute(0, s, permutes)
return permutes
# TEST IT
s = "1234"
all_permute = permute(s)
print(all_permute)
With Iterative approach (Using Stack)
# swap ith and jth character of string
def swap(s, i, j):
q = list(s)
q[i], q[j] = q[j], q[i]
return ''.join(q)
# iterative function
def permute_using_stack(s):
stk = [(0, s)]
permutes = []
while len(stk) > 0:
p, s = stk.pop(0)
if p >= len(s) - 1:
permutes.append(s)
continue
for i in range(p, len(s)):
stk.append((p + 1, swap(s, p, i)))
return permutes
# TEST IT
s = "1234"
all_permute = permute_using_stack(s)
print(all_permute)
With Lexicographically sorted
# swap ith and jth character of string
def swap(s, i, j):
q = list(s)
q[i], q[j] = q[j], q[i]
return ''.join(q)
# finds next lexicographic string if exist otherwise returns -1
def next_lexicographical(s):
for i in range(len(s) - 2, -1, -1):
if s[i] < s[i + 1]:
m = s[i + 1]
swap_pos = i + 1
for j in range(i + 1, len(s)):
if m > s[j] > s[i]:
m = s[j]
swap_pos = j
if swap_pos != -1:
s = swap(s, i, swap_pos)
s = s[:i + 1] + ''.join(sorted(s[i + 1:]))
return s
return -1
# helper function
def permute_lexicographically(s):
s = ''.join(sorted(s))
permutes = []
while True:
permutes.append(s)
s = next_lexicographical(s)
if s == -1:
break
return permutes
# TEST IT
s = "1234"
all_permute = permute_lexicographically(s)
print(all_permute)

This code makes sense to me. The logic is to loop through all characters, extract the ith character, perform the permutation on the other elements and append the ith character at the beginning.
If i'm asked to get all permutations manually for string ABC. I would start by checking all combinations of element A:
A AB
A BC
Then all combinations of element B:
B AC
B CA
Then all combinations of element C:
C AB
C BA
def permute(s: str):
n = len(s)
if n == 1: return [s]
if n == 2:
return [s[0]+s[1], s[1]+s[0]]
permutations = []
for i in range(0, n):
current = s[i]
others = s[:i] + s[i+1:]
otherPermutations = permute(others)
for op in otherPermutations:
permutations.append(current + op)
return permutations

Simpler solution using permutations.
from itertools import permutations
def stringPermutate(s1):
length=len(s1)
if length < 2:
return s1
perm = [''.join(p) for p in permutations(s1)]
return set(perm)

def permute_all_chars(list, begin, end):
if (begin == end):
print(list)
return
for current_position in range(begin, end + 1):
list[begin], list[current_position] = list[current_position], list[begin]
permute_all_chars(list, begin + 1, end)
list[begin], list[current_position] = list[current_position], list[begin]
given_str = 'ABC'
list = []
for char in given_str:
list.append(char)
permute_all_chars(list, 0, len(list) -1)

The itertools module in the standard library has a function for this which is simply called permutations.
import itertools
def minion_game(s):
vow ="aeiou"
lsword=[]
ta=[]
for a in range(1,len(s)+1):
t=list(itertools.permutations(s,a))
lsword.append(t)
for i in range(0,len(lsword)):
for xa in lsword[i]:
if vow.startswith(xa):
ta.append("".join(xa))
print(ta)
minion_game("banana")

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

more efficient method of substring calculation for advice - python

def similarity(s, t): """ assumes len(t) <= len(s), which is easily doable""" i = 0 while i < len(t) and s[i] == t[i]: i += 1 return i def selfSimilarity(s): return sum(similarity(s, s[i:]) for i in range(len(s))) selfSimilarity("ababaa") # 11

Related

Leetcode 5: Longes Palindrome Substring

DP solution to find the maximum length of a contiguous subarray with equal number of 0 and 1

Optimal Search Tree Using Python - Code Analysis

Longest Common Subsequence of three strings

Finding all possible permutations of a given string in python

Categories

Resources