I'm here to ask help about my program.
I realise a program that raison d'ĂȘtre is to find the most occured four letters string on a x letters bigger string which have been generated randomly.
As example, if you would know the most occured sequence of four letters in 'abcdeabcdef' it's pretty easy to understand that is 'abcd' so the program will return this.
Unfortunately, my program works very slow, I mean, It take 119.7 seconds, for analyze all possibilities and display the results for only a 1000 letters string.
This is my program, right now :
import random
chars = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
string = ''
for _ in range(1000):
string += str(chars[random.randint(0, 25)])
print(string)
number = []
for ____ in range(0,26):
print(____)
for ___ in range(0,26):
for __ in range(0, 26):
for _ in range(0, 26):
test = chars[____] + chars[___] + chars[__] + chars[_]
print('trying :',test, end = ' ')
number.append(0)
for i in range(len(string) -3):
if string[i: i+4] == test:
number[len(number) -1] += 1
print('>> finished')
_max = max(number)
for i in range(len(number)-1):
if number[i] == _max :
j, k, l, m = i, 0, 0, 0
while j > 25:
j -= 26
k += 1
while k > 25:
k -= 26
l += 1
while l > 25:
l -= 26
m += 1
Result = chars[m] + chars[l] + chars[k] + chars[j]
print(str(Result),'occured',_max, 'times' )
I think there is ways to optimize it but at my level, I really don't know. Maybe the structure itself is not the best. Hope you'll gonna help me :D
You only need to loop through your list once to count the 4-letter sequences. You are currently looping n*n*n*n. You can use zip to make a four letter sequence that collects the 997 substrings, then use Counter to count them:
from collections import Counter
import random
chars = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
s = "".join([chars[random.randint(0, 25)] for _ in range(1000)])
it = zip(s, s[1:], s[2:], s[3:])
counts = Counter(it)
counts.most_common(1)
Edit:
.most_common(x) returns a list of the x most common strings. counts.most_common(1) returns a single item list with the tuple of letters and number of times it occurred like; [(('a', 'b', 'c', 'd'), 2)]. So to get a string, just index into it and join():
''.join(counts.most_common(1)[0][0])
Even with your current approach of iterating through every possible 4-letter combination, you can speed up a lot by keeping a dictionary instead of a list, and testing whether the sequence occurs at all first before trying to count the occurrences:
counts = {}
for a in chars:
for b in chars:
for c in chars:
for d in chars:
test = a + b + c + d
print('trying :',test, end = ' ')
if test in s: # if it occurs at all
# then record how often it occurs
counts[test] = sum(1 for i in range(len(s)-4)
if test == s[i:i+4])
The multiple loops can be replaced with itertools.permutations, though this improves readability rather than performance:
length = 4
for sequence in itertools.permutations(chars, length):
test = "".join(sequence)
if test in s:
counts[test] = sum(1 for i in range(len(s)-length) if test == s[i:i+length])
You can then display the results like this:
_max = max(counts.values())
for k, v in counts.items():
if v == _max:
print(k, "occurred", _max, "times")
Provided that the string is shorter or around the same length as 26**4 characters, then it is much faster still to iterate through the string rather than through every combination:
length = 4
counts = {}
for i in range(len(s) - length):
sequence = s[i:i+length]
if sequence in counts:
counts[sequence] += 1
else:
counts[sequence] = 1
This is equivalent to the Counter approach already suggested.
I have a string. I want to generate all permutations from that string, by changing the order of characters in it. For example, say:
x='stack'
what I want is a list like this,
l=['stack','satck','sackt'.......]
Currently I am iterating on the list cast of the string, picking 2 letters randomly and transposing them to form a new string, and adding it to set cast of l. Based on the length of the string, I am calculating the number of permutations possible and continuing iterations till set size reaches the limit.
There must be a better way to do this.
The itertools module has a useful method called permutations(). The documentation says:
itertools.permutations(iterable[, r])
Return successive r length permutations of elements in the iterable.
If r is not specified or is None, then r defaults to the length of the
iterable and all possible full-length permutations are generated.
Permutations are emitted in lexicographic sort order. So, if the input
iterable is sorted, the permutation tuples will be produced in sorted
order.
You'll have to join your permuted letters as strings though.
>>> from itertools import permutations
>>> perms = [''.join(p) for p in permutations('stack')]
>>> perms
['stack', 'stakc', 'stcak', 'stcka', 'stkac', 'stkca', 'satck',
'satkc', 'sactk', 'sackt', 'saktc', 'sakct', 'sctak', 'sctka',
'scatk', 'scakt', 'sckta', 'sckat', 'sktac', 'sktca', 'skatc',
'skact', 'skcta', 'skcat', 'tsack', 'tsakc', 'tscak', 'tscka',
'tskac', 'tskca', 'tasck', 'taskc', 'tacsk', 'tacks', 'taksc',
'takcs', 'tcsak', 'tcska', 'tcask', 'tcaks', 'tcksa', 'tckas',
'tksac', 'tksca', 'tkasc', 'tkacs', 'tkcsa', 'tkcas', 'astck',
'astkc', 'asctk', 'asckt', 'asktc', 'askct', 'atsck', 'atskc',
'atcsk', 'atcks', 'atksc', 'atkcs', 'acstk', 'acskt', 'actsk',
'actks', 'ackst', 'ackts', 'akstc', 'aksct', 'aktsc', 'aktcs',
'akcst', 'akcts', 'cstak', 'cstka', 'csatk', 'csakt', 'cskta',
'cskat', 'ctsak', 'ctska', 'ctask', 'ctaks', 'ctksa', 'ctkas',
'castk', 'caskt', 'catsk', 'catks', 'cakst', 'cakts', 'cksta',
'cksat', 'cktsa', 'cktas', 'ckast', 'ckats', 'kstac', 'kstca',
'ksatc', 'ksact', 'kscta', 'kscat', 'ktsac', 'ktsca', 'ktasc',
'ktacs', 'ktcsa', 'ktcas', 'kastc', 'kasct', 'katsc', 'katcs',
'kacst', 'kacts', 'kcsta', 'kcsat', 'kctsa', 'kctas', 'kcast',
'kcats']
If you find yourself troubled by duplicates, try fitting your data into a structure with no duplicates like a set:
>>> perms = [''.join(p) for p in permutations('stacks')]
>>> len(perms)
720
>>> len(set(perms))
360
Thanks to #pst for pointing out that this is not what we'd traditionally think of as a type cast, but more of a call to the set() constructor.
You can get all N! permutations without much code
def permutations(string, step = 0):
# if we've gotten to the end, print the permutation
if step == len(string):
print "".join(string)
# everything to the right of step has not been swapped yet
for i in range(step, len(string)):
# copy the string (store as array)
string_copy = [character for character in string]
# swap the current index with the step
string_copy[step], string_copy[i] = string_copy[i], string_copy[step]
# recurse on the portion of the string that has not been swapped yet (now it's index will begin with step + 1)
permutations(string_copy, step + 1)
Here is another way of doing the permutation of string with minimal code based on bactracking.
We basically create a loop and then we keep swapping two characters at a time,
Inside the loop we'll have the recursion. Notice,we only print when indexers reaches the length of our string.
Example:
ABC
i for our starting point and our recursion param
j for our loop
here is a visual help how it works from left to right top to bottom (is the order of permutation)
the code :
def permute(data, i, length):
if i==length:
print(''.join(data) )
else:
for j in range(i,length):
#swap
data[i], data[j] = data[j], data[i]
permute(data, i+1, length)
data[i], data[j] = data[j], data[i]
string = "ABC"
n = len(string)
data = list(string)
permute(data, 0, n)
Stack Overflow users have already posted some strong solutions but I wanted to show yet another solution. This one I find to be more intuitive
The idea is that for a given string: we can recurse by the algorithm (pseudo-code):
permutations = char + permutations(string - char) for char in string
I hope it helps someone!
def permutations(string):
"""
Create all permutations of a string with non-repeating characters
"""
permutation_list = []
if len(string) == 1:
return [string]
else:
for char in string:
[permutation_list.append(char + a) for a in permutations(string.replace(char, "", 1))]
return permutation_list
Here's a simple function to return unique permutations:
def permutations(string):
if len(string) == 1:
return string
recursive_perms = []
for c in string:
for perm in permutations(string.replace(c,'',1)):
recursive_perms.append(c+perm)
return set(recursive_perms)
itertools.permutations is good, but it doesn't deal nicely with sequences that contain repeated elements. That's because internally it permutes the sequence indices and is oblivious to the sequence item values.
Sure, it's possible to filter the output of itertools.permutations through a set to eliminate the duplicates, but it still wastes time generating those duplicates, and if there are several repeated elements in the base sequence there will be lots of duplicates. Also, using a collection to hold the results wastes RAM, negating the benefit of using an iterator in the first place.
Fortunately, there are more efficient approaches. The code below uses the algorithm of the 14th century Indian mathematician Narayana Pandita, which can be found in the Wikipedia article on Permutation. This ancient algorithm is still one of the fastest known ways to generate permutations in order, and it is quite robust, in that it properly handles permutations that contain repeated elements.
def lexico_permute_string(s):
''' Generate all permutations in lexicographic order of string `s`
This algorithm, due to Narayana Pandita, is from
https://en.wikipedia.org/wiki/Permutation#Generation_in_lexicographic_order
To produce the next permutation in lexicographic order of sequence `a`
1. Find the largest index j such that a[j] < a[j + 1]. If no such index exists,
the permutation is the last permutation.
2. Find the largest index k greater than j such that a[j] < a[k].
3. Swap the value of a[j] with that of a[k].
4. Reverse the sequence from a[j + 1] up to and including the final element a[n].
'''
a = sorted(s)
n = len(a) - 1
while True:
yield ''.join(a)
#1. Find the largest index j such that a[j] < a[j + 1]
for j in range(n-1, -1, -1):
if a[j] < a[j + 1]:
break
else:
return
#2. Find the largest index k greater than j such that a[j] < a[k]
v = a[j]
for k in range(n, j, -1):
if v < a[k]:
break
#3. Swap the value of a[j] with that of a[k].
a[j], a[k] = a[k], a[j]
#4. Reverse the tail of the sequence
a[j+1:] = a[j+1:][::-1]
for s in lexico_permute_string('data'):
print(s)
output
aadt
aatd
adat
adta
atad
atda
daat
data
dtaa
taad
tada
tdaa
Of course, if you want to collect the yielded strings into a list you can do
list(lexico_permute_string('data'))
or in recent Python versions:
[*lexico_permute_string('data')]
Here is another approach different from what #Adriano and #illerucis posted. This has a better runtime, you can check that yourself by measuring the time:
def removeCharFromStr(str, index):
endIndex = index if index == len(str) else index + 1
return str[:index] + str[endIndex:]
# 'ab' -> a + 'b', b + 'a'
# 'abc' -> a + bc, b + ac, c + ab
# a + cb, b + ca, c + ba
def perm(str):
if len(str) <= 1:
return {str}
permSet = set()
for i, c in enumerate(str):
newStr = removeCharFromStr(str, i)
retSet = perm(newStr)
for elem in retSet:
permSet.add(c + elem)
return permSet
For an arbitrary string "dadffddxcf" it took 1.1336 sec for the permutation library, 9.125 sec for this implementation and 16.357 secs for #Adriano's and #illerucis' version. Of course you can still optimize it.
Here's a slightly improved version of illerucis's code for returning a list of all permutations of a string s with distinct characters (not necessarily in lexicographic sort order), without using itertools:
def get_perms(s, i=0):
"""
Returns a list of all (len(s) - i)! permutations t of s where t[:i] = s[:i].
"""
# To avoid memory allocations for intermediate strings, use a list of chars.
if isinstance(s, str):
s = list(s)
# Base Case: 0! = 1! = 1.
# Store the only permutation as an immutable string, not a mutable list.
if i >= len(s) - 1:
return ["".join(s)]
# Inductive Step: (len(s) - i)! = (len(s) - i) * (len(s) - i - 1)!
# Swap in each suffix character to be at the beginning of the suffix.
perms = get_perms(s, i + 1)
for j in range(i + 1, len(s)):
s[i], s[j] = s[j], s[i]
perms.extend(get_perms(s, i + 1))
s[i], s[j] = s[j], s[i]
return perms
See itertools.combinations or itertools.permutations.
why do you not simple do:
from itertools import permutations
perms = [''.join(p) for p in permutations(['s','t','a','c','k'])]
print perms
print len(perms)
print len(set(perms))
you get no duplicate as you can see :
['stack', 'stakc', 'stcak', 'stcka', 'stkac', 'stkca', 'satck', 'satkc',
'sactk', 'sackt', 'saktc', 'sakct', 'sctak', 'sctka', 'scatk', 'scakt', 'sckta',
'sckat', 'sktac', 'sktca', 'skatc', 'skact', 'skcta', 'skcat', 'tsack',
'tsakc', 'tscak', 'tscka', 'tskac', 'tskca', 'tasck', 'taskc', 'tacsk', 'tacks',
'taksc', 'takcs', 'tcsak', 'tcska', 'tcask', 'tcaks', 'tcksa', 'tckas', 'tksac',
'tksca', 'tkasc', 'tkacs', 'tkcsa', 'tkcas', 'astck', 'astkc', 'asctk', 'asckt',
'asktc', 'askct', 'atsck', 'atskc', 'atcsk', 'atcks', 'atksc', 'atkcs', 'acstk',
'acskt', 'actsk', 'actks', 'ackst', 'ackts', 'akstc', 'aksct', 'aktsc', 'aktcs',
'akcst', 'akcts', 'cstak', 'cstka', 'csatk', 'csakt', 'cskta', 'cskat', 'ctsak',
'ctska', 'ctask', 'ctaks', 'ctksa', 'ctkas', 'castk', 'caskt', 'catsk', 'catks',
'cakst', 'cakts', 'cksta', 'cksat', 'cktsa', 'cktas', 'ckast', 'ckats', 'kstac',
'kstca', 'ksatc', 'ksact', 'kscta', 'kscat', 'ktsac', 'ktsca', 'ktasc', 'ktacs',
'ktcsa', 'ktcas', 'kastc', 'kasct', 'katsc', 'katcs', 'kacst', 'kacts', 'kcsta',
'kcsat', 'kctsa', 'kctas', 'kcast', 'kcats']
120
120
[Finished in 0.3s]
def permute(seq):
if not seq:
yield seq
else:
for i in range(len(seq)):
rest = seq[:i]+seq[i+1:]
for x in permute(rest):
yield seq[i:i+1]+x
print(list(permute('stack')))
All Possible Word with stack
from itertools import permutations
for i in permutations('stack'):
print(''.join(i))
permutations(iterable, r=None)
Return successive r length permutations of elements in the iterable.
If r is not specified or is None, then r defaults to the length of the iterable and all possible full-length permutations are generated.
Permutations are emitted in lexicographic sort order. So, if the input iterable is sorted, the permutation tuples will be produced in sorted order.
Elements are treated as unique based on their position, not on their value. So if the input elements are unique, there will be no repeat values in each permutation.
This is a recursive solution with n! which accepts duplicate elements in the string
import math
def getFactors(root,num):
sol = []
# return condition
if len(num) == 1:
return [root+num]
# looping in next iteration
for i in range(len(num)):
# Creating a substring with all remaining char but the taken in this iteration
if i > 0:
rem = num[:i]+num[i+1:]
else:
rem = num[i+1:]
# Concatenating existing solutions with the solution of this iteration
sol = sol + getFactors(root + num[i], rem)
return sol
I validated the solution taking into account two elements, the number of combinations is n! and the result can not contain duplicates. So:
inpt = "1234"
results = getFactors("",inpt)
if len(results) == math.factorial(len(inpt)) | len(results) != len(set(results)):
print("Wrong approach")
else:
print("Correct Approach")
With recursive approach.
def permute(word):
if len(word) == 1:
return [word]
permutations = permute(word[1:])
character = word[0]
result = []
for p in permutations:
for i in range(len(p)+1):
result.append(p[:i] + character + p[i:])
return result
running code.
>>> permute('abc')
['abc', 'bac', 'bca', 'acb', 'cab', 'cba']
Yet another initiative and recursive solution. The idea is to select a letter as a pivot and then create a word.
def find_premutations(alphabet):
words = []
word =''
def premute(new_word, alphabet):
if not alphabet:
words.append(word)
else:
for i in range(len(alphabet)):
premute(new_word=word + alphabet[i], alphabet=alphabet[0:i] + alphabet[i+1:])
premute(word, alphabet)
return words
# let us try it with 'abc'
a = 'abc'
find_premutations(a)
Output:
abc
acb
bac
bca
cab
cba
Here's a really simple generator version:
def find_all_permutations(s, curr=[]):
if len(s) == 0:
yield curr
else:
for i, c in enumerate(s):
for combo in find_all_permutations(s[:i]+s[i+1:], curr + [c]):
yield "".join(combo)
I think it's not so bad!
def f(s):
if len(s) == 2:
X = [s, (s[1] + s[0])]
return X
else:
list1 = []
for i in range(0, len(s)):
Y = f(s[0:i] + s[i+1: len(s)])
for j in Y:
list1.append(s[i] + j)
return list1
s = raw_input()
z = f(s)
print z
Here's a simple and straightforward recursive implementation;
def stringPermutations(s):
if len(s) < 2:
yield s
return
for pos in range(0, len(s)):
char = s[pos]
permForRemaining = list(stringPermutations(s[0:pos] + s[pos+1:]))
for perm in permForRemaining:
yield char + perm
from itertools import permutations
perms = [''.join(p) for p in permutations('ABC')]
perms = [''.join(p) for p in permutations('stack')]
def perm(string):
res=[]
for j in range(0,len(string)):
if(len(string)>1):
for i in perm(string[1:]):
res.append(string[0]+i)
else:
return [string];
string=string[1:]+string[0];
return res;
l=set(perm("abcde"))
This is one way to generate permutations with recursion, you can understand the code easily by taking strings 'a','ab' & 'abc' as input.
You get all N! permutations with this, without duplicates.
Everyone loves the smell of their own code. Just sharing the one I find the simplest:
def get_permutations(word):
if len(word) == 1:
yield word
for i, letter in enumerate(word):
for perm in get_permutations(word[:i] + word[i+1:]):
yield letter + perm
This program does not eliminate the duplicates, but I think it is one of the most efficient approaches:
s=raw_input("Enter a string: ")
print "Permutations :\n",s
size=len(s)
lis=list(range(0,size))
while(True):
k=-1
while(k>-size and lis[k-1]>lis[k]):
k-=1
if k>-size:
p=sorted(lis[k-1:])
e=p[p.index(lis[k-1])+1]
lis.insert(k-1,'A')
lis.remove(e)
lis[lis.index('A')]=e
lis[k:]=sorted(lis[k:])
list2=[]
for k in lis:
list2.append(s[k])
print "".join(list2)
else:
break
With Recursion
# swap ith and jth character of string
def swap(s, i, j):
q = list(s)
q[i], q[j] = q[j], q[i]
return ''.join(q)
# recursive function
def _permute(p, s, permutes):
if p >= len(s) - 1:
permutes.append(s)
return
for i in range(p, len(s)):
_permute(p + 1, swap(s, p, i), permutes)
# helper function
def permute(s):
permutes = []
_permute(0, s, permutes)
return permutes
# TEST IT
s = "1234"
all_permute = permute(s)
print(all_permute)
With Iterative approach (Using Stack)
# swap ith and jth character of string
def swap(s, i, j):
q = list(s)
q[i], q[j] = q[j], q[i]
return ''.join(q)
# iterative function
def permute_using_stack(s):
stk = [(0, s)]
permutes = []
while len(stk) > 0:
p, s = stk.pop(0)
if p >= len(s) - 1:
permutes.append(s)
continue
for i in range(p, len(s)):
stk.append((p + 1, swap(s, p, i)))
return permutes
# TEST IT
s = "1234"
all_permute = permute_using_stack(s)
print(all_permute)
With Lexicographically sorted
# swap ith and jth character of string
def swap(s, i, j):
q = list(s)
q[i], q[j] = q[j], q[i]
return ''.join(q)
# finds next lexicographic string if exist otherwise returns -1
def next_lexicographical(s):
for i in range(len(s) - 2, -1, -1):
if s[i] < s[i + 1]:
m = s[i + 1]
swap_pos = i + 1
for j in range(i + 1, len(s)):
if m > s[j] > s[i]:
m = s[j]
swap_pos = j
if swap_pos != -1:
s = swap(s, i, swap_pos)
s = s[:i + 1] + ''.join(sorted(s[i + 1:]))
return s
return -1
# helper function
def permute_lexicographically(s):
s = ''.join(sorted(s))
permutes = []
while True:
permutes.append(s)
s = next_lexicographical(s)
if s == -1:
break
return permutes
# TEST IT
s = "1234"
all_permute = permute_lexicographically(s)
print(all_permute)
This code makes sense to me. The logic is to loop through all characters, extract the ith character, perform the permutation on the other elements and append the ith character at the beginning.
If i'm asked to get all permutations manually for string ABC. I would start by checking all combinations of element A:
A AB
A BC
Then all combinations of element B:
B AC
B CA
Then all combinations of element C:
C AB
C BA
def permute(s: str):
n = len(s)
if n == 1: return [s]
if n == 2:
return [s[0]+s[1], s[1]+s[0]]
permutations = []
for i in range(0, n):
current = s[i]
others = s[:i] + s[i+1:]
otherPermutations = permute(others)
for op in otherPermutations:
permutations.append(current + op)
return permutations
Simpler solution using permutations.
from itertools import permutations
def stringPermutate(s1):
length=len(s1)
if length < 2:
return s1
perm = [''.join(p) for p in permutations(s1)]
return set(perm)
def permute_all_chars(list, begin, end):
if (begin == end):
print(list)
return
for current_position in range(begin, end + 1):
list[begin], list[current_position] = list[current_position], list[begin]
permute_all_chars(list, begin + 1, end)
list[begin], list[current_position] = list[current_position], list[begin]
given_str = 'ABC'
list = []
for char in given_str:
list.append(char)
permute_all_chars(list, 0, len(list) -1)
The itertools module in the standard library has a function for this which is simply called permutations.
import itertools
def minion_game(s):
vow ="aeiou"
lsword=[]
ta=[]
for a in range(1,len(s)+1):
t=list(itertools.permutations(s,a))
lsword.append(t)
for i in range(0,len(lsword)):
for xa in lsword[i]:
if vow.startswith(xa):
ta.append("".join(xa))
print(ta)
minion_game("banana")