Permutations Python - python

How do I take, for example, this tuple ("A", "E", "L") and generate all possible words without repeating the letters? The result would be 3 words with only one letter, 6 words with two letters and 6 words with 3 letters.
I tried this:
def gererate(tuplo_letras):
return [i for i in itertools.permutations(tuplo_letras)]
def final(arg):
return generate(list(map(''.join, itertools.permutations(arg))))

You can use itertools.permutations and iterate over all the lengthes of the permutations you want to cover. Note that permutations takes two arguments, the iterable and the desired length of the permutations you want:
from itertools import permutations, chain
tpl = ("A", "E", "L")
[''.join(p) for p in chain(*(permutations(tpl, l+1) for l in range(len(tpl))))]
# ['A', 'E', 'L', 'AE', 'AL', 'EA', 'EL', 'LA', 'LE', 'AEL', 'ALE', 'EAL', 'ELA', 'LAE', 'LEA']
If you need them grouped you can nest the comprehensions accordingly:
[[''.join(p) for p in (permutations(tpl, l+1))] for l in range(len(tpl))]
# [['A', 'E', 'L'], ['AE', 'AL', 'EA', 'EL', 'LA', 'LE'], ['AEL', 'ALE', 'EAL', 'ELA', 'LAE', 'LEA']]

Related

Permutations of two lists but it has to be alternating consonant and vowel

I have the string 'BANANA' (could be any).
I would like to find all possible permutations between characters but it needs to be one consonant and one vowel alternating.
So far, what I've achieved are normal permutations, but I'm struggling with the alternation part.
from itertools import permutations
# Set for avoiding duplicates
return set([''.join(j) for i in range(1, len(string_) + 1) for j
in permutations(string_, i)])
The output I would expect is:
['B', 'A', 'N', 'BA', 'AN', 'NA', 'BAN', 'BANA', etc]
But not (since they would be consonant after consonant or vowel after vowel) things like:
['BN', 'NB', AA', etc]
In your list comprehension, you can filter out permutations where there are consecutive vowels or consonants.
from itertools import permutations
VOWELS = {"A", "E", "I", "O", "U"}
string_ = "BANANA"
def is_alternating(text):
def is_vowel(ch):
return ch in VOWELS
return all(
is_vowel(text[index-1]) != is_vowel(text[index])
for index in range(1, len(text))
)
perms = set(
[
''.join(j)
for i in range(1, len(string_) + 1)
for j in permutations(string_, i)
if is_alternating(j)
]
)
print(perms)
Output
{'ANABA', 'BAN', 'ANANA', 'BANA', 'BANANA', 'AB', 'ANANAB', 'N', 'ABANAN', 'NABA', 'A', 'ANA', 'ABANA', 'ANAN', 'B', 'ANABAN', 'NAB', 'NAN', 'NABANA', 'AN', 'ABA', 'NABAN', 'BA', 'NANAB', 'ABAN', 'NANA', 'ANAB', 'NANABA', 'NA', 'BANAN'}

finding all possible subsequences in a given string

I have written this piece of code and it prints all substrings of a given string but I want it to print all the possible subsequences.
from itertools import combinations_with_replacement
s = 'MISSISSIPPI'
lst = []
for i,j in combinations_with_replacement(range(len(s)), 2):
print(s[i:(j+1)])
Use combinations to get subsequences. That's what combinations is for.
from itertools import combinations
def all_subsequences(s):
out = set()
for r in range(1, len(s) + 1):
for c in combinations(s, r):
out.add(''.join(c))
return sorted(out)
Example:
>>> all_subsequences('HELLO')
['E', 'EL', 'ELL', 'ELLO', 'ELO', 'EO', 'H', 'HE', 'HEL', 'HELL', 'HELLO', 'HELO',
'HEO', 'HL', 'HLL', 'HLLO', 'HLO', 'HO', 'L', 'LL', 'LLO', 'LO', 'O']
>>> all_subsequences('WORLD')
['D', 'L', 'LD', 'O', 'OD', 'OL', 'OLD', 'OR', 'ORD', 'ORL', 'ORLD', 'R', 'RD',
'RL', 'RLD', 'W', 'WD', 'WL', 'WLD', 'WO', 'WOD', 'WOL', 'WOLD', 'WOR', 'WORD',
'WORL', 'WORLD', 'WR', 'WRD', 'WRL', 'WRLD']
One simple way to do so is to verify if the list you are making already has the case that you are iterating over. If you have already seen it, then skip it, if not, then append it to your list of seen combinations.
from itertools import combinations_with_replacement
s = 'MISSISSIPPI'
lst = []
for i,j in combinations_with_replacement(range(len(s)), 2):
if s[i:(j+1)] not in lst:
lst.append(s[i:(j+1)]) # save new combination into list
print(lst[-1]) # print new combination
To be sure that all cases are covered, it really helps to make a drawing of combination that the loop will go over. Suppose a generic string, where letters are represented by their position in the python list, for example 0 to 3.
Here are the numbers generated by "combinations_with_replacement"
00, 01, 02, 03,
11, 12, 13,
22, 23,
33

Create DNA Sequences of length n

How can we use recursion to calculate all dna sequences of length n in a function.
For instance if the function is given 2, it returns ['AA', 'AC', 'AT', 'AG', 'CA', 'CC', 'CT', 'CG', 'TA', 'TC', 'TT', 'TG', 'GA', 'GC', 'GT', 'GG']
etc...
functools.permutations will give all combinations of a given iterable, the second argument r is the length of the combinations returned
itertools.permutations('ACGT', length)
Here is one way:
def all_seq(n, curr, e, ways):
"""All possible sequences of size n given elements e.
ARGS
n: size of sequence
curr: a list used for constructing sequences
e: the list of possible elements (could have been a global list instead)
ways: the final list of sequences
"""
if len(curr) == n:
ways.append(''.join(curr))
return
for element in e:
all_seq(n, list(curr) + [element], e, ways)
perms = []
all_seq(2, [], ['A', 'C', 'T', 'G'], perms)
print(perms)
The ouput:
['AA', 'AC', 'AT', 'AG', 'CA', 'CC', 'CT', 'CG', 'TA', 'TC', 'TT', 'TG', 'GA', 'GC', 'GT', 'GG']
You actually want itertools.product('ACGT', repeat=n). Note that this will grow enormously fast (4^n elements of n length).
If your assignment is to do it recursively, consider how you would get all n+1-length options that start with a n-length prefix. The naive recursive option might be rather slow compared to itertools, if you need to use it in anger.

how to get all possible strings for the alphabet letters in python?

For example, given the alphabet = 'abcd', how I can get this output in Python:
a
aa
b
bb
ab
ba
(...)
iteration by iteration.
I already tried the powerset() function that is found here on stackoverflow,
but that doesn't repeat letters in the same string.
Also, if I want to set a minimum and maximum limit that the string can have, how can I?
For example min=3 and max=4, abc, aaa, aba, ..., aaaa, abca, abcb, ...
You can use combinations_with_replacement from itertools (docs). The function combinations_with_replacement takes an iterable object as its first argument (e.g. your alphabet) and the desired length of the combinations to generate. Since you want strings of different lengths, you can loop over each desired length.
For example:
from itertools import combinations_with_replacement
def get_all_poss_strings(alphabet, min_length, max_length):
poss_strings = []
for r in range(min_length, max_length + 1):
poss_strings += combinations_with_replacement(alphabet, r)
return ["".join(s) for s in poss_strings] # combinations_with_replacement returns tuples, so join them into individual strings
Sample:
alphabet = "abcd"
min_length = 3
max_length = 4
get_all_poss_strings(alphabet, min_length, max_length)
Output:
['aaa', 'aab', 'aac', 'aad', 'abb', 'abc', 'abd', 'acc', 'acd', 'add', 'bbb', 'bbc', 'bbd', 'bcc', 'bcd', 'bdd', 'ccc', 'ccd', 'cdd', 'ddd', 'aaaa', 'aaab', 'aaac', 'aaad', 'aabb', 'aabc', 'aabd', 'aacc', 'aacd', 'aadd', 'abbb', 'abbc', 'abbd', 'abcc', 'abcd', 'abdd', 'accc', 'accd', 'acdd', 'addd', 'bbbb', 'bbbc', 'bbbd', 'bbcc', 'bbcd', 'bbdd', 'bccc', 'bccd', 'bcdd', 'bddd', 'cccc', 'cccd', 'ccdd', 'cddd', 'dddd']
Edit:
If order also matters for your strings (as indicated by having "ab" and "ba"), you can use the following function to get all permutations of all lengths in a given range:
from itertools import combinations_with_replacement, permutations
def get_all_poss_strings(alphabet, min_length, max_length):
poss_strings = []
for r in range(min_length, max_length + 1):
combos = combinations_with_replacement(alphabet, r)
perms_of_combos = []
for combo in combos:
perms_of_combos += permutations(combo)
poss_strings += perms_of_combos
return list(set(["".join(s) for s in poss_strings]))
Sample:
alphabet = "abcd"
min_length = 1
max_length = 2
get_all_poss_strings(alphabet, min_length, max_length)
Output:
['a', 'aa', 'ab', 'ac', 'ad', 'b', 'ba', 'bb', 'bc', 'bd', 'c', 'ca', 'cb', 'cc', 'cd', 'd', 'da', 'db', 'dc', 'dd']
You can use the product function of itertools with varying lengths. The result differs in order from the example you give, but this may be what you want. This results in a generator that you can use to get all your desired strings. This code lets you set a minimum and a maximum length of the returned strings. If you do not specify a value for parameter maxlen then the generator is infinite. Be sure you have a way to stop it or you will get an infinite loop.
import itertools
def allcombinations(alphabet, minlen=1, maxlen=None):
thislen = minlen
while maxlen is None or thislen <= maxlen:
for prod in itertools.product(alphabet, repeat=thislen):
yield ''.join(prod)
thislen += 1
for c in allcombinations('abcd', minlen=1, maxlen=2):
print(c)
This example gives the printout which is similar to your first example, though in a different order.
a
b
c
d
aa
ab
ac
ad
ba
bb
bc
bd
ca
cb
cc
cd
da
db
dc
dd
If you really want a full list, just use
list(allcombinations('abcd', minlen=1, maxlen=2))

How to generate subpeptides (special combinations) from a string representing a cyclic peptide?

Here is my problem: I have a sequence representing a cyclic peptide and I'm trying to create a function that generate all possible subpeptides. A subpeptide is created when bonds between 2 aminoacids are broken. For example: for the peptide 'ABCD', its subpeptides would be 'A', 'B', 'C', 'D', 'AB', 'BC', 'CD', 'DA', 'ABC', 'BCD', 'CDA', DAB'. Thus, the amount of possible subpeptides from a peptide of length n will always be n*(n-1). Note that not all of them are substrings from peptide ('DA', 'CDA'...).
I've written a code that generate combinations. However, there are some excessive elements, such as not linked aminoacids ('AC', 'BD'...). Does anyone have a hint of how could I eliminate those, since peptide may have a different length each time the function is called? Here's what I have so far:
def Subpeptides(peptide):
subpeptides = []
from itertools import combinations
for n in range(1, len(peptide)):
subpeptides.extend(
[''.join(comb) for comb in combinations(peptide, n)]
)
return subpeptides
Here are the results for peptide 'ABCD':
['A', 'B', 'C', 'D', 'AB', 'AC', 'AD', 'BC', 'BD', 'CD', 'ABC', 'ABD', 'ACD', 'BCD']
The order of aminoacids is not important, if they represent a real sequence of the peptide. For example, 'ABD' is a valid form of 'DAB', since D and A have a bond in the cyclic peptide.
I'm using Python.
it's probably easier to just generate them all:
def subpeptides(peptide):
l = len(peptide)
looped = peptide + peptide
for start in range(0, l):
for length in range(1, l):
print(looped[start:start+length])
which gives:
>>> subpeptides("ABCD")
A
AB
ABC
B
BC
BCD
C
CD
CDA
D
DA
DAB
(if you want a list instead of printing, just change print(...) to yield ... and you have a generator).
all the above does is enumerate the different places the first bond could be broken, and then the different products you would get if the next bond broke after one, two, or three (in this case) acids. looped is just an easy way to avoid having the logic of going "round the loop".
Last term is missed
you can use below code
def subpeptides(peptide):
l = len(peptide)
ls=[]
looped = peptide + peptide
for start in range(0, l):
for length in range(1, l):
ls.append( (looped[start:start+length]))
ls.append(peptide)
return ls
you can use this one
>>>aa='ABCD'
>>> F=[]
>>> B=[]
>>> for j in range(1,len(aa)+1,1):
for i in range(0,len(aa),1):
A=str.split(((aa*j)[i:i+j]))
B=B+A
C=(B[0:len(aa)*len(aa)-len(aa)+1])
it gives you:
C=['A', 'B', 'C', 'D', 'AB', 'BC', 'CD', 'DA', 'ABC', 'BCD', 'CDA', 'DAB', 'ABCD']
i hope this helps , btw im doing the coursera course too if it would be of interest joining up forces , let me know

Categories