Find all possible substrings beginning with characters from capturing group - python

I have for example the string BANANA and want to find all possible substrings beginning with a vowel. The result I need looks like this:
"A", "A", "A", "AN", "AN", "ANA", "ANA", "ANAN", "ANANA"
I tried this: re.findall(r"([AIEOU]+\w*)", "BANANA")
but it only finds "ANANA" which seems to be the longest match.
How can I find all the other possible substrings?

s="BANANA"
vowels = 'AIEOU'
sorted(s[i:j] for i, x in enumerate(s) for j in range(i + 1, len(s) + 1) if x in vowels)

This is a simple way of doing it. Sure there's an easier way though.
def subs(txt, startswith):
for i in xrange(len(txt)):
for j in xrange(1, len(txt) - i + 1):
if txt[i].lower() in startswith.lower():
yield txt[i:i + j]
s = 'BANANA'
vowels = 'AEIOU'
print sorted(subs(s, vowels))

A more pythonic way:
>>> def grouper(s):
... return [s[i:i+j] for j in range(1,len(s)+1) for i in range(len(s)-j+1)]
...
>>> vowels = {'A', 'I', 'O', 'U', 'E', 'a', 'i', 'o', 'u', 'e'}
>>> [t for t in grouper(s) if t[0] in vowels]
['A', 'A', 'A', 'AN', 'AN', 'ANA', 'ANA', 'ANAN', 'ANANA']
Benchmark with accepted answer:
from timeit import timeit
s1 = """
sorted(s[i:j] for i, x in enumerate(s) for j in range(i + 1, len(s) + 1) if x in vowels)
"""
s2 = """
def grouper(s):
return [s[i:i+j] for j in range(1,len(s)+1) for i in range(len(s)-j+1)]
[t for t in grouper(s) if t[0] in vowels]
"""
print '1st: ', timeit(stmt=s1,
number=1000000,
setup="vowels = 'AIEOU'; s = 'BANANA'")
print '2nd : ', timeit(stmt=s2,
number=1000000,
setup="vowels = {'A', 'I', 'O', 'U', 'E', 'a', 'i', 'o', 'u', 'e'}; s = 'BANANA'")
result :
1st: 6.08756995201
2nd : 5.25555992126

As already mentioned in the comments, Regex would not be the right way to go about this.
Try this
def get_substr(string):
holder = []
for ix, elem in enumerate(string):
if elem.lower() in "aeiou":
for r in range(len(string[ix:])):
holder.append(string[ix:ix+r+1])
return holder
print get_substr("BANANA")
## ['A', 'AN', 'ANA', 'ANAN', 'ANANA', 'A', 'AN', 'ANA', 'A']

Related

How to get each letter of word list python

l = ['hello', 'world', 'monday']
for i in range(n):
word = input()
l.append(word)
for j in l[0]:
print(j)
Output : h e l l o
I would like to do it for every word in l.
I want to keep my list intact because i would need to get len() of each word and i won't know the number of word that i could possibly get.
I don't know if i'm clear enough, if you need more informations let me know, thanks !
def split_into_letters(word):
return ' '.join(word)
lst = ['hello', 'world', 'monday']
lst_2 = list(map(split_into_letters, lst))
print(lst_2)
You can map each word to a function that splits it into letters
l = ['hello', 'world', 'monday']
list(map(list, l))
#output
[['h', 'e', 'l', 'l', 'o'],
['w', 'o', 'r', 'l', 'd'],
['m', 'o', 'n', 'd', 'a', 'y']]
from itertools import chain
lst = ['hello', 'world', 'monday']
# Print all letters of all words seperated by spaces
print(*chain.from_iterable(lst))
# Print all letters of all words seperated by spaces
# for each word on a new line
for word in lst:
print(*word)

What's the most effective way to Iterate, while manipulating a list or string?

My goal is to create a code breaker in python. So far I have jumbled up the letters and as a result have a list of individual characters.
#Result of inputting the string 'hello world'
['C', 'V', 'N', 'N', 'H', 'X', 'H', 'K', 'N', 'O']
My aim is output this as a string with a space 'CVNNH XHKNO'
Now I have several options but I'm unsure which one would the best:
Do I convert it to a string first before manipulating it or manipulate the list before converting to a string.
I have the following helpers available from the process so far (automatically)
length = [5,5] #list
total_chars = 10 #int
no_of_words = 2 #int
I have converted it to a string CVNNHXHKNO and thought about inserting the space after the 5th letter by calculating a start point[0], mid point[5] and end point[11].
start = 0
mid_point = total_chars - length[0]
print(mid_point)
first_word = message[start:mid_point]
print(first_word)
second_word = message[mid_point:total_chars]
print(second_word)
completed_word = first_word + ' ' + second_word
print(completed_word)
Unfortunately this is just manually and doesn't take into account if there a 5 or more words. I have attempted to iterate over the original list of individual characters in nested for loops using the list length but seem to confuse myself and overthink.
If you have this as inputs:
l = ['C', 'V', 'N', 'N', 'H', 'X', 'H', 'K', 'N', 'O']
length = [5,5] #list
total_chars = 10 #int
no_of_words = 2 #int
Then you can compute your output as follow:
words = []
pos = 0
for i in length:
words.append("".join(l[pos:pos+i]))
pos += i
result = " ".join(words)
print(words)
print(result)
Output:
['CVNNH', 'XHKNO']
CVNNH XHKNO
I do not fully understand your question, but probably what you want is something like
letters = ['C', 'V', 'N', 'N', 'H', 'X', 'H', 'K', 'N', 'O']
length = [5, 5]
words = []
offset = 0
for i in length:
words.append(''.join(letters[offset:offset+i]))
offset += i
string_words = ' '.join(words)
print(string_words)
lst = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
WORD_SIZE = 5
NUM_WORDS = 2 # You could replace this with `len(lst) / WORD_SIZE`
result = ' '.join(
''.join(
lst[i * WORD_SIZE: i * WORD_SIZE + 5]
)
for i in range(NUM_WORDS)
)
# result = 'abcde fghij'
To be able to use infinitely long lists of lengths as input, you could iterate over the length list and join the corresponding letters:
letters = ["A", "S", "I", "M", "P", "L", "E", "T", "E", "S", "T"]
length = [1, 6, 4]
starting_index = 0
for l in length:
print("".join(letters[starting_index:starting_index+l]))
starting_index += l
It looks like you just need your length list describing how many letters in each word:
message = ['C', 'V', 'N', 'N', 'H', 'X', 'H', 'K', 'N', 'O']
length = [5,5]
offset = 0
words = []
for size in length:
words.append(''.join(message[offset:offset+size]))
offset += size
completed_word = ' '.join(words)
print(completed_word)
Output:
CVNNH XHKNO

python consecutive elements to swap the list items [duplicate]

This question already has answers here:
What is the simplest way to swap each pair of adjoining chars in a string with Python?
(20 answers)
Closed 3 years ago.
here my input like:
['a','b','c','d','e','f']
output:
['b','a','d','c','f','e']
I tried to get consecutive list but i'm getting list in between empty string so please make to remove those empty list .
s = list(input().split())
def swap(c, i, j):
c[i], c[j] = c[j], c[i]
return ' '.join(c)
result = swap(s, 0, 1)
print(list(result))
current output:- ['b', ' ', 'a', ' ', 'c', ' ', 'd', ' ', 'e', ' ', 'f']
expected output:-['b', 'a', 'c', 'd', 'e','f']
You just need to return c as list, there is not need to convert to string and back again into a list:
s = ['a','b','c','d','e','f']
def swap(c, i, j):
c[i], c[j] = c[j], c[i]
return c
result = swap(s, 0, 1)
print(result)
Output:
['b', 'a', 'c', 'd', 'e', 'f']
a simple function to swap pairs that does not change the input:
def swap_pairs(list_to_swap):
s = list_to_swap[:] # create copy to not touch the original sequence
for i in range(0, len(s)-1, 2):
s[i], s[i+1] = s[i+1], s[i]
return s
s0 = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
s1 = ['a', 'b', 'c', 'd', 'e', 'f']
print(swap_pairs(s0))
print(swap_pairs(s1))
# ['b', 'a', 'd', 'c', 'f', 'e', 'g']
# ['b', 'a', 'd', 'c', 'f', 'e']
### check if s0 and s1 are untouched:
print(s0)
print(s1)
# ['a', 'b', 'c', 'd', 'e', 'f', 'g']
# ['a', 'b', 'c', 'd', 'e', 'f']
if you want to swap pairs 'in place', i.e. directly change the input, you could shorten the process to
def swap_pairs(s):
for i in range(0, len(s)-1, 2):
s[i], s[i+1] = s[i+1], s[i]
# return s
s1 = ['a', 'b', 'c', 'd', 'e', 'f']
swap_pairs(s1)
print(s1)
# ['b', 'a', 'd', 'c', 'f', 'e']
I think it's a matter of taste if a return statement should be added here. I'd consider it to be more clear not to return something since logically not needed. Anyway, be aware of variable scope.
this is the problem.. your joining on space. change it to the following.
def swap(c, i, j):
c[i], c[j] = c[j], c[i]
return ''.join(c)
for your output you could also do the following.
l = [x for x in [your output list] if x!= ' ']
or
l = [x for x in [your output list] if len(x.strip()) > 0]
Try returning only "C" and use recursion for swapping of all elements of list Then you will get expected Output. Check below code.
Output of below code: ['b','a','d','c','f','e']
s = ['a','b','c','d','e','f']
def swap(c, i, j):
if j<=len(c) and len(c)%2==0:
c[i], c[j] = c[j], c[i]
swap(c,i+2,j+2)
elif j<len(c):
c[i], c[j] = c[j], c[i]
swap(c,i+2,j+2)
return c
result = swap(s, 0, 1)
print(list(result))
and if you want Only output= ['b','a','c','d','e','f'] then no need of recursion just return c. Check below code:
s = ['a','b','c','d','e','f']
def swap(c, i, j):
c[i], c[j] = c[j], c[i]
return c
result = swap(s, 0, 1)
print(list(result))

Splitting a python list into smaller lists at spaces

I have a list which consists of alphabets and spaces:
s = ['a','b',' ',' ','b','c',' ','d','e','f','g','h',' ','i','j'];
I need to split it into smaller individual lists:
s=[['a','b'],['b','c'],['d','e','f','g','h'],['i','j']]
I am new to python.
The entire code:
#To get the longest alphabetical substring from a given string
s = input("Enter any string: ")
alpha_string = []
for i in range(len(s)-1): #if length is 5: 0,1,2,3
if(s[i] <= s[i+1]):
if i == len(s)-2:
alpha_string.append(s[i])
alpha_string.append(s[i+1])
else:
alpha_string.append(s[i])
if(s[i] > s[i+1] and s[i-1] <= s[i]):
alpha_string.append(s[i])
alpha_string.append(" ")
if(s[i] > s[i+1] and s[i-1] > s[i]):
alpha_string.append(" ")
print(alpha_string)
#Getting the position of each space in the list
position = []
for j in range(len(alpha_string)):
if alpha_string[j] == " ":
position.append([j])
print(position)
#Using the position of each space to create slices into the list
start = 0
final_string = []
for k in range(len(position)):
final_string.append(alpha_string[start:position[k]])
temp = position[k]
start = temp
print(final_string)`
Try a list comprehension as follows
print([list(i) for i in ''.join(s).split(' ') if i != ''])
[['a', 'b'], ['b', 'c'], ['d', 'e', 'f', 'g', 'h'], ['i', 'j']]
Here generator will be perfect :
s = ['a','b',' ',' ','b','c',' ','d','e','f','g','h',' ','i','j'];
def generator_approach(list_):
list_s=[]
for i in list_:
if i==' ':
if list_s:
yield list_s
list_s=[]
else:
list_s.append(i)
yield list_s
closure=generator_approach(s)
print(list(closure))
output:
[['a', 'b'], ['b', 'c'], ['d', 'e', 'f', 'g', 'h'], ['i', 'j']]
Or simply in one line, result = [list(item) for item in ''.join(s).split()]
This is one functional way.
s = ['a','b',' ',' ','b','c',' ','d','e','f','g','h',' ','i','j']
res = list(map(list, ''.join(s).split()))
# [['a', 'b'], ['b', 'c'], ['d', 'e', 'f', 'g', 'h'], ['i', 'j']]
from itertools import groupby
s = ['a','b',' ',' ','b','c',' ','d','e','f','g','h',' ','i','j']
t = [list(g) for k, g in groupby(s, str.isspace) if not k]
print(t)
OUTPUT
[['a', 'b'], ['b', 'c'], ['d', 'e', 'f', 'g', 'h'], ['i', 'j']]
This doesn't require the strings to be single letter like many of the join() and split() solutions:
>>> from itertools import groupby
>>>
>>> s = ['abc','bcd',' ',' ','bcd','cde',' ','def','efg','fgh','ghi','hij',' ','ijk','jkl']
>>>
>>> [list(g) for k, g in groupby(s, str.isspace) if not k]
[['abc', 'bcd'], ['bcd', 'cde'], ['def', 'efg', 'fgh', 'ghi', 'hij'], ['ijk', 'jkl']]
>>>
I can never pass up an opportunity to (ab)use groupby()

Is there a function in python to split a word into a list? [duplicate]

This question already has answers here:
How do I split a string into a list of characters?
(15 answers)
Closed 2 years ago.
Is there a function in python to split a word into a list of single letters? e.g:
s = "Word to Split"
to get
wordlist = ['W', 'o', 'r', 'd', ' ', 't', 'o', ' ', 'S', 'p', 'l', 'i', 't']
>>> list("Word to Split")
['W', 'o', 'r', 'd', ' ', 't', 'o', ' ', 'S', 'p', 'l', 'i', 't']
The easiest way is probably just to use list(), but there is at least one other option as well:
s = "Word to Split"
wordlist = list(s) # option 1,
wordlist = [ch for ch in s] # option 2, list comprehension.
They should both give you what you need:
['W','o','r','d',' ','t','o',' ','S','p','l','i','t']
As stated, the first is likely the most preferable for your example but there are use cases that may make the latter quite handy for more complex stuff, such as if you want to apply some arbitrary function to the items, such as with:
[doSomethingWith(ch) for ch in s]
The list function will do this
>>> list('foo')
['f', 'o', 'o']
Abuse of the rules, same result:
(x for x in 'Word to split')
Actually an iterator, not a list. But it's likely you won't really care.
text = "just trying out"
word_list = []
for i in range(len(text)):
word_list.append(text[i])
print(word_list)
Output:
['j', 'u', 's', 't', ' ', 't', 'r', 'y', 'i', 'n', 'g', ' ', 'o', 'u', 't']
The easiest option is to just use the list() command. However, if you don't want to use it or it dose not work for some bazaar reason, you can always use this method.
word = 'foo'
splitWord = []
for letter in word:
splitWord.append(letter)
print(splitWord) #prints ['f', 'o', 'o']
def count():
list = 'oixfjhibokxnjfklmhjpxesriktglanwekgfvnk'
word_list = []
# dict = {}
for i in range(len(list)):
word_list.append(list[i])
# word_list1 = sorted(word_list)
for i in range(len(word_list) - 1, 0, -1):
for j in range(i):
if word_list[j] > word_list[j + 1]:
temp = word_list[j]
word_list[j] = word_list[j + 1]
word_list[j + 1] = temp
print("final count of arrival of each letter is : \n", dict(map(lambda x: (x, word_list.count(x)), word_list)))

Categories