How do I match vowels? - python

I am having trouble with a small component of a bigger program I am in the works on. Basically I need to have a user input a word and I need to print the index of the first vowel.
word= raw_input("Enter word: ")
vowel= "aeiouAEIOU"
for index in word:
if index == vowel:
print index
However, this isn't working. What's wrong?

Try:
word = raw_input("Enter word: ")
vowels = "aeiouAEIOU"
for index,c in enumerate(word):
if c in vowels:
print index
break
for .. in will iterate over actual characters in a string, not indexes. enumerate will return indexes as well as characters and make referring to both easier.

Just to be different:
import re
def findVowel(s):
match = re.match('([^aeiou]*)', s, flags=re.I)
if match:
index = len(match.group(1))
if index < len(s):
return index
return -1 # not found

The same idea using list comprehension:
word = raw_input("Enter word: ")
res = [i for i,ch in enumerate(word) if ch.lower() in "aeiou"]
print(res[0] if res else None)

index == vowel asks if the letter index is equal to the entire vowel list. What you want to know is if it is contained in the vowel list. See some of the other answers for how in works.

One alternative solution, and arguably a more elegant one, is to use the re library.
import re
word = raw_input('Enter a word:')
try:
print re.search('[aeiou]', word, re.I).start()
except AttributeError:
print 'No vowels found in word'
In essence, the re library implements a regular expression matching engine. re.search() searches for the regular expression specified by the first string in the second one and returns the first match. [aeiou] means "match a or e or i or o or u" and re.I tells re.search() to make the search case-insensitive.

for i in range(len(word)):
if word[i] in vowel:
print i
break
will do what you want.
"for index in word" loops over the characters of word rather than the indices. (You can loop over the indices and characters together using the "enumerate" function; I'll let you look that up for yourself.)

Related

Find the last vowel in a string

I cant seem to find the proper way to search a string for the last vowel, and store any unique consonants after that last vowel. I have it set up like this so far.
word = input('Input a word: ')
wordlow = word.lower()
VOWELS = 'aeiou'
last_vowel_index = 0
for i, ch in enumerate(wordlow):
if ch == VOWELS:
last_vowel_index += i
print(wordlow[last_vowel_index + 1:])
I like COLDSPEED's approach, but for completeness, I will suggest a regex based solution:
import re
s = 'sjdhgdfgukgdk'
re.search(r'([^AEIOUaeiou]*)$', s).group(1)
# 'kgdk'
# '[^AEIOUaeiou]' matches a non-vowel (^ being the negation)
# 'X*' matches 0 or more X
# '$' matches the end of the string
# () marks a group, group(1) returns the first such group
See the docs on python regular expression syntax. Further processing is also needed for the uniqueness part ;)
You can reverse your string, and use itertools.takewhile to take everything until the "last" (now the first after reversal) vowel:
from itertools import takewhile
out = ''.join(takewhile(lambda x: x not in set('aeiou'), string[::-1]))[::-1]
print(out)
'ng'
If there are no vowels, the entire string is returned. Another thing to note is that, you should convert your input string to lower case using a str.lower call, otherwise you risk not counting uppercase vowels.
If you want unique consonants only (without any repetition), a further step is needed:
from collections import OrderedDict
out = ''.join(OrderedDict.fromkeys(out).keys())
Here, the OrderedDict lets us keep order while eliminating duplicates, since, the keys must be unique in any dictionary.
Alternatively, if you want consonants that only appear once, use:
from collections import Counter
c = Counter(out)
out = ''.join(x for x in out if c[x] == 1)
You can simply write a function for that:
def func(astr):
vowels = set('aeiouAEIOU')
# Container for all unique not-vowels after the last vowel
unique_notvowels = set()
# iterate over reversed string that way you don't need to reset the index
# every time a vowel is encountered.
for idx, item in enumerate(astr[::-1], 1):
if item in vowels:
# return the vowel, the index of the vowel and the container
return astr[-idx], len(astr)-idx, unique_notvowels
unique_notvowels.add(item)
# In case no vowel is found this will raise an Exception. You might want/need
# a different behavior...
raise ValueError('no vowels found')
For example:
>>> func('asjhdskfdsbfkdes')
('e', 14, {'s'})
>>> func('asjhdskfdsbfkds')
('a', 0, {'b', 'd', 'f', 'h', 'j', 'k', 's'})
It returns the last vowel, the index of the vowel and all unique not-vowels after the last vowel.
In case the vowels should be ordered you need to use an ordered container instead of the set, for example a list (could be much slower) or collections.OrderedDict (more memory expensive but faster than the list).
You can just reverse your string and loop over each letter until you encounter the first vowel:
for i, letter in enumerate(reversed(word)):
if letter in VOWELS:
break
print(word[-i:])
last_vowel will return the last vowel in the word
last_index will give you the last index of this vowel in the input
Python 2.7
input = raw_input('Input a word: ').lower()
last_vowel = [a for a in input if a in "aeiou"][-1]
last_index = input.rfind(last_vowel)
print(last_vowel)
print(last_index)
Python 3.x
input = input('Input a word: ').lower()
last_vowel = [a for a in input if a in "aeiou"][-1]
last_index = input.rfind(last_vowel)
print(last_vowel)
print(last_index)

Python empty string index error

How do I stop the index error that occurs whenever I input an empty string?
s = input("Enter a phrase: ")
if s[0] in ["a","e","i","o","u","A","E","I","O","U"]:
print("an", s)
else:
print("a", s)
You can use the str.startswith() method to test if a string starts with a specific character; the method takes either a single string, or a tuple of strings:
if s.lower().startswith(tuple('aeiou')):
The str.startswith() method doesn't care if s is empty:
>>> s = ''
>>> s.startswith('a')
False
By using str.lower() you can save yourself from having to type out all vowels in both lower and upper case; you can just store vowels into a separate variable to reuse the same tuple everywhere you need it:
vowels = tuple('aeiou')
if s.lower().startswith(vowels):
In that case I'd just include the uppercase characters; you only need to type it out once, after all:
vowels = tuple('aeiouAEIOU')
if s.startswith(vowels):
This will check the boolean value of s first, and only if it's True it will try to get the first character. Since a empty string is boolean False it will never go there unless you have at least a one-character string.
if s and s[0] in ["a","e","i","o","u","A","E","I","O","U"]:
print("an", s)
General solution using try/except:
s = input("Enter a phrase: ")
try:
if s[0].lower() in "aeiou":
print("an", s)
else:
print("a", s)
except IndexError:
# stuff you want to do if string is empty
Another approach:
s = ""
while len(s) == 0:
s = input("Enter a phrase: ")
if s[0].lower() in "aeiou":
print("an", s)
else:
print("a", s)
Even shorter:if s[0:][0] in vowels: but of course this not pass as 'pythonic' I guess. What it does: a slice (variable[from:to]) may be empty without causing an error. We just have to make sure that only the first element is returned in case the input is longer than 1 character.
Edit: no, sorry, this will not work if s=''. You have to use 'if s[0:][0:] in vowels:' but this clearly crosses a line now. Ugly.
Just use
if s:
if s[0] in vowels:
as suggested before.

Python: Find the longest word in a string

I'm preparing for an exam but I'm having difficulties with one past-paper question. Given a string containing a sentence, I want to find the longest word in that sentence and return that word and its length. Edit: I only needed to return the length but I appreciate your answers for the original question! It helps me learn more. Thank you.
For example: string = "Hello I like cookies". My program should then return "Cookies" and the length 7.
Now the thing is that I am not allowed to use any function from the class String for a full score, and for a full score I can only go through the string once. I am not allowed to use string.split() (otherwise there wouldn't be any problem) and the solution shouldn't have too many for and while statements. The strings contains only letters and blanks and words are separated by one single blank.
Any suggestions? I'm lost i.e. I don't have any code.
Thanks.
EDIT: I'm sorry, I misread the exam question. You only have to return the length of the longest word it seems, not the length + the word.
EDIT2: Okay, with your help I think I'm onto something...
def longestword(x):
alist = []
length = 0
for letter in x:
if letter != " ":
length += 1
else:
alist.append(length)
length = 0
return alist
But it returns [5, 1, 4] for "Hello I like cookies" so it misses "cookies". Why? EDIT: Ok, I got it. It's because there's no more " " after the last letter in the sentence and therefore it doesn't append the length. I fixed it so now it returns [5, 1, 4, 7] and then I just take the maximum value.
I suppose using lists but not .split() is okay? It just said that functions from "String" weren't allowed or are lists part of strings?
You can try to use regular expressions:
import re
string = "Hello I like cookies"
word_pattern = "\w+"
regex = re.compile(word_pattern)
words_found = regex.findall(string)
if words_found:
longest_word = max(words_found, key=lambda word: len(word))
print(longest_word)
Finding a max in one pass is easy:
current_max = 0
for v in values:
if v>current_max:
current_max = v
But in your case, you need to find the words. Remember this quote (attribute to J. Zawinski):
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Besides using regular expressions, you can simply check that the word has letters. A first approach is to go through the list and detect start or end of words:
current_word = ''
current_longest = ''
for c in mystring:
if c in string.ascii_letters:
current_word += c
else:
if len(current_word)>len(current_longest):
current_longest = current_word
current_word = ''
else:
if len(current_word)>len(current_longest):
current_longest = current_word
A final way is to split words in a generator and find the max of what it yields (here I used the max function):
def split_words(mystring):
current = []
for c in mystring:
if c in string.ascii_letters:
current.append(c)
else:
if current:
yield ''.join(current)
max(split_words(mystring), key=len)
Just search for groups of non-whitespace characters, then find the maximum by length:
longest = len(max(re.findall(r'\S+',string), key = len))
For python 3. If both the words in the sentence is of the same length, then it will return the word that appears first.
def findMaximum(word):
li=word.split()
li=list(li)
op=[]
for i in li:
op.append(len(i))
l=op.index(max(op))
print (li[l])
findMaximum(input("Enter your word:"))
It's quite simple:
def long_word(s):
n = max(s.split())
return(n)
IN [48]: long_word('a bb ccc dddd')
Out[48]: 'dddd'
found an error in a previous provided solution, he's the correction:
def longestWord(text):
current_word = ''
current_longest = ''
for c in text:
if c in string.ascii_letters:
current_word += c
else:
if len(current_word)>len(current_longest):
current_longest = current_word
current_word = ''
if len(current_word)>len(current_longest):
current_longest = current_word
return current_longest
I can see imagine some different alternatives. Regular expressions can probably do much of the splitting words you need to do. This could be a simple option if you understand regexes.
An alternative is to treat the string as a list, iterate over it keeping track of your index, and looking at each character to see if you're ending a word. Then you just need to keep the longest word (longest index difference) and you should find your answer.
Regular Expressions seems to be your best bet. First use re to split the sentence:
>>> import re
>>> string = "Hello I like cookies"
>>> string = re.findall(r'\S+',string)
\S+ looks for all the non-whitespace characters and puts them in a list:
>>> string
['Hello', 'I', 'like', 'cookies']
Now you can find the length of the list element containing the longest word and then use list comprehension to retrieve the element itself:
>>> maxlen = max(len(word) for word in string)
>>> maxlen
7
>>> [word for word in string if len(word) == maxlen]
['cookies']
This method uses only one for loop, doesn't use any methods in the String class, strictly accesses each character only once. You may have to modify it depending on what characters count as part of a word.
s = "Hello I like cookies"
word = ''
maxLen = 0
maxWord = ''
for c in s+' ':
if c == ' ':
if len(word) > maxLen:
maxWord = word
word = ''
else:
word += c
print "Longest word:", maxWord
print "Length:", len(maxWord)
Given you are not allowed to use string.split() I guess using a regexp to do the exact same thing should be ruled out as well.
I do not want to solve your exercise for you, but here are a few pointers:
Suppose you have a list of numbers and you want to return the highest value. How would you do that? What information do you need to track?
Now, given your string, how would you build a list of all word lengths? What do you need to keep track of?
Now, you only have to intertwine both logics so computed word lengths are compared as you go through the string.
My proposal ...
import re
def longer_word(sentence):
word_list = re.findall("\w+", sentence)
word_list.sort(cmp=lambda a,b: cmp(len(b),len(a)))
longer_word = word_list[0]
print "The longer word is '"+longer_word+"' with a size of", len(longer_word), "characters."
longer_word("Hello I like cookies")
import re
def longest_word(sen):
res = re.findall(r"\w+",sen)
n = max(res,key = lambda x : len(x))
return n
print(longest_word("Hey!! there, How is it going????"))
Output : there
Here I have used regex for the problem. Variable "res" finds all the words in the string and itself stores them in the list after splitting them.
It uses split() to store all the characters in a list and then regex does the work.
findall keyword is used to find all the desired instances in a string. Here \w+ is defined which tells the compiler to look for all the words without any spaces.
Variable "n" finds the longest word from the given string which is now free of any undesired characters.
Variable "n" uses lambda expressions to define the key len() here.
Variable "n" finds the longest word from "res" which has removed all the non-string charcters like %,&,! etc.
>>>#import regular expressions for the problem.**
>>>import re
>>>#initialize a sentence
>>>sen = "fun&!! time zone"
>>>res = re.findall(r"\w+",sen)
>>>#res variable finds all the words and then stores them in a list.
>>>res
Out: ['fun','time','zone']
>>>n = max(res)
Out: zone
>>>#Here we get "zone" instead of "time" because here the compiler
>>>#sees "zone" with the higher value than "time".
>>>#The max() function returns the item with the highest value, or the item with the highest value in an iterable.
>>>n = max(res,key = lambda x:len(x))
>>>n
Out: time
Here we get "time" because lambda expression discards "zone" as it sees the key is for len() in a max() function.
list1 = ['Happy', 'Independence', 'Day', 'Zeal']
listLen = []
for i in list1:
listLen.append(len(i))
print list1[listLen.index(max(listLen))]
Output - Independence

Finding the number of words with all vowels

I am given a text file that is stored in a list called words_list:
if __name__ = "__main__":
words_file = open('words.txt')
words_list = []
for w in words_file:
w = w.strip().strip('\n')
words_list.append(w)
That's what the list of strings look like (it's a really, really long list of words)
I have to find "all the words" with all of the vowels; so far I have:
def all_vowel(words_list):
count = 0
for w in words_list:
if all_five_vowels(w): # this function just returns true
count = count + 1
if count == 0
print '<None found>'
else
print count
The problem with this is that count adds 1 every time it sees a vowel, whereas I want it to add 1 only if the entire word has all of the vowels.
Simply test if any of your words are a subset of the vowels set:
vowels = set('aeiou')
with open('words.txt') as words_file:
for word in words_file:
word = word.strip()
if vowels.issubset(word):
print word
set.issubset() works on any sequence (including strings):
>>> set('aeiou').issubset('word')
False
>>> set('aeiou').issubset('education')
True
Assuming the word_list variable is an actual list, probably your "all_five_vowels" function is wrong.
This could be an alternative implementation:
def all_five_vowels(word):
vowels = ['a','e','o','i','u']
for letter in word:
if letter in vowels:
vowels.remove(letter)
if len(vowels) == 0:
return True
return False
#Martijn Peters has already posted a solution that is probably the fastest solution in Python. For completeness, here is another good way to solve this in Python:
vowels = set('aeiou')
with open('words.txt') as words_file:
for word in words_file:
word = word.strip()
if all(ch in vowels for ch in word):
print word
This uses the built-in function all() with a generator expression, and it's a handy pattern to learn. This reads as "if all the characters in the word are vowels, print the word." Python also has any() which could be used for checks like "if any character in the word is a vowel, print the word".
More discussion of any() and all() here: "exists" keyword in Python?

How Do I Fix This String Issue in Python?

I am trying to add text with vowels in certain words (that are not consecutive vowels like ie or ei), for example:
Word: 'weird'
Text to add before vowel: 'ib'
Result: 'wibeird'
Thus the text 'ib' was added before the vowel 'e'. Notice how it didn't replace 'i' with 'ib' because when the vowel is consecutive I don't want it to add text.
However, when I do this:
Word: 'dog'
Text to add before vowel: 'ob'
Result: 'doboog'
Correct Result Should Be: 'dobog'
I've been trying to debug my program but I can't seem to figure out the logic in order to make sure it prints 'wibeird' and 'dobog' correctly.
Here is my code, substitute first_syl with 'ob' and word with 'dog' after you run it first with 'weird.
first_syl = 'ib'
word = 'weird'
vowels = "aeiouAEIOU"
diction = "bcdfghjklmnpqrstvwxyz"
empty_str = ""
word_str = ""
ch_str = ""
first_vowel_count = True
for ch in word:
if ch in diction:
word_str += ch
if ch in vowels and first_vowel_count == True:
empty_str += word_str + first_syl + ch
word_str = ""
first_vowel_count = False
if ch in vowels and first_vowel_count == False:
ch_str = ch
if word[-1] not in vowels:
final_str = empty_str + ch_str + word_str
print (final_str)
I am using Python 3.2.3. Also I don't want to use any imported modules, trying to do this to understand the basics of strings and loops in python.
Have you considered regular expressions?
import re
print (re.sub(r'(?<![aeiou])[aeiou]', r'ib\g<0>', 'weird')) #wibeird
print (re.sub(r'(?<![aeiou])[aeiou]', r'ob\g<0>', 'dog')) #dobog
Never use regex when you don't have to. There's a famous quote that goes
Some people, when confronted with a problem, think
“I know, I'll use regular expressions.” Now they have two problems.
This can easily be solved with basic if-then statements. Here's a commented version explaining the logic being used:
first_syl = 'ib' # the characters to be added
word = 'dOg' # the input word
vowels = "aeiou" # instead of a long list of possibilities, we'll use the
# <string>.lower() func. It returns the lowercase equivalent of a
# string object.
first_vowel_count = True # This will tell us if the iterator is at the first vowel
final_str = "" # The output.
for ch in word:
if ch.lower() not in vowels: # If we're at a consonant,
first_vowel_count = True # the next vowel to appear must be the first in
# the series.
elif first_vowel_count: # So the previous "if" statement was false. We're
# at a vowel. This is also the first vowel in the
# series. This means that before appending the vowel
# to output,
final_str += first_syl # we need to first append the vowel-
# predecessor string, or 'ib' in this case.
first_vowel_count = False # Additionally, any vowels following this one cannot
# be the first in the series.
final_str += ch # Finally, we'll append the input character to the
# output.
print(final_str) # "dibOg"

Categories