How Do I Fix This String Issue in Python?

How Do I Fix This String Issue in Python? - python

I am trying to add text with vowels in certain words (that are not consecutive vowels like ie or ei), for example:
Word: 'weird'
Text to add before vowel: 'ib'
Result: 'wibeird'
Thus the text 'ib' was added before the vowel 'e'. Notice how it didn't replace 'i' with 'ib' because when the vowel is consecutive I don't want it to add text.
However, when I do this:
Word: 'dog'
Text to add before vowel: 'ob'
Result: 'doboog'
Correct Result Should Be: 'dobog'
I've been trying to debug my program but I can't seem to figure out the logic in order to make sure it prints 'wibeird' and 'dobog' correctly.
Here is my code, substitute first_syl with 'ob' and word with 'dog' after you run it first with 'weird.
first_syl = 'ib'
word = 'weird'
vowels = "aeiouAEIOU"
diction = "bcdfghjklmnpqrstvwxyz"
empty_str = ""
word_str = ""
ch_str = ""
first_vowel_count = True
for ch in word:
if ch in diction:
word_str += ch
if ch in vowels and first_vowel_count == True:
empty_str += word_str + first_syl + ch
word_str = ""
first_vowel_count = False
if ch in vowels and first_vowel_count == False:
ch_str = ch
if word[-1] not in vowels:
final_str = empty_str + ch_str + word_str
print (final_str)
I am using Python 3.2.3. Also I don't want to use any imported modules, trying to do this to understand the basics of strings and loops in python.

Have you considered regular expressions?
import re
print (re.sub(r'(?<![aeiou])[aeiou]', r'ib\g<0>', 'weird')) #wibeird
print (re.sub(r'(?<![aeiou])[aeiou]', r'ob\g<0>', 'dog')) #dobog

Never use regex when you don't have to. There's a famous quote that goes
Some people, when confronted with a problem, think
“I know, I'll use regular expressions.” Now they have two problems.
This can easily be solved with basic if-then statements. Here's a commented version explaining the logic being used:
first_syl = 'ib' # the characters to be added
word = 'dOg' # the input word
vowels = "aeiou" # instead of a long list of possibilities, we'll use the
# <string>.lower() func. It returns the lowercase equivalent of a
# string object.
first_vowel_count = True # This will tell us if the iterator is at the first vowel
final_str = "" # The output.
for ch in word:
if ch.lower() not in vowels: # If we're at a consonant,
first_vowel_count = True # the next vowel to appear must be the first in
# the series.
elif first_vowel_count: # So the previous "if" statement was false. We're
# at a vowel. This is also the first vowel in the
# series. This means that before appending the vowel
# to output,
final_str += first_syl # we need to first append the vowel-
# predecessor string, or 'ib' in this case.
first_vowel_count = False # Additionally, any vowels following this one cannot
# be the first in the series.
final_str += ch # Finally, we'll append the input character to the
# output.
print(final_str) # "dibOg"

Related

How do I send a character from a string that is NOT a letter or a number to the end of the string?

I am doing a Pig Latin code in which the following words are supposed to return the following responses:
"computer" == "omputercay"
"think" == "inkthay"
"algorithm" == "algorithmway"
"office" == "officeway"
"Computer" == "Omputercay"
"Science!" == "Iencescay!"
However, for the last word, my code does not push the '!' to the end of the string. What is the code that will make this happen?
All of them return the correct word apart from the last which returns "Ience!Scay!"
def pigLatin(word):
vowel = ("a","e","i","o","u")
first_letter = word[0]
if first_letter in vowel:
return word +'way'
else:
l = len(word)
i = 0
while i < l:
i = i + 1
if word[i] in vowel:
x = i
new_word = word[i:] + word[:i] + "ay"
if word[0].isupper():
new_word = new_word.title()
return new_word

For simplicity, how about you check if the word contains an exlamation point ! at the end and if it does just remove it and when you are done add it back. So instead of returning just check place ! at the end (if you discovered it does at the beggining).
def pigLatin(word):
vowel = ("a","e","i","o","u")
first_letter = word[0]
if first_letter in vowel:
return word +'way'
else:
hasExlamation = False
if word[-1] == '!':
word = word[:-1] # removes last letter
hasExlamation = True
l = len(word)
i = 0
while i < l:
i = i + 1
if word[i] in vowel:
x = i
new_word = word[i:] + word[:i] + "ay"
if word[0].isupper():
new_word = new_word.title()
break # do not return just break out of the `while` loop
if hasExlamation:
new_word += "!" # same as new_word = new_word + "!"
return new_word
That way it does not treat ! as a normal letter and the output is Iencescay!. You can of course do this with any other character similarly
specialCharacters = ["!"] # define this outside the function
def pigLatin():
# all of the code above
if word in specialCharacters:
hasSpecialCharacter = True
# then you can continue the same way

Regular expressions to the rescue. A regex pattern with word boundaries will make your life much easier in this case. A word boundary is exactly what it sounds like - it indicates the start- or end of a word, and is represented in the pattern with \b. In your case, the ! would be such a word boundary. The "word" itself consists of any character in the set a-z, A-Z, 0-9 or underscore, and is represented by \w in the pattern. The + means, one or more \w characters.
So, if the pattern is r"\b\w+\b", this will match any word (consisting of any of a-zA-Z0-9_), with leading or succeeding word boundaries.
import re
pattern = r"\b\w+\b"
sentence = "computer think algorithm office Computer Science!"
print(re.findall(pattern, sentence))
Output:
['computer', 'think', 'algorithm', 'office', 'Computer', 'Science']
>>>
Here, we're using re.findall to get a list of all substrings that matched the pattern. Notice, no whitespace or punctuation is included.
Let's introduce re.sub, which takes a pattern to look for, a string to look through, and another string with which to replace any match it finds. Instead of a replacement-string, you can instead pass in a function. This function must take a match object as a parameter, and must return a string with which to replace the current match.
import re
pattern = r"\b\w+\b"
sentence = "computer think algorithm office Computer Science!"
def replace(match):
return "*" * len(match.group())
print(re.sub(pattern, replace, sentence))
Output:
******** ***** ********* ****** ******** *******!
>>>
That's just for demonstration purposes.
Let's change gears for a second:
from string import ascii_letters as alphabet
print(alphabet)
Output:
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
>>>
That's handy for creating a string containing only consonants:
from string import ascii_letters as alphabet
consonants = "".join(set(alphabet) ^ set("aeiouAEIOU"))
print(consonants)
Output:
nptDPbHvsxKNWdYyrTqVQRlBCZShzgGjfkJMLmFXwc
>>>
We've taken the difference between the set of all alpha-characters and the set of only vowels. This yields the set of only consonants. Notice, that the order of the characters it not preserved in a set, but it doesn't matter in our case, since we'll be effectively treating this string as a set - testing for membership (if a character is in this string, it must be a consonant. The order does not matter).
Let's take advantage of this, and modify our pattern from earlier. Let's add two capturing groups - the first will capture any leading consonants (if they exist), the second will capture all remaining alpha characters (consonants or vowels) before the terminating word boundary:
import re
from string import ascii_letters as alphabet
consonants = "".join(set(alphabet) ^ set("aeiouAEIOU"))
pattern = fr"\b([{consonants}]*)(\w+)\b"
word = "computer"
match = re.match(pattern, word)
if match is not None:
print(f"Group one is \"{match.group(1)}\"")
print(f"Group two is \"{match.group(2)}\"")
Output:
Group one is "c"
Group two is "omputer"
>>>
As you can see, the first group captured c, and the second group captured omputer. Separating the match into two groups will be useful later when we construct the pig-latin translation. We can get even cuter by naming our capturing groups. This isn't required, but it will make things a bit easier to read later on:
pattern = fr"\b(?P<prefix>[{consonants}]*)(?P<rest>\w+)\b"
Now, the first capturing group is named prefix, and can be accessed via match.group("prefix"), rather than match.group(1). The second capturing group is named rest, and can be accessed via match.group("rest") instead of match.group(2).
Putting it all together:
import re
from string import ascii_letters as alphabet
consonants = "".join(set(alphabet) ^ set("aeiouAEIOU"))
pattern = fr"\b(?P<prefix>[{consonants}]*)(?P<rest>\w+)\b"
sentence = "computer think algorithm office Computer Science!"
def to_pig_latin(match):
rest = match.group("rest")
prefix = match.group("prefix")
result = rest + prefix
if len(prefix) == 0:
# if the 'prefix' capturing group was empty
# the word must have started with a vowel
# so, the suffix is 'way'
result += "way"
# that also means we need to check if the first character...
# ... (which must be in 'rest') was upper-case.
if rest[0].isupper():
result = result.title()
else:
result += "ay"
if prefix[0].isupper():
result = result.title()
return result
print(re.sub(pattern, to_pig_latin, sentence))
Output:
omputercay inkthay algorithmway officeway Omputercay Iencescay!
>>>
That was the verbose version. The definition of to_pig_latin can be shortened to:
def to_pig_latin(match):
rest = match.group("rest")
prefix = match.group("prefix")
return (str, str.title)[(prefix or rest)[0].isupper()](rest + prefix + "way"[bool(prefix):])

How to take non alpha characters out of a string and put them back into the string in the same order?

I'm trying to make a Pig Latin translator, but I have an issue with my code where it doesn't work properly when a word such as "hi" or /chair/ is input. This is because I need my code to detect that the input has a non-alpha character, take it out of the string, and put it back in when it's done changing the string. I am struggling to make this, though.
# Pig Latin 11/11/20
#!/usr/bin/env python3
vowels = ("A", "E", "I", "O", "U")
message = input("Input text to be translated to Pig
Latin\n")
message = message.split()
not_alpha = {}
new_message = []
for word in message:
This commented out section is what I tried to solve this problem with, before the word would go through the editing process, it would go through here and remove the non_alpha keys and place them in a dictionary called not_alpha. My thought process was that I would place it in a dictionary with the character as the key, and the index in the string as the value. Then, at the end, I would loop through every letter in word and reconstruct the word with all the non-alpha characters in order.
# for letter in word:
# if not letter.isalpha():
# not_alpha[letter] = word.index(letter)
# word = word
# for k in not_alpha.keys():
# word.replace(k, "")
letter_editing = word[0]
if word.isalpha():
if letter_editing.upper() in vowels:
word += "yay"
else:
letter_editing = word[0:2]
if letter_editing.upper() in vowels:
word = word[1:] + word[0] + "ay"
else:
word = word[2:] + word[0:2] + "ay"
# for letter in word:
# if word.index(letter) in not_alpha.values():

While I am not positive of all the rules of pig latin you need to apply, from what I see you are only applying two:
Rule 1 - First Letter is Consonant - in which case you are moving
the first letter to the end of the word and adding ay.
Rule 2 - First Letter is Vowel - in which case you are simply adding ay to
the end of the word.
Given these 2 rules and the following observations:
The input message is a stream of alphanumeric characters, punctuation characters and white space characters of length L.
the start of a word within the message is delineated by one or more punctuation or whitespace characters preceding the word.
the end of a word is delineated by either a punctuation character, a whitespace character or end of message.
You can accomplish the translation as follows:
from string import whitespace, punctuation
vowels = 'aeiou'
message = "Oh! my what a beautiful day for the fox to jump the fence!"
output = "" #The output buffer
temp_buf = "" #Temp storage for latin add-ons
word_found = False #Flag to identify a word has been found
cp = 0 # Character pointer to letter of interest in message
while cp < len(message):
ch = message[cp]
if whitespace.find(ch) >= 0 or punctuation.find(ch) >= 0:
word_found = False
if temp_buf:
output += temp_buf
temp_buf = ""
output += ch
else:
if word_found:
output += ch
else:
word_found = True
if vowels.find(ch.lower()) >= 0:
temp_buf = "ay"
output += ch
else:
temp_buf += ch + "ay"
cp += 1
if temp_buf:
output += temp_buf
print(output)

I'd implement this using the callback form of re.sub, where you can have a function determine the replacement for the regular expression match.
import re
vowels = "aeiou"
def mangle_word(match):
word = match.group(0)
if word[0].lower() in vowels:
return word + "ay"
return word[1:] + word[0] + "ay"
message = "Oh! my what a beautiful day for the fox to jump 653 fences!"
print(re.sub("[a-z]+", mangle_word, message, flags=re.I))
outputs
Ohay! ymay hatway aay eautifulbay ayday orfay hetay oxfay otay umpjay 653 encesfay!

Not converting letters to uppercase and lowercase in python

I'm trying to make a program that will convert any text into a different form. That means that a text such as 'hi there' becomes 'hI tHeRe'.
list = []
word = input('Enter in a word or a sentence! ')
for num in range(len(word)):
list.clear()
list.append('i')
letter = word[num]
for x in range(len(list)):
if x % 2 == 0:
i = word.index(letter)
place = letter.lower()
word = word.replace(word[i], place)
if not x % 2 == 0:
i = word.index(letter)
place = letter.upper()
word = word.replace(word[i], place)
print(word)
However, when I run the code it just prints the same string as normal.

When using replace, you have to assign the result to your variable:
word = word.replace(word[i], place)
However, replace is actually not what you want here. replace replaces all instances of a certain pattern with a new string. In your current code, every instance of whatever letter word[i] represents will be replaced with the result of .lower() or .upper().
You also don't want to use the word list, since doing so will shadow the Python built-in list class.
If you want to keep most of your original logic, you can follow #khelwood's suggestion in the comments and end up with the following:
word = input('Enter in a word or a sentence! ')
wordList = list(word)
for i in range(len(word)):
if i % 2 == 0:
wordList[i] = word[i].lower()
else:
wordList[i] = word[i].upper()
print(''.join(wordList))

Here is one of my previous codes, you can change all the variable names to whatever you see fit.
s = input('Enter in a word or string.')
ret = ""
i = True # capitalize
for char in s:
if i:
ret += char.upper()
else:
ret += char.lower()
if char != ' ':
i = not i
print(ret)
I hope it works for you.

Try this one liner -
a = 'hi there'
''.join([i[1].lower() if i[0]%2==0 else i[1].upper() for i in enumerate(a)])
'hI ThErE'
If you care about each word starting from lowercase then this nested list comprehension works -
' '.join([''.join([j[1].lower() if j[0]%2==0 else j[1].upper() for j in enumerate(i)]) for i in a.split()])
'hI tHeRe'

The problem is with list.clear in the beginning of the for loop.
Each iteration you clear the list so the second for iteration run on the first item only.
Remove list.clear and it should scan the input word

Find out if string contains a combination of letters in a specific order

I am attempting to write a program to find words in the English language that contain 3 letters of your choice, in order, but not necessarily consecutively. For example, the letter combination EJS would output, among others, the word EJectS. You supply the letters, and the program outputs the words.
However, the program does not give the letters in the right order, and does not work at all with double letters, like the letters FSF or VVC. I hope someone can tell me how I can fix this error.
Here is the full code:
with open("words_alpha.txt") as words:
wlist = list(words)
while True:
elim1 = []
elim2 = []
elim3 = []
search = input("input letters here: ")
for element1 in wlist:
element1 = element1[:-1]
val1 = element1.find(search[0])
if val1 > -1:
elim1.append(element1)
for element2 in elim1:
val2 = element2[(val1):].find(search[2])
if val2 > -1:
elim2.append(element2)
for element3 in elim2:
val3 = element3[((val1+val2)):].find(search[1])
if val3 > -1:
elim3.append(element3)
print(elim3)

You are making this very complicated for yourself. To test whether a word contains the letters E, J and S in that order, you can match it with the regex E.*J.*S:
>>> import re
>>> re.search('E.*J.*S', 'EJectS')
<_sre.SRE_Match object; span=(0, 6), match='EJectS'>
>>> re.search('E.*J.*S', 'JEt engineS') is None
True
So here's a simple way to write a function which tests for an arbitrary combination of letters:
import re
def contains_letters_in_order(word, letters):
regex = '.*'.join(map(re.escape, letters))
return re.search(regex, word) is not None
Examples:
>>> contains_letters_in_order('EJectS', 'EJS')
True
>>> contains_letters_in_order('JEt engineS', 'EJS')
False
>>> contains_letters_in_order('ABra Cadabra', 'ABC')
True
>>> contains_letters_in_order('Abra CadaBra', 'ABC')
False
If you want to test every word in a wordlist, it is worth doing pattern = re.compile(regex) once, and then pattern.search(word) for each word.

You need to read the file correctly with read(), and since there is a newline between each word, call split('\n') to properly create the word list. The logic is simple. If all the letters are in the word, get the index for each letter, and check that the order of the indexes matches the order of the letters.
with open('words_alpha.txt') as file:
word_list = file.read().split('\n')
search = input("input letters here: ").lower()
found = []
for word in word_list:
if all(x in word for x in search):
i = word.find(search[0])
j = word.find(search[1], i + 1)
k = word.find(search[2], j + 1)
if i < j < k:
found.append(word)
print(found)
Using Function:
def get_words_with_letters(word_list, search):
search = search.lower()
for word in word_list:
if all(x in word for x in search):
i = word.find(search[0])
j = word.find(search[1], i + 1)
k = word.find(search[2], j + 1)
if i < j < k:
yield word
words = list(get_words_with_letters('fsf'))

The issue with your code is that you're using val1 from a specific word in your first loop for another word in your second loop. So val1 will be the wrong value most of the time as you're using the position of the first letter in the last word you checked in your first loop for every word in your seconds loop.
There are a lot of ways to solve what you're trying to do. However, my code below should be fairly close to what you had in mind with your solution. I have tried to explain everything that's going on in the comments:
# Read words from file
with open("words_alpha.txt") as f:
words = f.readlines()
# Begin infinite loop
while True:
# Get user input
search = input("Input letters here: ")
# Loop over all words
for word in words:
# Remove newline characters at the end
word = word.strip()
# Start looking for the letters at the beginning of the word
position = -1
# Check position for each letter
for letter in search:
position = word[position + 1:].find(letter)
# Break out of loop if letter not found
if position < 0:
break
# If there was no `break` in the loop, the word contains all letters
else:
print(word)
For every new letter we start looking beginning at position + 1 where position is the position of the previously found letter. (That's why we have to do position = -1, so we start looking for the first letter at -1 + 1 = 0.)
You should ideally move the removal of \n outside of the loop, so you will have to do it once and not for every search. I just left it inside the loop for consistency with your code.
Also, by the way, there's no handling of uppercase/lowercase for now. So, for example, should the search for abc be different from Abc? I'm not sure, what you need there.

IndexError: string index out of range. Pig Latin

Sorry if I'm being really ignorant, I've started learning to code Python recently (first language) and have been working on this task on codewars.com to create a single word pig latin programme. It is pretty messy, but it seems to work aside from the fact that the message:
Traceback:
in
in pig_latin
IndexError: string index out of range
...comes up. I have looked online and I sort of gather it is likely some piece of code that is just out of line or i need a -1 somewhere or something. I was wondering if anyone could help me identify where this would be. It's not helped of course by the fact that I have made this difficult for myself with my inefficiency :P thanks
def pig_latin(s):
word = 'ay'
word2 = 'way'
total=0
total2=0
lst = []
val = None
#rejecting non character strings
for c in s:
if c.isalpha() == False:
return None
#code for no vowels and also code for all consonant strings
for char in s:
if char in 'aeiou':
total+=1
if total==0:
return s + 'ay'
else:
pass
elif char not in 'aeiou':
total2+=1
if total2 == len(s):
answer_for_cons = s + word
return answer_for_cons.lower()
#first character is a vowel
if s[0] in 'aeiou':
return s + word2
#normal rule
elif s[0] not in 'aeiou':
for c in s:
if c in 'aeiou':
lst.append(s.index(c))
lst.sort()
answer = s[lst[0]:len(s)] + str(s[:lst[0]]) + word
return answer.lower()

The only point where an index is implicated is when you call s[0]. Have you maybe tried running pig_latin with an empty string?
Also, the formatting of your code makes no sense. I am assuming it was lost in the pasting? Everything below val = None should be at least one indent further right.

Now that the indentation is fixed, the code seems to run, but it does raise
IndexError: string index out of range
if we pass pig_latin an empty string. That's because of
if s[0] in 'aeiou':
That will fail if s is the empty string because you can't do s[0] on an empty string. s[0] refers to the first char in the string, but an empty string doesn't have a first char. And of course pig_latin returns None if we pass it a string that contains non-alpha characters.
So before you start doing the other tests, you should check that the string isn't empty, and return something appropriate if it is empty. The simplest way to do that is
if not s:
return ''
I suggest returning s or the empty string if you get passed an invalid string, rather than returning None. A function that returns different types depending on the value of the input is a bit messy to work with.
There are various simplifications and improvements that can be made to your code. For example, there's no need to do elif char not in 'aeiou' after you've already done if char in 'aeiou', since if char in 'aeiou' is false then char not in 'aeiou' must be true. However, we can simply that whole section considerably.
Here's your code with a few other improvements. Rather than using index to find the location of the first vowel we can use enumerate to get both the letter and its index at the same time.
def pig_latin(s):
word = 'ay'
word2 = 'way'
#return empty and strings that contain non-alpha chars unchanged
if not s or not s.isalpha():
return s
#code for no vowels
total = 0
for char in s:
if char in 'aeiou':
total += 1
if total == 0:
return s.lower() + word
#first character is a vowel
if s[0] in 'aeiou':
return s.lower() + word2
#normal rule. This will always return before the end of the loop
# because by this point `s` is guaranteed to contain at least one vowel
for i, char in enumerate(s):
if char in 'aeiou':
answer = s[i:] + s[:i] + word
return answer.lower()
# test
data = 'this is a pig latin test string aeiou bcdf 123'
s = ' '.join([pig_latin(w) for w in data.split()])
print(s)
output
isthay isway away igpay atinlay esttay ingstray aeiouway bcdfay 123

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.