Need help to translate a string to pyg latin - python

I want to write a function that will take a string and turn the words into Pyg Latin. That means that:
If a word begins with a vowel, add "-way" to the end. Example: "ant" becomes "ant-way".
If a word begins with a consonant cluster, move that cluster to the end and add "ay" to it. Example: "pant" becomes "ant-pay".
I've searched many posts and websites but none of them do the same way or the way I want to do it. I have to test these functions in a test and I have 4 test cases for this one. One is 'fish' and it should returns 'ish-fray' the second is 'frish' and it should returns 'ish-fray' the third is 'ish' and it should return 'ish-way' and the last is 'tis but a scratch' and it should return 'is-tay ut-bay a-way atch-scray'
I've found a program that can translate it CLOSE to what it has to be but I'm not sure how to edit it so it can return the result I'm looking for.
def pyg_latin(fir_str):
pyg = 'ay'
pyg_input = fir_str
if len(pyg_input) > 0 and pyg_input.isalpha():
lwr_input = pyg_input.lower()
lst = lwr_input.split()
latin = []
for item in lst:
frst = item[0]
if frst in 'aeiou':
item = item + pyg
else:
item = item[1:] + frst + pyg
latin.append(item)
return ' '.join(latin)
So, this is the result my code does:
pyg_latin('fish')
#it returns
'ishfay'
What I want it to return isn't much different but I dont know how to add it in
pyg_latin('fish')
#it returns
'ish-fay'

Think about what the string should look like.
Chunk of text, followed by a hyphen, followed by the first letter (if it’s a not a vowel), followed by “ay”.
You can use python string formatting or just add the strings together:
Item[1:] + “-“ + frst + pyg
It is also worth learning how array slicing works and how strings are arrays that can be accessed through the notation. The following code appears to work for your test cases. You should refactor it and understand what each line does. Make the solution more robust but adding test scenarios like '1st' or a sentence with punctuation. You could also build a function that creates the pig latin string and returns it then refactor the code to utilize that.
def pg(w):
w = w.lower()
string = ''
if w[0] not in 'aeiou':
if w[1] not in 'aeiou':
string = w[2:] + "-" + w[:2] + "ay"
return string
else:
string = w[1:] + "-" + w[0] + "ay"
return string
else:
string = w + "-" + "way"
return string
words = ['fish', 'frish', 'ish', 'tis but a scratch']
for word in words:
# Type check the incoming object and raise an error if it is not a list or string
# This allows handling both 'fish' and 'tis but a scratch' but not 5.
if isinstance(word, str):
new_phrase = ''
if ' ' in word:
for w in word.split(' '):
new_phrase += (pg(w)) + ' '
else:
new_phrase = pg(word)
print(new_phrase)
# Raise a Type exception if the object being processed is not a string
else:
raise TypeError

Related

Convert the string to a string in which the words are separated by spaces and only the first word starts with an uppercase letter [duplicate]

This question already has answers here:
Split a string at uppercase letters
(22 answers)
Closed 2 years ago.
I am trying to make a script that will accept a string as input in which all of the words are run together, but the first character of each word is uppercase. It should convert the string to a string in which the words are separated by spaces and only the first word starts with an uppercase letter.
For Example (The Input):
"StopWhateverYouAreDoingInterestingIDontCare"
The expected output:
"Stop whatever you are doing interesting I dont care"
Here is the one I wrote so far:
string_input = "StopWhateverYouAreDoingInterestingIDontCare"
def organize_string():
start_sentence = string_input[0]
index_of_i = string_input.index("I")
for i in string_input[1:]:
if i == "I" and string_input[index_of_i + 1].isupper():
start_sentence += ' ' + i
elif i.isupper():
start_sentence += ' ' + i.lower()
else:
start_sentence += i
return start_sentence
While this takes care of some parts, I am struggling with differentiating if the letter "I" is single or a whole word. Here is my output:
"Stop whatever you are doing interesting i dont care"
Single "I" needs to be uppercased, while the "I" in the word "Interesting" should be lowercased "interesting".
I will really appreciate all the help!
A regular expression will do in this example.
import re
s = "StopWhateverYouAreDoingInterestingIDontCare"
t = re.sub(r'(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z])', ' ', s)
Explained:
(?<=[a-z])(?=[A-Z]) - a lookbehind for a lowercase letter followed by a lookahead uppercase letter
| - (signifies or)
(?<=[A-Z])(?=[A-Z]) - a lookbehind for a uppercase letter followed by a lookahead uppercase letter
This regex substitutes a space when there is a lowercase letter followed by an uppercase letter, OR, when there is an uppercase letter followed by an uppercase letter.
UPDATE: This doesn't correctly lowercase the words (with the exception of I and the first_word)
UPDATE2: The fix to this is:
import re
s = "StopWhateverYouAreDoingInterestingIDontCare"
first_word, *rest = re.split(r'(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z])', s)
rest = [word.lower() if word != 'I' else word for word in rest]
print(first_word, ' '.join(rest))
Prints:
Stop whatever you are doing interesting I dont care
Update 3: I looked at why your code failed to correctly form the sentence (which I should have done in the first place instead of posting my own solution :-)).
Here is the corrected code with some remarks about the changes.
string_input = "StopWhateverYouAreDoingInterestingIDontCare"
def organize_string():
start_sentence = string_input[0]
#index_of_i = string_input.index("I")
for i, char in enumerate(string_input[1:], start=1):
if char == "I" and string_input[i + 1].isupper():
start_sentence += ' ' + char
elif char.isupper():
start_sentence += ' ' + char.lower()
else:
start_sentence += char
return start_sentence
print(organize_string())
!. I commented out the line index_of_i = string_input.index("I") as it doesn't do what you need (it finds the index of the first capital I and not an I that should stand alone (it finds the index of the I in Interesting instead of the IDont further in the string_input string). It is not a correct statement.
for i, char in enumerate(string_input[1:], 1) enumerate states the index of the letters in the string starting at 1 (since string_input[1:] starts at index 1 so they are in sync). i is the index of a letter in string_input.
I changed the i's to char to make it clearer that char is the character. Other than these changes, the code stands as you wrote it.
Now the program gives the correct output.
string_input = "StopWhateverYouAreDoingInterestingIDontCare"
counter = 1
def organize_string():
global counter
start_sentence = string_input[0]
for i in string_input[1:]:
if i == "I" and string_input[counter+1].isupper():
start_sentence += ' ' + i
elif i.isupper():
start_sentence += ' ' + i.lower()
else:
start_sentence += i
counter += 1
print(start_sentence)
organize_string()
I made some changes to your program. I used a counter to check the index position. I get your expected output:
Stop whatever you are doing interesting I dont care
s = 'StopWhateverYouAreDoingInterestingIDontCare'
ss = ' '
res = ''.join(ss + x if x.isupper() else x for x in s).strip(ss).split(ss)
sr = ''
for w in res:
sr = sr + w.lower() + ' '
print(sr[0].upper() + sr[1:])
output
Stop whatever you are doing interesting i dont care
I hope this will work fine :-
string_input = "StopWhateverYouAreDoingInterestingIDontCare"
def organize_string():
i=0
while i<len(string_input):
if string_input[i]==string_input[i].upper() and i==0 :
print(' ',end='')
print(string_input[i].upper(),end='')
elif string_input[i]==string_input[i].upper() and string_input[i+1]==string_input[i+1].upper():
print(' ',end='')
print(string_input[i].upper(),end='')
elif string_input[i]==string_input[i].upper() and i!=0:
print(' ',end='')
print(string_input[i].lower(),end='')
if string_input[i]!=string_input[i].upper():
print(string_input[i],end='')
i=i+1
organize_string()
Here is one solution utilising the re package to split the string based on the upper case characters. [Docs]
import re
text = "StopWhateverYouAreDoingInterestingIDontCare"
# Split text by upper character
text_splitted = re.split('([A-Z])', text)
print(text_splitted)
As we see in the output below the separator (The upper case character) and the text before and after is kept. This means that the upper case character is always followed by the rest of the word. The empty first string originates from the first upper case character, which is the first separator.
# Output of print
[
'',
'S', 'top',
'W', 'hatever',
'Y', 'ou',
'A', 're',
'D', 'oing',
'I', 'nteresting',
'I', '',
'D', 'ont',
'C', 'are'
]
As we have seen the first character is always followed by the rest of the word. By combining the two we have the splitted words. This also allows us to easily handle your special case with the I
# Remove first character because it is always empty if first char is always upper
text_splitted = text_splitted[1:]
result = []
for i in range(0, len(text_splitted), 2):
word = text_splitted[i]+text_splitted[i+1]
if (i > 0) and (word != 'I') :
word = word.lower()
result.append(word)
result = ' '.join(result)
split the sentence into individual words. If you find the word "I" in this list, leave it alone. Leave the first word alone. All of the other words, you cast to lower case.
You have to use some string manipulation like this:
output=string_input[0]
for l in string_input[1:]:
if l.islower():
new_s+=l
else:
new_s+=' '+l.lower()
print(output)

Python iterations mischaracterizes string value

For this problem, I am given strings ThatAreLikeThis where there are no spaces between words and the 1st letter of each word is capitalized. My task is to lowercase each capital letter and add spaces between words. The following is my code. What I'm doing there is using a while loop nested inside a for-loop. I've turned the string into a list and check if the capital letter is the 1st letter or not. If so, all I do is make the letter lowercase and if it isn't the first letter, I do the same thing but insert a space before it.
def amendTheSentence(s):
s_list = list(s)
for i in range(len(s_list)):
while(s_list[i].isupper()):
if (i == 0):
s_list[i].lower()
else:
s_list.insert(i-1, " ")
s_list[i].lower()
return ''.join(s_list)
However, for the test case, this is the behavior:
Input: s: "CodesignalIsAwesome"
Output: undefined
Expected Output: "codesignal is awesome"
Console Output: Empty
You can use re.sub for this:
re.sub(r'(?<!\b)([A-Z])', ' \\1', s)
Code:
import re
def amendTheSentence(s):
return re.sub(r'(?<!\b)([A-Z])', ' \\1', s).lower()
On run:
>>> amendTheSentence('GoForPhone')
go for phone
Try this:
def amendTheSentence(s):
start = 0
string = ""
for i in range(1, len(s)):
if s[i].isupper():
string += (s[start:i] + " ")
start = i
string += s[start:]
return string.lower()
print(amendTheSentence("CodesignalIsAwesome"))
print(amendTheSentence("ThatAreLikeThis"))
Output:
codesignal is awesome
that are like this
def amendTheSentence(s):
new_sentence=''
for char in s:
if char.isupper():
new_sentence=new_sentence + ' ' + char.lower()
else:
new_sentence=new_sentence + char
return new_sentence
new_sentence=amendTheSentence("CodesignalIsAwesome")
print (new_sentence)
result is codesignal is awesome

Python - I need to rearrange a character of a string element in a list

So I have a function that takes a string input and turns it into pig latin.
For all words that begin with consonants (everything except vowels), I have to take the first letter of that word and move it to the back and then add "ay" to the word.
For example "like" would become "ikelay".
In my program, the string input given to me is first split and then each element of that newly created list is checked to see if the first character of that element is either a vowel, a consonant, or otherwise.
def simple_pig_latin(input, sep=" ", end="."):
splitinput = input.split(sep)
for i in splitinput:
if splitinput[splitinput.index(i)][0] in ['a','e','i','o','u']:
splitinput[splitinput.index(i)] = str(i) + "way"
elif splitinput[splitinput.index(i)][0] in ['b','c','d','f','g','h','j','k','l','m','n','p','q','r','s','t','v','w','x','y','z']:
splitinput[splitinput.index(i)] = str(i) + "ay"
else:
continue
finalstring = ' '.join(splitinput)
finalstring = finalstring + end
simple_pig_latin("i like this")
Notice in the elif branch, I am supposed to take the first letter of i and put it at the end of that word and add "ay" to it. Given the input string "i like this" I should turn the second word (since like starts with l, making it a consonant) into 'ikelay' How would I rearrange like so that it became ikel?
I tried to keep your structure while still removing the useless code :
def simple_pig_latin(input_text, sep=" ", end="."):
words = input_text.split(sep)
new_words = []
for word in words:
if word[0].lower() in ['a', 'e', 'i', 'o', 'u']:
new_words.append(word + "way")
else:
new_words.append(word[1:] + word[0] + "ay")
finalstring = sep.join(new_words)
finalstring = finalstring + end
return finalstring
print simple_pig_latin("i like this")
# iway ikelay histay.
Notes :
Your function needs to return something
It's probably easier to create a new list than to mutate the original one
if i is already a string, there's no need to call str(i)
i is usually used for an integer between 0 and n-1. Not for words.
word[0] is the first letter of your word
word[k:] is word without the first k letters
to simplify your code, I consider that if the first letter isn't a vowel, it must be a consonant.
I call lower() on the first letter in order to check if 'I' is a vowel.
For your question, you could change your code str(i) + "ay" to i[1:] + i[0] + "ay" in your elif branch.

How to add strings to items in list that resulted from split?

I am building a function that accepts a string as input, splits it based on certain separator and ends on period. Essentially what I need to do is add certain pig latin words onto certain words within the string if they fit the criteria.
The criteria are:
if the word starts with a non-letter or contains no characters, do nothing to it
if the word starts with a vowel, add 'way' to the end
if the word starts with a consonant, place the first letter at the end and add 'ay'
For output example:
simple_pig_latin("i like this") → 'iway ikelay histay.'
--default sep(space) and end(dot)
simple_pig_latin("i like this", sep='.') → 'i like thisway.'
--separator is dot, so whole thing is a single “word”
simple_pig_latin("i.like.this",sep='.',end='!') → 'iway.ikelay.histay!'
--sep is '.' and end is '!'
simple_pig_latin(".") → '..'
--only word is '.', so do nothing to it and add a '.' to the end
It is now:
def simple_pig_latin(input, sep='', end='.'):
words=input.split(sep)
new_sentence=""
Vowels= ('a','e','i','o','u')
Digit= (0,1,2,3,4,5,6,7,8,9)
cons=('b','c','d','f','g','h','j','k','l','m','n','p','q','r','s','t','v','w','x','y','z')
for word in words:
if word[0] in Vowels:
new_word= word+"way"
if word[0] in Digit:
new_word= word
if word[0] in cons:
new_word= word+"ay"
else:
new_word= word
new_sentence= new_sentence + new_word+ sep
new_sentence= new_sentence.strip(sep) + sentenceEndPunctuation
return new_sentence
Example error:
ERROR: test_simple_pig_latin_8 (__main__.AllTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "testerl8.py", line 125, in test_simple_pig_latin_8
result = simple_pig_latin(input,sep='l',end='')
File "/Users/kgreenwo/Desktop/student.py", line 8, in simple_pig_latin
if word[0] in Vowels:
IndexError: string index out of range
You have the means of adding strings together correct: you use the + operator, as you have in new_string = new_string + "way".
You have two other major issues, however:
To determine whether a variable can be found in a list (in your case, a tuple), you’d probably want to use the in operator. Instead of if [i][0]==Vowels: you would use if [i][0] in Vowels:.
When you reconstruct the string with the new words, you will need to add the word to your new_string. Instead of new_string=new_string+"way" you might use new_string = new_string+word+"way". If you choose to do it this way, you’ll also need to decide when to add the sep back to each word.
Another way of joining smaller strings into larger ones with a known separator is to create a list of the new individual strings, and then join the strings back together using your known separator:
separator = ' '
words = sentence.split(separator)
newWords = []
for word in words:
newWord = doSomething(word)
newWords.append(newWord)
newSentence = separator.join(newWords)
In this way, you don’t have to worry about either the first or last word not needing a separator.
In your case, doSomething might look like:
def doSomething(word):
if word[0] in Vowels:
return word + "way"
elif word[0] in Consonants:
return word + "ay"
#and so forth
How to write a function
On a more basic level, you will probably find it easier to create your functions in steps, rather than trying to write everything at once. At each step, you can be sure that the function (or script) works, and then move on to the next step. For example, your first version might be as simple as:
def simple_pig_latin(sentence, separator=' '):
words = sentence.split(separator)
for word in words:
print word
simple_pig_latin("i like this")
This does nothing except print each word in the sentence, one per line, to show you that the function is breaking the sentence apart into words the way that you expect it to be doing. Since words are fundamental to your function, you need to be certain that you have words and that you know where they are before you can continue. Your error of trying to check [i][0] would have been caught much more easily in this version, for example.
A second version might then do nothing except return the sentence recreated, taking it apart and then putting it back together the same way it arrived:
def simple_pig_latin(sentence, separator=' '):
words = sentence.split(separator)
new_sentence = ""
for word in words:
new_sentence = new_sentence + word + separator
return new_sentence
print simple_pig_latin("i like this")
Your third version might try to add the end punctuation:
def simple_pig_latin(sentence, separator=' ', sentenceEndPunctuation='.'):
words = sentence.split(separator)
new_sentence = ""
for word in words:
new_sentence = new_sentence + word + separator
new_sentence = new_sentence + sentenceEndPunctuation
return new_sentence
print simple_pig_latin("i like this")
At this point, you’ll realize that there’s an issue with the separator getting added on in front of the end punctuation, so you might fix that by stripping off the separator when done, or by using a list to construct the new_sentence, or any number of ways.
def simple_pig_latin(sentence, separator=' ', sentenceEndPunctuation='.'):
words = sentence.split(separator)
new_sentence = ""
for word in words:
new_sentence = new_sentence + word + separator
new_sentence = new_sentence.strip(separator) + sentenceEndPunctuation
return new_sentence
print simple_pig_latin("i like this")
Only when you can return the new sentence without the pig latin endings, and understand how that works, would you add the pig latin to your function. And when you add the pig latin, you would do it one rule at a time:
def simple_pig_latin(sentence, separator=' ', sentenceEndPunctuation='.'):
vowels= ('a','e','i','o','u')
words = sentence.split(separator)
new_sentence = ""
for word in words:
if word[0] in vowels:
new_word = word + "way"
else:
new_word = word
new_sentence = new_sentence + new_word + separator
new_sentence = new_sentence.strip(separator) + sentenceEndPunctuation
return new_sentence
print simple_pig_latin("i like this")
And so on, adding each change one at a time, until the function performs the way you expect.
When you try to build the function complete all at once, you end up with competing errors that make it difficult to see where the function is going wrong. By building the function one step at a time, you should generally only have one error at a time to debug.

IndexError: string index out of range. Pig Latin

Sorry if I'm being really ignorant, I've started learning to code Python recently (first language) and have been working on this task on codewars.com to create a single word pig latin programme. It is pretty messy, but it seems to work aside from the fact that the message:
Traceback:
in
in pig_latin
IndexError: string index out of range
...comes up. I have looked online and I sort of gather it is likely some piece of code that is just out of line or i need a -1 somewhere or something. I was wondering if anyone could help me identify where this would be. It's not helped of course by the fact that I have made this difficult for myself with my inefficiency :P thanks
def pig_latin(s):
word = 'ay'
word2 = 'way'
total=0
total2=0
lst = []
val = None
#rejecting non character strings
for c in s:
if c.isalpha() == False:
return None
#code for no vowels and also code for all consonant strings
for char in s:
if char in 'aeiou':
total+=1
if total==0:
return s + 'ay'
else:
pass
elif char not in 'aeiou':
total2+=1
if total2 == len(s):
answer_for_cons = s + word
return answer_for_cons.lower()
#first character is a vowel
if s[0] in 'aeiou':
return s + word2
#normal rule
elif s[0] not in 'aeiou':
for c in s:
if c in 'aeiou':
lst.append(s.index(c))
lst.sort()
answer = s[lst[0]:len(s)] + str(s[:lst[0]]) + word
return answer.lower()
The only point where an index is implicated is when you call s[0]. Have you maybe tried running pig_latin with an empty string?
Also, the formatting of your code makes no sense. I am assuming it was lost in the pasting? Everything below val = None should be at least one indent further right.
Now that the indentation is fixed, the code seems to run, but it does raise
IndexError: string index out of range
if we pass pig_latin an empty string. That's because of
if s[0] in 'aeiou':
That will fail if s is the empty string because you can't do s[0] on an empty string. s[0] refers to the first char in the string, but an empty string doesn't have a first char. And of course pig_latin returns None if we pass it a string that contains non-alpha characters.
So before you start doing the other tests, you should check that the string isn't empty, and return something appropriate if it is empty. The simplest way to do that is
if not s:
return ''
I suggest returning s or the empty string if you get passed an invalid string, rather than returning None. A function that returns different types depending on the value of the input is a bit messy to work with.
There are various simplifications and improvements that can be made to your code. For example, there's no need to do elif char not in 'aeiou' after you've already done if char in 'aeiou', since if char in 'aeiou' is false then char not in 'aeiou' must be true. However, we can simply that whole section considerably.
Here's your code with a few other improvements. Rather than using index to find the location of the first vowel we can use enumerate to get both the letter and its index at the same time.
def pig_latin(s):
word = 'ay'
word2 = 'way'
#return empty and strings that contain non-alpha chars unchanged
if not s or not s.isalpha():
return s
#code for no vowels
total = 0
for char in s:
if char in 'aeiou':
total += 1
if total == 0:
return s.lower() + word
#first character is a vowel
if s[0] in 'aeiou':
return s.lower() + word2
#normal rule. This will always return before the end of the loop
# because by this point `s` is guaranteed to contain at least one vowel
for i, char in enumerate(s):
if char in 'aeiou':
answer = s[i:] + s[:i] + word
return answer.lower()
# test
data = 'this is a pig latin test string aeiou bcdf 123'
s = ' '.join([pig_latin(w) for w in data.split()])
print(s)
output
isthay isway away igpay atinlay esttay ingstray aeiouway bcdfay 123

Categories