Why does str.capitalize() not work as I expect?

Why does str.capitalize() not work as I expect? - python

Please, let me know if I'm not providing enough information. The goal of the program is to capitalize the first letter of every sentence.
usr_str = input()
def fix_capitalization(usr_str):
list_of_sentences = usr_str.split(".")
list_of_sentences.pop() #remove last element: ""
new_str = ''
for sentence in list_of_sentences:
new_str += sentence.capitalize() + "."
return new_str
print(fix_capitalization(usr_str))
For instance, if I input "hi. hello. hey." I expect it to output "Hi. Hello. Hey." but instead, it outputs "Hi. hello. hey."

An alternative would be to build a list of strings then concatenate them:
def fix_capitalization(usr_str):
list_of_sentences = usr_str.split(".")
output = []
for sentence in list_of_sentences:
new_sentence = sentence.strip().capitalize()
# If empty, don't bother
if new_sentence:
output.append(new_sentence)
# Finally, join everything
return ". ".join(output) +"."

You've entered the sentences with spaces between them. Now when you split the list the list at the '.' character the spaces are still remaining. I checked what the elements in the list were when you split it and the result was this.
'''
['hi', ' hello', ' hey', '']
'''

Related

Capitalizing first letter in split-append-join operation

I have the following code
def pig(text):
message = text.split()
pig_latin = []
for word in message:
word = word[1:] + word[0] + 'ay'
pig_latin.append(word)
return ' '.join(pig_latin)
def main():
user = str(input("Enter a string: "))
print(f"Pig latin: {pig(user)}")
Input: Practice makes perfect
Expected Output: Racticepay akesmay erfectpay
My translator is working fine, but I need to capitalize only the first letter of every sentence.
I can't figure out where to place the .capitalize() to get my desired output. I have put it in many locations and no luck so far.

In addition to what #BrokenBenchmark said, a format string and a generator expression would simplify your code down to a single, readable line of code.
def pig(text):
return ' '.join(f"{w[1:]}{w[0]}ay" for w in text.split()).capitalize()

Capitalizing should be the last thing you do before you return, so put .capitalize() at the return statement.
Change
return ' '.join(pig_latin)
to
return ' '.join(pig_latin).capitalize()

Just had to restart VS code now it works lol

removing the words after a specific sign in every sentence in a string

this is the string for example:
'I have an apple. I want to eat it. But it is so sore.'
and I want to convert it to this one:
'I have an apple want to eat it is is so sore'

Here is a way to do it without regexes, using del as you have mentioned:
def remove_after_sym(s, sym):
# Find first word
first = s.strip().split(' ')[0]
# Split the string using the symbol
l = []
s = s.strip().split(sym)
# Split words by space in each sentence
for a in s:
x = a.strip().split(' ')
del x[0]
l.append(x)
# Join words in each sentence
for i in range(len(l)):
l[i] = ' '.join(l[i])
# Combine sentences
final = first + ' ' + ' '.join(l)
final = final.strip() + '.'
return final
Here, sym is a str (a single character).
Also I have used the word 'sentence' very liberally as in your example, sym is a dot. But here sentence really means parts of the string broken by the symbol you want.
Here is what it outputs.
In [1]: remove_after_sym(string, '.')
Out[1]: 'I have an apple want to eat it it is so sore.'

How to add strings to items in list that resulted from split?

I am building a function that accepts a string as input, splits it based on certain separator and ends on period. Essentially what I need to do is add certain pig latin words onto certain words within the string if they fit the criteria.
The criteria are:
if the word starts with a non-letter or contains no characters, do nothing to it
if the word starts with a vowel, add 'way' to the end
if the word starts with a consonant, place the first letter at the end and add 'ay'
For output example:
simple_pig_latin("i like this") → 'iway ikelay histay.'
--default sep(space) and end(dot)
simple_pig_latin("i like this", sep='.') → 'i like thisway.'
--separator is dot, so whole thing is a single “word”
simple_pig_latin("i.like.this",sep='.',end='!') → 'iway.ikelay.histay!'
--sep is '.' and end is '!'
simple_pig_latin(".") → '..'
--only word is '.', so do nothing to it and add a '.' to the end
It is now:
def simple_pig_latin(input, sep='', end='.'):
words=input.split(sep)
new_sentence=""
Vowels= ('a','e','i','o','u')
Digit= (0,1,2,3,4,5,6,7,8,9)
cons=('b','c','d','f','g','h','j','k','l','m','n','p','q','r','s','t','v','w','x','y','z')
for word in words:
if word[0] in Vowels:
new_word= word+"way"
if word[0] in Digit:
new_word= word
if word[0] in cons:
new_word= word+"ay"
else:
new_word= word
new_sentence= new_sentence + new_word+ sep
new_sentence= new_sentence.strip(sep) + sentenceEndPunctuation
return new_sentence
Example error:
ERROR: test_simple_pig_latin_8 (__main__.AllTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "testerl8.py", line 125, in test_simple_pig_latin_8
result = simple_pig_latin(input,sep='l',end='')
File "/Users/kgreenwo/Desktop/student.py", line 8, in simple_pig_latin
if word[0] in Vowels:
IndexError: string index out of range

You have the means of adding strings together correct: you use the + operator, as you have in new_string = new_string + "way".
You have two other major issues, however:
To determine whether a variable can be found in a list (in your case, a tuple), you’d probably want to use the in operator. Instead of if [i][0]==Vowels: you would use if [i][0] in Vowels:.
When you reconstruct the string with the new words, you will need to add the word to your new_string. Instead of new_string=new_string+"way" you might use new_string = new_string+word+"way". If you choose to do it this way, you’ll also need to decide when to add the sep back to each word.
Another way of joining smaller strings into larger ones with a known separator is to create a list of the new individual strings, and then join the strings back together using your known separator:
separator = ' '
words = sentence.split(separator)
newWords = []
for word in words:
newWord = doSomething(word)
newWords.append(newWord)
newSentence = separator.join(newWords)
In this way, you don’t have to worry about either the first or last word not needing a separator.
In your case, doSomething might look like:
def doSomething(word):
if word[0] in Vowels:
return word + "way"
elif word[0] in Consonants:
return word + "ay"
#and so forth
How to write a function
On a more basic level, you will probably find it easier to create your functions in steps, rather than trying to write everything at once. At each step, you can be sure that the function (or script) works, and then move on to the next step. For example, your first version might be as simple as:
def simple_pig_latin(sentence, separator=' '):
words = sentence.split(separator)
for word in words:
print word
simple_pig_latin("i like this")
This does nothing except print each word in the sentence, one per line, to show you that the function is breaking the sentence apart into words the way that you expect it to be doing. Since words are fundamental to your function, you need to be certain that you have words and that you know where they are before you can continue. Your error of trying to check [i][0] would have been caught much more easily in this version, for example.
A second version might then do nothing except return the sentence recreated, taking it apart and then putting it back together the same way it arrived:
def simple_pig_latin(sentence, separator=' '):
words = sentence.split(separator)
new_sentence = ""
for word in words:
new_sentence = new_sentence + word + separator
return new_sentence
print simple_pig_latin("i like this")
Your third version might try to add the end punctuation:
def simple_pig_latin(sentence, separator=' ', sentenceEndPunctuation='.'):
words = sentence.split(separator)
new_sentence = ""
for word in words:
new_sentence = new_sentence + word + separator
new_sentence = new_sentence + sentenceEndPunctuation
return new_sentence
print simple_pig_latin("i like this")
At this point, you’ll realize that there’s an issue with the separator getting added on in front of the end punctuation, so you might fix that by stripping off the separator when done, or by using a list to construct the new_sentence, or any number of ways.
def simple_pig_latin(sentence, separator=' ', sentenceEndPunctuation='.'):
words = sentence.split(separator)
new_sentence = ""
for word in words:
new_sentence = new_sentence + word + separator
new_sentence = new_sentence.strip(separator) + sentenceEndPunctuation
return new_sentence
print simple_pig_latin("i like this")
Only when you can return the new sentence without the pig latin endings, and understand how that works, would you add the pig latin to your function. And when you add the pig latin, you would do it one rule at a time:
def simple_pig_latin(sentence, separator=' ', sentenceEndPunctuation='.'):
vowels= ('a','e','i','o','u')
words = sentence.split(separator)
new_sentence = ""
for word in words:
if word[0] in vowels:
new_word = word + "way"
else:
new_word = word
new_sentence = new_sentence + new_word + separator
new_sentence = new_sentence.strip(separator) + sentenceEndPunctuation
return new_sentence
print simple_pig_latin("i like this")
And so on, adding each change one at a time, until the function performs the way you expect.
When you try to build the function complete all at once, you end up with competing errors that make it difficult to see where the function is going wrong. By building the function one step at a time, you should generally only have one error at a time to debug.

How to make my code detect end of string in python?

I'm trying to write code to split a sentence without it's punctuation. For example, if a user inputs "Hello, how are you?", I can split the sentence to ['hello','how','are','you']
userinput = str(raw_input("Enter your sentence: "))
def sentence_split(sentence):
result = []
current_word = ""
for letter in sentence:
if letter.isalnum():
current_word += letter
else: ## this is a symbol or punctuation, e.g. reach end of a word
if current_word:
result.append(current_word)
current_word = "" ## reinitialise for creating a new word
return result
print "Split of your sentence:", sentence_split(userinput)
so far my code works, but if i put a sentence without ending it with a punctuation, the last word won't show up in result, for example, if the input were "Hello, how are you", the result would be ['hello','how','are'], I guess it's because there's no punctuation to tell the code the string is ended, is there a way I can make the program detect it's the end of string? So that even if the input were "Hello, how are you", the result would still be ['hello','how','are','you'].

I've not tried to adjust your algorithm myself, but I think the method below should achieve what you are after.
def sentence_split(sentence):
new_sentence = sentence[:]
for letter in sentence:
if not letter.isalnum():
new_sentence = new_sentence.replace(letter, ' ')
return new_sentence.split()
Now with it running:
runfile(r'C:\Users\cat\test.py', wdir=r'C:\Users\cat')
['Hello', 'how', 'are', 'you']
Edit: Fixed a bug with initialisation of new_sentence.

You could try something like this:
def split_string(text, splitlist):
for sep in splitlist:
text = text.replace(sep, splitlist[0])
return filter(None, text.split(splitlist[0])) if splitlist else [text]
If you set splitlist to "!?,." or whatever you need to split on, this will first replace every instance of punctuation with the first sep from splitlist, and finally will split the whole sentence on the first sep, while removing empty strings from the returned list (that's what filter(None, list) does).
Or you could use this simple regex solution:
>>> s = "Hello, how are you?"
>>> re.findall(r'([A-Za-z]+)', s)
['Hello', 'how', 'are', 'you']

Since the algorithm expects every word to end with punctuation or a space, you could just add a space to the end of the input to make sure the algorithm terminates properly:
userinput = str(raw_input("Enter your sentence: ")) + " "
Result:
Enter your sentence: hello how are you
Split of your sentence: ['hello', 'how', 'are', 'you']

Method 1:
Why not just use re.split('[a list of chars i do not like]', s)?
https://docs.python.org/2/library/re.html
Method 2:
Sanitize string (remove unwanted characters):
http://pastebin.com/raw.php?i=1j7ACbyK
Then do s.split(' ').

The problem with your code is that you don’t do anything with current_word at the end, unless you hit a non-alphanum character:
for letter in sentence:
if letter.isalnum():
current_word += letter
else:
if current_word:
result.append(current_word)
current_word = ""
return result
If the last letter is another character, it will just be added to current_word, but current_word will never be appended to the result. You can fix this, by simply duplicating the append-logic after the loop:
for letter in sentence:
if letter.isalnum():
current_word += letter
else:
if current_word:
result.append(current_word)
current_word = ""
if current_word:
result.append(current_word)
return result
So now, when current_word is non-empty after the loop, it will be appended to the result as well. And in case the last character was some punctuation, current_word will be empty again, so the condition of the if after the loop won’t be true.

small issue with whitespeace/punctuation in python?

I have this function that will convert text language into English:
def translate(string):
textDict={'y':'why', 'r':'are', "l8":'late', 'u':'you', 'gtg':'got to go',
'lol': 'laugh out loud', 'ur': 'your',}
translatestring = ''
for word in string.split(' '):
if word in textDict:
translatestring = translatestring + textDict[word]
else:
translatestring = translatestring + word
return translatestring
However, if I want to translate y u l8? it will return whyyoul8?. How would I go about separating the words when I return them, and how do I handle punctuation? Any help appreciated!

oneliner comprehension:
''.join(textDict.get(word, word) for word in re.findall('\w+|\W+', string))
[Edit] Fixed regex.

You're adding words to a string without spaces. If you're going to do things this way (instead of the way suggested to your in your previous question on this topic), you'll need to manually re-add the spaces since you split on them.

"y u l8" split on " ", gives ["y", "u", "l8"]. After substitution, you get ["why", "you", "late"] - and you're concatenating these without adding spaces, so you get "whyyoulate". Both forks of the if should be inserting a space.

You can just add a + ' ' + to add a space. However, I think what you're trying to do is this:
import re
def translate_string(str):
textDict={'y':'why', 'r':'are', "l8":'late', 'u':'you', 'gtg':'got to go', 'lol': 'laugh out loud', 'ur': 'your',}
translatestring = ''
for word in re.split('([^\w])*', str):
if word in textDict:
translatestring += textDict[word]
else:
translatestring += word
return translatestring
print translate_string('y u l8?')
This will print:
why you late?
This code handles stuff like question marks a bit more gracefully and preserves spaces and other characters from your input string, while retaining your original intent.

I'd like to suggest the following replacement for this loop:
for word in string.split(' '):
if word in textDict:
translatestring = translatestring + textDict[word]
else:
translatestring = translatestring + word
for word in string.split(' '):
translatetring += textDict.get(word, word)
The dict.get(foo, default) will look up foo in the dictionary and use default if foo isn't already defined.
(Time to run, short notes now: When splitting, you could split based on punctuation as well as whitespace, save the punctuation or whitespace, and re-introduce it when joining the output string. It's a bit more work, but it'll get the job done.)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why does str.capitalize() not work as I expect? - python

You've entered the sentences with spaces between them. Now when you split the list the list at the '.' character the spaces are still remaining. I checked what the elements in the list were when you split it and the result was this. ''' ['hi', ' hello', ' hey', ''] '''

Related

Capitalizing first letter in split-append-join operation

removing the words after a specific sign in every sentence in a string

How to add strings to items in list that resulted from split?

How to make my code detect end of string in python?

small issue with whitespeace/punctuation in python?

Categories

Resources