Remove only trailing whitespace from output - python

I have a task that was assigned to me for homework. Basically the problem is:
Write a program that can get rid of the brand names and replace them with the generic names.
The table below shows some brand names that have generic names. The mapping has also been provided to you in your program as the BRANDS dictionary.
BRANDS = {
'Velcro': 'hook and loop fastener',
'Kleenex': 'tissues',
'Hoover': 'vacuum',
'Bandaid': 'sticking plaster',
'Thermos': 'vacuum flask',
'Dumpster': 'garbage bin',
'Rollerblade': 'inline skate',
'Asprin': 'acetylsalicylic acid'
}
This is my code:
sentence = input('Sentence: ')
sentencelist = sentence.split()
for c in sentencelist:
if c in BRANDS:
d = c.replace(c, BRANDS[c])
print(d, end=' ')
else:
print(c, end=' ')
My output:
Sentence: I bought some Velcro shoes.
I bought some hook and loop fastener shoes.
Expected output:
Sentence: I bought some Velcro shoes.
I bought some hook and loop fastener shoes.
It looks the same, but in my output there was an extra whitespace after 'shoes.' when there isn't supposed to be a whitespace. So how do I remove this whitespace?
I know you could do rstrip() or replace() and I tried it, but it would just jumble everything together when I just need to remove the trailing whitespace and not remove any other whitespace. If the user put the brand name in the middle of the sentence, and I used rstrip(), it would join the brand name and the rest of the sentence together.

The key is to use a string's join method to concatenate everything for you. For example, to put a space between a bunch of strings without putting a space after the last bit, do
' '.join(bunch_of_strings)
The strings have to be in an iterable, like a list, for that to work. You could make the list like this:
edited_list = []
for word in sentence_list:
if word in BRANDS:
edited_list.append(BRANDS[word])
else:
edited_list.append(word)
A much shorter alternative would be
edited_list = [BRANDS.get(word, word) for word in sentence_list]
Either way, you can combine the edited sentence using the join method:
print(' '.join(edited_list))
This being Python, you can do the whole thing as a one-liner without using an intermediate list at all:
print(' '.join(BRANDS.get(word, word) for word in sentence_list))
Finally, you could do the joining in print itself using splat notation. Here, you would pass in each element of your list as a separate argument, and use the default sep argument to insert the spaces:
print(*edited_list)
As an aside, d = c.replace(c, BRANDS[c]) is a completely pointless equivalent of just d = BRANDS[c]. Since strings are immutable, any time you do c.replace(c, ..., you are just returning the replacent in a somewhat illegible manner.

The problem is that print(c, end=' ') will always print a space after c. Here is a pretty minimal change to fix that:
sentence = input('Sentence: ')
sentencelist = sentence.split()
is_first = True
for c in sentencelist:
if not is_first:
print(' ', end='')
is_first = False
if c in BRANDS:
d = c.replace(c, BRANDS[c])
print(d, end='')
else:
print(c, end='')
As others have pointed out, this can be tidied up, e.g., d = c.replace(c, BRANDS[c]) is equivalent to d = BRANDS[c], and if you change it to c = BRANDS[c], then you could use a single print call and no else clause.
But you also have to be careful with your approach, because it will fail for sentences like "I bought a Hoover." The sentence.split() operation will keep "Hoover." as a single item, and that will fail the c in BRANDS test due to the extra period. You could try to separate words from punctuation, but that won't be easy. Another solution would be to apply all the replacements to each element, or equivalently, to the whole sentence. That should work fine in this case since you may not have to worry about replacement words that could be embedded in longer words (e.g., accidentally replacing 'cat' embedded in 'caterpillar'). So something like this may work OK:
new_sentence = sentence
for brand, generic in BRANDS.items():
new_sentence = new_sentence.replace(brand, generic)
print(new_sentence)

Your end=' ' unconditionally appends extra spaces to your output. There is no consistent way to undo this (echoing a backspace character only works for terminals, seeking only works for files, etc.).
The trick is to avoid printing it in the first place:
sentence = input('Sentence: ')
sentencelist = sentence.split()
result = []
for c in sentencelist:
# Perform replacement if needed
if c in BRANDS:
c = BRANDS[c] # c.replace(c, BRANDS[c]) is weird way to spell BRANDS[c]
# Append possibly replaced value to list of results
result.append(c)
# Add spaces only in between elements, not at the end, then print all at once
print(' '.join(result))
# Or as a trick to let print add the spaces and convert non-strings to strings:
print(*result)

You dont have to split the word and iterating through it.
Try this code it will work and will not get the issue of white space anymore
sentence = ' '.join(str(BRANDS.get(word, word)) for word in input_words)
Here,make a list names "input_words" and add the number of line that you wanted to process
Happy Learning!

Related

Looping and Lists - Grok Learning

I've just started learning to code Python today on Grok Learning and I'm currently stuck on this problem. I have to create a code that reads a message and:
read the words in reverse order
only take the words in the message that start with an uppercase letter
make everything lowercase
I've done everything right but I can't get rid of a space at the end. I was wondering if anyone knew how to remove it. Here is my code:
code = []
translation = []
msg = input("code: ")
code = msg.split()
code.reverse()
for c in code:
if c[0].isupper():
translation.append(c)
print("says: ", end='')
for c in translation:
c = c.lower()
print(c, end = ' ')
Thank you :)
You need to iterate for all of the letters in translation but the last and print it separately without the space:
for c in translation[:-1]:
c = c.lower()
print(c, end = ' ')
print(translation[-1], end='')
You can simply use join() and f-strings.
result = ' '.join(translation).lower()
print(f"says: {result}")
This is a common problem:
You have a sequence of n elements
You want to format them in a string using a separator between the elements, resulting in n-1 separators
I'd say the pythonic way to do this, if you really want to build the resulting string, is str.join(). It takes any iterable, for example a list, of strings, and joins all the elements together using the string it was called on as a separator. Take this example:
my_list = ["1", "2", "3"]
joined = ", ".join(my_list)
# joined == "1, 2, 3"
In your case, you could do
msg = "Hello hello test Test asd adsa Das"
code = msg.split()
code.reverse()
translation = [c.lower() for c in code if c[0].isupper()]
print("says: ", end='')
print(" ".join(translation))
# output:
# says: das test hello
For printing, note that print can also take multiple elements and print them using a separator. So, you could use this:
print(*translation, sep=" ")
You could also leave out explicitly setting sep because a space is the default:
print(*translation)

How to make shortcut of first letters of any text?

I need to write a function that returns the first letters (and make it uppercase) of any text like:
shortened = shorten("Don't repeat yourself")
print(shortened)
Expected output:
DRY
and:
shortened = shorten("All terrain armoured transport")
print(shortened)
Expected output:
ATAT
Use list comprehension and join
shortened = "".join([x[0] for x in text.title().split(' ') if x])
Using regex you can match all characters except the first letter of each word, replace them with an empty string to remove them, then capitalize the resulting string:
import re
def shorten(sentence):
return re.sub(r"\B[\S]+\s*","",sentence).upper()
print(shorten("Don't repeat yourself"))
Output:
DRY
text = 'this is a test'
output = ''.join(char[0] for char in text.title().split(' '))
print(output)
TIAT
Let me explain how this works.
My first step is to capitalize the first letter of each work
text.title()
Now I want to be able to separate each word by the space in between, this will become a list
text.title()split(' ')
With that I'd end up with 'This','Is','A','Test' so now I obviously only want the first character of each word in the list
for word in text.title()split(' '):
print(word[0]) # T I A T
Now I can lump all that into something called list comprehension
output = [char[0] for char in text.title().split(' ')]
# ['T','I','A','T']
I can use ''.join() to combine them together, I don't need the [] brackets anymore because it doesn't need to be a list
output = ''.join(char[0] for char in text.title().split(' ')

split strings with multiple special characters into lists without importing anything in python

i need to make a program that will capitalize the first word in a sentence and i want to be sure that all the special characters that are used to end a sentence can be used.
i can not import anything! this is for a class and i just want some examples to do this.
i have tried to use if to look in the list to see if it finds the matching character and do the correct split operatrion...
this is the function i have now... i know its not good at all as it just returns the original string...
def getSplit(userString):
userStringList = []
if "? " in userString:
userStringList=userString.split("? ")
elif "! " in userStringList:
userStringList = userString.split("! ")
elif ". " in userStringList:
userStringList = userString.split(". ")
else:
userStringList = userString
return userStringList
i want to be able to input something like this is a test. this is a test? this is definitely a test!
and get [this is a test.', 'this is a test?', 'this is definitely a test!']
and the this is going to send the list of sentences to another function to make the the first letter capitalized for each sentence.
this is an old homework assignment that i could only make it use one special character to separate the string into a list. buti want to user to be able to put in more then just one kind of sentence...
This may hep. use str.replace to replace special chars with space and the use str.split
Ex:
def getSplit(userString):
return userString.replace("!", " ").replace("?", " ").replace(".", " ").split()
print(map(lambda x:x.capitalize, getSplit("sdfsdf! sdfsdfdf? sdfsfdsf.sdfsdfsd!fdfgdfg?dsfdsfgf")))
Normally, you could use re.split(), but since you cannot import anything, the best option would be just to do a for loop. Here it is:
def getSplit(user_input):
n = len(user_input)
sentences =[]
previdx = 0
for i in range(n - 1):
if(user_input[i:i+2] in ['. ', '! ', '? ']):
sentences.append(user_input[previdx:i+2].capitalize())
previdx = i + 2
sentences.append(user_input[previdx:n].capitalize())
return "".join(sentences)
I would split the string at each white space. Then scan the list for words that contain the special character. If any is present, the next word is capitalised. Join the list back at the end. Of course, this assumes that there are no more than two consecutive spaces between words.
def capitalise(text):
words = text.split()
new_words = [words[0].capitalize()]
i = 1
while i < len(words) - 1:
new_words.append(words[i])
if "." in words[i] or "!" in words[i] or "?" in words[i]:
i += 1
new_words.append(words[i].capitalize())
i += 1
return " ".join(new_words)
If you can use the re module which is available by default in python, this is how you could do it:
import re
a = 'test this. and that, and maybe something else?even without space. or with multiple.\nor line breaks.'
print(re.sub(r'[.!?]\s*\w', lambda x: x.group(0).upper(), a))
Would lead to:
test this. And that, and maybe something else?Even without space. Or with multiple.\nOr line breaks.

Python get the x first words in a string

I'm looking for a code that takes the 4 (or 5) first words in a script.
I tried this:
import re
my_string = "the cat and this dog are in the garden"
a = my_string.split(' ', 1)[0]
b = my_string.split(' ', 1)[1]
But I can't take more than 2 strings:
a = the
b = cat and this dog are in the garden
I would like to have:
a = the
b = cat
c = and
d = this
...
You can use slice notation on the list created by split:
my_string.split()[:4] # first 4 words
my_string.split()[:5] # first 5 words
N.B. these are example commands. You should use one or the other, not both in a row.
The second argument of the split() method is the limit. Don't use it and you will get all words.
Use it like this:
my_string = "the cat and this dog are in the garden"
splitted = my_string.split()
first = splitted[0]
second = splitted[1]
...
Also, don't call split() every time when you want a word, it is expensive. Do it once and then just use the results later, like in my example.
As you can see, there is no need to add the ' ' delimiter since the default delimiter for the split() function (None) matches all whitespace. You can use it however if you don't want to split on Tab for example.
You can split a string on whitespace easily enough, but if your string doesn't happen to have enough words in it, the assignment will fail where the list is empty.
a, b, c, d, e = my_string.split()[:5] # May fail
You'd be better off keeping the list as is instead of assigning each member to an individual name.
words = my_string.split()
at_most_five_words = words[:5] # terrible variable name
That's a terrible variable name, but I used it to illustrate the fact that you're not guaranteed to get five words – you're only guaranteed to get at most five words.

How do I calculate the number of times a word occurs in a sentence?

So I've been learning Python for some months now and was wondering how I would go about writing a function that will count the number of times a word occurs in a sentence. I would appreciate if someone could please give me a step-by-step method for doing this.
Quick answer:
def count_occurrences(word, sentence):
return sentence.lower().split().count(word)
'some string.split() will split the string on whitespace (spaces, tabs and linefeeds) into a list of word-ish things. Then ['some', 'string'].count(item) returns the number of times item occurs in the list.
That doesn't handle removing punctuation. You could do that using string.maketrans and str.translate.
# Make collection of chars to keep (don't translate them)
import string
keep = string.lowercase + string.digits + string.whitespace
table = string.maketrans(keep, keep)
delete = ''.join(set(string.printable) - set(keep))
def count_occurrences(word, sentence):
return sentence.lower().translate(table, delete).split().count(word)
The key here is that we've constructed the string delete so that it contains all the ascii characters except letters, numbers and spaces. Then str.translate in this case takes a translation table that doesn't change the string, but also a string of chars to strip out.
wilberforce has the quick, correct answer, and I'll give the long winded 'how to get to that conclusion' answer.
First, here are some tools to get you started, and some questions you need to ask yourself.
You need to read the section on Sequence Types, in the python docs, because it is your best friend for solving this problem. Seriously, read it. Once you have read that, you should have some ideas. For example you can take a long string and break it up using the split() function. To be explicit:
mystring = "This sentence is a simple sentence."
result = mystring.split()
print result
print "The total number of words is: " + str(len(result))
print "The word 'sentence' occurs: " + str(result.count("sentence"))
Takes the input string and splits it on any whitespace, and will give you:
["This", "sentence", "is", "a", "simple", "sentence."]
The total number of words is 6
The word 'sentence' occurs: 1
Now note here that you do have the period still at the end of the second 'sentence'. This is a problem because 'sentence' is not the same as 'sentence.'. If you are going to go over your list and count words, you need to make sure that the strings are identical. You may need to find and remove some punctuation.
A naieve approach to this might be:
no_period_string = mystring.replace(".", " ")
print no_period_string
To get me a period-less sentence:
"This sentence is a simple sentence"
You also need to decide if your input going to be just a single sentence, or maybe a paragraph of text. If you have many sentences in your input, you might want to find a way to break them up into individual sentences, and find the periods (or question marks, or exclamation marks, or other punctuation that ends a sentence). Once you find out where in the string the 'sentence terminator' is you could maybe split up the string at that point, or something like that.
You should give this a try yourself - hopefully I've peppered in enough hints to get you to look at some specific functions in the documentation.
Simplest way:
def count_occurrences(word, sentence):
return sentence.count(word)
text=input("Enter your sentence:")
print("'the' appears", text.count("the"),"times")
simplest way to do it
Problem with using count() method is that it not always gives the correct number of occurrence when there is overlapping, for example
print('banana'.count('ana'))
output
1
but 'ana' occurs twice in 'banana'
To solve this issue, i used
def total_occurrence(string,word):
count = 0
tempsting = string
while(word in tempsting):
count +=1
tempsting = tempsting[tempsting.index(word)+1:]
return count
You can do it like this:
def countWord(word):
numWord = 0
for i in range(1, len(word)-1):
if word[i-1:i+3] == 'word':
numWord += 1
print 'Number of times "word" occurs is:', numWord
then calling the string:
countWord('wordetcetcetcetcetcetcetcword')
will return: Number of times "word" occurs is: 2
def check_Search_WordCount(mySearchStr, mySentence):
len_mySentence = len(mySentence)
len_Sentence_without_Find_Word = len(mySentence.replace(mySearchStr,""))
len_Remaining_Sentence = len_mySentence - len_Sentence_without_Find_Word
count = len_Remaining_Sentence/len(mySearchStr)
return (int(count))
I assume that you just know about python string and for loop.
def count_occurences(s,word):
count = 0
for i in range(len(s)):
if s[i:i+len(word)] == word:
count += 1
return count
mystring = "This sentence is a simple sentence."
myword = "sentence"
print(count_occurences(mystring,myword))
explanation:
s[i:i+len(word)]: slicing the string s to extract a word having the same length with the word (argument)
count += 1 : increase the counter whenever matched.

Categories