how to remove substring with and without space

how to remove substring with and without space - python

I have to remove first instances of the word "hard" from the given string but I am not sure how to do remove it both with and without spaces:
For example:
string1 = "it is a hard rock" needs to become "it is a rock"
string2 = "play hard" needs to become "play"
However, when I use
string1 = string1.replace(hard+ ' ', '', 1)
it will not work on string2 as hard comes at the end without spaces. Any way to deal with this?
Lastly if we have string3
string3 = "play hard to be hard" becomes "play to be hard"
We want only the first occurrence to be replaced.

Maybe a simple
.replace(" hard", "").replace("hard ", "")
already works?
If not, I would suggest using a regular expression. But then you would have to give us a few more examples that need to be covered.

Seems like a job for some regular expression:
import re
' '.join(filter(bool, re.split(r' *\bhard\b *', 'it is a hard rock', maxsplit=1)))
* eats up spaces around the word, \b guarantees only full words match, filter(bool, ...) removes empty strings between consecutive separators (if any) and finally ' '.join reinstates a single space.

Use str.partition—
# using if block
" ".join(s.strip() for s in thestring.partition("hard") if s != "hard")
# or with slice notation
" ".join(s.strip() for a in thestring.partition("hard")[::2])

Related

how to specify the exact words, like Legend and not derivates like Legendary? [duplicate]

I need to find a way to figure out a way to find the exact word in a string.
All the information I have read online has only given me how to search for letters in a string, so
98787This is correct
will still come back as true in an if statement.
This is what I have so far.
if 'This is correct' in text:
print("correct")
This will work with any combination of letters before the This is correct... For example fkrjThis is correct, 4123This is correct and lolThis is correct will all come back as true in the if statement. When I want it to come back as true only if it exactly matches This is correct.

You can use the word-boundaries of regular expressions. Example:
import re
s = '98787This is correct'
for words in ['This is correct', 'This', 'is', 'correct']:
if re.search(r'\b' + words + r'\b', s):
print('{0} found'.format(words))
That yields:
is found
correct found
For an exact match, replace \b assertions with ^ and $ to restrict the match to the begin and end of line.

Use the comparison operator == instead of in then:
if text == 'This is correct':
print("Correct")
This will check to see if the whole string is just 'This is correct'. If it isn't, it will be False

Actually, you should look for 'This is correct' string surrounded by word boundaries.
So
import re
if re.search(r'\bThis is correct\b', text):
print('correct')
should work for you.

I suspect that you are looking for the startswith() function. This checks to see if the characters in a string match at the start of another string
"abcde".startswith("abc") -> true
"abcde".startswith("bcd") -> false
There is also the endswith() function, for checking at the other end.

You can make a few changes.
elif 'This is correct' in text[:len('This is correct')]:
or
elif ' This is correct ' in ' '+text+' ':
Both work. The latter is more flexible.

It could be a complicated problem if we want to solve it without using regular expression. But I came up with a little trick.
First we need to pad the original string with whitespaces.
After that we can search the text, which is also padded with whitespaces.
Example code here:
incorrect_str = "98787This is correct"
correct_str = "This is a great day and This is correct"
# Padding with whitespaces
new_incorrect_str = " " + incorrect_str + " "
new_correct_str = " " + correct_str + " "
if " This is correct " in new_correct_str:
print("correct")
else:
print("incorrect")

Break up the string into a list of strings with .split() then use the in operator.
This is much simpler than using regular expressions.

Below is a solution without using regular expressions. The program searches for exact word in this case 'CASINO' and prints the sentence.
words_list = [ "The Learn Python Challenge Casino.", "They bought a car while at
the casino", "Casinoville" ]
search_string = 'CASINO'
def text_manipulation(words_list, search_string):
search_result = []
for sentence in words_list:
words = sentence.replace('.', '').replace(',', '').split(' ')
[search_result.append(sentence) for w in words if w.upper() ==
search_string]
print(search_result)
text_manipulation(words_list, search_string)
This will print the results - ['The Learn Python Challenge Casino.', 'They bought a car while at the casino']

Make strings lowercase only after an right arrow

I'd like to make strings lowercase only after an right arrow (and until it hits a comma) in Python.
Also, I prefer to write it in one-line, if I could.
Here's my code:
import re
line = "For example, →settle ACCOUNTs. After that, UPPER CASEs are OK."
string1 = re.sub(r'→([A-Za-z ]+)', r'→\1', line)
# string1 = re.sub(r'→([A-Za-z ]+)', r'→\1.lower()', line) # Pseudo-code in my brain
print(string1)
# Expecting: "For example, →settle accounts. After that, UPPERCASEs are OK."
# Should I always write two lines of code, like this:
string2 = re.findall(r'→([A-Za-z ]+)', line)
print('→' + string2[0].lower())
# ... and add "For example, " and ". After that, UPPER CASEs are OK." ... later?
I believe that there must be a better way. How would you guys do?
Thank you in advance.

import re
line = "For example, →settle ACCOUNTs. After that, UPPER CASEs are OK."
string1 = re.sub(r'→[A-Za-z ]+', lambda match: match.group().lower(), line)
print(string1)
# For example, →settle accounts. After that, UPPER CASEs are OK.
From the documentation:
re.sub(pattern, repl, string, count=0, flags=0)
...repl can be a string or a function...
If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

How can I output a string excluding ALL whitespaces? [duplicate]

This question already has answers here:
How to strip all whitespace from string
(14 answers)
Closed 4 years ago.
Basically, I'm trying to do a code in Python where a user inputs a sentence. However, I need my code to remove ALL whitespaces (e.g. tabs, space, index, etc.) and print it out.
This is what I have so far:
def output_without_whitespace(text):
newText = text.split("")
print('String with no whitespaces: '.join(newText))
I'm clear that I'm doing a lot wrong here and I'm missing plenty, but, I haven't been able to thoroughly go over splitting and joining strings yet, so it'd be great if someone explained it to me.
This is the whole code that I have so far:
text = input(str('Enter a sentence: '))
print(f'You entered: {text}')
def get_num_of_characters(text):
result = 0
for char in text:
result += 1
return result
print('Number of characters: ', get_num_of_characters(text))
def output_without_whitespace(text):
newtext = "".join(text.split())
print(f'String without whitespaces: {newtext}')
I FIGURED OUT MY PROBLEM!
I realize that in this line of code.
print(f'String without whitespaces: {newtext}')
It's supposed to be.
print('String without whitespaces: ', output_without_whitespace(text))
I realize that my problem as to why the sentence without whitespaces was not printing back out to me was, because I was not calling out my function!

You have the right idea, but here's how to implement it with split and join:
def output_without_whitespace(text):
return ''.join(text.split())
so that:
output_without_whitespace(' this\t is a\n test..\n ')
would return:
thisisatest..

A trivial solution is to just use split and rejoin (similar to what you are doing):
def output_without_whitespace(text):
return ''.join(text.split())
First we split the initial string to a list of words, then we join them all together.
So to think about it a bit:
text.split()
will give us a list of words (split by any whitespace). So for example:
'hello world'.split() -> ['hello', 'world']
And finally
''.join(<result of text.split()>)
joins all of the words in the given list to a single string. So:
''.join(['hello', 'world']) -> 'helloworld'
See Remove all whitespace in a string in Python for more ways to do it.

Get input, split, join
s = ''.join((input('Enter string: ').split()))
Enter string: vash the stampede
vashthestampede

There are a few different ways to do this, but this seems the most obvious one to me. It is simple and efficient.
>>> with_spaces = ' The quick brown fox '
>>> list_no_spaces = with_spaces.split()
>>> ''.join(list_no_spaces)
'Thequickbrownfox'
.split() with no parameter splits a string into a list wherever there's one or more white space characters, leaving out the white space...more details here.
''.join(list_no_spaces) joins elements of the list into a string with nothing betwen the elements, which is what you want here: 'Thequickbrownfox'.
If you had used ','.join(list_no_spaces) you'd get 'The,quick,brown,fox'.
Experienced Python programmers tend to use regular expressions sparingly. Often it's better to use tools like .split() and .join() to do the work, and keep regular expressions for where there is no alternative.

split strings with multiple special characters into lists without importing anything in python

i need to make a program that will capitalize the first word in a sentence and i want to be sure that all the special characters that are used to end a sentence can be used.
i can not import anything! this is for a class and i just want some examples to do this.
i have tried to use if to look in the list to see if it finds the matching character and do the correct split operatrion...
this is the function i have now... i know its not good at all as it just returns the original string...
def getSplit(userString):
userStringList = []
if "? " in userString:
userStringList=userString.split("? ")
elif "! " in userStringList:
userStringList = userString.split("! ")
elif ". " in userStringList:
userStringList = userString.split(". ")
else:
userStringList = userString
return userStringList
i want to be able to input something like this is a test. this is a test? this is definitely a test!
and get [this is a test.', 'this is a test?', 'this is definitely a test!']
and the this is going to send the list of sentences to another function to make the the first letter capitalized for each sentence.
this is an old homework assignment that i could only make it use one special character to separate the string into a list. buti want to user to be able to put in more then just one kind of sentence...

This may hep. use str.replace to replace special chars with space and the use str.split
Ex:
def getSplit(userString):
return userString.replace("!", " ").replace("?", " ").replace(".", " ").split()
print(map(lambda x:x.capitalize, getSplit("sdfsdf! sdfsdfdf? sdfsfdsf.sdfsdfsd!fdfgdfg?dsfdsfgf")))

Normally, you could use re.split(), but since you cannot import anything, the best option would be just to do a for loop. Here it is:
def getSplit(user_input):
n = len(user_input)
sentences =[]
previdx = 0
for i in range(n - 1):
if(user_input[i:i+2] in ['. ', '! ', '? ']):
sentences.append(user_input[previdx:i+2].capitalize())
previdx = i + 2
sentences.append(user_input[previdx:n].capitalize())
return "".join(sentences)

I would split the string at each white space. Then scan the list for words that contain the special character. If any is present, the next word is capitalised. Join the list back at the end. Of course, this assumes that there are no more than two consecutive spaces between words.
def capitalise(text):
words = text.split()
new_words = [words[0].capitalize()]
i = 1
while i < len(words) - 1:
new_words.append(words[i])
if "." in words[i] or "!" in words[i] or "?" in words[i]:
i += 1
new_words.append(words[i].capitalize())
i += 1
return " ".join(new_words)

If you can use the re module which is available by default in python, this is how you could do it:
import re
a = 'test this. and that, and maybe something else?even without space. or with multiple.\nor line breaks.'
print(re.sub(r'[.!?]\s*\w', lambda x: x.group(0).upper(), a))
Would lead to:
test this. And that, and maybe something else?Even without space. Or with multiple.\nOr line breaks.

Python function: Please help me in this one

okay these two functions are related to each other and fortunately the first one is solved but the other is a big mess and it should give me 17.5 but it only gives me 3 so why doesn't it work out??
def split_on_separators(original, separators):
""" (str, str) -> list of str
Return a list of non-empty, non-blank strings from the original string
determined by splitting the string on any of the separators.
separators is a string of single-character separators.
>>> split_on_separators("Hooray! Finally, we're done.", "!,")
['Hooray', ' Finally', " we're done."]
"""
result = []
newstring = ''
for index,char in enumerate(original):
if char in separators or index==len(original) -1:
result.append(newstring)
newstring=''
if '' in result:
result.remove('')
else:
newstring+=char
return result
def average_sentence_length(text):
""" (list of str) -> float
Precondition: text contains at least one sentence. A sentence is defined
as a non-empty string of non-terminating punctuation surrounded by
terminating punctuation or beginning or end of file. Terminating
punctuation is defined as !?.
Return the average number of words per sentence in text.
>>> text = ['The time has come, the Walrus said\n',
'To talk of many things: of shoes - and ships - and sealing wax,\n',
'Of cabbages; and kings.\n'
'And why the sea is boiling hot;\n'
'and whether pigs have wings.\n']
>>> average_sentence_length(text)
17.5
"""
words=0
Sentences=0
for line in text:
words+=1
sentence=split_on_separators(text,'?!.')
for sep in sentence:
Sentences+=1
ASL=words/Sentences
return ASL

words can be counted by spliting each sentence in the list using space and counting the length of that list. would be helpful.

You can eliminate the need for your first function by using regular expressions to split on separators. The regular expression function is re.split(). Here is a cleaned up version that gets the right result:
import re
def average_sentence_length(text):
# Join all the text into one string and remove all newline characters
# Joining all text into one string allows us to find the sentences much
# easier, since multiple list items in 'text' could be one whole sentence
text = "".join(text).replace('\n', '')
# Use regex to split the sentences at delimiter characters !?.
# Filter out any empty strings that result from this function,
# otherwise they will count as words later on
sentences = filter(None, re.split('[!?.]', text))
# Set the word sum variable
wordsum = 0.0
for s in sentences:
# Split each sentence (s) into its separate words and add them
# to the wordsum variable
words = s.split(' ')
wordsum += len(words)
return wordsum / len(sentences)
data = ['The time has come, the Walrus said\n',
' To talk of many things: of shoes - and ships - and sealing wax,\n',
'Of cabbages; and kings.\n'
'And why the sea is boiling hot;\n'
'and whether pigs have wings.\n']
print average_sentence_length(data)
The one issue with this function is that with the text you provided, it returns 17.0 instead of 17.5. This is because there is no space in between "...the Walrus said" and "To talk of...". There is nothing that can be done there besides adding the space that should be there in the first place.
If the first function (split_on_separators) is required for the project, than you can replace the re.split() function with your function. Using regular expressions is a bit more reliable and a lot more lightweight than writing an entire function for it, however.
EDIT
I forgot to explain the filter() function. Basically if you give the first argument of type None, it takes the second argument and removes all "false" items in it. Since an empty string is considered false in Python, it is removed. You can read more about filter() here

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to remove substring with and without space - python

Maybe a simple .replace(" hard", "").replace("hard ", "") already works? If not, I would suggest using a regular expression. But then you would have to give us a few more examples that need to be covered.

Use str.partition— # using if block " ".join(s.strip() for s in thestring.partition("hard") if s != "hard") # or with slice notation " ".join(s.strip() for a in thestring.partition("hard")[::2])

Related

how to specify the exact words, like Legend and not derivates like Legendary? [duplicate]

Make strings lowercase only after an right arrow

How can I output a string excluding ALL whitespaces? [duplicate]

split strings with multiple special characters into lists without importing anything in python

Python function: Please help me in this one

Categories

Resources