Find and select a required word from a sentence?

Find and select a required word from a sentence? - python

I am trying to get raw_input from user and then find a required word from that input. If the required word is there, then a function runs. So I tried .split to split the input but how do I find if the required word is in the list.

It's really simple to get this done. Python has an in operator that does exactly what you need. You can see if a word is present in a string and then do whatever else you'd like to do.
sentence = 'hello world'
required_word = 'hello'
if required_word in sentence:
# do whatever you'd like
You can see some basic examples of the in operator in action here.
Depending on the complexity of your input or lack of complexity of your required word, you may run into some problems. To deal with that you may want to be a little more specific with your required word.
Let's take this for example:
sentence = 'i am harrison'
required_word = 'is'
This example will evaluate to True if you were to doif required_word in sentence: because technically the letters is are a substring of the word "harrison".
To fix that you would just simply do this:
sentence = 'i am harrison'
required_word = ' is '
By putting the empty space before and after the word it will specifically look for occurrences of the required word as a separate word, and not as a part of a word.
HOWEVER, if you are okay with matching substrings as well as word occurrences then you can ignore what I previously explained.
If there's a group of words and if any of them is the required one, then what should I do? Like, the required word is either "yes" or "yeah". And the input by user contains "yes" or "yeah".
As per this question, an implementation would look like this:
sentence = 'yes i like to code in python'
required_words = ['yes', 'yeah']
^ ^ ^ ^
# add spaces before and after each word if you don't
# want to accidentally run into a chance where either word
# is a substring of one of the words in sentence
if any(word in sentence for word in required_words):
# do whatever you'd like
This makes use of the any operator. The if statement will evaluate to true as long as at least one of the words in required_words is found in sentence.

Harrison's way is one way. Here are other ways:
Way 1:
sentence = raw_input("enter input:")
words = sentence.split(' ')
desired_word = 'test'
if desired_word in words:
# do required operations
Way 2:
import re
sentence = raw_input("enter input:")
desired_word = 'test'
if re.search('\s' + desired_word + '\s', sentence.strip()):
# do required operations
Way 3 (especially if there are punctuations at the end of the word):
import re
sentence = raw_input("enter input:")
desired_word = 'test'
if re.search('\s' + desired_word + '[\s,:;]', sentence.strip()):
# do required operations

Related

How do I match a string with any word in the middle, and store it as a variable?

I'm trying to make a bot that defines a word when I ask "what does" + any word + "mean".
I don't know how to match an input with a string that has any word in the middle, and then store that word as a variable.
It should be something like this:
whatDoesWordMean = ["what does mean"]
sentance = input()
if any(x in sentance for x in whatDoesWordMean):
#Stores word as a variable
#DefineWord()
Right now, it only accepts the input "what does mean"

I would suggest using regular expressions.
For example:
import re
expressions = [r"what does (\w+) mean", r"meaning of (\w+)", r"what is (\w+)"]
patterns = [re.compile(expr, re.IGNORECASE) for expr in expressions]
while True:
sentence = input("chat: ")
pattern = next((p for p in patterns if p.match(sentence)),None)
if pattern:
word = pattern.match(sentence).group(1)
print(word,"means...")
output:
chat: what is good
good means...
chat: what does IHMO mean
IHMO means...
chat: hello world
chat: meaning of Life
Life means...
chat:
If you don't want to use regular expressions, you can work with a prefix-suffix matching approach using string methods startswith() and endswith():
expressions = ["what does | mean", "meaning of |", "what is |"]
patterns = [(prefix,suffix) for e in expressions
for prefix,suffix in [e.lower().split("|")]]
while True:
sentence = input("chat: ")
matches = [ sentence[len(p):-len(s) or None] for p,s in patterns
if sentence.lower().startswith(p)
and sentence.lower().endswith(s)]
if matches:
word = matches[0]
print(word,"means...")

A quick solution to find not only words, but phrases also would be this:
import re
msg = 'what does SO mean'
wordRegex = re.compile(r'what does (.*) mean')
print(wordRegex.findall(msg))
Essentially you are looking for everything that is between what does and mean. By using regular expressions it is really simple.

If I'm understanding your bot correctly, it will define whatever is between 'what does' and 'mean'. One solution I can think of is:
whatDoesWordMean = ['what','does','mean']
word_to_define = input()
split_input = word_to_define.split(' ')
definition = None
for word in split_input:
if word not in whatDoesWordMean:
definition = DefineWord(word)
break
Of course this assumes you only want to define one word. Otherwise you could append the definitions to a list or do the following
whatDoesWordMean = ['what','does','mean']
word_to_define = input()
split_input = word_to_define.split(' ')
to_define = ''
for word in split_input:
if word not in whatDoesWordMean:
to_define += word
definition = DefineWord(to_define)

We can do something like this in a quick glance:
whatDoesWordMean = ["what", "does", "mean"]
sentence = input()
word_list = [x for x in sentence.split(" ") if x not in whatDoesWordMean]
this will give you a list of words that are between 'what does ' and 'mean' and store it in the variable word. You can join multiple words using pyhtons built in join() method.
word = " ".join(word_list)
But I would suggest using regular expression solutions. they are much more efficient.

split strings with multiple special characters into lists without importing anything in python

i need to make a program that will capitalize the first word in a sentence and i want to be sure that all the special characters that are used to end a sentence can be used.
i can not import anything! this is for a class and i just want some examples to do this.
i have tried to use if to look in the list to see if it finds the matching character and do the correct split operatrion...
this is the function i have now... i know its not good at all as it just returns the original string...
def getSplit(userString):
userStringList = []
if "? " in userString:
userStringList=userString.split("? ")
elif "! " in userStringList:
userStringList = userString.split("! ")
elif ". " in userStringList:
userStringList = userString.split(". ")
else:
userStringList = userString
return userStringList
i want to be able to input something like this is a test. this is a test? this is definitely a test!
and get [this is a test.', 'this is a test?', 'this is definitely a test!']
and the this is going to send the list of sentences to another function to make the the first letter capitalized for each sentence.
this is an old homework assignment that i could only make it use one special character to separate the string into a list. buti want to user to be able to put in more then just one kind of sentence...

This may hep. use str.replace to replace special chars with space and the use str.split
Ex:
def getSplit(userString):
return userString.replace("!", " ").replace("?", " ").replace(".", " ").split()
print(map(lambda x:x.capitalize, getSplit("sdfsdf! sdfsdfdf? sdfsfdsf.sdfsdfsd!fdfgdfg?dsfdsfgf")))

Normally, you could use re.split(), but since you cannot import anything, the best option would be just to do a for loop. Here it is:
def getSplit(user_input):
n = len(user_input)
sentences =[]
previdx = 0
for i in range(n - 1):
if(user_input[i:i+2] in ['. ', '! ', '? ']):
sentences.append(user_input[previdx:i+2].capitalize())
previdx = i + 2
sentences.append(user_input[previdx:n].capitalize())
return "".join(sentences)

I would split the string at each white space. Then scan the list for words that contain the special character. If any is present, the next word is capitalised. Join the list back at the end. Of course, this assumes that there are no more than two consecutive spaces between words.
def capitalise(text):
words = text.split()
new_words = [words[0].capitalize()]
i = 1
while i < len(words) - 1:
new_words.append(words[i])
if "." in words[i] or "!" in words[i] or "?" in words[i]:
i += 1
new_words.append(words[i].capitalize())
i += 1
return " ".join(new_words)

If you can use the re module which is available by default in python, this is how you could do it:
import re
a = 'test this. and that, and maybe something else?even without space. or with multiple.\nor line breaks.'
print(re.sub(r'[.!?]\s*\w', lambda x: x.group(0).upper(), a))
Would lead to:
test this. And that, and maybe something else?Even without space. Or with multiple.\nOr line breaks.

Python 3 - How to capitalize first letter of every sentence when translating from morse code

I am trying to translate morse code into words and sentences and it all works fine... except for one thing. My entire output is lowercased and I want to be able to capitalize every first letter of every sentence.
This is my current code:
text = input()
if is_morse(text):
lst = text.split(" ")
text = ""
for e in lst:
text += TO_TEXT[e].lower()
print(text)
Each element in the split list is equal to a character (but in morse) NOT a WORD. 'TO_TEXT' is a dictionary. Does anyone have a easy solution to this? I am a beginner in programming and Python btw, so I might not understand some solutions...

Maintain a flag telling you whether or not this is the first letter of a new sentence. Use that to decide whether the letter should be upper-case.
text = input()
if is_morse(text):
lst = text.split(" ")
text = ""
first_letter = True
for e in lst:
if first_letter:
this_letter = TO_TEXT[e].upper()
else:
this_letter = TO_TEXT[e].lower()
# Period heralds a new sentence.
first_letter = this_letter == "."
text += this_letter
print(text)

From what is understandable from your code, I can say that you can use the title() function of python.
For a more stringent result, you can use the capwords() function importing the string class.
This is what you get from Python docs on capwords:
Split the argument into words using str.split(), capitalize each word using str.capitalize(), and join the capitalized words using str.join(). If the optional second argument sep is absent or None, runs of whitespace characters are replaced by a single space and leading and trailing whitespace are removed, otherwise sep is used to split and join the words.

Need assistance with sentence analysis

My code takes sentence and finds a given a word in that sentence.
If the word is in the sentence it needs to say that it has found the word and what positions said word is in.
If the word is not in the sentence it should display an error message.
I have this:
print("Please insert your sentence without punctuation")
sentence=(input())
variable1='sentence'
print("Which word would you like to find in your sentence?")
word=input()
variable2='word'
if 'word'=='COUNTRY':
'variable3'==5
'variable4'==17
if word in sentence:
print([word], "is in positions", [variable3], "and", [variable4]);
else:
print("Your word is not in the sentence!")

I want to deal with some misunderstandings in the presented code.
First,
print("Please insert your sentence without punctuation")
sentence=(input())
is simpler as
sentence = input("Please insert your sentence without punctuation")
Now I have a variable called sentence wihich should not be muddled with the string 'sentence'
Similarly we can say
word = input("Which word would you like to find in your sentence?")
gives another variable word again not to be muddled with the string 'word'
Suppose for the sake of argument we have,
sentence = "Has this got an elephant in?"
and we search for the word 'elephant'
The posted code attempts to use in, but this will happen:
>>> "elephant" in sentence
True
>>> "ele" in sentence
True
>>> "giraffe" in sentence
False
>>>
Close. But not close enough. It is not looking for a whole word, since we found 'ele' in 'elephant'.
If you split the sentence into words, as suggested by the other answer, you can then search for whole words and find the position. (Look up split; you can choose other characters than the default ' ').
words = sentence.split()
word = 'ele'
words.index(word)
If the word isn't there you will get an error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: 'ele' is not in list
I will leave the error handling to you.

Python sequences provide the index method. It gives you the index of an element, or raises an error if the element is not in the sequence. On strings, it allows you to find substrings.
>>> 'hello world'.index('world')
6
>>> 'hello world'.index('word')
ValueError: substring not found
Basically, you have to add input for the sentence and the word to search. That's it.
print("Insert sentence without punctuation...")
sentence=input() # get input, store it to name `sentence`
print("Insert word to find...")
word=input()
try:
idx = sentence.index(word)
except ValueError: # so it wasn't in the sentence after all...
print('Word', word, 'not in sentence', repr(sentence))
else: # if we get here, IndexError was NOT thrown
print('Word', word, 'first occurs at position', idx)
There are some caveats here, for example 'fooworldbar' will match as well. The correct handling of such things depend on what precisely one wants. I'm guessing you actually want word positions.
If you need positions in the meaning of "the nth word", you must transform the sentence to a list of words. str.split does that. You can then work with index again. Also, if you want all positions, you must call index repeatedly.
print("Insert sentence without punctuation...")
sentence = input() # get input, store it to name `sentence`
words = sentence.split() # split at whitespace, creating a list of words
print("Insert word to find...")
word=input()
positions, idx = [], -1
while idx < len(words):
try:
idx = words.index(word, idx+1)
except ValueError: # so it wasn't in the rest of the sentence after all...
break
else: # if we get here, IndexError was NOT thrown
positions.append(idx) # store the index to list of positions
# if we are here, we have searched through the whole string
if positions: # positions is not an empty list, so we have found some
print('Word', word, 'appears at positions', ', '.join(str(pos) for pos in positions))
else:
print('Word', word, 'is not in the sentence')

You can use re module:
import re
sentence = input('Sentence: ')
word = input('Word: ')
## convert word in regular expression for search method.
regex_word = r'(' + word + ')(?=\s|$)'
## initialize search var.
search = re.search(regex_word, sentence)
if search:
while search:
match_pos = search.span()
sentence = sentence[:match_pos[0]] + sentence[match_pos[1]:]
print(word + ' is in position ' + str(match_pos[0]))
search = re.search(regex_word, sentence)
else:
print(word + ' is not present in this sentence')

How do I calculate the number of times a word occurs in a sentence?

So I've been learning Python for some months now and was wondering how I would go about writing a function that will count the number of times a word occurs in a sentence. I would appreciate if someone could please give me a step-by-step method for doing this.

Quick answer:
def count_occurrences(word, sentence):
return sentence.lower().split().count(word)
'some string.split() will split the string on whitespace (spaces, tabs and linefeeds) into a list of word-ish things. Then ['some', 'string'].count(item) returns the number of times item occurs in the list.
That doesn't handle removing punctuation. You could do that using string.maketrans and str.translate.
# Make collection of chars to keep (don't translate them)
import string
keep = string.lowercase + string.digits + string.whitespace
table = string.maketrans(keep, keep)
delete = ''.join(set(string.printable) - set(keep))
def count_occurrences(word, sentence):
return sentence.lower().translate(table, delete).split().count(word)
The key here is that we've constructed the string delete so that it contains all the ascii characters except letters, numbers and spaces. Then str.translate in this case takes a translation table that doesn't change the string, but also a string of chars to strip out.

wilberforce has the quick, correct answer, and I'll give the long winded 'how to get to that conclusion' answer.
First, here are some tools to get you started, and some questions you need to ask yourself.
You need to read the section on Sequence Types, in the python docs, because it is your best friend for solving this problem. Seriously, read it. Once you have read that, you should have some ideas. For example you can take a long string and break it up using the split() function. To be explicit:
mystring = "This sentence is a simple sentence."
result = mystring.split()
print result
print "The total number of words is: " + str(len(result))
print "The word 'sentence' occurs: " + str(result.count("sentence"))
Takes the input string and splits it on any whitespace, and will give you:
["This", "sentence", "is", "a", "simple", "sentence."]
The total number of words is 6
The word 'sentence' occurs: 1
Now note here that you do have the period still at the end of the second 'sentence'. This is a problem because 'sentence' is not the same as 'sentence.'. If you are going to go over your list and count words, you need to make sure that the strings are identical. You may need to find and remove some punctuation.
A naieve approach to this might be:
no_period_string = mystring.replace(".", " ")
print no_period_string
To get me a period-less sentence:
"This sentence is a simple sentence"
You also need to decide if your input going to be just a single sentence, or maybe a paragraph of text. If you have many sentences in your input, you might want to find a way to break them up into individual sentences, and find the periods (or question marks, or exclamation marks, or other punctuation that ends a sentence). Once you find out where in the string the 'sentence terminator' is you could maybe split up the string at that point, or something like that.
You should give this a try yourself - hopefully I've peppered in enough hints to get you to look at some specific functions in the documentation.

Simplest way:
def count_occurrences(word, sentence):
return sentence.count(word)

text=input("Enter your sentence:")
print("'the' appears", text.count("the"),"times")
simplest way to do it

Problem with using count() method is that it not always gives the correct number of occurrence when there is overlapping, for example
print('banana'.count('ana'))
output
1
but 'ana' occurs twice in 'banana'
To solve this issue, i used
def total_occurrence(string,word):
count = 0
tempsting = string
while(word in tempsting):
count +=1
tempsting = tempsting[tempsting.index(word)+1:]
return count

You can do it like this:
def countWord(word):
numWord = 0
for i in range(1, len(word)-1):
if word[i-1:i+3] == 'word':
numWord += 1
print 'Number of times "word" occurs is:', numWord
then calling the string:
countWord('wordetcetcetcetcetcetcetcword')
will return: Number of times "word" occurs is: 2

def check_Search_WordCount(mySearchStr, mySentence):
len_mySentence = len(mySentence)
len_Sentence_without_Find_Word = len(mySentence.replace(mySearchStr,""))
len_Remaining_Sentence = len_mySentence - len_Sentence_without_Find_Word
count = len_Remaining_Sentence/len(mySearchStr)
return (int(count))

I assume that you just know about python string and for loop.
def count_occurences(s,word):
count = 0
for i in range(len(s)):
if s[i:i+len(word)] == word:
count += 1
return count
mystring = "This sentence is a simple sentence."
myword = "sentence"
print(count_occurences(mystring,myword))
explanation:
s[i:i+len(word)]: slicing the string s to extract a word having the same length with the word (argument)
count += 1 : increase the counter whenever matched.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.