Need assistance with sentence analysis

Need assistance with sentence analysis - python

My code takes sentence and finds a given a word in that sentence.
If the word is in the sentence it needs to say that it has found the word and what positions said word is in.
If the word is not in the sentence it should display an error message.
I have this:
print("Please insert your sentence without punctuation")
sentence=(input())
variable1='sentence'
print("Which word would you like to find in your sentence?")
word=input()
variable2='word'
if 'word'=='COUNTRY':
'variable3'==5
'variable4'==17
if word in sentence:
print([word], "is in positions", [variable3], "and", [variable4]);
else:
print("Your word is not in the sentence!")

I want to deal with some misunderstandings in the presented code.
First,
print("Please insert your sentence without punctuation")
sentence=(input())
is simpler as
sentence = input("Please insert your sentence without punctuation")
Now I have a variable called sentence wihich should not be muddled with the string 'sentence'
Similarly we can say
word = input("Which word would you like to find in your sentence?")
gives another variable word again not to be muddled with the string 'word'
Suppose for the sake of argument we have,
sentence = "Has this got an elephant in?"
and we search for the word 'elephant'
The posted code attempts to use in, but this will happen:
>>> "elephant" in sentence
True
>>> "ele" in sentence
True
>>> "giraffe" in sentence
False
>>>
Close. But not close enough. It is not looking for a whole word, since we found 'ele' in 'elephant'.
If you split the sentence into words, as suggested by the other answer, you can then search for whole words and find the position. (Look up split; you can choose other characters than the default ' ').
words = sentence.split()
word = 'ele'
words.index(word)
If the word isn't there you will get an error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: 'ele' is not in list
I will leave the error handling to you.

Python sequences provide the index method. It gives you the index of an element, or raises an error if the element is not in the sequence. On strings, it allows you to find substrings.
>>> 'hello world'.index('world')
6
>>> 'hello world'.index('word')
ValueError: substring not found
Basically, you have to add input for the sentence and the word to search. That's it.
print("Insert sentence without punctuation...")
sentence=input() # get input, store it to name `sentence`
print("Insert word to find...")
word=input()
try:
idx = sentence.index(word)
except ValueError: # so it wasn't in the sentence after all...
print('Word', word, 'not in sentence', repr(sentence))
else: # if we get here, IndexError was NOT thrown
print('Word', word, 'first occurs at position', idx)
There are some caveats here, for example 'fooworldbar' will match as well. The correct handling of such things depend on what precisely one wants. I'm guessing you actually want word positions.
If you need positions in the meaning of "the nth word", you must transform the sentence to a list of words. str.split does that. You can then work with index again. Also, if you want all positions, you must call index repeatedly.
print("Insert sentence without punctuation...")
sentence = input() # get input, store it to name `sentence`
words = sentence.split() # split at whitespace, creating a list of words
print("Insert word to find...")
word=input()
positions, idx = [], -1
while idx < len(words):
try:
idx = words.index(word, idx+1)
except ValueError: # so it wasn't in the rest of the sentence after all...
break
else: # if we get here, IndexError was NOT thrown
positions.append(idx) # store the index to list of positions
# if we are here, we have searched through the whole string
if positions: # positions is not an empty list, so we have found some
print('Word', word, 'appears at positions', ', '.join(str(pos) for pos in positions))
else:
print('Word', word, 'is not in the sentence')

You can use re module:
import re
sentence = input('Sentence: ')
word = input('Word: ')
## convert word in regular expression for search method.
regex_word = r'(' + word + ')(?=\s|$)'
## initialize search var.
search = re.search(regex_word, sentence)
if search:
while search:
match_pos = search.span()
sentence = sentence[:match_pos[0]] + sentence[match_pos[1]:]
print(word + ' is in position ' + str(match_pos[0]))
search = re.search(regex_word, sentence)
else:
print(word + ' is not present in this sentence')

Related

How to check generated strings against a text file

I'm trying to have the user input a string of characters with one asterisk. The asterisk indicates a character that can be subbed out for a vowel (a,e,i,o,u) in order to see what substitutions produce valid words.
Essentially, I want to take an input "l*g" and have it return "lag, leg, log, lug" because "lig" is not a valid English word. Below I have invalid words to be represented as "x".
I've gotten it to properly output each possible combination (e.g., including "lig"), but once I try to compare these words with the text file I'm referencing (for the list of valid words), it'll only return 5 lines of x's. I'm guessing it's that I'm improperly importing or reading the file?
Here's the link to the file I'm looking at so you can see the formatting:
https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/packages/corpora/words.zip
Using the "en" file ~2.5MB
It's not in a dictionary layout i.e. no corresponding keys/values, just lines (maybe I could use the line number as the index, but I don't know how to do that). What can I change to check the test words to narrow down which are valid words based on the text file?
with open(os.path.expanduser('~/Downloads/words/en')) as f:
words = f.readlines()
inputted_word = input("Enter a word with ' * ' as the missing letter: ")
letters = []
for l in inputted_word:
letters.append(l)
### find the index of the blank
asterisk = inputted_word.index('*') # also used a redundant int(), works fine
### sub in vowels
vowels = ['a','e','i','o','u']
list_of_new_words = []
for v in vowels:
letters[asterisk] = v
new_word = ''.join(letters)
list_of_new_words.append(new_word)
for w in list_of_new_words:
if w in words:
print(new_word)
else:
print('x')
There are probably more efficient ways to do this, but I'm brand new to this. The last two for loops could probably be combined but debugging it was tougher that way.

print(list_of_new_words)
gives
['lag', 'leg', 'lig', 'log', 'lug']
So far, so good.
But this :
for w in list_of_new_words:
if w in words:
print(new_word)
else:
print('x')
Here you print new_word, which is defined in the previous for loop :
for v in vowels:
letters[asterisk] = v
new_word = ''.join(letters) # <----
list_of_new_words.append(new_word)
So after the loop, new_word still has the last value it was assigned to : "lug" (if the script input was l*g).
You probably meant w instead ?
for w in list_of_new_words:
if w in words:
print(w)
else:
print('x')
But it still prints 5 xs ...
So that means that w in words is always False. How is that ?
Looking at words :
print(words[0:10]) # the first 10 will suffice
['A\n', 'a\n', 'aa\n', 'aal\n', 'aalii\n', 'aam\n', 'Aani\n', 'aardvark\n', 'aardwolf\n', 'Aaron\n']
All the words from the dictionary contain a newline character (\n) at the end. I guess you were not aware that it is what readlines do. So I recommend using :
words = f.read().splitlines()
instead.
With these 2 modifications (w and splitlines) :
Enter a word with ' * ' as the missing letter: l*g
lag
leg
x
log
lug
🎉

How to extract first letter of every nth word in a sentence?

I was trying to extract the first letter of every 5th word and after doing a bit of research I was able to figure out how to obtain every 5th word. But, how do I know extract the first letters of every 5th word and put them together to make a word out of them. This is my progress so far:
def extract(text):
for word in text.split()[::5]:
print(word)
extract("I like to jump on trees when I am bored")

As the comment pointed out, split it and then just access the first character:
def extract(text):
for word in text.split(" "):
print(word[0])
text.split(" ") returns an array and we are looping through that array. word is the current entry (string) in that array. Now, in python you can access the first character of a string in typical array notation. Therefore, word[0] returns the first character of that word, word[-1] would return the last character of that word.

I don't know how did you solve the first part and can not solve the second one,
but anyway, strings in python are simply a list of characters, so if you want to access the 1st character you get the 0th index. so applying that to your example, as the comment mentioned you type (word[0]),
so you can print the word[0] or maybe collect the 1st characters in a list to do any further operations (I do believe that what you want to do, not just printing them!)
def extract(text):
mychars=[]
for word in text.split()[::5]:
mychars.append(word[0])
print(mychars)
extract("I like to jump on trees when I am bored")

The below code might help you out. Just an example idea based on what you said.
#
# str Text : A string of words, such as a sentence.
# int split : Split the string every nth word
# int maxLen : Max number of chars extracted from beginning of each word
#
def extract(text,split,maxLen):
newWord = ""
# Every nth word
for word in text.split()[::split]:
if len(word) < maxLen:
newWord += word[0:] #Entire word (if maxLength is small)
else:
newWord += word[:maxLen] #Beginning of word until nth letter
return (None if newWord=="" else newWord)
text = "The quick brown fox jumps over the lazy dog."
result = extract(text, split=5, maxLen=2) #Use split=5, maxLen=1 to do what you said specifically
if (result):
print (result) #Expected output: "Thov"

Take word by word input and output a sentence

I am trying to write a program that inputs a sentence from the keyboard, word by word, into a list. The program should output the following.
The complete sentence, with the first word capitalized if it wasnt already, spaces between each word, and a period at the end.
The count of the number of words in the sentence.
For instance, if the input is:
the
cat
ran
home
quickly
Your program should output:
The cat ran home quickly.
There are 5 words in the sentence.
listMessage = []
message = input('Enter first word of your message: ')
while message != 'done!':
listMessage.append(message)
message = input('Please enter the next word of your message or type done! when complete ')
return listMessage

Given that you already have listMessage, you can simply:
' '.join(listMessage).capitalize() + '.'

def function():
listMessage = []
message = input('Enter first word of your message: ').strip()
while message != 'done!':
listMessage.append(message)
message = input('Please enter the next word of your message or type done! when complete ')
text = ' '.join(listMessage).capitalize()+'.'
return text

You've got a couple things you might want to check here, according to your problem description.
If you want spaces between each word, you'll likely want to check to make sure that the words themselves, when entered, don't have leading or trailing spaces already. Use .strip() on your input to ensure this is the case.
If you want to capitalize the first letter of your sentence, you can check to see if listMessage[0][0].isupper() == True. This checks the first letter of the first word for capitalization.
If you'd like to add spaces to each string when you concatenate it, you can try a ranged for loop:
finalStr = ""
for str in listMessage:
finalStr += (str + " ")
(This will leave a space at the end, remember to .strip() it.)
Put it all together, and you've got your code. Try a working solution here!

You can try this:
word = ""
sentence = ""
while True:
word = input("Enter a word: ")
if word == 'done!':
break
sentence = sentence + word + " "
print(sentence)

Find and select a required word from a sentence?

I am trying to get raw_input from user and then find a required word from that input. If the required word is there, then a function runs. So I tried .split to split the input but how do I find if the required word is in the list.

It's really simple to get this done. Python has an in operator that does exactly what you need. You can see if a word is present in a string and then do whatever else you'd like to do.
sentence = 'hello world'
required_word = 'hello'
if required_word in sentence:
# do whatever you'd like
You can see some basic examples of the in operator in action here.
Depending on the complexity of your input or lack of complexity of your required word, you may run into some problems. To deal with that you may want to be a little more specific with your required word.
Let's take this for example:
sentence = 'i am harrison'
required_word = 'is'
This example will evaluate to True if you were to doif required_word in sentence: because technically the letters is are a substring of the word "harrison".
To fix that you would just simply do this:
sentence = 'i am harrison'
required_word = ' is '
By putting the empty space before and after the word it will specifically look for occurrences of the required word as a separate word, and not as a part of a word.
HOWEVER, if you are okay with matching substrings as well as word occurrences then you can ignore what I previously explained.
If there's a group of words and if any of them is the required one, then what should I do? Like, the required word is either "yes" or "yeah". And the input by user contains "yes" or "yeah".
As per this question, an implementation would look like this:
sentence = 'yes i like to code in python'
required_words = ['yes', 'yeah']
^ ^ ^ ^
# add spaces before and after each word if you don't
# want to accidentally run into a chance where either word
# is a substring of one of the words in sentence
if any(word in sentence for word in required_words):
# do whatever you'd like
This makes use of the any operator. The if statement will evaluate to true as long as at least one of the words in required_words is found in sentence.

Harrison's way is one way. Here are other ways:
Way 1:
sentence = raw_input("enter input:")
words = sentence.split(' ')
desired_word = 'test'
if desired_word in words:
# do required operations
Way 2:
import re
sentence = raw_input("enter input:")
desired_word = 'test'
if re.search('\s' + desired_word + '\s', sentence.strip()):
# do required operations
Way 3 (especially if there are punctuations at the end of the word):
import re
sentence = raw_input("enter input:")
desired_word = 'test'
if re.search('\s' + desired_word + '[\s,:;]', sentence.strip()):
# do required operations

how do i find the word position of a word that appears more than once in a string using a for loop

loop = True
while loop yes no maybe
sentence is : ", myList.index(wordchosen) + 1)
sorry i had to do this :(

Your question was very confusing because it contained only code which wasn't styled as code. You should always include some plain English description of the problem. I tried to guess :)
Your code has some indentation issues, it doesn't run as such.
You use the input() function, which fails when a sentence is entered:
write a sentence
tre sdf fre dfg
Traceback (most recent call last):
File "bin/tst.py", line 14, in <module>
sentence = input()
File "<string>", line 1
tre sdf fre dfg
^
SyntaxError: invalid syntax
From the input() documentation:
Consider using the raw_input() function for general input from
users.
After switching from input() to raw_input() another issue pops up: the code doesn't appear to work. That's because you're using the is identity operator to check equality, which won't always produce the result you're expecting. See: Why does comparing strings in Python using either '==' or 'is' sometimes produce a different result?
After switching is to == the code works as you probably intended it:
write a sentence
this is the input sentence
this is the input sentence
choose a word from the sentence that you would like to find the word postion of
sentence
('word position for your sentence is : ', 5)
write a sentence
Side note: you don't need to use paranthesis around variables: (wordchosen).
There are more efficient/elegant/pythonic ways of checking if wordchosen is in myList, see the other answers.

I think this is what you want
while True:
sentence = input('Enter a sentence:\n')
sentence = sentence.lower().split()
print(sentence)
word = input('Enter a word you want to find in the sentence:\n')
print([counter for counter, ele in enumerate(sentence) if word == ele])
decision = input('Yes or no to exit or continue')
decision.lower
if decision == 'yes':
continue
else:
break
This returns a list of all the indexes of the word you're looking for in the sentence
sentence.index() will not work because it will return the item position of the first element in the sentence
to get a visual occurrence of the words in the list just change the print to
print([counter + 1 for counter, ele in enumerate(sentence) if word == ele])

I'm not quite sure what you are asking but this psuedocode should be somewhat like what you (likely) want to do.
Try something such as this:
get the users string and word they want to search for
parse sentence into an array (parse them by spaces)
loop through they array and increase a count for each time the word appears
save the positions of where these words appear into another array that you can later use or return

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.