I'm new to programming in python and I have a challenge that I've been attempting for a few days now but I can't seem to figure out what is wrong with my code. My code take a text file and tells me how many sentences, words, and syllables are in the text. I have everything running fine except my code is counting a syllable containing consecutive vowels as multiple syllables and I can't seem to figure out how to fix it. Any help at all would be appreciated.
For example if the file has this:
"Or to take arms against a sea of troubles, And by opposing end them? To die: to sleep."
It should come out as saying the text has 21 syllables but the program tells me it has 26 because it counts the consecutive vowels more than once.
fileName = input("Enter the file name: ")
inputFile = open(fileName, 'r')
text = inputFile.read()
# Count the sentences
sentences = text.count('.') + text.count('?') + \
text.count(':') + text.count(';') + \
text.count('!')
# Count the words
words = len(text.split())
# Count the syllables
syllables = 0
vowels = "aeiouAEIOU"
for word in text.split():
for vowel in vowels:
syllables += word.count(vowel)
for ending in ['es', 'ed', 'e']:
if word.endswith(ending):
syllables -= 1
if word.endswith('le'):
syllables += 1
# Compute the Flesch Index and Grade Level
index = 206.835 - 1.015 * (words / sentences) - \
84.6 * (syllables / words)
level = int(round(0.39 * (words / sentences) + 11.8 * \
(syllables / words) - 15.59))
# Output the results
print("The Flesch Index is", index)
print("The Grade Level Equivalent is", level)
print(sentences, "sentences")
print(words, "words")
print(syllables, "syllables")
Instead of counting the number of occurrences of each vowel for each word, we can iterate through the characters of the word, and only count a vowel if it isn't preceded by another vowel:
# Count the syllables
syllables = 0
vowels = "aeiou"
for word in (x.lower() for x in text.split()):
syllables += word[0] in vowels
for i in range(1, len(word)):
syllables += word[i] in vowels and word[i - 1] not in vowels
for ending in {'es', 'ed', 'e'}:
if word.endswith(ending):
syllables -= 1
if word.endswith('le'):
syllables += 1
Related
Given a string consisting of words separated by spaces (one or more).
Find the average length of all words.
Average word length = total number of characters in words (excluding spaces) divided by the number of words.
My attempt:
But input is incorrect, can you help me?
sentence = input("sentence: ")
words = sentence.split()
total_number_of_characters = 0
number_of_words = 0
for word in words:
total_number_of_characters += len(sentence)
number_of_words += len(words)
average_word_length = total_number_of_characters / number_of_words
print(average_word_length)
When you're stuck, one nice trick is to use very verbose variable names that match the task description as closely as possible, for example:
words = sentence.split()
total_number_of_characters = 0
number_of_words = 0
for word in words:
total_number_of_characters += WHAT?
number_of_words += WHAT?
average_word_length = total_number_of_characters / number_of_words
Can you do the rest?
I think maybe it should be
for char in word:
Rather than
for char in words:
You may use mean() function to calculate the average.
>>> from statistics import mean()
>>> sentence = 'The quick brown fox jumps over the lazy dog'
>>> mean(len(word) for word in sentence.split())
3.888888888888889
The statistics library was introduced with Python 3.4.
https://docs.python.org/3/library/statistics.html#statistics.mean
There is a simpler way to solve this problem. You can get the amount of words by getting len(words) and the number of letters by taking the original sentence and removing all spaces in it (check the replace() method).
Now your turn to piece these infos together!
Edit: Here's an example:
sentence = input("Sentence: ")
words = len(sentence.split())
chars = len(sentence.replace(" ", ""))
print(chars / words)
Given a sentence string. Write the shortest word in a sentence. If there are several such words, then output the last one. A word is a set of characters that does not contain spaces, punctuation marks and is delimited by spaces, punctuation marks, or the beginning/end of a line.
Input: sentence = “I LOVE python version three and point 10”
Output: "I"
My attempt:
sentence = input("sentence: ")
words = sentence.split()
min_word = None
for word in words:
if len(word) < len(words):
min_word = word
print(min_word)
But output is : 10
Can you help me?
this bug because of if len(word) < len(words):. It can be if len(word) < len(min_word): and to fix len(None) you can use this code:
sentence = input("sentence: ")
words = sentence.split()
min_word = words[0]
for word in words:
if len(word) < len(min_word):
min_word = word
print(min_word)
Your task is to write a program that loops through the words in the provided jane_eyre.txt file and counts those words which fulfill both of the following conditions:
The word has more than 10 characters
The word does not contain the letter "e"
Once you've finished the aggregation, let your program print out the following message:
There are 10 long words without an 'e'.
My work:
count = 0
f = open("jane_eyre.txt").read()
words = f.split()
if len(words) > 10 and "e" not in words:
count += 1
print("There are", count, "long words without an 'e'.")
The count result is 1 but it should be 10. What's wrong with my work??
You have to iterate on each word, with best practice:
with open('jane_eyre.txt') as fp: # use a context manager
count = 0 # initialize counter to 0
for word in fp.read().split(): # for each word
if len(word) > 10 and not 'e' in word: # your condition
count += 1 # increment counter
print(f"There are {count} long words without an 'e'.") # use f-strings
But pay attention to punctuation: "imaginary," is 10 characters length but the word itself has only 9 characters.
You need to loop through each word. So something like:
count = 0
f = open("jane_eyre.txt").read()
words = f.split()
for word in words:
if len(word) > 10 and "e" not in word:
count += 1
print("There are", count, "long words without an 'e'.")
I have to count the number of syllables in a text file. My problem is that I don't know how to iterate each character of each string. My idea was to check if a letter is a vowel, and if the following letter is not a vowel, increase the count by 1. But I can't increase "letter". I've also tried to use the "range" method, but I have problem also with that. What can I try? Thank you.
PS: I can only use Python built-in methods.
txt = ['countingwords', 'house', 'plant', 'alpha', 'syllables']
This is my code so far.
def syllables(text_file):
count = 0
vowels = ['a','e','i','o','u','y']
with open(text_file, 'r') as f:
txt = f.readlines()
txt = [line.replace(' ','') for line in txt]
txt = [line.replace(',','') for line in txt]
txt = [y.lower() for y in txt]
for word in txt:
for letter in word:
if letter is in vowel and [letter + 1] is not in vowel:
count += 1
You might try this:
lines = ["You should count me too"]
count = 0
vowels = "aeiouy"
for line in lines:
for word in line.lower().split(" "):
for i in range(len(word)):
if word[i] in vowels and (i == 0 or word[i-1] not in vowels):
count +=1
print(count) # -> 5
I'm trying to create a sentiment analyser in Python that downloads text and analyses it against a list of negative and positive words. For every match within the text with a word in poswords.txt there should be a +1 score and for every match within the text in negwords.txt there should be a -1 score, the overall score for the text will be the sentiment score. This is how I have tried to do it but I keep just getting a score of 0.
The answer below does not seem to work, I keep getting a sentiment score of 0.
split = text.split()
poswords = open('poswords.txt','r')
for word in split:
if word in poswords:
sentimentScore +=1
poswords.close()
negwords = open('negwords.txt','r')
for word in split:
if word in negwords:
sentimentScore -=1
negwords.close()
poswords and negwords in your code are just file handles, you are not reading the words in those files.
Here:
split = text.split()
poswords = open('poswords.txt','r')
pos = []
for line in poswords:
pos.append(line.strip())
for word in split:
if word in pos:
sentimentScore +=1
poswords.close()
negwords = open('negwords.txt','r')
neg = []
for line in negwords:
neg.append(line.strip())
for word in split:
if word in neg:
sentimentScore -=1
negwords.close()
If the files are huge, the above is not a optimal solution. Create a dictionary for positive and negative words:
input_text = text.split() # avoid using split as a variable name, since it is a keyword
poswords = open('poswords.txt','r')
pos_dict = defaultdict(int)
for line in poswords:
pos_dict[line.strip()] += 1
poswords.close()
negwords = open('negwords.txt','r')
neg_dict = defaultdict(int)
for line in negwords:
neg_dict[line.strip()] += 1
negwords.close()
sentiment_score = 0
for word in input_text:
if word in pos_dict:
sentiment_score += 1
elif word in neg_dict:
sentiment_score -=1