Word Frequency counter not counting past 1 per user input - python

I am very new to python and new to this website as well so if this question isn't worded well I apologize. Basically, the program I am trying to create is one that takes user inputs, counts the words in the inputs, and then also counts the occurrences of a specific list of common words. My problem is that when using a test case that has more than one occurrence of a word in my list, the counter does not count past 1. For example, if the user input is "A statement is a program instruction", it will only count the use of the word "a" one time. Below I have included my code, I also want to preface this with being my first attempt at using loops. I believe my problem lies within lines 31-32.
#step1
from re import X
words = ['the','be','to', 'of', 'and','a','in','that','have','i']
frequency = [0,0,0,0,0,0,0,0,0,0]
assert len(words) == len(frequency), "the length of words and frequency should be equal!"
total_frequency = 0
#step2
text = input("Enter some text:").lower()
#step3 repeated until == quit through step 7
while text != 'quit':
#step4 split the input
text_list = text.split(' ')
#step5 update the frequency
total_frequency = total_frequency+len(text_list)
#step6 update the counter
for idx, this_word in enumerate(words):
# in the inner loop, you have to compare this_word with each element of text_list if they are the same, add on to frequency[idx]
if this_word in text_list:
frequency[idx]+=1
#step7: ask for another input
text = input("Enter some text:").lower()
#step 8 format output
keys = ' '.join(words)
print(f'Total number of words:{total_frequency}')
print("{:^50}".format('Word frequency of top ten words'))
print('-'*50)
print(keys)
for idx, this_word in enumerate(words):
print(f'{frequency[idx]:^{len(this_word)}} ',end="")
print()
# END
OUTPUT:
Enter some text:A statement is a program instruction
Enter some text:quit
Total number of words:6
Word frequency of top ten words
--------------------------------------------------
the be to of and a in that have i
0 0 0 0 0 1 0 0 0 0

Replace your inner loop with this:
for word in text_list:
if word in words:
frequency[words.index(word)] += 1
and you will get the results you want. Note that it would be better to store your words and frequencies in a dict, so you could let Python do the searching, like this:
words = {'the':0,'be':0,'to':0, 'of':0, 'and':0,'a':0,'in':0,'that':0,'have':0,'i':0}
total_frequency = 0
#step2
text = input("Enter some text:").lower()
#step3 repeated until == quit through step 7
while text != 'quit':
#step4 split the input
text_list = text.split(' ')
#step5 update the frequency
total_frequency = total_frequency+len(text_list)
#step6 update the counter
for word in text_list:
if word in words:
words[word] += 1
text = input("Enter some text:").lower()
#step 8 format output
keys = ' '.join(words.keys())
print(f'Total number of words:{total_frequency}')
print("{:^50}".format('Word frequency of top ten words'))
print('-'*50)
print(keys)
for word, cnt in words.items():
print(f'{cnt:^{len(word)}} ',end="")
print()

if this_word in text_list returns true if 'this_word' is in there. you need another loop.
for word in text_list:
if this_word == word:
frequency[idx]+=1

Related

How to select certain characters in a string in Python?

My name is Shaun. I am 13 years old and trying to learn python.
I am trying to make a program that finds vowels in an input and then prints how many vowels there are in the input the user gives.
Here is the code:
s = (input('Enter a string: ')) # Let the user give an input (has to be a string)
Vwl = [] # Create an array where we will append the values when the program finds a vowel or several vowels
for i in s: # Create a loop to check for each letter in s
count_a = 0 # Create a variable to count how many vowels in a
count_e = 0 # Create a variable to count how many vowels in e
count_i = 0 # Create a variable to count how many vowels in i
count_o = 0 # Create a variable to count how many vowels in o
count_u = 0 # Create a variable to count how many vowels in u
The function below is pretty long to explain, so summary of the function below is to find a vowel in s (the input) and make one of the counters, if not some or all, increase by 1. For the sake of learning, we append the vowels in the array Vwl. Then, it prints out Vwl and how many letters there are in the list by using len.
if s.find("a" or "A") != -1:
count_a = count_a + 1
Vwl.append('a')
elif s.find("e" or "E") != -1:
count_e = count_e + 1
Vwl.append("e")
elif s.find("i" or "I") != -1:
count_i = count_i + 1
Vwl.append("i")
elif s.find("o" or "O") != -1:
count_o = count_o + 1
Vwl.append("o")
elif s.find("u" or "U") != -1:
count_u = count_u + 1
Vwl.append("u")
print(Vwl)
print(f"How many vowels in the sentence: {len(Vwl)}")
For some odd reason however, my program first finds the first vowel it sees, and converts the whole string into the first vowel it finds. Then it prints down the wrong amount of vowels based on the array of vowels in the array Vwls
Could someone please help me?
The reason your code only prints out the first vowel is because the if statements you coded are not inside a loop, that part of the code runs once and then it finishes, so it only manages to find one vowel.
Here are couple ways you can do what you are trying to do:
Way 1: Here is if you just want to count the vowels:
s = input()
vowel_counter = 0
for letter in s:
if letter in "aeiou":
vowel_counter+=1
print(f"How many vowels in the sentence: {vowel_counter}")
Way 2: Use a python dictionary to keep track of how many of each vowel you have
s = input()
vowel_dict = {}
for letter in s:
if letter in "aeiou":
if letter not in vowel_dict:
vowel_dict[letter]=0
vowel_dict[letter]+=1
print(f"How many vowels in the sentence: {sum(vowel_dict.values())}")
print(vowel_dict)

How to loop through words with conditions

Your task is to write a program that loops through the words in the provided jane_eyre.txt file and counts those words which fulfill both of the following conditions:
The word has more than 10 characters
The word does not contain the letter "e"
Once you've finished the aggregation, let your program print out the following message:
There are 10 long words without an 'e'.
My work:
count = 0
f = open("jane_eyre.txt").read()
words = f.split()
if len(words) > 10 and "e" not in words:
count += 1
print("There are", count, "long words without an 'e'.")
The count result is 1 but it should be 10. What's wrong with my work??
You have to iterate on each word, with best practice:
with open('jane_eyre.txt') as fp: # use a context manager
count = 0 # initialize counter to 0
for word in fp.read().split(): # for each word
if len(word) > 10 and not 'e' in word: # your condition
count += 1 # increment counter
print(f"There are {count} long words without an 'e'.") # use f-strings
But pay attention to punctuation: "imaginary," is 10 characters length but the word itself has only 9 characters.
You need to loop through each word. So something like:
count = 0
f = open("jane_eyre.txt").read()
words = f.split()
for word in words:
if len(word) > 10 and "e" not in word:
count += 1
print("There are", count, "long words without an 'e'.")

Comparing user input from a list of words

So I am writing a program that will take in user input and then will take the user input and compare it to a set list and then tell me how many words from the user input are in the given list.
For Example:
list = ['I','like','apples'] # set list
user_in = input('Say a phrase:')
# the user types: I eat apples.
#
# then the code will count and total the similar words
# in the list from the user input.
I've gotten close with this, I know I might have to convert the user input into a list itself. just need help comparing and counting matching words.
Thank you.
len([word for word in user_in if word in list])
Try like this :
similarWords=0 #initialize a counter
for word in user_in.split():
if word in list: #check and compare if word is in set list
similarWords+=1 #increase counter by 1 every time a word matches
Well, you can split your user input with user_in.split(' ').
Then compare each word in the user_in_list with a word in the check list and increase the counter when this is the case:
list = ['I','like','apples'] # set list
user_in = input('Say a phrase:')
ui = user_in.split(' ')
count = 0
for word in ui:
if word in list:
count += 1
print(count)
Try this,
l1 = ['I','like','apples'] # set list
user = input('Say a phrase:')
a=user.split(' ')
k=0
print(a)
for i in l1:
if i in l1:
k+=1
print("similar words:",k)
Hope this helps you!
By using the function split() you can split the given phrase into words.
And it is better to use a ui.lower() or ui.upper() to avoid case sensitivity
li = ['i', 'like', 'apples'] #initial list
ui = input('say a phrase') #taking input from user
ui = ui.lower() #converting string into lowercase
ui_list = ui.split() #converting phrase into list containing words
count = 0
for word in ui_list:
if word in li:
print(word)
count += 1
print(count) #printing matched words

Print values from list based from separate text file

How do I print a list of words from a separate text file? I want to print all the words unless the word has a length of 4 characters.
words.txt file looks like this:
abate chicanery disseminate gainsay latent aberrant coagulate dissolution garrulous laud
It has 334 total words in it. I'm trying to display the list until it reaches a word with a length of 4 and stops.
wordsFile = open("words.txt", 'r')
words = wordsFile.read()
wordsFile.close()
wordList = words.split()
#List outputs length of words in list
lengths= [len(i) for i in wordList]
for i in range(10):
if i >= len(lengths):
break
print(lengths[i], end = ' ')
# While loop displays names based on length of words in list
while words != 4:
if words in wordList:
print("\nSelected words are:", words)
break
output
5 9 11 7 6 8 9 11 9 4
sample desired output
Selected words are:
Abate
Chicanery
disseminate
gainsay
latent
aberrant
coagulate
dissolution
garrulous
Given that you only want the first 10 words. There isn't much point reading all 4 lines. You can safely read just the 1st and save yourself some time.
#from itertools import chain
with open('words.txt') as f:
# could raise `StopIteration` if file is empty
words = next(f).strip().split()
# to read all lines
#words = []
#for line in f:
# words.extend(line.strip().split())
# more functional way
# words = list(chain.from_iterable(line.strip().split() for line in f))
print("Selected words are:")
for word in words[:10]:
if len(word) != 4:
print(word)
There are a few alternative methods I left in there but commented out.
Edit using a while loop.
i = 0
while i < 10:
if len(words[i]) != 4:
print(words[i])
i += 1
Since you know how many iterations you can do, you can hide the mechanics of the iteration using a for loop. A while does not facilitate this very well and is better used when you don't know how many iterations you will do.
To read all words from a text file, and print each of them unless they have a length of 4:
with open("words.txt","r") as wordsFile:
words = wordsFile.read()
wordsList = words.split()
print ("Selected words are:")
for word in wordsList:
if len(word) != 4: # ..unless it has a length of 4
print (word)
Later in your question you write, "I'm trying to display the first 10 words "...). If so, add a counter, and add a condition to print if its value is <= 10.
While i'd use a for or a while loop, like Paul Rooney suggested, you can also adapt your code.
When you create the list lengths[], you create a list with ALL the lengths of the words contained in wordList.
You then cycle the first 10 lengths in lengths[] with the for loop;
If you need to use this method, you can nest a for loop, comparing words and lengths:
#lengths[] contains all the lengths of the words in wordList
lengths= [len(i) for i in wordList]
#foo[] cointains all the words in wordList
foo = [i for i in wordList]
#for the first 10 elements of lengths, if the elements isn't 4 char long
#print foo[] element with same index
for i in range(10):
if lengths[i] != 4:
print(foo[i])
if i >= len(lengths):
break
I hope this is clear and it's the answer you were looking for

How to print words that only cointain letters from a list?

Hello I have recently been trying to create a progam in Python 3 which will read a text file wich contains 23005 words, the user will then enter a string of 9 characters which the program will use to create words and compare them to the ones in the text file.
I want to print words which contains between 4-9 letters and that also contains the letter in the middle of my list. For example if the user enters the string "anitsksem" then the fifth letter "s" must be present in the word.
Here is how far I have gotten on my own:
# Open selected file & read
filen = open("svenskaOrdUTF-8.txt", "r")
# Read all rows and store them in a list
wordList = filen.readlines()
# Close File
filen.close()
# letterList index
i = 0
# List of letters that user will input
letterList = []
# List of words that are our correct answers
solvedList = []
# User inputs 9 letters that will be stored in our letterList
string = input(str("Ange Nio Bokstäver: "))
userInput = False
# Checks if user input is correct
while userInput == False:
# if the string is equal to 9 letters
# insert letter into our letterList.
# also set userInput to True
if len(string) == 9:
userInput = True
for char in string:
letterList.insert(i, char)
i += 1
# If string not equal to 9 ask user for a new input
elif len(string) != 9:
print("Du har inte angivit nio bokstäver")
string = input(str("Ange Nio Bokstäver: "))
# For each word in wordList
# and for each char within that word
# check if said word contains a letter from our letterList
# if it does and meets the requirements to be a correct answer
# add said word to our solvedList
for word in wordList:
for char in word:
if char in letterList:
if len(word) >= 4 and len(word) <= 9 and letterList[4] in word:
print("Char:", word)
solvedList.append(word)
The issue that I run into is that instead of printing words which only contain letters from my letterList, it prints out words which contains at least one letter from my letterList. This also mean that some words are printed out multiple time, for example if the words contains multiple letters from letterList.
I've been trying to solve these problems for a while but I just can't seem to figure it out. I Have also tried using permutations to create all possible combinations of the letters in my list and then comparing them to my wordlist, however I felt that solution was to slow given the number of combinations which must be created.
# For each word in wordList
# and for each char within that word
# check if said word contains a letter from our letterList
# if it does and meets the requirements to be a correct answer
# add said word to our solvedList
for word in wordList:
for char in word:
if char in letterList:
if len(word) >= 4 and len(word) <= 9 and letterList[4] in word:
print("Char:", word)
solvedList.append(word)
Also since I'm kinda to new to python, if you have any general tips to share, I would really appreciate it.
You get multiple words mainly because you iterate through each character in a given word and if that character is in the letterList you append and print it.
Instead, iterate on a word basis and not on a character basis while also using the with context managers to automatically close files:
with open('american-english') as f:
for w in f:
w = w.strip()
cond = all(i in letterList for i in w) and letterList[4] in w
if 9 > len(w) >= 4 and cond:
print(w)
Here cond is used to trim down the if statement, all(..) is used to check if every character in the word is in letterList, w.strip() is to remove any redundant white-space.
Additionally, to populate your letterList when the input is 9 letters, don't use insert. Instead, just supply the string to list and the list will be created in a similar, but noticeably faster, fashion:
This:
if len(string) == 9:
userInput = True
for char in string:
letterList.insert(i, char)
i += 1
Can be written as:
if len(string) == 9:
userInput = True
letterList = list(string)
With these changes, the initial open and readlines are not needed, neither is the initialization of letterList.
You can try this logic:
for word in wordList:
# if not a valid work skip - moving this check out side the inner for-each will improve performance
if len(word) < 4 or len(word) > 9 or letterList[4] not in word:
continue
# find the number of matching words
match_count = 0
for char in word:
if char in letterList:
match_count += 1
# check if total number of match is equal to the word count
if match_count == len(word):
print("Char:", word)
solvedList.append(word)
You can use lambda functions to get this done.
I am just putting up a POC here leave it to you to convert it into complete solution.
filen = open("test.text", "r")
word_list = filen.read().split()
print("Enter your string")
search_letter = raw_input()[4]
solved_list = [ word for word in word_list if len(word) >= 4 and len(word) <= 9 and search_letter in word]
print solved_list

Categories