Iterating through a dictionary to find the most frequently used word

Iterating through a dictionary to find the most frequently used word - python

I made a quick bit of code that checks a list of input words by the user and counts how many of the same word the user has typed. if print is input it is supposed to print the most frequent word used as well as how many times it was used for example
The word 'Hello' has been used '12' times
i am just not sure how to get it to iterate through the dictionary and find the most frequent word
Here is the code
d = {}
L2 = []
Counter = 0
checker = -1
while True:
Text = str(input('enter word '))
Text = Text.lower()
if Text in ['q', 'Q']:
break
if Text in ['Print', 'print']:
for word in L2:
if word not in d:
counter = L2.count(word)
d[word] = counter
#This part does not work
for value in d[0]:
if checker < value:
checker = value
print(checker)
#This part does
L2.append(Text)

d = {}
L2 = []
while True:
text_input = input('enter word ') # input by default stores values in string format
text_input = text_input.lower()
if text_input == 'q': # you have used text_input.lower() so you don't need to use it is 'q' or 'Q' it will never be 'Q'
break
elif text_input == 'print':
for item in set(L2): # set(L2) will only keep unique values from list L2
d[item] = L2.count(item) # counts for all unique items and append it to dictionary
for word, count in d.items(): # by using this loop you can print all possible max values i.e. if two word has same max occurance it will print both
if count == max(d.values()):
print("Word : {} repeated {} times.".format(word, count))
else:
L2.append(text_input) # input is neither 'q' nor 'print' hence append word in list

Related

Word Frequency counter not counting past 1 per user input

I am very new to python and new to this website as well so if this question isn't worded well I apologize. Basically, the program I am trying to create is one that takes user inputs, counts the words in the inputs, and then also counts the occurrences of a specific list of common words. My problem is that when using a test case that has more than one occurrence of a word in my list, the counter does not count past 1. For example, if the user input is "A statement is a program instruction", it will only count the use of the word "a" one time. Below I have included my code, I also want to preface this with being my first attempt at using loops. I believe my problem lies within lines 31-32.
#step1
from re import X
words = ['the','be','to', 'of', 'and','a','in','that','have','i']
frequency = [0,0,0,0,0,0,0,0,0,0]
assert len(words) == len(frequency), "the length of words and frequency should be equal!"
total_frequency = 0
#step2
text = input("Enter some text:").lower()
#step3 repeated until == quit through step 7
while text != 'quit':
#step4 split the input
text_list = text.split(' ')
#step5 update the frequency
total_frequency = total_frequency+len(text_list)
#step6 update the counter
for idx, this_word in enumerate(words):
# in the inner loop, you have to compare this_word with each element of text_list if they are the same, add on to frequency[idx]
if this_word in text_list:
frequency[idx]+=1
#step7: ask for another input
text = input("Enter some text:").lower()
#step 8 format output
keys = ' '.join(words)
print(f'Total number of words:{total_frequency}')
print("{:^50}".format('Word frequency of top ten words'))
print('-'*50)
print(keys)
for idx, this_word in enumerate(words):
print(f'{frequency[idx]:^{len(this_word)}} ',end="")
print()
# END
OUTPUT:
Enter some text:A statement is a program instruction
Enter some text:quit
Total number of words:6
Word frequency of top ten words
--------------------------------------------------
the be to of and a in that have i
0 0 0 0 0 1 0 0 0 0

Replace your inner loop with this:
for word in text_list:
if word in words:
frequency[words.index(word)] += 1
and you will get the results you want. Note that it would be better to store your words and frequencies in a dict, so you could let Python do the searching, like this:
words = {'the':0,'be':0,'to':0, 'of':0, 'and':0,'a':0,'in':0,'that':0,'have':0,'i':0}
total_frequency = 0
#step2
text = input("Enter some text:").lower()
#step3 repeated until == quit through step 7
while text != 'quit':
#step4 split the input
text_list = text.split(' ')
#step5 update the frequency
total_frequency = total_frequency+len(text_list)
#step6 update the counter
for word in text_list:
if word in words:
words[word] += 1
text = input("Enter some text:").lower()
#step 8 format output
keys = ' '.join(words.keys())
print(f'Total number of words:{total_frequency}')
print("{:^50}".format('Word frequency of top ten words'))
print('-'*50)
print(keys)
for word, cnt in words.items():
print(f'{cnt:^{len(word)}} ',end="")
print()

if this_word in text_list returns true if 'this_word' is in there. you need another loop.
for word in text_list:
if this_word == word:
frequency[idx]+=1

How to find the largest number of times a word is repeated consecutively in a given string?

Okay, so this is kind of a confusing question, I will try and word it in the best way that I can.
I'm trying to figure out a way that I can find the largest consecutive repeats of a word in a string in Python
For example, let's say the word I want to look for is "apple" and the string is: "applebananaorangeorangeorangebananaappleappleorangeappleappleappleapple". Here, the largest number of consecutive repeats for the word "apple" is 3.
I have tried numerous ways of finding repeating character such as this:
word="100011010" #word = "1"
count=1
length=""
if len(word)>1:
for i in range(1,len(word)):
if word[i-1]==word[i]:
count+=1
else :
length += word[i-1]+" repeats "+str(count)+", "
count=1
length += ("and "+word[i]+" repeats "+str(count))
else:
i=0
length += ("and "+word[i]+" repeats "+str(count))
print (length)
But this works with integers and not words. It also outputs the number of times the character repeats in general but does not identify the largest consecutive repeats. I hope that makes sense. My brain is kind of all over the place rn so I apologize if im trippin

Here is a solution I came up with that I believe solves your problem. There is almost certainly a simpler/faster way to do it if you spend more time with the problem which I would encourage.
import re
search_string = "applebananaorangeorangeorangebananaappleappleorangeappleappleappleapple"
search_term = "apple"
def search_for_term(search_string, search_term):
#split string into array on search_term
#keeps search term in array unlike normal string split
split_string = re.split(f'({search_term})', search_string)
#remove unnecessary characters
split_string = list(filter(lambda x: x != "", split_string))
#enumerate string and filter out instances that aren't the search term
enum_string = list(filter(lambda x: x[1] == search_term, enumerate(split_string)))
#loop through each of the items in the enumerated list and save to the current chain
#once a chain brakes i.e. the next element is not in order append the current_chain to
#the chains list and start over
chains = []
current_chain = []
for idx, val in enum_string:
if len(current_chain) == 0:
current_chain.append(idx)
elif idx == current_chain[-1] + 1:
current_chain.append(idx)
else:
chains.append(current_chain)
current_chain = [idx]
print(chains, current_chain)
#append anything leftover in the current_chain list to the chains list
if len(current_chain) > 0:
chains.append(current_chain)
del current_chain
#find the max length nested list in the chains list and return it
max_length = max(map(len, chains))
return max_length
max_length = search_for_term(search_string, search_term)
print(max_length)

Here is how I would do this. first check for 'apple' in the randString, then check for 'appleapple', then 'appleappleapple', and so on until the search result is empty. Keep track of the iteration count and voilà.
randString = "applebananaorangeorangeorangebananaappleappleorangeappleappleappleapple"
find = input('type in word to search for: ')
def consecutive():
count =0
for i in range(len(randString)):
count +=1
seachword = [find*count]
check = [item for item in seachword if item in randString]
if len(check) != 0:
continue
else:
# Need to remove 1 from the final count.
print (find, ":", count -1)
break
consecutive()

Count vowels and print duplicate vowels

I have a code that counts the number of vowels from the user input and prints them. Furthermore, what I also want to do is to print out the duplicate vowels.
The first part of the code runs fine and it does print out the number of vowels in whatever the user gives input but the second part does not seem to work. I am attaching the code I have come up with.
user_name = input('Please enter your name: ')
count = 0
for vowels in user_name:
if vowels.lower() == "a" or vowels.lower() == "e" or vowels.lower() == "i" or vowels.lower() == "o" \
or vowels.lower() == "u":
count = count + 1
print(f'Number of vowels are {count}')
dupes = ""
for rep_vows in user_name:
if rep_vows not in dupes:
# dupes.append(rep_vows)
print(dupes)

If you want to get the duplicates, you should be checking if a new vowel was already added to the list (or string) of vowels found.
A simple modification will get you this
duples = ''
for rep_vows in user_name:
if rep_vows in duples:
print(rep_vows)
if rep_vows.lower() in "aeiou":
duples += rep_vows
Since you know how to use in, you can change the first part to:
for vowels in user_name:
if vowels.lower() in "aeiou":
count = count + 1

For counting things Python has Counter dict here is some examples:
>>> from collections import Counter
>>> import re
>>> # Count vowels
>>> Counter(re.findall('[aieouAEIOU]', 'Daniel Hilst'))
Counter({'i': 2, 'a': 1, 'e': 1})
>>> # Summing up
>>> sum(Counter(re.findall('[aieouAEIOU]', 'Daniel Hilst')).values())
4
>>> # Count words
>>> Counter(re.findall(r'\w+', 'some text'))
Counter({'some': 1, 'text': 1})
>>>
You can find the documentation at collections package docs: https://docs.python.org/3.7/library/collections.html

I think a native Python Counter is really what you should try to use here. It is just a glorified dictionary but it really trims down the amount of code you need to write to achieve your goal.
from collections import Counter #import Counter from Python's collections standard library
user_name = input('Please enter your name: ')
vowels = ['a','e','i','o','u'] # create a list of your vowels
counter = Counter() # initialize counter
for letter in user_name:
if letter in vowels:
print(letter)
counter[letter]+=1
print(counter)

Your solution is ok, but dupes should be a list (of repeated vowels), and you can use another list for the already seen vowels; so when you see a vowel, you check if that one is already in "seen". If the vowel is in seen, you append it to dupes, otherwise, you append it to seen, so when that one repeats, it will be appended to dupes.
And finally, you print the list of dupes.
user_name = input('Please enter your name: ')
count = 0
for vowels in user_name:
if vowels.lower() == "a" or vowels.lower() == "e" or vowels.lower() == "i" or vowels.lower() == "o" or vowels.lower() == "u":
count = count + 1
print(f'Number of vowels are {count}')
seen = []
dupes = []
for rep_vows in user_name:
if rep_vows not in seen:
seen.append(rep_vows)
else:
dupes.append(rep_vows)
print(dupes)

Can someone explain how to store and display a line count with user input?

This simple while loop that stops at a sentinel value and otherwise continuously asks for user input. How would I use the incrementing line count variable to display after on what line the user inputted certain things? I would guess use of a dictionary is needed?
lineCount = 1
d = {} #is this needed?
q = raw_input("enter something")
while q != "no":
lineCount += 1
q = raw_input("enter something")
#code here that stores each new input with its respective line and prints what the user inputted on each corresponding line when the loop ends
Many thanks in advance

Using array:
lines = []
def add_line(line):
lines.append(line)
def print_lines():
for i in range(len(lines)):
print "%d: %s" % (i, lines[i])
lineCount = 1
q = raw_input("enter something")
add_line(q)
while q != "no":
lineCount += 1
q = raw_input("enter something")
if q != "no":
add_line(q)
print_lines()

Exactly like you said, using a dictionary:
d = {}
lineCount = 1
q = raw_input("enter something: ")
while q != "no":
lineCount += 1
q = raw_input("enter something: ")
d[lineCount] = q
Then you could just query your dictionary if you want to know what the user input at a desired line as d[desired_line].
Assuming you have a simple dictionary, such as:
d = {2: 'b', 3: 'c', 4: 'd', 5: 'no', 6: 'c'}
Then if you'd like to print out the lines where a desired word appear you could define a function like so:
def print_lines(my_dict, word):
lines = [line for line, ww in my_dict.iteritems() if ww == word]
return lines
You could call this function with your dictionary (d) and the word your looking for, let's say 'c':
print_lines(d, 'c')
Finally, could iterate over the set of words (unique_words using set) you have and print the lines they appear on by calling the previous function:
def print_words(my_dict):
unique_words = set([word for word in my_dict.itervalues() if word != 'no'])
for word in sorted(unique_words):
print("'{0}' occurs in lines {1}".format(word, print_lines(my_dict, word)))
UPDATE (for sentences instead of single words):
In case you want to work with sentences, rather than single words per line, you'll need to modify your code as follows:
First, you could define a function to handle the user input and build the dictionary:
def build_dict():
d = {}
lineCount = 1
q = raw_input("enter something: ")
while q != "no":
d[lineCount] = q
lineCount += 1
q = raw_input("enter something: ")
return d
Let's say you build a dictionary like such:
d = {1: 'a x', 2: 'a y', 3: 'b x'}
Then you can define a helper function that tells you at which line a particular word occurs:
def occurs_at(my_dict, word):
lines = [line for line, words in my_dict.iteritems() if word in words]
return lines
Finally, you can query this function to see where all your words occur at (and you can decide which words to ignore, e.g. 'a').
def print_words(my_dict, ignored=['a']):
sentences = [sentences.split(' ') for sentences in my_dict.itervalues()]
unique_words = set([word for sentence in sentences for word in sentence])
for word in sorted(unique_words):
if word not in ignored:
print("'{0}' occurs in lines {1}".format(word, occurs_at(my_dict, word)))

Matching input letters with a dictionary in Python

I'm trying to make a program that will read in words from a .txt file and having the user input letters of own choosing, and the program will give print out all the matches.
This is what I got so far:
fil = open("example.txt", "r")
words = fil.readlines()
letters = raw_input("Type in letters: ")
compare = set(letters)
lista = []
for a_line in words:
a_line = a_line.strip()
lineword = set(a_line)
if compare >= lineword:
lista.append(rad)
print lista
Now this works only to a certain degree. It does match the user input with the content of the .txt file, but I want it to be more precise. For example:
If I put in "hrose" it will find me "horse", but it will also find me "roses" with two s, since it only compares elements and not amount
How can I make the program to only use the specified letters?

You can use Counter:
from collections import Counter
def compare(query, word):
query_count = Counter(query)
word_count = Counter(word)
return all([query_count[char] >= word_count[char] for char in word])
>>> compare("hrose", "rose")
True
>>> compare("hrose", "roses")
False

Counters are your friend
from collections import Counter
fil = open("example.txt", "r")
words = [(a.strip(), Counter(a.strip())) for a in fil.readlines()]
letters = raw_input("Type in letters: ")
letter_count = Counter(letters)
word_list = []
for word, word_count in words:
if all([letter_count[char] >= word_count[char] for char in word]):
word_list.append(word)
print word_list
looking at the comments, it's possible you may only want exact matches, if so, you don't even need a counter
fil = open("example.txt", "r")
words = [(a.strip(), sorted(a.strip())) for a in fil.readlines()]
letters = sorted(raw_input("Type in letters: "))
word_list = [word for word, sorted_word in words if letters == sorted_word]
print word_list

you can map a mapping dictionary with key as the letters in the word and value being how many times it occurs in that word.
Now just compare two dictionaries.
fil = open("example.txt", "r")
words = fil.readlines()
letters = raw_input("Type in letters: ")
compare = list(letters)
letter_dict = {}
for letter in compare:
try:
letter_dict[letter] += 1
except KeyError:
letter_dict[letter] = 0
lista = []
for a_line in words:
a_line = a_line.strip()
lineword = list(a_line)
word_dict = {}
for letter in lineword:
try:
word_dict[letter] += 1
except KeyError:
word_dict[letter] = 0
flag = True
for key, value in letter_dict.items():
if key not in word_dict or word_dict[key] < value:
flag = False
break;
if flag:
lista.append(a_line)
print lista

one approach you could follow is to use set fundtions:
either use issubset/issuperset
set("horse").issubset(set("hrose")) #returs True
set("horse").issubset(set("roses")) #returns False
or
set("horse").difference(set("hrose")) #returns empty set based on set length you know close call
set("horse").difference(set("roses")) #returns set(['h'])
In the second approach, if you have the choice to choose among multiple options, you could go for result with small length.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Iterating through a dictionary to find the most frequently used word - python

Related

Word Frequency counter not counting past 1 per user input

How to find the largest number of times a word is repeated consecutively in a given string?

Count vowels and print duplicate vowels

Can someone explain how to store and display a line count with user input?

Matching input letters with a dictionary in Python

Categories

Resources