Word Frequency Counter Python - python

Struggling with this exercise which must use a dictionary and count the number of times each word appears in a number of user inputs. It works in a fashion, but does not atomise each word from each line of user input. So instead of counting an input of 'happy days' as 1 x happy and 1 x days, it gives me 1 x happy days. I have tried split() along with the lower() but this converts the input to a list and I am struggling with then pouring that list into a dictionary.
As you may have guessed, I'm a bit of a novice, so all help would be greatly appreciated!
occurrences = {}
while True:
word = input('Enter line: ')
word = word.lower() #this is also where I have tried a split()
if word =='':
break
occurrences[word]=occurrences.get(word,0)+1
for word in (occurrences):
print(word, occurrences[word])
EDIT
Cheers for responses. This ended up being the final solution. They weren't worried about case and wanted the final results sorted().
occurrences = {}
while True:
words = input('Enter line: ')
if words =='':
break
for word in words.split():
occurrences[word]=occurrences.get(word,0)+1
for word in sorted(occurrences):
print(word, occurrences[word])

What you have is almost there, you just want to loop over the words when adding them to the dict
occurrences = {}
while True:
words = input('Enter line: ')
words = words.lower() #this is also where I have tried a split()
if words =='':
break
for word in words.split():
occurrences[word]=occurrences.get(word,0)+1
for word in (occurrences):
print(word, occurrences[word])

This line does not get executed: occurrences[word]=occurrences.get(word,0)+1
Because if it enters the if, it goes to the break and never executes that line. To make it be outside of the if don't indent it.
In general, the indentation of the posted code is messed up, I guess it's not really like that in your actual code.

Do you want line by line stats or do you want overall stats ? I'm guessing you want line by line, but you can also get overall stats easily by uncommenting a few lines in the following code:
# occurrences = dict() # create a dictionary here if yuo want to have incremental overall stats
while True:
words = input('Enter line: ')
if words =='':
break
word_list = words.lower().split()
print word_list
occurrences = dict() # create a dict here if you want line by line stats
for word in word_list:
occurrences[word] = occurrences.get(word,0)+1
## use the lines bellow if you want line by line stats
for k,v in occurrences.items():
print k, " X ", v
## use the lines bellow if you want overall stats
# for k,v in occurrences.items():
# print k, " X ", v

Related

How to chose a random word from a list in a file with an especific lenght in python

Im very new on python, actually, Im not even a programer, Im a doctor :), and as a way to practice I decided to wright my hangman version.
After some research I couldnt find any way to use the module "random" to return a word with an especific length. As a solution, I wrote a routine in which it trys a random word till it found the right lenght. It worked for the game, but Im sure its a bad solution and of course it affects the performance. So, may someone give me a better solution? Thanks.
There is my code:
import random
def get_palavra():
palavras_testadas = 0
num_letras = int(input("Choose the number of letters: "))
while True:
try:
palavra = random.choice(open("wordlist.txt").read().split())
escolhida = palavra
teste = len(list(palavra))
if teste == num_letras:
return escolhida
else:
palavras_testadas += 1
if palavras_testadas == 100: # in large wordlists this number must be higher
print("Unfortunatly theres is no words with {} letters...".format(num_letras))
break
else:
continue
except ValueError:
pass
forca = get_palavra()
print(forca)
You may
read the file once and store the content
remove the newline \n char fom each line, because it count as a character
to avoid making choice on lines that hasn't the good length, filter first to keep the possible ones
if the good_len_lines list has no element you directly know you can stop, no need to do a hundred picks
else, pick a word in the good_length ones
def get_palavra():
with open("wordlist.txt") as fic: # 1.
lines = [line.rstrip() for line in fic.readlines()] # 2.
num_letras = int(input("Choose the number of letters: "))
good_len_lines = [line for line in lines if len(line) == num_letras] # 3.
if not good_len_lines: # 4.
print("Unfortunatly theres is no words with {} letters...".format(num_letras))
return None
return random.choice(good_len_lines) # 5.
Here is a working example:
def random_word(num_letras):
all_words = []
with open('wordlist.txt') as file:
lines = [ line for line in file.read().split('\n') if line ]
for line in lines:
all_words += [word for word in line.split() if word]
words = [ word for word in all_words if len(word) == num_letras ]
if words:
return random.choice(words)

How to take out punctuation from string and find a count of words of a certain length?

I am opening trying to create a function that opens a .txt file and counts the words that have the same length as the number specified by the user.
The .txt file is:
This is a random text document. How many words have a length of one?
How many words have the length three? We have the power to figure it out!
Is a function capable of doing this?
I'm able to open and read the file, but I am unable to exclude punctuation and find the length of each word.
def samplePractice(number):
fin = open('sample.txt', 'r')
lstLines = fin.readlines()
fin.close
count = 0
for words in lstLines:
words = words.split()
for i in words:
if len(i) == number:
count += 1
return count
You can try using the replace() on the string and pass in the desired punctuation and replace it with an empty string("").
It would look something like this:
puncstr = "Hello!"
nopuncstr = puncstr.replace(".", "").replace("?", "").replace("!", "")
I have written a sample code to remove punctuations and to count the number of words. Modify according to your requirement.
import re
fin = """This is a random text document. How many words have a length of one? How many words have the length three? We have the power to figure it out! Is a function capable of doing this?"""
fin = re.sub(r'[^\w\s]','',fin)
print(len(fin.split()))
The above code prints the number of words. Hope this helps!!
instead of cascading replace() just use strip() a one time call
Edit: a cleaner version
pl = '?!."\'' # punctuation list
def samplePractice(number):
with open('sample.txt', 'r') as fin:
words = fin.read().split()
# clean words
words = [w.strip(pl) for w in words]
count = 0
for word in words:
if len(word) == number:
print(word, end=', ')
count += 1
return count
result = samplePractice(4)
print('\nResult:', result)
output:
This, text, many, have, many, have, have, this,
Result: 8
your code is almost ok, it just the second for block in wrong position
pl = '?!."\'' # punctuation list
def samplePractice(number):
fin = open('sample.txt', 'r')
lstLines = fin.readlines()
fin.close
count = 0
for words in lstLines:
words = words.split()
for i in words:
i = i.strip(pl) # clean the word by strip
if len(i) == number:
count += 1
return count
result = samplePractice(4)
print(result)
output:
8

Length function in python is not working the way I want to

I am newbie to programming and python. I looked online for help and I doing as they say but I think I am making a mistake which I am not able to catch.
For now all I'm trying to do here is: if the word matches the length that user entered with the word in the file, make a list of those words. It sort of works if I replace userLength with the actual number but it's not working with variable userlength. I need that list later to develop Hangman.
Any help or recommendation on code will be great.
def welcome():
print("Welcome to the Hangman: ")
userLength = input ("Please tell us how long word you want to play : ")
print(userLength)
text = open("test.txt").read()
counts = Counter([len(word.strip('?!,.')) for word in text.split()])
counts[10]
print(counts)
for wl in text.split():
if len(wl) == counts :
wordLen = len(text.split())
print (wordLen)
print(wl)
filename = open("test.txt")
lines = filename.readlines()
filename.close()
print (lines)
for line in lines:
wl = len(line)
print (wl)
if wl == userLength:
words = line
print (words)
def main ():
welcome()
main()
The input function returns a string, so you need to turn userLength into an int, like this:
userLength = int(userLength)
As it is, the line wl == userLength is always False.
Re: comment
Here's one way to build that word list of words with the correct length:
def welcome():
print("Welcome to the Hangman: ")
userLength = int(input("Please tell us how long word you want to play : "))
words = []
with open("test.txt") as lines:
for line in lines:
word = line.strip()
if len(word) == userLength:
words.append(word)
input() returns a string py3.x , so you must convert it to int first.
userLength = int(input ("Please tell us how long word you want to play : "))
And instead of using readlines you can iterate over one line at once, it is memory efficient. Secondly use the with statement when handling files as it automatically closes the file for you.:
with open("test.txt") as f:
for line in f: #fetches a single line each time
line = line.strip() #remove newline or other whitespace charcters
wl = len(line)
print (wl)
if wl == userLength:
words = line
print (words)

Search string for set of characters

So for some background: I've been going through learning python the hard way and have taken a little break to try doing a few fun things, I came across on a suggestion on daniweb, to try create a program in which you enter in a list of characters and it will then print out any word that contain all those characters.
I've figured out how to do it manually, here's the below code:
string = raw_input("Please enter the scrable letters you have: ")
for line in open('/usr/share/dict/words', 'r').readlines():
if string[0] in line and string[1] in line and string[2] in line:
print line,
But I somehow cannot figure out how to get it to work by using loops (that way the user can enter in a list of characters of any length. I figured something like the below would work, but it doesn't appear to do so:
while i < len(string)-1:
if string[i] in line: tally = tally + 1
i = i + 1
if tally == len(string)-1: print line
else: i = 0
Any help in the right direction would be much appreciated, thanks.
I would use all with a comprehension for this ... and the comprehension is a loop
user_string = raw_input("Please enter the scrable letters you have: ")
for line in open('/usr/share/dict/words', 'r').readlines():
if all(c in line for c in user_string):
print line,
Set operations can come in handy here:
inp = set(raw_input("Please enter the scrable letters you have: "))
with open('/usr/share/dict/words', 'r') as words:
for word in words:
if inp <= set(word):
print word,

Accessing certain words in an split list

I am trying to create a program in python that takes a sentence from a user and jumbles the middle letters of said word, but keeping the other letters intact...Right now I have code that will rearrange all the user input's and just forgets about the spaces...I'll let my code speak for myself.. IT works fine for a single word input, I guess I will just summarize it...
I need to randomize each word the user enters keeping the other words intact afterwards..
import random
words = input("Enter a word or sentence") #Gets user input
words.split()
for i in list(words.split()): #Runs the code for how many words there are
first_letter = words[0] #Takes the first letter out and defines it
last_letter = words[-1] #Takes the last letter out and defines it
letters = list(words[1:-1]) #Takes the rest and puts them into a list
random.shuffle(letters) #shuffles the list above
middle_letters = "".join(letters) #Joins the shuffled list
final_word_uncombined = (first_letter, middle_letters, last_letter) #Puts final word all back in place as a list
final_word = "".join(final_word_uncombined) #Puts the list back together again
print(final_word) #Prints out the final word all back together again
Your code is almost right. Corrected version would be like this:
import random
words = raw_input("Enter a word or sentence: ")
jumbled = []
for word in words.split(): #Runs the code for how many words there are
if len(word) > 2: # Only need to change long words
first_letter = word[0] #Takes the first letter out and defines it
last_letter = word[-1] #Takes the last letter out and defines it
letters = list(word[1:-1]) #Takes the rest and puts them into a list
random.shuffle(letters) #shuffles the list above
middle_letters = "".join(letters) #Joins the shuffled list
word = ''.join([first_letter, middle_letters, last_letter])
jumbled.append(word)
jumbled_string = ' '.join(jumbled)
print jumbled_string
So I read this question, during lunch at the apartment, then I had to wade through traffic. Anyways here is my one line contribution. Seriously alexeys' answer is where it's at.
sentence = input("Enter a word or sentence")
print " ".join([word[0] + ''.join(random.sample(list(word[1:-1]), len(list(word[1:-1])))) + word[-1] for word in sentence.split()])
If i understand your question correctly it looks like you are on track, you just have to extend this for every word
randomized_words = []
for word in words.split():
#perform your word jumbling
radomized_words.append(jumbled_word)
print ' '.join(randomized_words)
This creates a separate jumbled word list. Each word in the users word input is jumbled and added to the list to retain order. At the end, the jumbled words list is printed. Each word is in the same order as entered by the user but the letters are jumbled.

Categories