Printing list of words from dictionary - python

(Sorry for the long question)
print_most_common() function which is passed two parameters, a dictionary containing words and their corresponding frequencies e.g.
{"fish":9, "parrot":8, "frog":9, "cat":9, "stork":1, "dog":4, "bat":9, "rat":3}
and, an integer, the required number of characters. The function gets a list of all the words of the required number of characters which are keys of the dictionary and have the highest frequency for words of that length. The function first prints the string made up of the word length (the second parameter), followed by " letter keywords: ", then prints a list of all the words of the required length (keys from the dictionary) which have the highest frequency followed by the frequency value. The list of words must be sorted alphabetically.
e.g.
word_frequencies = {"fish":9, "parrot":8, "frog":9, "cat":9,
"stork":1, "dog":4, "bat":9, "rat":3}
print_most_common(word_frequencies, 3)
print_most_common(word_frequencies, 4)
print_most_common(word_frequencies, 5)
Will print:
3 letter keywords: ['bat', 'cat'] 9
4 letter keywords: ['fish', 'frog'] 9
5 letter keywords: ['stork'] 1
How would I define the print_most_common(words_dict, word_len) function?

how about this.
freq_dict = {k: v for k, v in word_frequencies.items() if len(k) == word_len}
for example:
>> freq_dict = {k: v for k, v in word_frequencies.items() if len(k) == 3}
>> print(freq_dict)
>> {'bat': 9, 'cat': 9, 'dog': 4, 'rat': 3}

This should work for a Python 2 implementation at least, updating to 3 should not be difficult.
Asav provides a way to get dictionary with a words of word_len and their corresponding frequency. You can then retrieve the max value from the frequencies and consequently retrieve the list of words with that frequency.
def print_most_common(words_dict, word_len):
wl_dict = {k: v for k, v in words_dict.items() if len(k) == word_len}
max_value = wl_dict[max(wl_dict, key=wl_dict.get)]
res_list = [key for key,val in wl_dict.items() if val == max_value]
print '%d letter keywords %s %d' % (word_len, res_list, max_value)
Do let me know if you want further decomposition or explanation.

Here is a possible solution:
Get all words of the required length.
filtered_words = {k: v for k, v in words_dict.items() if len(k) == word_len}
Get the maximum count for that length.
max_count = max(filtered_words.values())
Filter the words with that count.
[k for k, v in filtered_words.items() if v == max_count]
Full Code
def print_most_common(words_dict, word_len):
filtered_words = {k: v for k, v in words_dict.items() if len(k) == word_len}
max_count = max(filtered_words.values())
print word_len, 'letter keywords:', [k for k, v in filtered_words.items() if v == max_count], max_count

Related

Appending an integer to dictionary keys, python

I have a python dictionary with several keys.
Example:
dicOut = dict(list(zip(keys, values)))
for i in keys:
print(i)
out:
Trees
Cars
People
.... x n keys
I wish to assign a number in front.
Trees
Cars
People
.... x n keys
How do I make the for loop"
so far:
k = len(keys)
x = range (1,k+1)
for j in x:
for k in keys:
n= j, '-', k
print(n)
However it print all e.g. 3 keys 3 time. How to stop it at just e.g. e distinct keys.
for key in dict:
print(key, '. ', dict[key])
for i,k in enumerate(dicOut, start=1):
print(i,k)

ordering the dictionary values in list based on the position in the keys

I have a dictionary where the value is a list of a few substrings from the key which is a string.
For example:
d = {"How are things going": ["going","How"], "What the hell" : ["What", "hell"], "The police dept": ["dept","police"]}
and I want to get a list of lists generated from list values based on the position they appeared in the key. For example in the case above:
output = [["How", "going"], ["What", "hell"], ["police", "dept"]]
I did not find an efficient way to do it so I used a hacky approach:
final_output = []
for key,value in d.items():
if len(value) > 1:
new_list = []
for item in value:
new_list.append(item, key.find(item))
new_list.sort(key = lambda x: x[1])
ordered_list = [i[0] for i in new_list]
final_ouput.append(ordered_list)
Use sorted with str.find:
[sorted(v, key=k.find) for k, v in d.items()]
Output:
[['How', 'going'],
['What', 'hell'],
['police', 'dept']]
Using List Comprehension
output = [[each_word for each_word in key.split() if each_word in value] for key, value in d.items()]
Keys are already sorted so we can skip sorting
we can split key on " " and filter the splitted by associated value in dict
d = {"How are things going": ["going","How"], "What the hell" : ["What", "hell"], "The police dept": ["dept","police"]}
lists = []
for k in d.keys():
to_replace = k.split(" ")
replaced = filter(lambda x: x in d[k],to_replace)
lists.append(list(replaced))
print(lists)
Output:
[['How', 'going'], ['What', 'hell'], ['police', 'dept']]

How to use loop to get the word frequency of a list object and store in a dict object?

I have a list called data and a dict object called word_count, before converting the frequency into unique integers, I want to return a dict object word_count (expected format: {'marjori': 1,'splendid':1...}) and then sort the frequency.
data = [['marjori',
'splendid'],
['rivet',
'perform',
'farrah',
'fawcett']]
def build_dict(data, vocab_size = 5000):
word_count = {}
for w in data:
word_count.append(data.count(w)) ????
#print(word_count)
# how can I sort the words to make sorted_words[0] is the most frequently appearing word and sorted_words[-1] is the least frequently appearing word.
sorted_words = ??
I'm new to Python, can someone help me, thanks in advance. (I only want to use numpy library and for loop.)
For each word, you need to create a dict entry if it doesn't exist yet, or add 1 to it's value if it does exist:
word_count = dict()
for w in data:
if word_count.get(w) is not None:
word_count[w] += 1
else:
word_count[w] = 1
Then you can sort your dictionary by value:
word_count = {k: v for k, v in sorted(word_count.items(), key=lambda item: item[1], reverse=True)}
The last part of your code is not understandable, but if you only want to count the words and insert it into a dictionary and sort it by it frequency in descending order, I would suggest to use defaultdict and implement it like this:
data = ['marjori',
'splendid',
'rivet',
'farrah',
'perform',
'farrah',
'fawcett']
from collections import defaultdict
def build_dict(data, vocab_size = 5000):
"""Construct and return a dictionary mapping each of the most frequently appearing words to a unique integer."""
word_count = defaultdict(int) # A dict storing the words that appear in the reviews along with how often they occur
for w in data:
word_count[w]+=1
#print(word_count)
# how can I sort the words to make sorted_words[0] is the most frequently appearing word and sorted_words[-1] is the least frequently appearing word.
sorted_words = {k: v for k, v in sorted(word_count.items(), key=lambda item: item[1])}
return sorted_words
build_dict(data)
Output:
{'farrah': 2,
'fawcett': 1,
'marjori': 1,
'perform': 1,
'rivet': 1,
'splendid': 1}

How to find multiple maximums in a dictionary

Trying to analyse some strings and compute the number of times they come up. This data is stored in a dictionary. If I were to use the max function only the first highest number encountered would be printed.
count = {"cow": 4, "moo": 4, "sheep": 1}
print(max(count.keys(), key=lambda x: count[x]))
cow
This would yield cow to be the max. How would I get "cow" and "moo" to both be printed
count = {"cow": 4, "moo": 4, "sheep": 1}
cow, moo
Why not keep it simple?
mx = max(count.values())
print([k for k, v in count.items() if v == mx])
# ['cow', 'moo']
The bracketed expression in line two is a list comprehension, essentially a short hand for a for loop that runs over one list-like object (an "iterable") and creates a new list as it goes along. A subtlety in this case is that there are two loop variables (k and v) that run simultaneously their values being assigned by tuple unpacking (.items() returns pairs (key, value) one after the other). To summarize the list comprehension here is roughly equivalent to:
result = []
for k, v in count.items():
if v == mx:
result.append(k)
But the list comprehension will run faster and is also easier to read once you got used to it.
Just group the counts with a defaultdict, and take the maximum:
from collections import defaultdict
count = {"cow": 4, "moo": 4, "sheep": 1}
d = defaultdict(list)
for animal, cnt in count.items():
d[cnt].append(animal)
print(dict(d))
# {4: ['cow', 'moo'], 1: ['sheep']}
print(max(d.items())[1])
# ['cow', 'moo']

Python key error in function

I have a function that should return the number of words that contain each vowel (all lowercase), but I keep getting a key error. I'd appreciate any help in figuring it out. Thank you.
def vowelUseDict(t):
'''computes and returns a dictionary with the number of words in t containing each vowel
'''
vowelsUsed = {}
strList = t.split()
newList = []
vowels ='aeiou'
for v in vowels:
for strs in strList:
if v in strs and strs not in newList:
newList.append(strs)
vowelsUsed[v] = 1
if v in strs and strs in newList:
vowelsUsed[v] += 1
return vowelsUsed
text = 'like a vision she dances across the porch as the radio plays'
print(vowelUseDict(text))
#{'e': 5, 'u': 0, 'o': 4, 'a': 6, 'i': 3}
from collections import Counter
def vowelUseDict(t):
vowels = 'aeiou'
cnt = sum(map(Counter, t.split()), Counter())
return {k: cnt[k] if k in cnt else 0 for k in vowels}
That's because newList keeps the words from the previous vowels. Once you reach "like" for "i", it already exists since it was added for "e". This means that it tries to add to the value for key "i" in vowelsUsed, which doesn't exist (it would get added the first time a word is found that hasn't been added for another vowel).
Since (judging by the last line) you want every vowel to be in the resulting dict, you can just create the dict with all the vowels as keys and the values as zero, and you don't even have to check if a key exists. Just increase the value by one if the word contains the vowel.
The resulting code would be something like this:
def vowelUseDict(t):
'''computes and returns a dictionary with the number of words in t containing each vowel
'''
strList = t.split()
vowels ='aeiou'
vowelsUsed = {v: 0 for v in vowels}
for v in vowels:
for strs in strList:
if v in strs:
vowelsUsed[v] += 1
return vowelsUsed
text = 'like a vision she dances across the porch as the radio plays'
print(vowelUseDict(text))
#{'e': 5, 'u': 0, 'o': 4, 'a': 6, 'i': 3}
Roy Orbison singing for the lonely; hey, that's me and I want you only

Categories