How to count occurrences of key in list of dictionaries - python

I know this is a frequently asked question, however I do not have access to the Counter module as I'm using v2.6 of Python. I want to count the number of time a specific key appears in a list of dictionaries.
If my dictionary looks like this:
data = [{'a':1, 'b':1}, {'a':1, 'c':1}, {'b':1, 'c':1}, {'a':1, 'c':1}, {'a':1, 'd':1}]
How would I find out how many times "a" appears? I've tried using len, but that only returns the number of values for one key.
len(data['a'])

You can use list comprehension.
data = [{'a':1, 'b':1}, {'a':1, 'c':1}, {'b':1, 'c':1}, {'a':1, 'c':1}, {'a':1, 'd':1}]
sum([1 for d in data if 'a' in d])
Explanation:
First take the dictionary object from list data, check if key 'a' is present in the dictionary or not, if present, add 1 to the list. Then sum the new list.

You won't have access to collections.Counter, but collections.defaultdict was added in Python 2.5
keys and flatten list
data = [j for i in data for j in i.keys()]
# ['a', 'b', 'a', 'c', 'c', 'b', 'a', 'c', 'a', 'd']
collections.defaultdict
from collections import defaultdict
dct = defaultdict(int)
for key in data:
dct[key] += 1
# defaultdict(<type 'int'>, {'a': 4, 'c': 3, 'b': 2, 'd': 1})
If you only need the count for a, there are simpler ways to do this, but this will give you the counts of all keys in your list of dictionaries.

A one-line solution could be:
len([k for d in data for k in d.keys() if k == 'a'])

For this you could write the following function that would work for data in the structure you provided (a list of dicts):
def count_key(key,dict_list):
keys_list = []
for item in dict_list:
keys_list += item.keys()
return keys_list.count(key)
Then, you could invoke the function as follows:
data = [{'a':1, 'b':1}, {'a':1, 'c':1}, {'b':1, 'c':1}, {'a':1, 'c':1}, {'a':1, 'd':1}]
count_a = count_key('a',data)
In this case, count_a will be 4.

This question looks very much like a class assignment. Here is a simple bit of code that will do the job:
n=0
for d in data:
if 'a' in d:
n+=1
print(n)
Here n is a counter, the for loop iterates through the list of dictionaries.
The 'a' in d expression will return true if the key 'a' is in the dictionary d, in which case the counter n will be incremented. At the end the result is printed. I believe in Python 2.6 the brackets would be optional (I am using 3.6).

Related

Dictionary Comprehension in Python for key:[1,2,3] [duplicate]

This question already has answers here:
is it possible to reverse a dictionary in python using dictionary comprehension
(5 answers)
Closed 2 years ago.
While I've been improving my Python skills I have one question.
My code is below:
# def invertDictionary(dict):
# new_dict = {}
# for key, value in dict.items():
# if value in new_dict:
# new_dict[value].append(key)
# else:
# new_dict[value]=[key]
# return new_dict
def invertDictionary(dict):
new_dict = {value:([key] if value else [key]) for key, value in dict.items()}
return new_dict;
invertDictionary({'a':3, 'b':3, 'c':3})
I am trying to get output like {3:['a','b','c']}. I have achieved that using a normal for-loop; I just want to know how to get these results using a Dictionary Comprehension. I tried but in append it's getting an error. Please let me know how to achieve this.
Thanks in Advance!
You missed that you also need a list comprehension to build the list.
Iterate over the values in the dict, and build the needed list of keys for each one.
Note that this is a quadratic process, whereas the canonical (and more readable) for loop is linear.
d = {'a':3, 'b':3, 'c':3, 'e':4, 'f':4, 'g':0}
inv_dict = {v: [key for key, val in d.items() if val == v]
for v in set(d.values())}
result:
{0: ['g'],
3: ['a', 'b', 'c'],
4: ['e', 'f']
}
Will this do?
while your original version with a regular for loop is the best solution for this, here is a variation on #Prune answer that doesn't goes over the dict multiple times
>>> import itertools
>>> d = {'a':3, 'b':3, 'c':3, 'e':4, 'f':4, 'g':0}
>>> {group_key:[k for k,_ in dict_items]
for group_key,dict_items in itertools.groupby(
sorted(d.items(),key=lambda x:x[-1]),
key=lambda x:x[-1]
)
}
{0: ['g'], 3: ['a', 'b', 'c'], 4: ['e', 'f']}
>>>
first we sorted the items of the dict by value with a key function to sorted using a lambda function to extract the value part of the item tuple, then we use the groupby to group those with the same value together with the same key function and finally with a list comprehension extract just the key
--
as noted by Kelly, we can use the get method from the dict to get the value to make it shorter and use the fact that iteration over a dict give you its keys
>>> {k: list(g) for k, g in itertools.groupby(sorted(d, key=d.get), d.get)}
{0: ['g'], 3: ['a', 'b', 'c'], 4: ['e', 'f']}
>>>
You could use a defalutdict and the append method.
from collections import defaultdict
dict1 = {'a': 3, 'b': 3, 'c': 3}
dict2 = defaultdict(list)
{dict2[v].append(k) for k, v in dict1.items()}
dict2
>>> defaultdict(list, {3: ['a', 'b', 'c']})

correcting unhashable type: 'dict_keys'

try to find the max value in a nested dictionary, but showed unhashable type: 'dict_keys'error
Suppose I have this dictionary:
d = {'A': {'a':2, 'b':2, 'c':0},
'B': {'a':2, 'b':0, 'c':1}}
I want the code to return all the key(s) that contain maximum values within the dictionary (i.e. the maximum value in dictionary A is 2, and I want the code to return me the corresponding keys: 'a' and 'b')
['a','b']
here is the code I wrote:
max_value = max(d[Capital_Alph].values()))
return [key for key, value in d[Capital_Alph].items()
if value == max_value]
So you have a dictionary with a str as value and a dict as key, You can do something like this:
d = {'A': {'a':2, 'b':2, 'c':0},
'B': {'a':2, 'b':0, 'c':1}}
print(list(d['A'].keys()))
Returns:
['a', 'b', 'c']
[Finished in 0.8s]
Is this a viable solution to what you are trying to accomplish?
You can not use non-hashable datatypes as keys for sets or dict. You can accomplish your task by:
d = {'A': {'a':2, 'b':2, 'c':0},
'B': {'a':2, 'b':0, 'c':1}}
max_v = {k:max(d[k].values()) for k in d } # get the max value of the inner dict
print(max_v)
for inner in max_v:
print("Max keys of dict {} are: {}".format(inner,
[k for k,v in d[inner].items() if v == max_v[inner]]))
Output:
{'A': 2, 'B': 2} # max values of inner dicts
Max keys of dict A are: ['a', 'b']
Max keys of dict B are: ['a']
The part [k for k,v in d[inner].items() if v == max_v[inner]])) is needed to get all inner keys (if multiple exists) that have the same maximum value.
There are two errors in your code: there are too many ) characters in your calculation of max_value and you can't use return outside a function.
But if I fix those issues and do this:
>>> d = {'A': {'a':2, 'b':2, 'c':0},
'B': {'a':2, 'b':0, 'c':1}}
>>> Capital_Alph = "A"
>>> max_value = max(d[Capital_Alph].values())
>>> [key for key, value in d[Capital_Alph].items()
if value == max_value]
['a', 'b']
it's clear that there isn't a lot else wrong here. To avoid complicating things I didn't put the obvious loop around this:
for Capital_Alph in d:
but you can manage that on your own. Your error message is because you tried to make Capital_Alph a dict_keys object, in other words d.keys(), and use that as a key. You can't do that. You have to step through the list of dictionary keys yourself.

How to get multiple max key values in a dictionary?

Let's say I have a dictionary:
data = {'a':1, 'b':2, 'c': 3, 'd': 3}
I want to get the maximum value(s) in the dictionary. So far, I have been just doing:
max(zip(data.values(), data.keys()))[1]
but I'm aware that I could be missing another max value. What would be the most efficient way to approach this?
Based on your example, it seems like you're looking for the key(s) which map to the maximum value. You could use a list comprehension:
[k for k, v in data.items() if v == max(data.values())]
# ['c', 'd']
If you have a large dictionary, break this into two lines to avoid calculating max for as many items as you have:
mx = max(data.values())
[k for k, v in data.items() if v == mx]
In Python 2.x you will need .iteritems().
You could try collecting reverse value -> key pairs in a defaultdict, then output the values with the highest key:
from collections import defaultdict
def get_max_value(data):
d = defaultdict(list)
for key, value in data.items():
d[value].append(key)
return max(d.items())[1]
Which Outputs:
>>> get_max_value({'a':1, 'b':2, 'c': 3, 'd': 3})
['c', 'd']
>>> get_max_value({'a': 10, 'b': 10, 'c': 4, 'd': 5})
['a', 'b']
First of all, find what is the max value that occurs in the dictionary. If you are trying to create a list of all the max value(s), then try something like this:
data = {'a':1, 'b':2, 'c': 3, 'd': 3}
max_value = data.get(max(data))
list_num_max_value = []
for letter in data:
if data.get(letter) == max_value:
list_num_max_value.append(max_value)
print (list_num_max_value)
Please let me know if that's not what you are trying to do and I will guide you through the right process.

Update a dictionary with values from a list in Python

I have a Dictionary here:
dic = {'A':1, 'B':6, 'C':42, 'D':1, 'E':12}
and a list here:
lis = ['C', 'D', 'C', 'C', 'F']
What I'm trying to do is (also a requirement of the homework) to check whether the values in the lis matches the key in dic, if so then it increment by 1 (for example there's 3 'C's in the lis then in the output of dic 'C' should be 45). If not, then we create a new item in the dic and set the value to 1.
So the example output should be look like this:
dic = {'A':1, 'B':6, 'C':45, 'D':2, 'E':12, 'F':1}
Here's what my code is:
def addToInventory(dic, lis):
for k,v in dic.items():
for i in lis:
if i == k:
dic[k] += 1
else:
dic[i] = 1
return dic
and execute by this code:
dic = addToInventory(dic,lis)
It compiles without error but the output is strange, it added the missing F into the dic but didn't update the values correctly.
dic = {'A':1, 'B':6, 'C':1, 'D':1, 'E':12, 'F':1}
What am I missing here?
There's no need to iterate over a dictionary when it supports random lookup. You can use if x in dict to do this. Furthermore, you'd need your return statement outside the loop.
Try, instead:
def addToInventory(dic, lis):
for i in lis:
if i in dic:
dic[i] += 1
else:
dic[i] = 1
return dic
out = addToInventory(dic, lis)
print(out)
{'A': 1, 'B': 6, 'C': 45, 'D': 2, 'E': 12, 'F': 1}
As Harvey suggested, you can shorten the function a little by making use of dict.get.
def addToInventory(dic, lis):
for i in lis:
dic[i] = dic.get(i, 0) + 1
return dic
The dic.get function takes two parameters - the key, and a default value to be passed if the value associated with that key does not already exist.
If your professor allows the use of libraries, you can use the collections.Counter data structure, it's meant precisely for keeping counts.
from collections import Counter
c = Counter(dic)
for i in lis:
c[i] += 1
print(dict(c))
{'A': 1, 'B': 6, 'C': 45, 'D': 2, 'E': 12, 'F': 1}

Python remove duplicate value in a combined dictionary's list

I need a little bit of homework help. I have to write a function that combines several dictionaries into new dictionary. If a key appears more than once; the values corresponding to that key in the new dictionary should be a unique list. As an example this is what I have so far:
f = {'a': 'apple', 'c': 'cat', 'b': 'bat', 'd': 'dog'}
g = {'c': 'car', 'b': 'bat', 'e': 'elephant'}
h = {'b': 'boy', 'd': 'deer'}
r = {'a': 'adam'}
def merge(*d):
newdicts={}
for dict in d:
for k in dict.items():
if k[0] in newdicts:
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
return newdicts
combined = merge(f, g, h, r)
print(combined)
The output looks like:
{'a': ['apple', 'adam'], 'c': ['cat', 'car'], 'b': ['bat', 'bat', 'boy'], 'e': ['elephant'], 'd': ['dog', 'deer']}
Under the 'b' key, 'bat' appears twice. How do I remove the duplicates?
I've looked under filter, lambda but I couldn't figure out how to use with (maybe b/c it's a list in a dictionary?)
Any help would be appreciated. And thank you in advance for all your help!
Just test for the element inside the list before adding it: -
for k in dict.items():
if k[0] in newdicts:
if k[1] not in newdicts[k[0]]: # Do this test before adding.
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
And since you want just unique elements in the value list, then you can just use a Set as value instead. Also, you can use a defaultdict here, so that you don't have to test for key existence before adding.
Also, don't use built-in for your as your variable names. Instead of dict some other variable.
So, you can modify your merge method as:
from collections import defaultdict
def merge(*d):
newdicts = defaultdict(set) # Define a defaultdict
for each_dict in d:
# dict.items() returns a list of (k, v) tuple.
# So, you can directly unpack the tuple in two loop variables.
for k, v in each_dict.items():
newdicts[k].add(v)
# And if you want the exact representation that you have shown
# You can build a normal dict out of your newly built dict.
unique = {key: list(value) for key, value in newdicts.items()}
return unique
>>> import collections
>>> import itertools
>>> uniques = collections.defaultdict(set)
>>> for k, v in itertools.chain(f.items(), g.items(), h.items(), r.items()):
... uniques[k].add(v)
...
>>> uniques
defaultdict(<type 'set'>, {'a': set(['apple', 'adam']), 'c': set(['car', 'cat']), 'b': set(['boy', 'bat']), 'e': set(['elephant']), 'd': set(['deer', 'dog'])})
Note the results are in a set, not a list -- far more computationally efficient this way. If you would like the final form to be lists then you can do the following:
>>> {x: list(y) for x, y in uniques.items()}
{'a': ['apple', 'adam'], 'c': ['car', 'cat'], 'b': ['boy', 'bat'], 'e': ['elephant'], 'd': ['deer', 'dog']}
In your for loop add this:
for dict in d:
for k in dict.items():
if k[0] in newdicts:
# This line below
if k[1] not in newdicts[k[0]]:
newdicts[k[0]].append(k[1])
else:
newdicts[k[0]]=[k[1]]
This makes sure duplicates aren't added
Use set when you want unique elements:
def merge_dicts(*d):
result={}
for dict in d:
for key, value in dict.items():
result.setdefault(key, set()).add(value)
return result
Try to avoid using indices; unpack tuples instead.

Categories