Find common elements and their frequency in list of dictionaries

Find common elements and their frequency in list of dictionaries - python

I have multiple (~40) lists that contain dictionaries, that I would like to find
which are those list items (in this case dictionaries) that are common in all lists
how many times each unique item appears across all lists.
Some examples of the lists are:
a = [{'A': 0, 'B': 0},
{'A': 0, 'C': 1},
{'D': 1, 'C': 0},
{'D': 1, 'E': 0}]
b = [{'A': 0},
{'B': 0, 'C': 1},
{'D': 1, 'C': 0},
{'D': 1, 'E': 0}]
c = [{'C': 0},
{'B': 1},
{'D': 1, 'C': 0, 'E': 0},
{'D': 1, 'E': 0}]
What I tried so far, it is the following code, but it returned values that were not common in all lists...
def flatten(map_groups):
items = []
for group in map_groups:
items.extend(group)
return items
def intersection(map_groups):
unique = []
items = flatten(map_groups)
for item in items:
if item not in unique and items.count(item) > 1:
unique.append(item)
return unique
all_lists = [a,b,c]
intersection(all_lists)
What I would expect to get as a result would be:
1. {'D': 1, 'E': 0} as a common item in all lists
2. {'D': 1, 'E': 0}, 3
{'D': 1, 'C': 0}, 2
{'A': 0, 'B': 0}, 1
{'A': 0, 'C': 1}, 1
{'A': 0}, 1
{'B': 0, 'C': 1}, 1
{'C': 0},
{'B': 1},
{'D': 1, 'C': 0, 'E': 0}

To count things, python comes with a nice class: collections.Counter. Now the question is: What do you want to count?
For example, if you want to count the dictionaries that have the same keys and values, you can do something like this:
>>> count = Counter(tuple(sorted(x.items())) for x in a+b+c)
>>> count.most_common(3)
[((('C', 0), ('D', 1)), 2), ((('D', 1), ('E', 0)), 2), ((('A', 0), ('B', 0)), 1)]
The dictionaries here are converted to tuples with sorted items to make them comparable and hashable. Getting for example the 3 most common back as a list of dictionaries is also not too hard:
>>> [dict(x[0]) for x in count.most_common(3)]
[{'C': 0, 'D': 1}, {'D': 1, 'E': 0}, {'A': 0, 'B': 0}]

You can use a nested for loop:
a = [{'A': 0, 'B': 0},
{'A': 0, 'C': 1},
{'D': 1, 'C': 0},
{'D': 1, 'E': 1}]
b = [{'A': 0},
{'B': 0, 'C': 1},
{'D': 1, 'C': 0},
{'D': 1, 'E': 0}]
c = [{'C': 0},
{'B': 1},
{'D': 1, 'C': 0, 'E': 0},
{'D': 1, 'E': 0}]
abc_list = [*a, *b, *c]
abc = list()
for d in abc_list:
for i in abc:
if d == i[0]:
abc[abc.index(i)] = (d, i[1] + 1)
continue
abc.append((d, 1))
print(abc)
Output:
[({'A': 0, 'B': 0}, 1),
({'A': 0, 'C': 1}, 1),
({'D': 1, 'C': 0}, 2),
({'D': 1, 'E': 1}, 1),
({'A': 0}, 1),
({'B': 0, 'C': 1}, 1),
({'D': 1, 'E': 0}, 2),
({'C': 0}, 1),
({'B': 1}, 1),
({'D': 1, 'C': 0, 'E': 0}, 1)]
Explanation:
The line
[*a, *b, *c]
unpacks all the values in lists a, b and c into a single list, which \i named abc_list.
The continue statement where I put it means to directly continue to the next iteration of the inner for loop, without reaching abc.append((d, 1)).
The above output answers question 2. For question 1, we can use the built-in max() method on the abc list, with a custom key:
print(max(ABC, key=lambda x:x[1])[0])
Of course, it will only return one dictionary, {'D': 1, 'C': 0}. If you want to print out multiple dictionaries that appear the most frequently:
m = max(abc, key=lambda x:x[1])[1]
for d in abc:
if d[1] == m:
print(d[0])
Output:
{'D': 1, 'C': 0}
{'D': 1, 'E': 0}

Related

I am getting different value when printing and appending same variable to a list, Why is that?

The first code gives me the output I want but I want the dct to append to a list so I can use the values later. When I try to do that it gives me a different output. Why?
lst = [{'a' : 1, 'b' : 2, 'c': 3 },{'e' : 1, 'f' : 2, 'g': 3}]
e = 0
while e < len(lst):
for k in lst[e]:
dct = {}
x = lst[e][k]
for key, value in lst[e].items():
lst[e][key] = (value - x)
dct[k] = (lst[e])
print(dct)
e += 1
output(lst) = {'a': {'a': 0, 'b': 1, 'c': 2}}
{'b': {'a': -1, 'b': 0, 'c': 1}}
{'c': {'a': -2, 'b': -1, 'c': 0}}
{'e': {'e': 0, 'f': 1, 'g': 2}}
{'f': {'e': -1, 'f': 0, 'g': 1}}
{'g': {'e': -2, 'f': -1, 'g': 0}}
So the following is what I tried to do to save it in a list
e = 0
lst2 = []
while e < len(lst):
for k in lst[e]:
dct = {}
x = lst[e][k]
for key, value in lst[e].items():
lst[e][key] = (value - x)
dct[k] = (lst[e])
lst2.append(dct)
e += 1
print(lst2)
But the output when I print that list gives me the same value for every key in the different dictionaries.
Output(lst2)= [{'a': {'a': -2, 'b': -1, 'c': 0}},
{'b': {'a': -2, 'b': -1, 'c': 0}},
{'c': {'a': -2, 'b': -1, 'c': 0}},
{'e': {'e': -2, 'f': -1, 'g': 0}},
{'f': {'e': -2, 'f': -1, 'g': 0}},
{'g': {'e': -2, 'f': -1, 'g': 0}}]

If you want to use your existing code, change
lst2.append(dct)
to
lst2.append(dct.copy())
(and to understand why, read up on lists, references, and mutability.)
Or, if you want to rewrite your code, you might use
list_ = [{'a' : 1, 'b' : 2, 'c': 3 },{'e' : 1, 'f' : 2, 'g': 3}]
result = {}
for d in list_:
for key, value in d.items():
result[key] = {k:d[k]-value for k in d}
which gives
>>> print(result)
{'a': {'a': 0, 'b': 1, 'c': 2},
'b': {'a': -1, 'b': 0, 'c': 1},
'c': {'a': -2, 'b': -1, 'c': 0},
'e': {'e': 0, 'f': 1, 'g': 2},
'f': {'e': -1, 'f': 0, 'g': 1},
'g': {'e': -2, 'f': -1, 'g': 0},
}
(and if you're a fan of code-golf, here's a one-liner:)
result = {key: {k:d[k]-value for k in d} for d in list_ for key,value in d.items()}

Create list with all combination of dictionaries, where key only appears once

I tried different approaces with itertools, but just can't figure it out.
I need to find different combinations of dictionaries:
letters = ['a','b','c']
combinations = []
for i in range(3):
for t in letters:
one_combi = {str(t):i}
combinations.append(one_combi)
Now have a list of dictionaries {letter:number}
Now I need to create a list of combinations where the key (letter) only appear once.
Expected output looks something like this:
[{'a':0,'b':0,'c':0},
{'a':1,'b':0,'c':0},
{'a':1,'b':1,'c':0},
{'a':1,'b':1,'c':1},
{'a':2,'b':0,'c':0},
...
{'a':2,'b':2,'c':2}]
Would be great if someone can help me out on this one!

You can generate all combinations of integers from a range derived from the length of the input, and then use zip:
letters = ['a','b','c']
def combos(d, c = []):
if len(c) == len(d):
yield dict(zip(letters, c))
else:
for i in d:
yield from combos(d, c+[i])
print(list(combos(range(len(letters))))
Output:
[{'a': 0, 'b': 0, 'c': 0},
{'a': 0, 'b': 0, 'c': 1},
{'a': 0, 'b': 0, 'c': 2},
{'a': 0, 'b': 1, 'c': 0},
{'a': 0, 'b': 1, 'c': 1},
...
{'a': 2, 'b': 2, 'c': 2}]

What you are looking for is itertools.product
from itertools import product
lst = []
for a, b, c in product([0, 1, 2], repeat=3):
lst.append({'a': a, 'b': b, 'c': c})
print(lst)
Output:
[{'a': 0, 'b': 0, 'c': 0},
{'a': 0, 'b': 0, 'c': 1},
{'a': 0, 'b': 0, 'c': 2},
{'a': 0, 'b':1, 'c': 0},
{'a': 0, 'b': 1, 'c': 1},
{'a': 0, 'b': 1, 'c': 2},
{'a': 0, 'b': 2, 'c': 0},
{'a': 0, 'b': 2, 'c': 1},...
Update
We can compact everything into a single line using list comprehension.
letters = ['a','b','c']
lst = [dict(zip(letters, x)) for x in product(range(len(letters)), repeat=len(letters))]
print(lst)

Is there any way to sort this dictionaries by lowest value from keys?

I just wanna sort these dictionaries with some values from an input file.
def sortdicts():
listofs=[]
listofs=splitndict()
print sorted(listofs)
The splitndict() function has this output:
[{'a': 1, 'b': 2}, {'c': 2, 'd': 4}, {'a': 7, 'c': 3}, {'y': 5, 'x': 0}]
While the input is from another file and it's:
a 1
b 2
c 2
d 4
a 7
c 3
x 0
y 5
I used this to split the dictionary:
def splitndict():
listofd=[]
variablesRead=readfromfile()
splitted=[i.split() for i in variablesRead]
d={}
for lines in splitted:
if lines:
d[lines[0]]=int(lines[1])
elif d=={}:
pass
else:
listofd.append(d)
d={}
print listofd
return listofd
The output file should look like this:
[{'y': 5, 'x': 0}, {'a': 1, 'b': 2}, {'c': 2, 'd': 4}, {'a': 7, 'c': 3}
This output because :
It needs to be sorted by the lowest value from each dictionary key.

array = [{'y': 5, 'x': 0}, {'a': 1, 'b': 2}, {'c': 2, 'd': 4}, {'a': 7, 'c': 3}]
for the above array:
array = sorted(array, lambda element: min(element.values()))
where "element.values()" returns all values from dictionary and "min" returns the minimum of those values.
"sorted" passes each dictionary (an element) inside the lambda function one by one. and sorts on the basis of the result from the lambda function.

x = [{'y': 5, 'x': 0}, {'a': 1, 'b': 2}, {'c': 2, 'd': 4}, {'a': 7, 'c': 3}]
sorted(x, key=lambda i: min(i.values()))
Output is
[{'y': 5, 'x': 0}, {'a': 1, 'b': 2}, {'c': 2, 'd': 4}, {'a': 7, 'c': 3}]

Filter Dictionary keys of multilevel dictionary

I have the following dict structure:
{12345: {2006: [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}, {'a': 1, 'b': 5}]}, 12346: {2007: [{'a': 2, 'b': 7}, {'a': 1, 'b': 9}, {'a': 1, 'b': 12}]}}
I want to be able to filter based on the keys of 'a' or 'b'
for example if 'a' is 1 the my filtered dict would look like:
{12345: {2006: [{'a': 1, 'b': 2}, {'a': 1, 'b': 5}]}, 12346: {2007: [{'a': 1, 'b': 9}, {'a': 1, 'b': 12}]}}
I have the following for loop which gets me down to where I have the inner dict's I want, but I am not sure how to put it back into a dict of the same structure.
d = {12345: {2006: [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}, {'a': 1, 'b': 5}]}, 12346: {2007: [{'a': 2, 'b': 7}, {'a': 1, 'b': 9}, {'a': 1, 'b': 12}]}}
d_filter = {}
for item_code in d.keys():
for year in d[item_code]:
for item_dict in d[item_code][year]:
if item_dict['a'] == 1:
print(item_dict) # how to put this back in d_filter?
producing:
{'a': 1, 'b': 2}
{'a': 1, 'b': 5}
{'a': 1, 'b': 9}
{'a': 1, 'b': 12}
I am guessing there is a better way to filter that I can not find, or something with dictionary comprehension that my small mind can not grasp.
Any help would be appreciated.

Here's a dictionary comprehension that does just that; dct is your initial dictionary:
d = {k: {ky: [d for d in vl if d['a']==1] for ky, vl in v.items()}
for k, v in dct.items()}
print d
# {12345: {2006: [{'a': 1, 'b': 2}, {'a': 1, 'b': 5}]}, 12346: {2007: [{'a': 1, 'b': 9}, {'a': 1, 'b': 12}]}}
You can change the inner filter (i.e. d['a']==1) to the dict key and/or value of your choice.

You could do something like this:
filtered = {
item_code: {
year: [item for item in items if item['a'] == 1]
for year, items in years.items()
}
for item_code, years in d.items()
}
Which results in:
{12345: {2006: [{'a': 1, 'b': 2}, {'a': 1, 'b': 5}]},
12346: {2007: [{'a': 1, 'b': 9}, {'a': 1, 'b': 12}]}}

Using list comprehension to setup a list of unique dictionaries in Python

I have the following dictionary
stocklist = {'a': 0, 'b': 0, 'c': 0}
And I want to setup a grid of HEIGHT by WIDTH where each cell in the grid has it's own unique version of a stocklist with different values. Will this work?
stockmap = [[stocklist for w in range(WIDTH)] for h in range(HEIGHT)]
I have other lists of width by height where each cell contains only one value, and they work fine.
But previous to this I tried to solve my issue by using Classes and it was a nightmare as my instances contained a list that kept being identical.
I'm worried that if I start coding the above I'll end up with the same problem.

In your example each 'cell' of your grid will point to the same dictionary - stocklist. So if you modify one 'cell' actually all of them will change.
If you need to store different dict in each cell you should create deep copies of the stocklist.
try:
import copy
stocklist = {'a': 0, 'b': 0, 'c': 0}
stockmap = [[copy.deepcopy(stocklist) for w in range(WIDTH)] for h in range(HEIGHT)]
In the simple example, where your stocklist does not contain any nested dict also
`stockmap = [[dict(stocklist) for w in range(WIDTH)] for h in range(HEIGHT)]`
will work. However remember that if your stocklist would be something like {'a': 0, 'b': {'c': 0}}, the internal - nested dict {'c': 0} will not be deep copied and each 'cell' will share that dict.

As I suggested above, you need to instantiate a new object, for example using dict
stockmap = [[dict(stocklist) for w in range(WIDTH)] for h in range(HEIGHT)]
otherwise the very same dictionary instance would be used.
Let's check it out:
with your example
>>> HEIGHT = 3
>>> WIDTH = 3
>>> stocklist = {'a': 0, 'b': 0, 'c': 0}
>>> stockmap = [[stocklist for w in range(WIDTH)] for h in range(HEIGHT)]
>>> stockmap
[[{'a': 0, 'b': 0, 'c': 0}, {'a': 0, 'b': 0, 'c': 0}, {'a': 0, 'b': 0, 'c': 0}], [{'a': 0, 'b': 0, 'c': 0}, {'a': 0, 'b': 0, 'c': 0}, {'a': 0, 'b': 0, 'c': 0}], [{'a': 0, 'b': 0, 'c': 0}, {'a': 0, 'b': 0, 'c': 0}, {'a': 0, 'b': 0, 'c': 0}]]
>>> stocklist['a']=9
>>> stockmap
[[{'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}], [{'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}], [{'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}]]
as you can clearly see, modifying one item in the original dictionary affects the newly created list (grid)
Whereas doing
>>> stockmap = [[dict(stocklist) for w in range(WIDTH)] for h in range(HEIGHT)]
>>> stockmap
[[{'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}], [{'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}], [{'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}]]
>>> stocklist['a']=5
>>> stockmap
[[{'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}], [{'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}], [{'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}, {'a': 9, 'b': 0, 'c': 0}]]
leaves the grid unaltered
Note: as #damgad correctly points out, dict would not work for nested dictionaries. In such cases you need to use copy.deepcopy

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Find common elements and their frequency in list of dictionaries - python

Related

I am getting different value when printing and appending same variable to a list, Why is that?

Create list with all combination of dictionaries, where key only appears once

Is there any way to sort this dictionaries by lowest value from keys?

Filter Dictionary keys of multilevel dictionary

Using list comprehension to setup a list of unique dictionaries in Python

Categories

Resources