How to find the difference between two lists of dictionaries? - python

I have two lists of dictionaries and I'd like to find the difference between them (i.e. what exists in the first list but not the second, and what exists in the second list but not the first list).
The issue is that it is a list of dictionaries
a = [{'a': '1'}, {'c': '2'}]
b = [{'a': '1'}, {'b': '2'}]
set(a) - set(b)
Result
TypeError: unhashable type: 'dict'
Desired Result:
{'c': '2'}
How do I accomplish this?

You can use the in operator to see if it is in the list
a = [{'a': '1'}, {'c': '2'}]
b = [{'a': '1'}, {'b': '2'}]
>>> {'a':'1'} in a
True
>>> {'a':'1'} in b
True
>>> [i for i in a if i not in b]
[{'c': '2'}]

I'd like to find the difference between them (i.e. what exists in the first list but not the second, and what exists in the second list but not the first list)
According to your definition, you looking for a Symmetric difference:
>>> import itertools
>>> a = [{'a': '1'}, {'c': '2'}]
>>> b = [{'a': '1'}, {'b': '2'}]
>>> intersec = [item for item in a if item in b]
>>> sym_diff = [item for item in itertools.chain(a,b) if item not in intersec]
>>> intersec
[{'a': '1'}]
>>> sym_diff
[{'c': '2'}, {'b': '2'}
Alternatively (using the plain difference as given in your example):
>>> a_minus_b = [item for item in a if item not in b]
>>> b_minus_a = [item for item in b if item not in a]
>>> sym_diff = list(itertools.chain(a_minus_b,b_minus_a))
>>> a_minus_b
[{'c': '2'}]
>>> b_minus_a
[{'b': '2'}]
>>> sym_diff
[{'c': '2'}, {'b': '2'}]

You can also you filter with a lambda:
If you want the different items in each list:
print filter(lambda x: x not in b,a) + filter(lambda x: x not in a,b)
[{'c': '2'}, {'b': '2'}]
Or just filter(lambda x: x not in b,a) to get the elements in a but not in b
If you don't want to create the full list of dicts in memory you can use itertools.ifilter
from itertools import ifilter
diff = ifilter(lambda x: x not in b,a)
Then just iterate over diff:
for uniq in diff:
print uniq

Related

find particular value from list of dict in python

I have a list of dictionaries like this:
s = [{'a':1,'b':2},{'a':3},{'a':2},{'a':1}]
remove duplicate value pair
and I want a list of dictionaries like:
s = [{'a':1},{'a':3},{'a':2}]
Use list comprehension with filter a:
s = [{k: v for k, v in x.items() if k =='a'} for x in s]
print (s)
[{'a': 1}, {'a': 3}, {'a': 2}]
You could use a list comprehension adding new dictionary entries only if 'a' is contained:
[{'a':d['a']} for d in s if 'a' in d]
# [{'a': 1}, {'a': 3}, {'a': 2}]
You can try this.
s = [{'a':1,'b':2},{'a':3},{'a':2}]
s=[{'a':d['a']} for d in s]
# [{'a': 1}, {'a': 3}, {'a': 2}]
If you want to have a list of singleton dictionaries with only a keys, you can do this:
>>> [{'a': d.get('a')} for d in s]
[{'a': 1}, {'a': 3}, {'a': 2}]
But this just seems more suitable for a list of tuples:
>>> [('a', d.get('a')) for d in s]
[('a', 1), ('a', 3), ('a', 2)]
From the docs for dict.get:
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a Key Error.

inside list of dictionaries, merge lists based on key

I have nested dictionaries in a list of dictionaries, I want to merge the lists based on 'id'
res = [{'i': ['1'], 'id': '123'},
{'i': ['1'], 'id': '123'},
{'i': ['1','2','3','4','5','6'],'id': '123'},
{'i': ['1'], 'id': '234'},
{'i': ['1','2','3','4','5'],'id': '234'}]
Desired output:
[{'i': [1, 1, 1, 2, 3, 4, 5, 6], 'id': '123'},
{'i': [1, 1, 2, 3, 4, 5], 'id': '234'}]
I am trying to merge the nested dictionaries based on key "id". I couldn't figure out the best way out:
import collections
d = collections.defaultdict(list)
for i in res:
for k, v in i.items():
d[k].extend(v)
The above code is merging all the lists, but i wantto merge lists based on key "id".
Something like this should do the trick
from collections import defaultdict
merged = defaultdict(list)
for r in res:
merged[r['id']].extend(r['i'])
output = [{'id': key, 'i': merged_list} for key, merged_list in merged.items()]
The following produces the desired output, using itertools.groupby:
from operator import itemgetter
from itertools import groupby
k = itemgetter('id')
[
{'id': k, 'i': [x for d in g for x in d['i']]}
for k, g in groupby(sorted(res, key=k), key=k)
]
I'm not sure what the expected behavior should be when there are duplicates -- for example, should the lists be:
treated like a set() ?
appended, and there could be multiple items, such as [1,1,2,3...] ?
doesn't matter -- just take any
Here would be one variation where we use a dict comprehension:
{item['id']: item for item in res}.values()
# [{'i': ['1', '2', '3', '4', '5'], 'id': '234'}, {'i': ['1', '2', '3', '4', '5', '6'], 'id': '123'}]
If you provide a bit more information in your question, I can update the answer accordingly.

Sorting a list of dicts based on another list of dicts in Python

I have 2 lists
A = [{'g': 'goal'}, {'b': 'ball'}, {'a': 'apple'}, {'f': 'float'}, {'e': 'egg'}]
B = [{'a': None}, {'e': None}, {'b': None}, {'g': None}, {'f': None}]
I want to sort A according to B. The reason I'm asking this is, I can't simply copy B's contents into A and over-writing A's object values with None. I want to retain A's values but sort it according to B's order.
How do I achieve this? Would prefer a solution in Python
spots = {next(iter(d)): i for i, d in enumerate(B)}
sorted_A = [None] * len(A)
for d in A:
sorted_A[spots[next(iter(d))]] = d
Average-case linear time. Place each dict directly into the spot it needs to go, without slow index calls or even calling sorted.
You could store the indices of keys in a dictionary and use those in the sorting function. This would work in O(n log(n)) time:
>>> keys = {next(iter(v)): i for i, v in enumerate(B)}
>>> keys
{'a': 0, 'e': 1, 'b': 2, 'g': 3, 'f': 4}
>>> A.sort(key=lambda x: keys[next(iter(x))])
>>> A
[{'a': 'apple'}, {'e': 'egg'}, {'b': 'ball'}, {'g': 'goal'}, {'f': 'float'}]
You can avoid sorting by iterating over the existing, ordered keys in B:
Merge list A into a single lookup dict
Build a new list from the order in B, using the lookup dict to find the value matching each key
Code:
import itertools
merged_A = {k: v for d in A for k, v in d.items()}
sorted_A = [{k: merged_A[k]} for k in itertools.chain.from_iterable(B)]
# [{'a': 'apple'}, {'e': 'egg'}, {'b': 'ball'}, {'g': 'goal'}, {'f': 'float'}]
If required, you can preserve the original dict objects from A instead of building new ones:
keys_to_dicts = {k: d for d in A for k in d}
sorted_A = [keys_to_dicts[k] for k in itertools.chain.from_iterable(B)]
How about this? Create a lookup dict on A and then use B's keys to create a new list in the right order.
In [103]: lookup_list = {k : d for d in A for k in d}
In [104]: sorted_list = [lookup_list[k] for d in B for k in d]; sorted_list
Out[104]: [{'a': 'apple'}, {'e': 'egg'}, {'b': 'ball'}, {'g': 'goal'}, {'f': 'float'}]
Performance
Setup:
import random
import copy
x = list(range(10000))
random.shuffle(x)
A = [{str(i) : 'test'} for i in x]
B = copy.deepcopy(A)
random.shuffle(B)
# user2357112's solution
%%timeit
spots = {next(iter(d)): i for i, d in enumerate(B)}
sorted_A = [None] * len(A)
for d in A:
sorted_A[spots[next(iter(d))]] = d
# Proposed in this post
%%timeit
lookup_list = {k : d for d in A for k in d}
sorted_list = [lookup_list[k] for d in B for k in d]; sorted_list
Results:
100 loops, best of 3: 9.27 ms per loop
100 loops, best of 3: 4.92 ms per loop
45% speedup to the original O(n), with twice the space complexity.

How do I loop over dictionary and check for values passed by a variable in Python

I've got the below dictionary and list - how do I loop over the dictionary checking if b == '1' while passing '1' as variable from a list?
dic = {'info': [{'a':0, 'b':'1'},{'a':0, 'b':'3'},{'a':0, 'b':'3'},{'a':0, 'b':'1'}]}
lst = ['1']
I want to return {'a':0, 'b':'1'}, {'a':0, 'b':'1'}.
This is a general solution using filter; the built-in method, you will have to adopt it to your needs:
>>> list(filter(lambda d: d['b'] in lst, dic['info']))
[{'b': '1', 'a': 0}, {'b': '1', 'a': 0}]
Converting the filter object into a list using list constructor is necessary only in Python3, whereas in Python2, it is not required:
>>> filter(lambda d: d['b'] in lst, dic['info'])
[{'b': '1', 'a': 0}, {'b': '1', 'a': 0}]
EDIT: To make the solution more general in case multiple items in lst, then consider the following:
>>> dic
{'info': [{'b': '1', 'a': 0}, {'b': '3', 'a': 0}, {'b': '3', 'a': 0}, {'b': '1', 'a': 0}, {'b': '2', 'a': '1'}]}
>>>
>>> lst
['1', '2']
>>> def filter_dict(dic_lst, lst):
lst_out = []
for sub_d in dic_lst:
if any(x == sub_d['b'] for x in lst):
lst_out.append(sub_d)
return lst_out
>>> filter_dict(dic['info'], lst)
[{'b': '1', 'a': 0}, {'b': '1', 'a': 0}, {'b': '2', 'a': '1'}]
OR:
>>> list(map(lambda x: list(filter(lambda d: d['b'] in x, dic['info'])),lst))
[[{'b': '1', 'a': 0}, {'b': '1', 'a': 0}], [{'b': '2', 'a': '1'}]]
Just a simple list comprehension:
In [22]: dic = {'info': [{'a':0, 'b':'1'},{'a':0, 'b':'3'},{'a':0, 'b':'3'},{'a':0, 'b':'1'}]}
In [23]: lst = ['1']
In [25]: [sub_dict for sub_dict in dic['info'] if sub_dict['b'] == lst[0]]
Out[25]: [{'a': 0, 'b': '1'}, {'a': 0, 'b': '1'}]
You could use a filter approach:
filter(lambda x:x['b'] in list, dic['info'])
It will create a generator which you can materialize in a list:
result = list(filter(lambda x:x['b'] in list, dic['info']))
Mind I would however rename your list variable since you here override a reference to the list type.
from collections import defaultdict
dic = {'info': [{'a':0, 'b':'1'},{'a':0, 'b':'3'},{'a':0, 'b':'3'},{'a':0, 'b':'1'}]}
d = defaultdict(list)
for each in dic['info']:
d[each['b']].append(each)
out:
defaultdict(list,
{'1': [{'a': 0, 'b': '1'}, {'a': 0, 'b': '1'}],
'3': [{'a': 0, 'b': '3'}, {'a': 0, 'b': '3'}]})
in:
d['1']
out:
[{'a': 0, 'b': '1'}, {'a': 0, 'b': '1'}]
Build an index dict to avoid iterate again.
First go my simple loop and iteration way
Input:
>>> dic
{'info': [{'a': 0, 'b': '1'}, {'a': 0, 'b': '3'}, {'a': 0, 'b': '3'}, {'a': 0, 'b': '1'}]}
>>> l
['1']
New List variable for result.
>>> result = []
Algo
Iterate diction by iteritems method of dictionary.
Value of main dictionary is list data type. so again iterate list by for loop.
Check b key is present in sub dictionary and check its value is present in given list l.
If yes, then append to result list.
code:
>>> for k,v in dic.iteritems():
... for i in v:
... if "b" in i and i["b"] in l:
... result.append(i)
...
Output:
>>> result
[{'a': 0, 'b': '1'}, {'a': 0, 'b': '1'}]
>>>
Notes:
Do not use list as variable name because list is reversed keyword for Python
Read basic things of dictionary and list which has properties.
Try to write code first.
You can make use of a list comprehension, or just do it using filter.
list comprehension
dict = {'info': [{'a':0, 'b':'1'},{'a':0, 'b':'3'},{'a':0, 'b':'3'},{'a':0, 'b':'1'}]}
lst = ['1']
result = [i for i in dict['info'] if i['b'] == lst[0]]
print result # [{'a': 0, 'b': '1'}, {'a': 0, 'b': '1'}]
filter
dict = {'info': [{'a':0, 'b':'1'},{'a':0, 'b':'3'},{'a':0, 'b':'3'},{'a':0, 'b':'1'}]}
list(filter(lambda i: i['b'] in lst, dic['info']))
# [{'b': '1', 'a': 0}, {'b': '1', 'a': 0}]

loop for to print a dictionary of dictionaries

I'd like to find a way to print a list of dictionnaries line by line, so that the result be clear and easy to read
the list is like this.
myList = {'1':{'name':'x',age:'18'},'2':{'name':'y',age:'19'},'3':{'name':'z',age:'20'}...}
and the result should be like this:
>>> '1':{'name':'x',age:'18'}
'2':{'name':'y',age:'19'}
'3':{'name':'z',age:'20'} ...
Using your example:
>>> myList = {'1':{'name':'x','age':'18'},'2':{'name':'y','age':'19'},'3':{'name':'z','age':'20'}}
>>> for k, d in myList.items():
print k, d
1 {'age': '18', 'name': 'x'}
3 {'age': '20', 'name': 'z'}
2 {'age': '19', 'name': 'y'}
More examples:
A list of dictionaries:
>>> l = [{'a':'1'},{'b':'2'},{'c':'3'}]
>>> for d in l:
print d
{'a': '1'}
{'b': '2'}
{'c': '3'}
A dictionary of dictionaries:
>>> D = {'d1': {'a':'1'}, 'd2': {'b':'2'}, 'd3': {'c':'3'}}
>>> for k, d in D.items():
print d
{'b': '2'}
{'c': '3'}
{'a': '1'}
If you want the key of the dicts:
>>> D = {'d1': {'a':'1'}, 'd2': {'b':'2'}, 'd3': {'c':'3'}}
>>> for k, d in D.items():
print k, d
d2 {'b': '2'}
d3 {'c': '3'}
d1 {'a': '1'}
>>> import json
>>> dicts = {1: {'a': 1, 'b': 2}, 2: {'c': 3}, 3: {'d': 4, 'e': 5, 'f':6}}
>>> print(json.dumps(dicts, indent=4))
{
"1": {
"a": 1,
"b": 2
},
"2": {
"c": 3
},
"3": {
"d": 4,
"e": 5,
"f": 6
}
}
One more option - pprint, made for pretty-printing.
The pprint module provides a capability to “pretty-print” arbitrary Python data structures in a form which can be used as input to the interpreter.
List of dictionaries:
from pprint import pprint
l = [{'a':'1'},{'b':'2'},{'c':'3'}]
pprint(l, width=1)
Output:
[{'a': '1'},
{'b': '2'},
{'c': '3'}]
Dictionary with dictionaries in values:
from pprint import pprint
d = {'a':{'b':'c'}},{'d':{'e':'f'}}
pprint(d, width=1)
Output:
({'a': {'b': 'c'}},
{'d': {'e': 'f'}})
myList = {'1':{'name':'x','age':'18'},
'2':{'name':'y','age':'19'},
'3':{'name':'z','age':'20'}}
for item in myList:
print(item,':',myList[item])
Output:
3 : {'age': '20', 'name': 'z'}
2 : {'age': '19', 'name': 'y'}
1 : {'age': '18', 'name': 'x'}
item is used to iterate keys in the dict, and myList[item] is the value corresponding to the current key.

Categories