Merge 2 dictionaries based on specific features - python

With 2 json file I am trying to merge certain features of one into the other. I converted the json into dictionaries I am trying to merge features from 1 dictionary into another. However I want specific features of one dictionary to merege with the other but not overwrite the initial values
Dictionary A: [{a:1,b:2,f:10},{a:2,b:4,f:10}]
Dictionary B: [{f:1,g:1,k:1},{f:2,g:2,k:1}]
Desired:
Dictionary C:[{a:1,b:2,f:10,g:1,k:1},{a:2,b:4,f:10,g:2,k:1}]
Loop through all dictionaries simultaneously
for x,y in zip(A,B):
x["g"]= y["g"]
x["k"]= y["k"]

You can iterate using zip then combine the dictionaries and filter out the keys that you don't want, you can use comprehension:
# Python 3.9+
>>> [y|x for x,y in zip(A, B)]
# output:
[{'f': 10, 'g': 1, 'k': 1, 'a': 1, 'b': 2},
{'f': 10, 'g': 2, 'k': 1, 'a': 2, 'b': 4}]

This will preserve the order and not overwrite any duplicate keys in A.
lst_a = [{'a':1,'b':2,'f':10},{'a':2,'b':4,'f':10}]
lst_b = [{'f':1,'g':1,'k':1},{'f':2,'g':2,'k':1}]
lst_c = []
for dict_a,dict_b in zip(lst_a,lst_b):
dict_b = {k:v for k,v in dict_b.items() if k not in dict_a}
lst_c.append(dict_a | dict_b)
print(lst_c)

Related

Get value of dictionaries into separate lists

I am trying to get array by first key.
The names of the keys are always the same and the number of elements is the same.
[{'a': 1, 'b':41, 'c':324}, {'a': 1, 'b':12, 'c':65}, {'a': 2, 'b':36, 'c':12}]
expected output:
[{'b':41, 'c':324}, {'b':12, 'c':65}]
[{'b':36, 'c':12}]
Make a new dictionary that uses the values of the a keys as its keys.
newdict = {}
for d in data:
newdict.setdefault(d['a'], []).append({'b': d['b'], 'c': d['c']})
result = list(new_dict.values())

how to group list of dictionaries in a simple dictionary according to this method

I would apply the following code for list of dictionaries :
x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}
z = {**x, **y} # get z= {'a': 1, 'b': 3, 'c': 4} (without duplicated keys)
I create a list (List_Dict) of several dictionaries stored like:
List_Dict[0] = {'a': 1, 'b': 2}
List_Dict[1] = {'c': 3, 'a': 4}
List_Dict[2] = {'b': 1, 'c': 2}
... ... ... ... ...
List_Dict[5000] = {'d': 3, 'a': 4}
the list contains 5000 dictionaries
I ask if there is any simple method for typing between bracket the whole elements (5000) dynamically for applying this instruction:
z = {**List_Dict[0],**List_Dict[1],**List_Dict[2] ....,**List_Dict[5000]}
The way to do this in O(n) time is to use the the dict.update method to collect all of the entries in a single dict.
result = dict()
for d in List_Dict:
result.update(d)
Using update is better than the {**result, **d} syntax because that creates a new dictionary on every iteration; there are n iterations and the new dictionary's size is O(n), so that solution would take O(n^2) time.
You can simply put it in a for loop:
z = {}
for d in List_Dict:
z = {**z, **d}
print(z)

Over counting pairs in python loop

I have a list of dictionaries where each dict is of the form:
{'A': a,'B': b}
I want to iterate through the list and for every (a,b) pair, find the pair(s), (b,a), if it exists.
For example if for a given entry of the list A = 13 and B = 14, then the original pair would be (13,14). I would want to search the entire list of dicts to find the pair (14,13). If (14,13) occurred multiple times I would like to record that too.
I would like to count the number of times for all original (a,b) pairs in the list, when the complement (b,a) appears, and if so how many times. To do this I have two for loops and a counter when a complement pair is found.
pairs_found = 0
for i, val in enumerate( list_of_dicts ):
for j, vol in enumerate( list_of_dicts ):
if val['A'] == vol['B']:
if vol['A'] == val['B']:
pairs_found += 1
This generates a pairs_found greater than the length of list_of_dicts. I realize this is because the same pairs will be over-counted. I am not sure how I can overcome this degeneracy?
Edit for Clarity
list_of_dicts = []
list_of_dicts[0] = {'A': 14, 'B', 23}
list_of_dicts[1] = {'A': 235, 'B', 98}
list_of_dicts[2] = {'A': 686, 'B', 999}
list_of_dicts[3] = {'A': 128, 'B', 123}
....
Lets say that the list has around 100000 entries. Somewhere in that list, there will be one or more entries, of the form {'A' 23, 'B': 14}. If this is true then I would like a counter to increase its value by one. I would like to do this for every value in the list.
Here is what I suggest:
Use tuple to represent your pairs and use them as dict/set keys.
Build a set of unique inverted pairs you'll look for.
Use a dict to store the number of time a pair appears inverted
Then the code should look like this:
# Create a set of unique inverted pairs
inverted_pairs_set = {(d['B'],d['A']) for d in list_of_dicts}
# Create a counter for original pairs
pairs_counter_dict = {(ip[1],ip[0]):0 for ip in inverted_pairs_set]
# Create list of pairs
pairs_list = [(d['A'],d['B']) for d in list_of_dicts]
# Count for each inverted pairs, how many times
for p in pairs_list:
if p in inverted_pairs_set:
pairs_counter_dict[(p[1],p[0])] += 1
You can create a counter dictionary that contains the values of the 'A' and 'B' keys in all your dictionaries:
complements_cnt = {(dct['A'], dct['B']): 0 for dct in list_of_dicts}
Then all you need is to iterate over your dictionaries again and increment the value for the "complements":
for dct in list_of_dicts:
try:
complements_cnt[(dct['B'], dct['A'])] += 1
except KeyError: # in case there is no complement there is nothing to increase
pass
For example with such a list_of_dicts:
list_of_dicts = [{'A': 1, 'B': 2}, {'A': 2, 'B': 1}, {'A': 1, 'B': 2}]
This gives:
{(1, 2): 1, (2, 1): 2}
Which basically says that the {'A': 1, 'B': 2} has one complement (the second) and {'A': 2, 'B': 1} has two (the first and the last).
The solution is O(n) which should be quite fast even for 100000 dictionaries.
Note: This is quite similar to #debzsud answer. I haven't seen it before I posted the answer though. :(
I am still not 100% sure what it is you want to do but here is my guess:
pairs_found = 0
for i, dict1 in enumerate(list_of_dicts):
for j, dict2 in enumerate(list_of_dicts[i+1:]):
if dict1['A'] == dict2['B'] and dict1['B'] == dict2['A']:
pairs_found += 1
Note the slicing on the second for loop. This avoids checking pairs that have already been checked before (comparing D1 with D2 is enough; no need to compare D2 to D1)
This is better than O(n**2) but still there is probably room for improvement
You could first create a list with the values of each dictionary as tuples:
example_dict = [{"A": 1, "B": 2}, {"A": 4, "B": 3}, {"A": 5, "B": 1}, {"A": 2, "B": 1}]
dict_values = [tuple(x.values()) for x in example_dict]
Then create a second list with the number of occurrences of each element inverted:
occurrences = [dict_values.count(x[::-1]) for x in dict_values]
Finally, create a dict with dict_values as keys and occurrences as values:
dict(zip(dict_values, occurrences))
Output:
{(1, 2): 1, (2, 1): 1, (4, 3): 0, (5, 1): 0}
For each key, you have the number of inverted keys. You can also create the dictionary on the fly:
occurrences = {dict_values: dict_values.count(x[::-1]) for x in dict_values}

Getting the difference (in values) between two dictionaries in python

Let's say you are given 2 dictionaries, A and B with keys that can be the same but values (integers) that will be different. How can you compare the 2 dictionaries so that if the key matches you get the difference (eg if x is the value from key "A" and y is the value from key "B" then result should be x-y) between the 2 dictionaries as a result (preferably as a new dictionary).
Ideally you'd also be able to compare the gain in percent (how much the values changed percentage-wise between the 2 dictionaries which are snapshots of numbers at a specific time).
Given two dictionaries, A and B which may/may not have the same keys, you can do this:
A = {'a':5, 't':4, 'd':2}
B = {'s':11, 'a':4, 'd': 0}
C = {x: A[x] - B[x] for x in A if x in B}
Which only subtracts the keys that are the same in both dictionaries.
You could use a dict comprehension to loop through the keys, then subtract the corresponding values from each original dict.
>>> a = {'a': 5, 'b': 3, 'c': 12}
>>> b = {'a': 1, 'b': 7, 'c': 19}
>>> {k: b[k] - a[k] for k in a}
{'a': -4, 'b': 4, 'c': 7}
This assumes both dict have the exact same keys. Otherwise you'd have to think about what behavior you expect if there are keys in one dict but not the other (maybe some default value?)
Otherwise if you want to evaluate only shared keys, you can use the set intersection of the keys
>>> {k: b[k] - a[k] for k in a.keys() & b.keys()}
{'a': -4, 'b': 4, 'c': 7}
def difference_dict(Dict_A, Dict_B):
output_dict = {}
for key in Dict_A.keys():
if key in Dict_B.keys():
output_dict[key] = abs(Dict_A[key] - Dict_B[key])
return output_dict
>>> Dict_A = {'a': 4, 'b': 3, 'c':7}
>>> Dict_B = {'a': 3, 'c': 23, 'd': 2}
>>> Diff = difference_dict(Dict_A, Dict_B)
>>> Diff
{'a': 1, 'c': 16}
If you wanted to fit that all onto one line, it would be...
def difference_dict(Dict_A, Dict_B):
output_dict = {key: abs(Dict_A[key] - Dict_B[key]) for key in Dict_A.keys() if key in Dict_B.keys()}
return output_dict
If you want to get the difference of similar keys into a new dictionary, you could do something like the following:
new_dict={}
for key in A:
if key in B:
new_dict[key] = A[key] - B[key]
...which we can fit into one line
new_dict = { key : A[key] - B[key] for key in A if key in B }
here is a python package for this case:
https://dictdiffer.readthedocs.io/en/latest/
from dictdiffer import diff
print(list(diff(a, b)))
would do the trick.

Removing dictionaries from a list on the basis of duplicate value of key

I am new to Python. Suppose i have the following list of dictionaries:
mydictList= [{'a':1,'b':2,'c':3},{'a':2,'b':2,'c':4},{'a':2,'b':3,'c':4}]
From the above list, i want to remove dictionaries with same value of key b. So the resultant list should be:
mydictList = [{'a':1,'b':2,'c':3},{'a':2,'b':3,'c':4}]
You can create a new dictionary based on the value of b, iterating the mydictList backwards (since you want to retain the first value of b), and get only the values in the dictionary, like this
>>> {item['b'] : item for item in reversed(mydictList)}.values()
[{'a': 1, 'c': 3, 'b': 2}, {'a': 2, 'c': 4, 'b': 3}]
If you are using Python 3.x, you might want to use list function over the dictionary values, like this
>>> list({item['b'] : item for item in reversed(mydictList)}.values())
Note: This solution may not maintain the order of the dictionaries.
First, sort the list by b-values (Python's sorting algorithm is stable, so dictionaries with identical b values will retain their relative order).
from operator import itemgetter
tmp1 = sorted(mydictList, key=itemgetter('b'))
Next, use itertools.groupby to create subiterators that iterate over dictionaries with the same b value.
import itertools
tmp2 = itertools.groupby(tmp1, key=itemgetter('b))
Finally, create a new list that contains only the first element of each subiterator:
# Each x is a tuple (some-b-value, iterator-over-dicts-with-b-equal-some-b-value)
newdictList = [ next(x[1]) for x in tmp2 ]
Putting it all together:
from itertools import groupby
from operator import itemgetter
by_b = itemgetter('b')
newdictList = [ next(x[1]) for x in groupby(sorted(mydictList, key=by_b), key=by_b) ]
A very straight forward approach can go something like this:
mydictList= [{'a':1,'b':2,'c':3},{'a':2,'b':2,'c':4},{'a':2,'b':3,'c':4}]
b_set = set()
new_list = []
for d in mydictList:
if d['b'] not in b_set:
new_list.append(d)
b_set.add(d['b'])
Result:
>>> new_list
[{'a': 1, 'c': 3, 'b': 2}, {'a': 2, 'c': 4, 'b': 3}]

Categories