merger two list of dictionaries with common key - python

I have 2 list of dictionaries:-
x = [{'Name': 'SG', 'State': 'All good'}, {'Name': 'AA', 'State': 'All good'}]
y = [{'Name': 'SG', 'Alias': 'blue', 'Status': 'Started'},
{'Name': 'AA', 'Alias': 'blue', 'Status': 'Started'}]
Would like to merge them both with y showing as:
y = [{'Name': 'SG', 'Alias': 'blue', 'Status': 'Started', 'State: 'All good'},
{'Name': 'AA', 'Alias': 'blue', 'Status': 'Started', 'State: 'All good'}]
Below code does not give the desired result:
for i in range(len(x)):
for k, v in x[i]:
y[i][k] = v
NOte: x and y both the list have the same number of dictionaries and both have a matching "Name"

Here's a simpler way that does the same as what I think you're trying to do in your code:
>>> for d1, d2 in zip(x, y):
... d2.update(d1)
...
>>> y
[{'Name': 'SG', 'Alias': 'blue', 'Status': 'Started', 'State': 'All good'}, {'Name': 'AA', 'Alias': 'blue', 'Status': 'Started', 'State': 'All good'}]

If the dictionnaries are perfectly matching by pairs, look #Iguananaut solution
If not, and you have to check the Name field, I'd suggest you build intermediate dict {Name:value}, then you iterate to retrieve the possible informations in both, that allow to have missing values (missing Name) in any of the dict
x = [{'Name': 'SG', 'State': 'All good'}, {'Name': 'AA', 'State': 'All good'}]
y = [{'Name': 'SG', 'Alias': 'blue', 'Status': 'Started'},
{'Name': 'AA', 'Alias': 'blue', 'Status': 'Started'}]
prepare_x = {row['Name']: row for row in x}
prepare_y = {row['Name']: row for row in y}
result = [{**prepare_x.get(key, {}), **prepare_y.get(key, {})}
for key in (prepare_x.keys() | prepare_y.keys())]

Related

merge the dictionaries of sub fields into a single dictionary

Imagine I have the following dictionary.For every record (row of data), I want to merge the dictionaries of sub fields into a single dictionary. So in the end I have a list of dictionaries. One per each record.
Data = [{'Name': 'bob', 'age': '40’}
{'Name': 'tom', 'age': '30’},
{'Country’: 'US', 'City': ‘Boston’},
{'Country’: 'US', 'City': ‘New York},
{'Email’: 'bob#fake.com', 'Phone': ‘bob phone'},
{'Email’: 'tom#fake.com', 'Phone': ‘none'}]
Output = [
{'Name': 'bob', 'age': '40’,'Country’: 'US', 'City': ‘Boston’,'Email’: 'bob#fake.com', 'Phone': ‘bob phone'},
{'Name': 'tom', 'age': '30’,'Country’: 'US', 'City': ‘New York', 'Email’: 'tom#fake.com', 'Phone': ‘none'}
]
Related: How do I merge a list of dicts into a single dict?
I understand you know which dictionary relates to Bob and which dictionary relates to Tom by their position: dictionaries at even positions relate to Bob, while dictionaries at odd positions relate to Tom.
You can check whether a number is odd or even using % 2:
Data = [{'Name': 'bob', 'age': '40'},
{'Name': 'tom', 'age': '30'},
{'Country': 'US', 'City': 'Boston'},
{'Country': 'US', 'City': 'New York'},
{'Email': 'bob#fake.com', 'Phone': 'bob phone'},
{'Email': 'tom#fake.com', 'Phone': 'none'}]
bob_dict = {}
tom_dict = {}
for i,d in enumerate(Data):
if i % 2 == 0:
bob_dict.update(d)
else:
tom_dict.update(d)
Output=[bob_dict, tom_dict]
Or alternatively:
Output = [{}, {}]
for i, d in enumerate(Data):
Output[i%2].update(d)
This second approach is not only shorter to write, it's also faster to execute and easier to scale if you have more than 2 people.
Splitting the list into more than 2 dictionaries
k = 4 # number of dictionaries you want
Data = [{'Name': 'Alice', 'age': '40'},
{'Name': 'Bob', 'age': '30'},
{'Name': 'Charlie', 'age': '30'},
{'Name': 'Diane', 'age': '30'},
{'Country': 'US', 'City': 'Boston'},
{'Country': 'US', 'City': 'New York'},
{'Country': 'UK', 'City': 'London'},
{'Country': 'UK', 'City': 'Oxford'},
{'Email': 'alice#fake.com', 'Phone': 'alice phone'},
{'Email': 'bob#fake.com', 'Phone': '12345'},
{'Email': 'charlie#fake.com', 'Phone': '0000000'},
{'Email': 'diane#fake.com', 'Phone': 'none'}]
Output = [{} for j in range(k)]
for i, d in enumerate(Data):
Output[i%k].update(d)
# Output = [
# {'Name': 'Alice', 'age': '40', 'Country': 'US', 'City': 'Boston', 'Email': 'alice#fake.com', 'Phone': 'alice phone'},
# {'Name': 'Bob', 'age': '30', 'Country': 'US', 'City': 'New York', 'Email': 'bob#fake.com', 'Phone': '12345'},
# {'Name': 'Charlie', 'age': '30', 'Country': 'UK', 'City': 'London', 'Email': 'charlie#fake.com', 'Phone': '0000000'},
# {'Name': 'Diane', 'age': '30', 'Country': 'UK', 'City': 'Oxford', 'Email': 'diane#fake.com', 'Phone': 'none'}
#]
Additionally, instead of hardcoding k = 4:
If you know the number of fields but not the number of people, you can compute k by dividing the initial number of dictionaries by the number of dictionary types:
fields = ['Name', 'Country', 'Email']
assert(len(Data) % len(fields) == 0) # make sure Data is consistent with number of fields
k = len(Data) // len(fields)
Or alternatively, you can compute k by counting how many occurrences of the 'Names' field you have:
k = sum(1 for d in Data if 'Name' in d)

create dictionary of values based on matching keys in list from nested dictionary

i have nested dictionary with upto 300 items from TYPE1 TO TYPE300 called mainlookup
mainlookup = {'TYPE1': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}],
'TYPE2': [{'Song': 'Jazz', 'Type': 'Slow', 'Price': '5'}],
'TYPE37': [{'Song': 'Country', 'Type': 'Fast', 'Price': '7'}]}
input list to search in lookup based on string TYPE1, TYPE2 and so one
input_list = ['thissong-fav-user:type1-chan-44-John',
'thissong-fav-user:type1-chan-45-kelly-md',
'thissong-fav-user:type2-rock-45-usa',
'thissong-fav-user:type737-chan-45-patrick-md',
'thissong-fav-user:type37-chan-45-kelly-md']
i want to find the string TYPE IN input_list and then create a dictionary as shown below
Output_Desired = {'thissong-fav-user:type1-chan-44-John': [{'Song': 'Rock', 'Type': 'Hard',
'Price':'10'}],
'thissong-fav-user:type1-chan-45-kelly-md': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}],
'thissong-fav-user:type2-rock-45-usa': [{'Song': 'Jazz', 'Type': 'Slow', 'Price': '5'}],
'thissong-fav-user:type37-chan-45-kelly-md': [{'Song': 'Country', 'Type': 'Fast', 'Price': '7'}]}
Note-thissong-fav-user:type737-chan-45-patrick-md in the list has no match so i want to create a
seperate list if value is not found in main lookup
Notfound_list = ['thissong-fav-user:type737-chan-45-patrick-md', and so on..]
Appreciate your help.
You can try this:
mainlookup = {'TYPE1': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}],
'TYPE2': [{'Song': 'Jazz', 'Type': 'Slow', 'Price': '5'}], 'TYPE37': [{'Song': 'Country', 'Type': 'Fast', 'Price': '7'}]}
input_list = ['thissong-fav-user:type1-chan-44-John',
'thissong-fav-user:type1-chan-45-kelly-md', 'thissong-fav-user:type737-chan-45-kelly-md']
dct={i:mainlookup[i.split(':')[1].split('-')[0].upper()] for i in input_list if i.split(':')[1].split('-')[0].upper() in mainlookup.keys()}
Notfoundlist=[i for i in input_list if i not in dct.keys() ]
print(dct)
print(Notfoundlist)
Output:
{'thissong-fav-user:type1-chan-44-John': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}], 'thissong-fav-user:type1-chan-45-kelly-md': [{'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}]}
['thissong-fav-user:type737-chan-45-kelly-md']
An answer using regular expressions:
import re
from pprint import pprint
input_list = ['thissong-fav-user:type1-chan-44-John', 'thissong-fav-user:type1-chan-45-kelly-md', 'thissong-fav-user:type2-rock-45-usa', 'thissong-fav-user:type737-chan-45-patrick-md', 'thissong-fav-user:type37-chan-45-kelly-md']
mainlookup = {'TYPE2': {'Song': 'Reggaeton', 'Type': 'Hard', 'Price': '30'}, 'TYPE1': {'Song': 'Rock', 'Type': 'Hard', 'Price': '10'}, 'TYPE737': {'Song': 'Jazz', 'Type': 'Hard', 'Price': '99'}, 'TYPE37': {'Song': 'Rock', 'Type': 'Soft', 'Price': '1'}}
pattern = re.compile('type[0-9]+')
matches = [re.search(pattern, x).group(0) for x in input_list]
result = {x: [mainlookup[matches[i].upper()]] for i, x in enumerate(input_list)}
pprint(result)
Output:
{'thissong-fav-user:type1-chan-44-John': [{'Price': '10',
'Song': 'Rock',
'Type': 'Hard'}],
'thissong-fav-user:type1-chan-45-kelly-md': [{'Price': '10',
'Song': 'Rock',
'Type': 'Hard'}],
'thissong-fav-user:type2-rock-45-usa': [{'Price': '30',
'Song': 'Reggaeton',
'Type': 'Hard'}],
'thissong-fav-user:type37-chan-45-kelly-md': [{'Price': '1',
'Song': 'Rock',
'Type': 'Soft'}],
'thissong-fav-user:type737-chan-45-patrick-md': [{'Price': '99',
'Song': 'Jazz',
'Type': 'Hard'}]}

How to get/filter values in python3 json list dictionary response?

Below is result I got from API query.
[{'type':'book','title': 'example1', 'id': 12456, 'price': '8.20', 'qty': '12', 'status': 'available'},
{'type':'book','title': 'example2', 'id': 12457, 'price': '10.50', 'qty': '5', 'status': 'none'}]
How do I specify in code to get value pairs of title, price, & status only?
So result will be like:
[{'title': 'example1', 'price': '8.20', 'status': 'available'},
{'title': 'example2', 'price': '10.50', 'status': 'none'}]
You can use a dictionary comprehension within a list comprehension:
L = [{'type':'book','title': 'example1', 'id': 12456, 'price': '8.20', 'qty': '12', 'status': 'available'},
{'type':'book','title': 'example2', 'id': 12457, 'price': '10.50', 'qty': '5', 'status': 'none'}]
keys = ['title', 'price', 'status']
res = [{k: d[k] for k in keys} for d in L]
print(res)
[{'price': '8.20', 'status': 'available', 'title': 'example1'},
{'price': '10.50', 'status': 'none', 'title': 'example2'}]

How to group an array by multiple keys?

I'd like a function that can group a list of dictionaries into sublists of dictionaries depending on an arbitrary set of keys that all dictionaries have in common.
For example, I'd like the following list to be grouped into sublists of dictionaries depending on a certain set of keys
l = [{'name':'b','type':'new','color':'blue','amount':100},{'name':'c','type':'new','color':'red','amount':100},{'name':'d','type':'old','color':'gold','amount':100},{'name':'e','type':'old','color':'red','amount':100},
{'name':'f','type':'old','color':'red','amount':100},{'name':'g','type':'normal','color':'red','amount':100}]
If I wanted to group by type, the following list would result, which has a sublists where each sublist has the same type:
[[{'name':'b','type':'new','color':'blue','amount':100},{'name':'c','type':'new','color':'red','amount':100}],[{'name':'d','type':'old','color':'gold','amount':100},{'name':'e','type':'old','color':'red','amount':100},
{'name':'f','type':'old','color':'red','amount':100}],[{'name':'g','type':'normal','color':'red','amount':100}]]
If I wanted to group by type and color, the following would result where the list contains sublists that have the same type and color:
[[{'name':'b','type':'new','color':'blue','amount':100}],[{'name':'c','type':'new','color':'red','amount':100}],[{'name':'d','type':'old','color':'gold','amount':100}],[{'name':'e','type':'old','color':'red','amount':100},
{'name':'f','type':'old','color':'red','amount':100}],[{'name':'g','type':'normal','color':'red','amount':100}]]
I understand the following function can group by one key, but I'd like to group by multiple keys:
def group_by_key(l,i):
l = [list(grp) for key, grp in itertools.groupby(sorted(l, key=operator.itemgetter(i)), key=operator.itemgetter(i))]
This is my attempt using the group_by_function above
def group_by_multiple_keys(l,*keys):
for key in keys:
l = group_by_key(l,key)
l = [item for sublist in l for item in sublist]
return l
The issue there is that it ungroups it right after it grouped it by a key. Instead, I'd like to re-group it by another key and still have one list of sublists.
itertools.groupby() + operator.itemgetter() will do what you want. groupby() takes an iterable and a key function, and groups the items in the iterable by the value returned by passing each item to the key function. itemgetter() is a factory that returns a function, which gets the specified items from any item passed to it.
from __future__ import print_function
import pprint
from itertools import groupby
from operator import itemgetter
def group_by_keys(iterable, keys):
key_func = itemgetter(*keys)
# For groupby() to do what we want, the iterable needs to be sorted
# by the same key function that we're grouping by.
sorted_iterable = sorted(iterable, key=key_func)
return [list(group) for key, group in groupby(sorted_iterable, key_func)]
dicts = [
{'name': 'b', 'type': 'new', 'color': 'blue', 'amount': 100},
{'name': 'c', 'type': 'new', 'color': 'red', 'amount': 100},
{'name': 'd', 'type': 'old', 'color': 'gold', 'amount': 100},
{'name': 'e', 'type': 'old', 'color': 'red', 'amount': 100},
{'name': 'f', 'type': 'old', 'color': 'red', 'amount': 100},
{'name': 'g', 'type': 'normal', 'color': 'red', 'amount': 100}
]
Examples:
>>> pprint.pprint(group_by_keys(dicts, ('type',)))
[[{'amount': 100, 'color': 'blue', 'name': 'b', 'type': 'new'},
{'amount': 100, 'color': 'red', 'name': 'c', 'type': 'new'}],
[{'amount': 100, 'color': 'gold', 'name': 'd', 'type': 'old'},
{'amount': 100, 'color': 'red', 'name': 'e', 'type': 'old'},
{'amount': 100, 'color': 'red', 'name': 'f', 'type': 'old'}],
[{'amount': 100, 'color': 'red', 'name': 'g', 'type': 'normal'}]]
>>>
>>> pprint.pprint(group_by_keys(dicts, ('type', 'color')))
[[{'amount': 100, 'color': 'blue', 'name': 'b', 'type': 'new'}],
[{'amount': 100, 'color': 'red', 'name': 'c', 'type': 'new'}],
[{'amount': 100, 'color': 'gold', 'name': 'd', 'type': 'old'}],
[{'amount': 100, 'color': 'red', 'name': 'e', 'type': 'old'},
{'amount': 100, 'color': 'red', 'name': 'f', 'type': 'old'}],
[{'amount': 100, 'color': 'red', 'name': 'g', 'type': 'normal'}]]

Duplicate python dict for each value

In a list containing dictionaries, how do I split it based on unique values of dictionaries? So for instance, this:
t = [
{'name': 'xyz', 'value': ['K','L', 'M', 'N']},
{'name': 'abc', 'value': ['O', 'P', 'K']}
]
becomes this:
t = [
{'name': 'xyz', 'value': 'K'},
{'name': 'xyz', 'value': 'L'},
{'name': 'xyz', 'value': 'M'},
{'name': 'xyz', 'value': 'N'},
{'name': 'abc', 'value': 'O'},
{'name': 'xyz', 'value': 'P'},
{'name': 'xyz', 'value': 'K'}
]
You can do this with a list comprehension. Iterate through each dictionary d, and create a new dictionary for each value in d['values']:
>>> t = [ dict(name=d['name'], value=v) for d in t for v in d['value'] ]
>>> t
[{'name': 'xyz', 'value': 'K'},
{'name': 'xyz', 'value': 'L'},
{'name': 'xyz', 'value': 'M'},
{'name': 'xyz', 'value': 'N'},
{'name': 'abc', 'value': 'O'},
{'name': 'abc', 'value': 'P'},
{'name': 'abc', 'value': 'K'}]

Categories