How to combine two nested dictionaries with same master keys - python

I have two nested dicts with same master keys:
dict1 = {'person1': {'name': 'John', 'sex': 'Male'},
'person2': {'name': 'Marie', 'sex': 'Female'},
'person3': {'name': 'Luna', 'sex': 'Female'},
'person4': {'name': 'Peter', 'sex': 'Male'}}
dict2 = {'person1': {'weight': '81.1', 'age': '27'},
'person2': {'weight': '56.7', 'age': '22'},
'person3': {'weight': '63.4', 'age': '24'},
'person4': {'weight': '79.1', 'age': '29'}}
So I want to enrich dict 1 by the key value pairs from dict2.
I'm able to do so with a for loop...
for key in dict2:
dict2[key]['age'] = dict1[key]['age']
dict2[key]['weight'] = dict2[key]['weight']
Result:
dict2 = {'person1': {'name': 'John', 'sex': 'Male', 'weight': '81.1', 'age': '27'},
'person2': {'name': 'Marie', 'sex': 'Female', 'weight': '56.7', 'age': '22'},
'person3': {'name': 'Luna', 'sex': 'Female', 'weight': '63.4', 'age': '24'},
'person4': {'name': 'Peter', 'sex': 'Male', 'weight': '79.1', 'age': '29'}}
...but is there a more pythonic way to do so - e.g. with dict comprehension?

Yes:
dict3 = {k: {**v, **dict2[k]} for k, v in dict1.items()}
Firstly, use .items() to iterate over both keys and values at the same time.
Then, for each key k you want the value to be a new dict that is created by dumping — or destructuring — both v and dict2[k] in it.
UPDATE for Python >= 3.9:
Thanks #mwo for mentioning the pipe | operand:
dict3 = {k: v | dict2[k] for k, v in dict1.items()}

If you have control over the data source flatten the dictionaries and then use the update method. For example:
dict1 = {('person1', 'name'): 'John'}
dict2 = {('person1', 'weight'): 81.1}
dict1.update(dict2)
>>> dict1
{('person1', 'name'): 'John',
('person1', 'weight'): 81.1}
It is much easier to deal with this kind of data structure, but if you are stuck with nested dictionaries you can use a NestedDict to achieve the same result with a similar interface.
from ndicts import NestedDict
nd1 = NestedDict(dict1)
nd2 = NestedDict(dict2)
nd1.update(nd2)
>>> nd1
NestedDict(
{'person1': {'name': 'John', 'weight': 81.1}}
)
Use nd1.to_dict() if you need the result as a dictionary.
To install ndicts pip install ndicts.

Related

Filter nested dictionary in Python

I'm trying to remove key: value pairs from a nested dictionary within a nested dictionary, based on the value of a value within the double-nested dict.
The dictionary looks something like this, and I want to filter out entire entries of people with an age under 25 years old (while I do not want to filter out the outermost dictionary, so the "people group" one).
# Make a nested dictionary for test
people = {0:{1:{'name': 'John', 'age': '27', 'gender': 'Male'},
2: {'name': 'Marie', 'age': '22', 'gender': 'Female'},
3: {'name': 'Nicola', 'age': '19', 'gender': 'Non-binary'},
4: {'name': 'Garfield', 'age': '32', 'gender': 'Male'}},
1:{1:{'name': 'Katie', 'age': '24', 'gender': 'Male'},
2: {'name': 'Marigold', 'age': '42', 'gender': 'Female'},
3: {'name': 'James', 'age': '10', 'gender': 'Non-binary'},
4: {'name': 'Precious', 'age': '35', 'gender': 'Male'}}}
I have found my way to this thread, which is somewhat similar, although there's only one layer of "nestedness" there.
From it, I learnt that I could do something like this to filter keys with too low values tied to them, if my dictionary had only been nested one round:
{i:j for i,j in people.items() if j.get('age',0) >='25'}
How can I reach the element within a double-nested dictionary like this, and then remove the whole "single-nested dictionary", but keep the outermost one?
You can use nested dict comprehension:
>>> {gid: {uid: user for uid, user in pg.items() if int(user.get('age', 0)) >= 25} for gid, pg in people.items()}
{0: {1: {'name': 'John', 'age': '27', 'gender': 'Male'},
4: {'name': 'Garfield', 'age': '32', 'gender': 'Male'}},
1: {2: {'name': 'Marigold', 'age': '42', 'gender': 'Female'},
4: {'name': 'Precious', 'age': '35', 'gender': 'Male'}}}

Get specific the nested key/values based on a condition from python nested dictionary

I'm stuck parsing the below python nested dictionary based on the nested key. I want to filter a key's value and return all the nested key/values related to that.
{ 'US': { 'Washington': {'Seattle': {1: {'name': 'John', 'age': '27', 'gender': 'Male'}}},
{ 'Florida': {'some city': {2: {'name': 'Marie', 'age': '22', 'gender': 'Female'}}},
{ 'Ohio': {'some city': {3: {'name': 'Luna', 'age': '24', 'gender': 'Female', 'married': 'No'}}},
{ 'Nevada': {'some city': {4: {'name': 'Peter', 'age': '29', 'gender': 'Male', 'married': 'Yes'}}}}}
For instance, filtering on gender "Male" should return the below:
US
Washington
Seattle
1
name:John
age: 27
US
Nevada
somecity
4
name:Peter
age: 29
married: Yes
Can you please suggest the best way to parse it. I tried to use contains within a loop that doesn't seem to work.
We can recursively explore the dict structure, keeping track of the path of keys at each point. When we reach a dict containing the target value, we yield the path and the content of the dict.
We can use this generator:
def recursive_search(dct, target, path=None):
if path is None:
path = []
if target in dct.values():
out = ' '.join(path) + ' ' + ' '.join(f'{key}:{value}' for key, value in dct.items())
yield out
else:
for key, value in dct.items():
if isinstance(value, dict):
yield from recursive_search(value, target, path+[str(key)])
this way:
data = { 'US': { 'Washington': {'Seattle': {1: {'name': 'John', 'age': '27', 'gender': 'Male'}}},
'Florida': {'some city': {2: {'name': 'Marie', 'age': '22', 'gender': 'Female'}}},
'Ohio': {'some city': {3: {'name': 'Luna', 'age': '24', 'gender': 'Female', 'married': 'No'}}},
'Nevada': {'some city': {4: {'name': 'Peter', 'age': '29', 'gender': 'Male', 'married': 'Yes'}}}}}
for match in recursive_search(data, 'Male'):
print(match)
# US Washington Seattle 1 name:John age:27 gender:Male
# US Nevada some city 4 name:Peter age:29 gender:Male married:Yes
This Code Will work...
a_dict={ 'US': { 'Washington': {'Seattle': {1: {'name': 'John', 'age': '27', 'gender': 'Male'}}}, 'Florida': {'some city': {2: {'name': 'Marie', 'age': '22', 'gender': 'Female'}}}, 'Ohio': {'some city': {3: {'name': 'Luna', 'age': '24', 'gender': 'Female', 'married': 'No'}}}, 'Nevada': {'some city': {4: {'name': 'Peter', 'age': '29', 'gender': 'Male', 'married': 'Yes'}}}}}
for k,v in a_dict.items():
for k1,v1 in v.items():
for k2,v2 in v1.items():
for k3,v3 in v2.items():
if v3["gender"]=="Male":
string=""
for k4,v4 in v3.items():
string=string+ k4+":"+v4+" "
print(k,k1,k2,k3, string.strip())

Extracting key value pair and transpose nested dict

From this below data
people = {1: {'name': 'John', 'age': '27', 'sex': 'Male'},
2: {'name': 'Marie', 'age': '22', 'sex': 'Female'},
3: {'name': 'Luna', 'age': '24', 'sex': 'Female'},
4: {'name': 'Peter', 'age': '29', 'sex': 'Male'}}
How do I extract all the names: ex: ['John','Marie','Luna','Peter']
How do I transpose this dict and get something like below
new_dict = {name: {'John','Marie','Luna','Peter'},
age:{'27','22','24','29'},
sex:{'Male','Female','Female','Male'}}
Create a dataframe from your dict like:
import pandas as pd
df = pd.DataFrame.from_dict(people)
Transpose the dataframe
df2 = df.T
Convert the dataframe to dict
df2.to_dict

How to reorder a list in Python based on its content

I have a list of dictionaries in python like this;
l = [{'name': 'John', 'age': 23},
{'name': 'Steve', 'age': 35},
{'name': 'Helen'},
{'name': 'George'},
{'name': 'Jessica', 'age': 23}]
What I am trying to achieve here is reorder the elements of l in such a way that each entry containing the key age move to the end of the list like this;
End result:
l = [{'name': 'Helen'},
{'name': 'George'},
{'name': 'Jessica', 'age': 23},
{'name': 'John', 'age': 23},
{'name': 'Steve', 'age': 35}]
How can I do this?
You can sort the list:
l.sort(key=lambda d: 'age' in d)
The key returns either True or False, based on the presence of the 'age' key; True is sorted after False. Python's sort is stable, leaving the rest of the relative ordering intact.
Demo:
>>> from pprint import pprint
>>> l = [{'name': 'John', 'age': 23},
... {'name': 'Steve', 'age': 35},
... {'name': 'Helen'},
... {'name': 'George'},
... {'name': 'Jessica', 'age': 23}]
>>> l.sort(key=lambda d: 'age' in d)
>>> pprint(l)
[{'name': 'Helen'},
{'name': 'George'},
{'age': 23, 'name': 'John'},
{'age': 35, 'name': 'Steve'},
{'age': 23, 'name': 'Jessica'}]
If you also wanted to sort by age, then retrieve the age value and return a suitable stable sentinel for those entries that do not have an age, but which will be sorted first. float('-inf') will always be sorted before any other number, for example:
l.sort(key=lambda d: d.get('age', float('-inf')))
Again, entries without an age are left in their original relative order:
>>> l.sort(key=lambda d: d.get('age', float('-inf')))
>>> pprint(l)
[{'name': 'Helen'},
{'name': 'George'},
{'age': 23, 'name': 'John'},
{'age': 23, 'name': 'Jessica'},
{'age': 35, 'name': 'Steve'}]

how to convert list of dict to dict

How to convert list of dict to dict. Below is the list of dict
data = [{'name': 'John Doe', 'age': 37, 'sex': 'M'},
{'name': 'Lisa Simpson', 'age': 17, 'sex': 'F'},
{'name': 'Bill Clinton', 'age': 57, 'sex': 'M'}]
to
data = {'John Doe': {'name': 'John Doe', 'age': 37, 'sex': 'M'},
'Lisa Simpson': {'name': 'Lisa Simpson', 'age': 17, 'sex': 'F'},
'Bill Clinton': {'name': 'Bill Clinton', 'age': 57, 'sex': 'M'}}
A possible solution using names as the new keys:
new_dict = {}
for item in data:
name = item['name']
new_dict[name] = item
With python 3.x you can also use dict comprehensions for the same approach in a more nice way:
new_dict = {item['name']:item for item in data}
As suggested in a comment by Paul McGuire, if you don't want the name in the inner dict, you can do:
new_dict = {}
for item in data:
name = item.pop('name')
new_dict[name] = item
With python 3.3 and above, you can use ChainMap
A ChainMap groups multiple dicts or other mappings together to create
a single, updateable view. If no maps are specified, a single empty
dictionary is provided so that a new chain always has at least one
mapping.
from collections import ChainMap
data = dict(ChainMap(*data))
If the dicts wouldnt share key, then you could use:
dict((key,d[key]) for d in data for key in d)
Probably its better in your case to generate a dict with lists as values?
newdict={}
for k,v in [(key,d[key]) for d in data for key in d]:
if k not in newdict: newdict[k]=[v]
else: newdict[k].append(v)
This yields:
>>> newdict
`{'age': [37, 17, 57], 'name': ['John Doe', 'Lisa Simpson', 'Bill Clinton'], 'sex': ['M', 'F', 'M']}`
Try this approach:
{key: val} for k in data for key, val in k.items())
Let's not over complicate this:
simple_dictionary = dict(data[0])
Perhaps you want the name to be the key? You don't really specify, since your second example is invalid and not really meaningful.
Note that my example removes the key "name" from the value, which may be desirable (or perhaps not).
data = [{'name': 'John Doe', 'age': 37, 'sex': 'M'},
{'name': 'Lisa Simpson', 'age': 17, 'sex': 'F'},
{'name': 'Bill Clinton', 'age': 57, 'sex': 'M'}]
newdata = {}
for entry in data:
name = entry.pop('name') #remove and return the name field to use as a key
newdata[name] = entry
print newdata
##{'Bill Clinton': {'age': 57, 'sex': 'M'},
## 'John Doe': {'age': 37, 'sex': 'M'},
## 'Lisa Simpson': {'age': 17, 'sex': 'F'}}
print newdata['John Doe']['age']
## 37
import pandas as pd
data = [{'name': 'John Doe', 'age': 37, 'sex': 'M'},
{'name': 'Lisa Simpson', 'age': 17, 'sex': 'F'},
{'name': 'Bill Clinton', 'age': 57, 'sex': 'M'}]
print(pd.DataFrame(data).to_dict())
My 5 cents, didn't like any of answers:
from functools import reduce
collection = [{'hello': 1}, {'world': 2}]
answer = reduce(lambda aggr, new: aggr.update(new) or aggr, collection, {})
Just in case you wanted a functional alternative (also assuming the names are wanted as the new keys), you could do
from toolz.curried import *
data = [{'name': 'John Doe', 'age': 37, 'sex': 'M'},
{'name': 'Lisa Simpson', 'age': 17, 'sex': 'F'},
{'name': 'Bill Clinton', 'age': 57, 'sex': 'M'}]
newdata = pipe(data,
map(lambda x: {x['name']: dissoc(x, 'name')}),
lambda x: merge(*x)
)
print(newdata)

Categories