Filter nested dictionary in Python - python

I'm trying to remove key: value pairs from a nested dictionary within a nested dictionary, based on the value of a value within the double-nested dict.
The dictionary looks something like this, and I want to filter out entire entries of people with an age under 25 years old (while I do not want to filter out the outermost dictionary, so the "people group" one).
# Make a nested dictionary for test
people = {0:{1:{'name': 'John', 'age': '27', 'gender': 'Male'},
2: {'name': 'Marie', 'age': '22', 'gender': 'Female'},
3: {'name': 'Nicola', 'age': '19', 'gender': 'Non-binary'},
4: {'name': 'Garfield', 'age': '32', 'gender': 'Male'}},
1:{1:{'name': 'Katie', 'age': '24', 'gender': 'Male'},
2: {'name': 'Marigold', 'age': '42', 'gender': 'Female'},
3: {'name': 'James', 'age': '10', 'gender': 'Non-binary'},
4: {'name': 'Precious', 'age': '35', 'gender': 'Male'}}}
I have found my way to this thread, which is somewhat similar, although there's only one layer of "nestedness" there.
From it, I learnt that I could do something like this to filter keys with too low values tied to them, if my dictionary had only been nested one round:
{i:j for i,j in people.items() if j.get('age',0) >='25'}
How can I reach the element within a double-nested dictionary like this, and then remove the whole "single-nested dictionary", but keep the outermost one?

You can use nested dict comprehension:
>>> {gid: {uid: user for uid, user in pg.items() if int(user.get('age', 0)) >= 25} for gid, pg in people.items()}
{0: {1: {'name': 'John', 'age': '27', 'gender': 'Male'},
4: {'name': 'Garfield', 'age': '32', 'gender': 'Male'}},
1: {2: {'name': 'Marigold', 'age': '42', 'gender': 'Female'},
4: {'name': 'Precious', 'age': '35', 'gender': 'Male'}}}

Related

How to get maximum value of one entity in nested dictionary?

people = {1: {'Name': 'John', 'Age': '22', 'Sex': 'Male'}, 2: {'Name': 'Marie', 'Age': '26', 'Sex': 'Female'}, 3: {'Name': 'Marie', 'Age': '25', 'Sex': 'Female'}, 4: {'Name': 'Marie', 'Age': '21', 'Sex': 'Female'}}
I want to get the maximum value of 'Age'. Kindly help me how to do this.
You can use max with a defined key and lambda.
people = {1: {'Name': 'John', 'Age': '22', 'Sex': 'Male'}, 2: {'Name': 'Marie', 'Age': '26', 'Sex': 'Female'}, 3: {'Name': 'Marie', 'Age': '25', 'Sex': 'Female'}, 4: {'Name': 'Marie', 'Age': '21', 'Sex': 'Female'}}
max(people.values(), key=lambda x: int(x['Age']))
# {'Name': 'Marie', 'Age': '26', 'Sex': 'Female'}
max(people.values(), key=lambda x: int(x['Age']))['Age']
# '26'
If you wanted just the max value then max with a generator would suffice
people = {1: {'Name': 'John', 'Age': '22', 'Sex': 'Male'}, 2: {'Name': 'Marie', 'Age': '26', 'Sex': 'Female'}, 3: {'Name': 'Marie', 'Age': '25', 'Sex': 'Female'}, 4: {'Name': 'Marie', 'Age': '21', 'Sex': 'Female'}}
result = max(int(p['Age']) for p in people.values())
Which I prefer over the lambda, however if you want the dictionary that has the max Age then #I'mahdi answer is what you want.

How to combine two nested dictionaries with same master keys

I have two nested dicts with same master keys:
dict1 = {'person1': {'name': 'John', 'sex': 'Male'},
'person2': {'name': 'Marie', 'sex': 'Female'},
'person3': {'name': 'Luna', 'sex': 'Female'},
'person4': {'name': 'Peter', 'sex': 'Male'}}
dict2 = {'person1': {'weight': '81.1', 'age': '27'},
'person2': {'weight': '56.7', 'age': '22'},
'person3': {'weight': '63.4', 'age': '24'},
'person4': {'weight': '79.1', 'age': '29'}}
So I want to enrich dict 1 by the key value pairs from dict2.
I'm able to do so with a for loop...
for key in dict2:
dict2[key]['age'] = dict1[key]['age']
dict2[key]['weight'] = dict2[key]['weight']
Result:
dict2 = {'person1': {'name': 'John', 'sex': 'Male', 'weight': '81.1', 'age': '27'},
'person2': {'name': 'Marie', 'sex': 'Female', 'weight': '56.7', 'age': '22'},
'person3': {'name': 'Luna', 'sex': 'Female', 'weight': '63.4', 'age': '24'},
'person4': {'name': 'Peter', 'sex': 'Male', 'weight': '79.1', 'age': '29'}}
...but is there a more pythonic way to do so - e.g. with dict comprehension?
Yes:
dict3 = {k: {**v, **dict2[k]} for k, v in dict1.items()}
Firstly, use .items() to iterate over both keys and values at the same time.
Then, for each key k you want the value to be a new dict that is created by dumping — or destructuring — both v and dict2[k] in it.
UPDATE for Python >= 3.9:
Thanks #mwo for mentioning the pipe | operand:
dict3 = {k: v | dict2[k] for k, v in dict1.items()}
If you have control over the data source flatten the dictionaries and then use the update method. For example:
dict1 = {('person1', 'name'): 'John'}
dict2 = {('person1', 'weight'): 81.1}
dict1.update(dict2)
>>> dict1
{('person1', 'name'): 'John',
('person1', 'weight'): 81.1}
It is much easier to deal with this kind of data structure, but if you are stuck with nested dictionaries you can use a NestedDict to achieve the same result with a similar interface.
from ndicts import NestedDict
nd1 = NestedDict(dict1)
nd2 = NestedDict(dict2)
nd1.update(nd2)
>>> nd1
NestedDict(
{'person1': {'name': 'John', 'weight': 81.1}}
)
Use nd1.to_dict() if you need the result as a dictionary.
To install ndicts pip install ndicts.

Get specific the nested key/values based on a condition from python nested dictionary

I'm stuck parsing the below python nested dictionary based on the nested key. I want to filter a key's value and return all the nested key/values related to that.
{ 'US': { 'Washington': {'Seattle': {1: {'name': 'John', 'age': '27', 'gender': 'Male'}}},
{ 'Florida': {'some city': {2: {'name': 'Marie', 'age': '22', 'gender': 'Female'}}},
{ 'Ohio': {'some city': {3: {'name': 'Luna', 'age': '24', 'gender': 'Female', 'married': 'No'}}},
{ 'Nevada': {'some city': {4: {'name': 'Peter', 'age': '29', 'gender': 'Male', 'married': 'Yes'}}}}}
For instance, filtering on gender "Male" should return the below:
US
Washington
Seattle
1
name:John
age: 27
US
Nevada
somecity
4
name:Peter
age: 29
married: Yes
Can you please suggest the best way to parse it. I tried to use contains within a loop that doesn't seem to work.
We can recursively explore the dict structure, keeping track of the path of keys at each point. When we reach a dict containing the target value, we yield the path and the content of the dict.
We can use this generator:
def recursive_search(dct, target, path=None):
if path is None:
path = []
if target in dct.values():
out = ' '.join(path) + ' ' + ' '.join(f'{key}:{value}' for key, value in dct.items())
yield out
else:
for key, value in dct.items():
if isinstance(value, dict):
yield from recursive_search(value, target, path+[str(key)])
this way:
data = { 'US': { 'Washington': {'Seattle': {1: {'name': 'John', 'age': '27', 'gender': 'Male'}}},
'Florida': {'some city': {2: {'name': 'Marie', 'age': '22', 'gender': 'Female'}}},
'Ohio': {'some city': {3: {'name': 'Luna', 'age': '24', 'gender': 'Female', 'married': 'No'}}},
'Nevada': {'some city': {4: {'name': 'Peter', 'age': '29', 'gender': 'Male', 'married': 'Yes'}}}}}
for match in recursive_search(data, 'Male'):
print(match)
# US Washington Seattle 1 name:John age:27 gender:Male
# US Nevada some city 4 name:Peter age:29 gender:Male married:Yes
This Code Will work...
a_dict={ 'US': { 'Washington': {'Seattle': {1: {'name': 'John', 'age': '27', 'gender': 'Male'}}}, 'Florida': {'some city': {2: {'name': 'Marie', 'age': '22', 'gender': 'Female'}}}, 'Ohio': {'some city': {3: {'name': 'Luna', 'age': '24', 'gender': 'Female', 'married': 'No'}}}, 'Nevada': {'some city': {4: {'name': 'Peter', 'age': '29', 'gender': 'Male', 'married': 'Yes'}}}}}
for k,v in a_dict.items():
for k1,v1 in v.items():
for k2,v2 in v1.items():
for k3,v3 in v2.items():
if v3["gender"]=="Male":
string=""
for k4,v4 in v3.items():
string=string+ k4+":"+v4+" "
print(k,k1,k2,k3, string.strip())

Extracting key value pair and transpose nested dict

From this below data
people = {1: {'name': 'John', 'age': '27', 'sex': 'Male'},
2: {'name': 'Marie', 'age': '22', 'sex': 'Female'},
3: {'name': 'Luna', 'age': '24', 'sex': 'Female'},
4: {'name': 'Peter', 'age': '29', 'sex': 'Male'}}
How do I extract all the names: ex: ['John','Marie','Luna','Peter']
How do I transpose this dict and get something like below
new_dict = {name: {'John','Marie','Luna','Peter'},
age:{'27','22','24','29'},
sex:{'Male','Female','Female','Male'}}
Create a dataframe from your dict like:
import pandas as pd
df = pd.DataFrame.from_dict(people)
Transpose the dataframe
df2 = df.T
Convert the dataframe to dict
df2.to_dict

how do i remove a duplicate lines in a list using a function?

def remove_repeated_lines(data):
lines_seen = set() # holds lines already seen
d=[]
for t in data:
if t not in lines_seen: # check if line is not duplicate
d.append(t)
lines_seen.add(t)
return d
a=[{'name': 'paul', 'age': '26.', 'hometown': 'AU', 'gender': 'male'},
{'name': 'mei', 'age': '26.', 'hometown': 'NY', 'gender': 'female'},
{'name': 'smith', 'age': '16.', 'hometown': 'NY', 'gender': 'male'},
{'name': 'raj', 'age': '13.', 'hometown': 'IND', 'gender': 'male'}]
age=[]
for line in a:
for key,value in line.items():
if key == 'age':
age.append(remove_repeated_lines(value.replace('.','___')))
print(age)
the output is
[['2', '6', '___'], ['2', '6', '___'], ['1', '6', '___'], ['1', '3', '___']]
my desired output is ['26___','16___','13___']
Here is my code to remove repeated lines from the value of a dictionary. After I run the code, the repeated lines are not remove.
In [37]: a=[{'name': 'paul', 'age': '26.', 'hometown': 'AU', 'gender': 'male'},
...: {'name': 'mei', 'age': '26.', 'hometown': 'NY', 'gender': 'female'},
...: {'name': 'smith', 'age': '16.', 'hometown': 'NY', 'gender': 'male'},
...: {'name': 'raj', 'age': '13.', 'hometown': 'IND', 'gender': 'male'}]
In [40]: set(i["age"].replace(".","")+"_" for i in a)
Out[40]: {'13_', '16_', '26_'}
You can use set comprehension to do it with ease, in a more readable fashion:
age = list({
line['age'].replace('.', '___')
for line in a
if 'age' in line
})
Output:
['26___', '16___', '13___']

Categories