Get list of dictionaries based on their key values - python

I have several dictionaries, let's say 5:
dict1={'Age': 20, 'Name': 'Bob'}
dict2={'Age': 10, 'Name': 'Ane'}
dict3={'Age': 40, 'Name': 'Lee'}
dict4={'Age': 50, 'Name': 'Rob'}
dict5={'Age': 30, 'Name': 'Sia'}
and
arr=[50,40,30,20,10]
Can I get a list of dictionaries based on the values of age in arr?
Desired output:
[dict4,dict3,dict5,dict1,dict2]

Try using a lambda to sort by a certain property (in this case, Age):
Code:
dict1={'Age': 20, 'Name': 'Bob'}
dict2={'Age': 10, 'Name': 'Ane'}
dict3={'Age': 40, 'Name': 'Lee'}
dict4={'Age': 50, 'Name': 'Rob'}
dict5={'Age': 30, 'Name': 'Sia'}
dicts = [dict1, dict2, dict3, dict4, dict5]
dicts.sort(reverse=True, key=lambda x: x['Age'])
print(dicts)
Output:
[{'Age': 50, 'Name': 'Rob'}, {'Age': 40, 'Name': 'Lee'}, {'Age': 30, 'Name': 'Sia'}, {'Age': 20, 'Name': 'Bob'}, {'Age': 10, 'Name': 'Ane'}]

Based on your comments:
arr = [50, 40, 30, 20, 10]
dicts = [dict1, dict2, dict3, dict4, dict5]
dicts = {d["Age"]: d for d in dicts}
dicts = [dicts[v] for v in arr]
print(dicts)
Prints:
[
{"Age": 50, "Name": "Rob"},
{"Age": 40, "Name": "Lee"},
{"Age": 30, "Name": "Sia"},
{"Age": 20, "Name": "Bob"},
{"Age": 10, "Name": "Ane"},
]

If the expected output is the dict names and not the values, you can create a mapping between Age and the dict name and iterate through arr and get the name of the dict by its age:
dict1 = {'Age': 20, 'Name': 'Bob'}
dict2 = {'Age': 10, 'Name': 'Ane'}
dict3 = {'Age': 40, 'Name': 'Lee'}
dict4 = {'Age': 50, 'Name': 'Rob'}
dict5 = {'Age': 30, 'Name': 'Sia'}
arr = [50, 40, 30, 20, 10]
age_to_dict_name = {globals()[x]['Age']: x for x in globals() if x.startswith("dict")}
expected_output = [age_to_dict_name[x] for x in arr]
print(expected_output) # ['dict4', 'dict3', 'dict5', 'dict1', 'dict2']

Related

Search by any character on a list of dict with Python

I want to search by any character on a list of dicts.
my_list = [
{"name": "MrA", "age": 20, "height": 185},
{"name": "MrsB", "age": 28, "height": 192},
{"name": "MrC", "age": 18, "height": 170},
{"name": "MrD", "age": 50, "height": 177},
{"name": "MrsE", "age": 32, "height": 200},
{"name": "Mrs18F", "age": 21, "height": 175}
]
keywords = "MrA"
my_list = [item for item in my_list if keywords in list(item.values())]
print(my_list) # result is [{"name": "MrA", "age": 20, "height": 185}]
As seen, I can only search by full characters. But I want to handle this list with the expected result is which is search by any character on all fields:
With keywords = "Mrs":
[{"name": "MrsB", "age": 28, "height": 192},
{"name": "MrsE", "age": 32, "height": 200},
{"name": "Mrs18F", "age": 21, "height": 175}]
OR keywords = 18:
[{"name": "MrA", "age": 20, "height": 185},
{"name": "MrC", "age": 18, "height": 170}
{"name": "Mrs18F", "age": 21, "height": 175}]
I don't know How do I make it right. Is there any way I can get the expected result?
Use:
my_list = [
{"name": "MrA", "age": 20, "height": 185},
{"name": "MrsB", "age": 28, "height": 192},
{"name": "MrC", "age": 18, "height": 170},
{"name": "MrD", "age": 50, "height": 177},
{"name": "MrsE", "age": 32, "height": 200},
{"name": "Mrs18F", "age": 21, "height": 175}
]
def match_keywords(it, keys):
for val in it.values():
if any(key in str(val) for key in keys):
return True
return False
keywords = ["Mrs", "MrD"]
my_list = [item for item in my_list if match_keywords(item, keywords)]
print(my_list)
Output
[{'name': 'MrsB', 'age': 28, 'height': 192}, {'name': 'MrD', 'age': 50, 'height': 177}, {'name': 'MrsE', 'age': 32, 'height': 200}, {'name': 'Mrs18F', 'age': 21, 'height': 175}]
Another example:
keywords = ["18"]
my_list = [item for item in my_list if match_keywords(item, keywords)]
print(my_list)
Output
[{'name': 'MrA', 'age': 20, 'height': 185}, {'name': 'MrC', 'age': 18, 'height': 170}, {'name': 'Mrs18F', 'age': 21, 'height': 175}]
This solution allows you to search by different keys across all values of the dictionary.
You want to check if the keyword is in any of the values of each dictionary. This will be done by converting the keyword to a string, and each of the values:
def find(term, l):
term = str(term)
return [d for d in l if any(term in str(v) for v in d.values())]
print(find("Mrs", my_list))
print(find(18, my_list))
This gives:
[{'name': 'MrsB', 'age': 28, 'height': 192}, {'name': 'MrsE', 'age': 32, 'height': 200}, {'name': 'Mrs18F', 'age': 21, 'height': 175}]
[{'name': 'MrA', 'age': 20, 'height': 185}, {'name': 'MrC', 'age': 18, 'height': 170}, {'name': 'Mrs18F', 'age': 21, 'height': 175}]
You can turn the keyword and the values to a string and then check if the value contains the keyword, something like this:
def search_by_value(search_list, keyword):
result = []
keyword = str(keyword)
for l in search_list:
for k, v in l.items():
if keyword in str(v):
result.append(l)
break
return result
my_list = [
{"name": "MrA", "age": 20, "height": 185},
{"name": "MrsB", "age": 28, "height": 192},
{"name": "MrC", "age": 18, "height": 170},
{"name": "MrD", "age": 50, "height": 177},
{"name": "MrsE", "age": 32, "height": 200}
]
print(search_by_value(my_list, "Mrs"))
print(search_by_value(my_list, 18))
Output:
[{'name': 'MrsB', 'age': 28, 'height': 192}, {'name': 'MrsE', 'age': 32, 'height': 200}]
[{'name': 'MrA', 'age': 20, 'height': 185}, {'name': 'MrC', 'age': 18, 'height': 170}]
You can try this, Some values are int so you can check with str(query) and str(value).
def fnd_query(lst, query):
res = []
for dct in lst:
for val in dct.values():
if str(query) in str(val):
res.append(dct)
break
return res
print(fnd_query(my_list, 18))
print(fnd_query(my_list, 'Mrs'))
[{'name': 'MrA', 'age': 20, 'height': 185}, {'name': 'MrC', 'age': 18, 'height': 170}, {'name': 'Mrs18F', 'age': 21, 'height': 175}]
[{'name': 'MrsB', 'age': 28, 'height': 192}, {'name': 'MrsE', 'age': 32, 'height': 200}, {'name': 'Mrs18F', 'age': 21, 'height': 175}]

Calculate average values in a nested dict of dicts

I have a dictionary which has the following structure;
d = {'actor1': {'salary': {'year1': 60, 'year2': 65}, 'age': 30},
'actor2': {'salary': {'year1': 20, 'year2': 30}, 'age': 17},
'actor3': {'salary': {'year1': 50, 'year2': 80}, 'age': 25}}
I want the output to be as follows;
b = {'average': {'salary': {'year1': 43.3, 'year2': 58.3}, 'age': 24}}
So the inner dict can contain values which are both numbers, or dictionaries. If it is a dictionary we are guaranteed to have the same keys for each constituent dictionary (ie : the same years will always appear in salary for each actor).
I don't have a problem finding the correct value for the age key, which can be done as follows;
actor_keys = list(d)
b = {}
b['average'] = {}
b['average']['age'] = np.mean([b[i]['age'] for i in actor_keys])
Is there a nice similar kind of calculation that aggregates over the keys inside salary?
You can use recursion for a more robust solution to handle input of an unknown depth:
from itertools import groupby
data = {'actor1': {'salary': {'year1': 60, 'year2': 65}, 'age': 30}, 'actor2': {'salary': {'year1': 20, 'year2': 30}, 'age': 17}, 'actor3': {'salary': {'year1': 50, 'year2': 80}, 'age': 25}}
def ave(d):
_data = sorted([i for b in d for i in b.items()], key=lambda x:x[0])
_d = [(a, [j for _, j in b]) for a, b in groupby(_data, key=lambda x:x[0])]
return {a:ave(b) if isinstance(b[0], dict) else round(sum(b)/float(len(b)), 1) for a, b in _d}
result = {'average':ave(list(data.values()))}
Output:
{'average': {'age': 24.0, 'salary': {'year1': 43.3, 'year2': 58.3}}}
Here is another recursive solution:
def average_dicts(dicts):
result = {}
for i, d in enumerate(dicts):
for k, v in d.items():
update_dict_average(result, k, v, i)
return result
def update_dict_average(current, key, update, n):
if isinstance(update, dict):
subcurrent = current.setdefault(key, {})
for subkey, subupdate in update.items():
update_dict_average(subcurrent, subkey, subupdate, n)
else:
current[key] = (current.get(key, 0) * n + update) / (n + 1)
d = {'actor1': {'salary': {'year1': 60, 'year2': 65}, 'age': 30},
'actor2': {'salary': {'year1': 20, 'year2': 30}, 'age': 17},
'actor3': {'salary': {'year1': 50, 'year2': 80}, 'age': 25}}
result = {'average': average_dicts(d.values())}
print(result)
# {'average': {'salary': {'year1': 43.333333333333336, 'year2': 58.333333333333336}, 'age': 24.0}}
Here's what I would do.
def avg(nums):
nums = list(nums)
return round(sum(nums) / len(nums), 1)
d = {'actor1': {'salary': {'year1': 60, 'year2': 65}, 'age': 30},
'actor2': {'salary': {'year1': 20, 'year2': 30}, 'age': 17},
'actor3': {'salary': {'year1': 50, 'year2': 80}, 'age': 25}}
average = {'salary': {}}
average['age'] = avg(actor['age'] for actor in d.values())
for year in list(d.values())[0]['salary']:
average['salary'][year] = avg(actor['salary'][year] for actor in d.values())
b = {'average': average}
>>> print(b)
{'average': {'salary': {'year1': 43.3, 'year2': 58.3}, 'age': 24.0}}
This can handle an arbitrary positive number of years and actors, and doesn't require itertools or numpy.
Functional approach:
import itertools
from statistics import mean
d = {'actor1': {'salary': {'year1': 60, 'year2': 65}, 'age': 30},
'actor2': {'salary': {'year1': 20, 'year2': 30}, 'age': 17},
'actor3': {'salary': {'year1': 50, 'year2': 80}, 'age': 25}}
#helpers
age = operator.itemgetter('age')
salary = operator.itemgetter('salary')
year = operator.itemgetter(0)
value = operator.itemgetter(1)
ages = map(age,d.values())
avg_age = mean(ages)
print(f'avg_age: {avg_age}')
salaries = map(dict.items, map(salary, d.values()))
salaries = sorted(itertools.chain.from_iterable(salaries), key=year)
for key, group in itertools.groupby(salaries, year):
avg = mean(map(value, group))
print(f'avg for {key}: {avg}')
Here is my solution reusing what you did for age :
b = {}
b['average'] = {}
b['average']["salary"] = {"year1":np.mean([d.get(i).get('salary').get('year1') for i in d]),"year2":np.mean([d.get(i).get('salary').get('year2') for i in d])}

How to sum specific values within dictionary in python

I have the following nested dictionary and need to figure out how to sum all 'qty'.
data1 = {
'Batch1': {
'Pink': {'qty': 25, 'ordered': 15},
'Blue': {'qty': 18, 'ordered': 20}
},
'Batch2': {
'Coke': {'qty': 50, 'ordered': 100},
'Sprite': {'qty': 30, 'ordered': 25}
}
}
So the outcomes would be 123.
You can use sum:
data1 = {'Batch1': {'Pink': {'qty': 25, 'ordered':15}, 'Blue': {'qty':18, 'ordered':20}}, 'Batch2': {'Coke': {'qty': 50, 'ordered': 100},'Sprite': {'qty':30, 'ordered':25}}}
result = sum(b['qty'] for c in data1.values() for b in c.values())
Output:
123
Your data1 was formatted funny, so here's what I used:
{'Batch1': {'Blue': {'ordered': 20, 'qty': 18},
'Pink': {'ordered': 15, 'qty': 25}},
'Batch2': {'Coke': {'ordered': 100, 'qty': 50},
'Sprite': {'ordered': 25, 'qty': 30}}}
If you're not sure how deeply nested your dict will be, you can write a function to recursively traverse the nested dict looking for the qty key and sum the values:
def find_key_vals(query_key, base_dict):
values = []
for k, v in base_dict.items():
if k == query_key:
values.append(v)
elif isinstance(v, dict):
values += find_key_vals(query_key, v)
return values
find_key_vals('qty', data1)
# => [50, 30, 25, 18]
sum(find_key_vals('qty', data1))
# => 123

Parse list into dictionaries within a dictionary Python

dataset:
id = [1,2,3]
header = ['name','attack','defense']
stats = [['John',12,30], ['Amy',32,89], ['Lisa',45,21]]
I would like to obtain an output in the form of a nested dictionary. The keys of the outer dictionary will be the id and the values will be dictionaries the contain the other data. i.e.:
dict = {
1: {'name': 'John', 'attack': 12, 'defense': 30},
2: {'name': 'Amy', 'attack': 32, 'defense': 89},
3: {'name': 'Lisa', 'attack': 45, 'defense': 21}
}
this is my current code:
dict = {}
for i in id:
next_input = {}
for index, h in enumerate (header):
for sublist in stats:
next_input[h] = sublist[index]
dict[i] = next_input
It is not working because of the last for loop. the value of the inner dictionaries are just replacing themselves until the last sublist.
How can I correct this code?
You don't need to loop over the stats sublists; using the enumerate() option you picked, you'd have to add an index to the id loop and pick the right stats:
dict = {}
for id_index, i in enumerate(id):
next_input = {}
for h in enumerate (header):
next_input[h] = sublist[id_index][index]
dict[i] = next_input
However, you can use the zip() function to pair up two lists for parallel iteration:
result = {i: dict(zip(header, stat)) for i, stat in zip(id, stats)}
This uses a dictionary comprehension to build the outer mapping from id value to corresponding stats entry. The inner dictionary is simply build from the paired headers and statistics (dict() takes a sequence of (key, value) pairs).
Demo:
>>> id = [1,2,3]
>>> header = ['name','attack','defense']
>>> stats = [['John',12,30], ['Amy',32,89], ['Lisa',45,21]]
>>> {i: dict(zip(header, stat)) for i, stat in zip(id, stats)}
{1: {'attack': 12, 'defense': 30, 'name': 'John'}, 2: {'attack': 32, 'defense': 89, 'name': 'Amy'}, 3: {'attack': 45, 'defense': 21, 'name': 'Lisa'}}
>>> from pprint import pprint
>>> pprint(_)
{1: {'attack': 12, 'defense': 30, 'name': 'John'},
2: {'attack': 32, 'defense': 89, 'name': 'Amy'},
3: {'attack': 45, 'defense': 21, 'name': 'Lisa'}}
You can try this:
id = [1,2,3]
header = ['name','attack','defense']
stats = [['John',12,30], ['Amy',32,89], ['Lisa',45,21]]
new_dict = {a:{d:c for c, d in zip(b, header)} for a, b in zip(id, stats)}
Output:
{1: {'attack': 12, 'defense': 30, 'name': 'John'}, 2: {'attack': 32, 'defense': 89, 'name': 'Amy'}, 3: {'attack': 45, 'defense': 21, 'name': 'Lisa'}}
Another zip() variation:
d = {}
for i,s in enumerate(stats):
d[id[i]] = dict((zip(header, s)))
print(d)
The output:
{1: {'attack': 12, 'name': 'John', 'defense': 30}, 2: {'attack': 32, 'name': 'Amy', 'defense': 89}, 3: {'attack': 45, 'name': 'Lisa', 'defense': 21}}
use zip() and list comphersion
>> dict(zip(id ,[dict(zip(header,item)) for item in stats]))
{1: {'attack': 12, 'defense': 30, 'name': 'John'}, 2: {'attack': 32, 'defense': 89, 'name': 'Amy'}, 3: {'attack': 45, 'defense': 21, 'name': 'Lisa'}}
first zip every item in stats with header
>>> [dict(zip(header,item)) for item in stats]
[{'attack': 12, 'defense': 30, 'name': 'John'}, {'attack': 32, 'defense': 89, 'name': 'Amy'}, {'attack': 45, 'defense': 21, 'name': 'Lisa'}]
second zip id with the output of first
>>> zip(id,[dict(zip(header,item)) for item in stats])
[(1, {'attack': 12, 'defense': 30, 'name': 'John'}), (2, {'attack': 32, 'defense': 89, 'name': 'Amy'}), (3, {'attack': 45, 'defense': 21, 'name': 'Lisa'})]

sum value of two different dictionaries which is having same key

i am having two dictionaries
first = {'id': 1, 'age': 23}
second = {'id': 4, 'out': 100}
I want output dictionary as
{'id': 5, 'age': 23, 'out':100}
I tried
>>> dict(first.items() + second.items())
{'age': 23, 'id': 4, 'out': 100}
but i am getting id as 4 but i want to it to be 5 .
You want to use collections.Counter:
from collections import Counter
first = Counter({'id': 1, 'age': 23})
second = Counter({'id': 4, 'out': 100})
first_plus_second = first + second
print first_plus_second
Output:
Counter({'out': 100, 'age': 23, 'id': 5})
And if you need the result as a true dict, just use dict(first_plus_second):
>>> print dict(first_plus_second)
{'age': 23, 'id': 5, 'out': 100}
If you want to add values from the second to the first, you can do it like this:
first = {'id': 1, 'age': 23}
second = {'id': 4, 'out': 100}
for k in second:
if k in first:
first[k] += second[k]
else:
first[k] = second[k]
print first
The above will output:
{'age': 23, 'id': 5, 'out': 100}
You can simply update the 'id' key afterwards:
result = dict(first.items() + second.items())
result['id'] = first['id'] + second['id']

Categories