jmespath search nested array issue

jmespath search nested array issue - python

I need to search all dict in a nested array as below by its key with jmespath
my_list = [[{'age': 1, 'name': 'kobe'}, {'age': 2, 'name': 'james'}], [{'age': 3, 'name': 'kobe'}]]
I got an empty list with jmespath search: jmespath.search("[][?name=='kobe']", my_list)
how can I get result: [{'age': 1, 'name': 'kobe'}, {'age': 3, 'name': 'kobe'}] with jmespath search

Use the following jmesQuery:
[]|[?name=='kobe']
on input:
[[{"age": 1, "name": "kobe"}, {"age": 2, "name": "james"}], [{"age": 3, "name": "kobe"}]]
to get output:
[
{
"age": 1,
"name": "kobe"
},
{
"age": 3,
"name": "kobe"
}
]

The problem here is that you have a mix of different types, that is why you don't get expected results.
What you should do is this:
jmespath.search("[].to_array(#)[?name=='kobe'][]", my_list)
Here is a break down using Python console (pay attention to :
>>> my_list
[[{'age': 1, 'name': 'kobe'}, {'age': 2, 'name': 'james'}], [{'age': 3, 'name': 'kobe'}]]
>>> jmespath.search("[]", my_list)
[{'age': 1, 'name': 'kobe'}, {'age': 2, 'name': 'james'}, {'age': 3, 'name': 'kobe'}]
>>> jmespath.search("[].to_array(#)", my_list)
[[{'age': 1, 'name': 'kobe'}], [{'age': 2, 'name': 'james'}], [{'age': 3, 'name': 'kobe'}]]
>>> jmespath.search("[].to_array(#)[]", my_list)
[{'age': 1, 'name': 'kobe'}, {'age': 2, 'name': 'james'}, {'age': 3, 'name': 'kobe'}]
>>> jmespath.search("[].to_array(#)[?name=='kobe']", my_list)
[[{'age': 1, 'name': 'kobe'}], [], [{'age': 3, 'name': 'kobe'}]]
>>> jmespath.search("[].to_array(#)[?name=='kobe'][]", my_list)
[{'age': 1, 'name': 'kobe'}, {'age': 3, 'name': 'kobe'}]
You can find more explanation with examples in this guide: https://www.doaws.pl/blog/2021-12-05-how-to-master-aws-cli-in-15-minutes/how-to-master-aws-cli-in-15-minutes

Use Below code:
my_list = [[{'age': 1, 'name': 'kobe'}, {'age': 2, 'name': 'james'}], [{'age': 3,
'name': 'kobe'}]]
for l in my_list:
for dictionary in l:
Value_List = dictionary.values()
if "kobe" in Value_List:
print(dictionary)
Output:
{'age': 1, 'name': 'kobe'}
{'age': 3, 'name': 'kobe'}
OR-----
my_list = [[{'age': 1, 'name': 'kobe'}, {'age': 2, 'name': 'james'}],
[{'age': 3, 'name': 'kobe'}]]
Match_List = []
for l in my_list:
for dictionary in l:
if dictionary["name"] == "kobe":
Match_List.append(dictionary)
print(Match_List)
Output:
[{'age': 1, 'name': 'kobe'}, {'age': 3, 'name': 'kobe'}]

Related

how to add list values to existing dictionary in python

Im trying to add each values of score to the dict names i,e score[0] to names[0] and so on...
names=[{'id': 1, 'name': 'laptop'}, {'id': 2, 'name': 'box'}, {'id': 3, 'name': 'printer'}]
score = [0.9894376397132874, 0.819094657897949, 0.78116521835327]
Output should be like this
names=[{'id': 1, 'name': 'laptop','score':0.98}, {'id': 2, 'name': 'box','score':0.81}, {'id': 3, 'name': 'printer','score':0.78}]
How to achieve this? thanks in advance

I'd do it with a comprehension like this:
>>> [{**d, 'score':s} for d, s in zip(names, score)]
[{'id': 1, 'name': 'laptop', 'score': 0.9894376397132874}, {'id': 2, 'name': 'box', 'score': 0.819094657897949}, {'id': 3, 'name': 'printer', 'score': 0.78116521835327}]

Without list comprehension.
for i, name in enumerate(names):
name['score'] = score[i]
print(names)

This is an easy-to-understand solution. From your example, I understand you don't want to round up the numbers but still want to cut them.
import math
def truncate(f, n):
return math.floor(f * 10 ** n) / 10 ** n
names=[{'id': 1, 'name': 'laptop'}, {'id': 2, 'name': 'box'}, {'id': 3, 'name': 'printer'}]
score = [0.9894376397132874, 0.819094657897949, 0.78116521835327]
n = len(score)
for i in range(n):
names[i]["score"] = truncate(score[i], 2)
print(names)
If you do want to round up the numbers:
names=[{'id': 1, 'name': 'laptop'}, {'id': 2, 'name': 'box'}, {'id': 3, 'name': 'printer'}]
score = [0.9894376397132874, 0.819094657897949, 0.78116521835327]
n = len(score)
for i in range(n):
names[i]["score"] = round(score[i], 2)
print(names)

keep duplicates by key in a list of dictionaries

I have a list of dictionaries, and I would like to obtain those that have the same value in a key:
my_list_of_dicts = [{
'id': 3,
'name': 'John'
},{
'id': 5,
'name': 'Peter'
},{
'id': 2,
'name': 'Peter'
},{
'id': 6,
'name': 'Mariah'
},{
'id': 7,
'name': 'John'
},{
'id': 1,
'name': 'Louis'
}
]
I want to keep those items that have the same 'name', so, I would like to obtain something like:
duplicates: [{
'id': 3,
'name': 'John'
},{
'id': 5,
'name': 'Peter'
},{
'id': 2,
'name': 'Peter'
}, {
'id': 7,
'name': 'John'
}
]
I'm trying (not successfully):
duplicates = [item for item in my_list_of_dicts if len(my_list_of_dicts.get('name', None)) > 1]
I have clear my problem with this code, but not able to do the right sentence

Another concise way using collections.Counter:
from collections import Counter
my_list_of_dicts = [{
'id': 3,
'name': 'John'
},{
'id': 5,
'name': 'Peter'
},{
'id': 2,
'name': 'Peter'
},{
'id': 6,
'name': 'Mariah'
},{
'id': 7,
'name': 'John'
},{
'id': 1,
'name': 'Louis'
}
]
c = Counter(x['name'] for x in my_list_of_dicts)
duplicates = [x for x in my_list_of_dicts if c[x['name']] > 1]

You could use the following list comprehension:
>>> [d for d in my_list_of_dicts if len([e for e in my_list_of_dicts if e['name'] == d['name']]) > 1]
[{'id': 3, 'name': 'John'},
{'id': 5, 'name': 'Peter'},
{'id': 2, 'name': 'Peter'},
{'id': 7, 'name': 'John'}]

my_list_of_dicts = [{
'id': 3,
'name': 'John'
},{
'id': 5,
'name': 'Peter'
},{
'id': 2,
'name': 'Peter'
},{
'id': 6,
'name': 'Mariah'
},{
'id': 7,
'name': 'John'
},{
'id': 1,
'name': 'Louis'
}
]
df = pd.DataFrame(my_list_of_dicts)
df[df.name.isin(df[df.name.duplicated()]['name'])].to_json(orient='records')

Attempt similar to #cucuru
Hopefully Helpful.
Explained in comments what I did differently.
my_list_of_dicts = [{
'id': 3,
'name': 'John'
},{
'id': 5,
'name': 'Peter'
},{
'id': 2,
'name': 'Peter'
},{
'id': 6,
'name': 'Mariah'
},{
'id': 7,
'name': 'John'
},{
'id': 1,
'name': 'Louis'
}
]
# Create a list of names
names = [person.get('name') for person in my_list_of_dicts]
# Add item to list if the name occurs more than once in names
duplicates = [item for item in my_list_of_dicts if names.count(item.get('name')) > 1]
print(duplicates)
produces
[{'id': 3, 'name': 'John'}, {'id': 5, 'name': 'Peter'}, {'id': 2, 'name': 'Peter'}, {'id': 7, 'name': 'John'}]
[Program finished]

Sorted set list by dict value and group by names

I'm trying to name the question title well, but it's complicated. So will be better if I give you an example. I have something like this:
[{'level': 4, 'name': 'Docker'}, {'level': 1, 'name': 'Python'}, {'level': 3, 'name': 'JavaScript'}, {'level': 1, 'name': 'HTML'}]
and I wish to get this:
[{'level': 4, 'name': ['Docker']}, {'level': 3, 'name': ['JavaScript']}, {'level': 1, 'name': ['Python', 'HTML']}]
I sorted an list by dictionary values with powers.sort(key=lambda x: x['level'], reverse=True) and got this which imo is close to solution.
[{'level': 4, 'name': 'Docker'}, {'level': 3, 'name': 'JavaScript'}, {'level': 1, 'name': 'Python'}, {'level': 1, 'name': 'HTML'}]
I'll be grateful with any help to group names by level!

As one commenter says you can do this with defaultdict :
from collections import defaultdict
lang_list = [{'level': 4, 'name': 'Docker'}, {'level': 1, 'name': 'Python'}, {'level': 3, 'name': 'JavaScript'}, {'level': 1, 'name': 'HTML'}]
lvl_dict = defaultdict(list)
for d in lang_list:
lvl_dict[d['level']].append(d['name'])
lvl_list = [{'level': k, 'name': v} for k, v in lvl_dict.items()]
lvl_list.sort(key=lambda x: x['level'], reverse=True)
[{'level': 4, 'name': ['Docker']}, {'level': 3, 'name': ['JavaScript']}, {'level': 1, 'name': ['Python', 'HTML']}]

That's because you are only sorting; grouping is another explicit operation.
>>> from itertools import groupby
>>> from operator import itemgetter
>>> from pprint import pprint
>>> powers = [{'level': 4, 'name': 'Docker'}, {'level': 1, 'name': 'Python'}, {'level': 3, 'name': 'JavaScript'}, {'level': 1, 'name': 'HTML'}]
>>> get_level = itemgetter('level')
>>> get_name = itemgetter('name')
>>> def sort_and_group(lst, getter):
... return groupby(sorted(lst, key=getter), getter)
...
>>> pprint([dict(level=k, name=list(map(get_name, v))) for k, v in sort_and_group(powers, get_level)])
[{'level': 1, 'name': ['Python', 'HTML']},
{'level': 3, 'name': ['JavaScript']},
{'level': 4, 'name': ['Docker']}]
In most cases, you want a single group for each common attribute, so sorting by the same attribute prior to grouping is common.

using pandas:
import pandas as pd
a = [{'level': 4, 'name': 'Docker'}, {'level': 1, 'name': 'Python'}, {'level': 3, 'name': 'JavaScript'}, {'level': 1, 'name': 'HTML'}]
res = (pd.DataFrame(a).groupby('level')['name']
.apply(list).reset_index(name='name')
.sort_values('level',ascending=False)
.to_dict('records'))

Combining multiple lists of dictionaries

I have several lists of dictionaries, where each dictionary contains a unique id value that is common among all lists. I'd like to combine them into a single list of dicts, where each dict is joined on that id value.
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
desired_output = [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
I tried doing something like the answer found at https://stackoverflow.com/a/42018660/7564393, but I'm getting very confused since I have more than 2 lists. Should I try using a defaultdict approach? More importantly, I am NOT always going to know the other values, only that the id value is present in all dicts.

You can use itertools.groupby():
from itertools import groupby
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
desired_output = []
for _, values in groupby(sorted([*list1, *list2, *list3], key=lambda x: x['id']), key=lambda x: x['id']):
temp = {}
for d in values:
temp.update(d)
desired_output.append(temp)
Result:
[{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]

list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
# combine all lists
d = {} # id -> dict
for l in [list1, list2, list3]:
for list_d in l:
if 'id' not in list_d: continue
id = list_d['id']
if id not in d:
d[id] = list_d
else:
d[id].update(list_d)
# dicts with same id are grouped together since id is used as key
res = [v for v in d.values()]
print(res)

You can first build a dict of dicts, then turn it into a list:
from itertools import chain
from collections import defaultdict
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
dict_out = defaultdict(dict)
for d in chain(list1, list2, list3):
dict_out[d['id']].update(d)
out = list(dict_out.values())
print(out)
# [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
itertools.chain allows you to iterate on all the dicts contained in the 3 lists. We build a dict dict_out having the id as key, and the corresponding dict being built as value. This way, we can easily update the already built part with the small dict of our current iteration.

Here, I have presented a functional approach without using itertools (which is excellent in rapid development work).
This solution will work for any number of lists as the function takes variable number of arguments and also let user to specify the type of return output (list/dict).
By default it returns list as you want that otherwise it returns dictionary in case if you pass as_list = False.
I preferred dictionary to solve this because its fast and search complexity is also less.
Just have a look at the below get_packed_list() function.
get_packed_list()
def get_packed_list(*dicts_lists, as_list=True):
output = {}
for dicts_list in dicts_lists:
for dictionary in dicts_list:
_id = dictionary.pop("id") # id() is in-built function so preferred _id
if _id not in output:
# Create new id
output[_id] = {"id": _id}
for key in dictionary:
output[_id][key] = dictionary[key]
dictionary["id"] = _id # push back the 'id' after work (call by reference mechanism)
if as_list:
return [output[key] for key in output]
return output # dictionary
Test
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
output = get_packed_list(list1, list2, list3)
print(output)
# [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
output = get_packed_list(list1, list2, list3, as_list=False)
print(output)
# {1: {'id': 1, 'value': 20, 'sum': 10, 'total': 30}, 2: {'id': 2, 'value': 21, 'sum': 11, 'total': 32}}

list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
print(list1+list2+list3)

list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
result = []
for i in range(0,len(list1)):
final_dict = dict(list(list1[i].items()) + list(list2[i].items()) + list(list3[i].items()))
result.append(final_dict)
print(result)
output : [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]

How to remove a json string from list in python

I have two list with particular data I would like to merge them into a single list with out duplicates.
list1 =[{"id": "123","Name": "Sam", "Age": 10},{"id": "124","Name": "Ajay", "Age": 10}]
list2 =[{"id": "123","Name": "Sam"},{"id": "124","Name": "Ajay"},{"id": "125","Name": "Ram"}]
The output list should be like this
output= [{"id": "123","Name": "Sam", "Age": 10},{"id": "124","Name": "Ajay", "Age": 10},{"id": "125","Name": "Ram"}]

Presumably it is the id key that uniquely identifies the information. If so, collect all the info from the two lists in a dictionary, then produce a new list from that:
from itertools import chain
per_id = {}
for info in chain(list1, list2):
per_id.setdefault(info['id'], {}).update(info)
output = list(per_id.values()) # Python 2 and 3 compatible
Demo:
>>> from itertools import chain
>>> list1 = [{'Age': 10, 'id': '123', 'Name': 'Sam'}, {'Age': 10, 'id': '124', 'Name': 'Ajay'}]
>>> list2 = [{'id': '123', 'Name': 'Sam'}, {'id': '124', 'Name': 'Ajay'}, {'id': '125', 'Name': 'Ram'}]
>>> per_id = {}
>>> for info in chain(list1, list2):
... per_id.setdefault(info['id'], {}).update(info)
...
>>> list(per_id.values())
[{'Age': 10, 'id': '123', 'Name': 'Sam'}, {'Age': 10, 'id': '124', 'Name': 'Ajay'}, {'id': '125', 'Name': 'Ram'}]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

jmespath search nested array issue - python

Use the following jmesQuery: []|[?name=='kobe'] on input: [[{"age": 1, "name": "kobe"}, {"age": 2, "name": "james"}], [{"age": 3, "name": "kobe"}]] to get output: [ { "age": 1, "name": "kobe" }, { "age": 3, "name": "kobe" } ]

Related

how to add list values to existing dictionary in python

keep duplicates by key in a list of dictionaries

Sorted set list by dict value and group by names

Combining multiple lists of dictionaries

How to remove a json string from list in python

Categories

Resources