Comparing lists of dictionaries in Python

Comparing lists of dictionaries in Python - python

I've have read various questions but nothing I have found quite matches this scenario and I can't get it round my head.
I want to compare 2 lists of dictionaries. I don't want to check the individual key value pairs, I want to check the whole dictionary against the other but the gotcha is that one of the dictionaries in one list has an extra item 'id' which the other list doesn't so I don't need to compare that.
status_code and desc are not unique
just desc could change but as far as I'm concerned the whole thing has then changed.
Sample data:
data_db = [
{ "id": 1, "status_code": 2, "desc": "Description sample1" },
{ "id": 2, "status_code": 4, "desc": "Description sample2" },
{ "id": 3, "status_code": 5, "desc": "Description sample3" },
{ "id": 4, "status_code": 5, "desc": "Description sample4" }
]
data_api = [
{ "status_code": 1, "desc": "Description sample5" },
{ "status_code": 4, "desc": "Description sample6" },
{ "status_code": 5, "desc": "Description sample3" }
]
Expected output:
missing_from_db = [
{ "status_code": 1, "desc": "Description sample4" },
{ "status_code": 4, "desc": "Description sample6" } # because in data_db it desc is different
]
missing_from_api = [1,2,4] # This can just be the ids from data_db
I hope this makes sense (as it's confusing enough to me!).
Code wise I've not come up with anything remotely close or useful. Nearest thought I've had is reformatting data_db to this:
data_db = [
{
"id": 1,
"data": { "status_code": 2, "desc": "Description sample1" }
},
{
"id": 2,
"data": { "status_code": 4, "desc": "Description sample2" }
},
{
"id": 3,
"data": { "status_code": 5, "desc": "Description sample3" }
},
{
"id": 4,
"data": { "status_code": 5, "desc": "Description sample4" }
}
]
Thank you!

Reformatting your data_db should work:
data_db = [
{
"id": 1,
"data": { "status_code": 2, "desc": "Description sample1" }
},
{
"id": 2,
"data": { "status_code": 4, "desc": "Description sample2" }
},
{
"id": 3,
"data": { "status_code": 5, "desc": "Description sample3" }
},
{
"id": 4,
"data": { "status_code": 5, "desc": "Description sample4" }
}
]
data_api = [
{ "status_code": 1, "desc": "Description sample5" },
{ "status_code": 4, "desc": "Description sample6" },
{ "status_code": 5, "desc": "Description sample3" }
]
# checking the dicts in data_api against the 'data' sub-dicts in data_db
missing_from_db = [d for d in data_api if d not in [x['data'] for x in data_db]]
# using similar comprehension to extract the 'id' vals of the 'data' in data_db which aren't in data_api
missing_from_api = [d['id'] for d in data_db if d['data'] not in data_api]
Results:
print missing_from_db
[{'status_code': 1, 'desc': 'Description sample5'},
{'status_code': 4, 'desc': 'Description sample6'}]
print missing_from_api
[1, 2, 4]

This isn't a nice solution and it relies on the particular structure you have, but it works:
data_db = [
{ "id": 1, "status_code": 2, "desc": "Description sample1" },
{ "id": 2, "status_code": 4, "desc": "Description sample2" },
{ "id": 3, "status_code": 5, "desc": "Description sample3" },
{ "id": 4, "status_code": 5, "desc": "Description sample4" }
]
data_api = [
{ "status_code": 1, "desc": "Description sample5" },
{ "status_code": 4, "desc": "Description sample6" },
{ "status_code": 5, "desc": "Description sample3" }
]
lst = []
for dct in data_api:
for dct2 in data_db:
if all(dct[key] == dct2[key] for key in dct):
break
else:
lst.append(dct)
lst2 = []
for dct2 in data_db:
for dct in data_api:
if all(dct[key] == dct2[key] for key in dct):
break
else:
lst2.append(dct2["id"])
print(lst)
print(lst2)

will this help
def find_missing(data1,data2):
missig_from_data = list()
for i in range(0,len(data2)):
status = False
dec = False
for j in range(0,len(data1)):
if data2[i]['status_code'] == data1[j]['status_code']:
status = True
if data2[i]['desc'] == data1[j]['desc']:
dec = True
if (status == False and dec==False) or (status == True and dec==False) or (status == False and dec==True):
missig_from_data.append(data2[i])
return missig_from_data
data_db = [
{ "id": 1, "status_code": 2, "desc": "Description sample1" },
{ "id": 2, "status_code": 4, "desc": "Description sample2" },
{ "id": 3, "status_code": 5, "desc": "Description sample3" },
{ "id": 4, "status_code": 5, "desc": "Description sample4" }
]
data_api = [
{ "status_code": 1, "desc": "Description sample5" },
{ "status_code": 4, "desc": "Description sample6" },
{ "status_code": 5, "desc": "Description sample3" }
]
missig_from_data_db = find_missing(data_db,data_api)
missing_from_api = find_missing(data_api,data_db)
missing_from_api_1 = list()
for i in range(0,len(missing_from_api)): missing_from_api_1.append(missing_from_api[i]['id'])
print missig_from_data_db
print missing_from_api_1
Output :
[{'status_code': 1, 'desc': 'Description sample5'}, {'status_code': 4, 'desc': 'Description sample6'}]
[1, 2, 4]

Related

fetching multiple vales and keys from dict

movies={
'actors':{'prabhas':{'knownAs':'Darling', 'awards':{'nandi':1, 'cinemaa':1, 'siima':1},'remuneration':100, 'hits':{'industry':2, 'super':3,'flops':8}, 'age':41, 'height':6.1, 'mStatus':'single','sRate':'35%'},
'pavan':{'knownAs':'Power Star', 'awards':{'nandi':2, 'cinemaa':2, 'siima':5}, 'hits':{'industry':2, 'super':7,'flops':16}, 'age':48, 'height':5.9, 'mStatus':'married','sRate':'37%','remuneration':50},
},
'actress':{
'tamanna':{'knownAs':'Milky Beauty', 'awards':{'nandi':0, 'cinemaa':1, 'siima':1}, 'remuneration':10, 'hits':{'industry':1, 'super':7,'flops':11}, 'age':28, 'height':5.9, 'mStatus':'single', 'sRate':'40%'},
'rashmika':{'knownAs':'Butter Milky Beauty', 'awards':{'nandi':0, 'cinemaa':0, 'siima':2}, 'remuneration':12,'hits':{'industry':0, 'super':4,'flops':2}, 'age':36, 'height':5.9, 'mStatus':'single', 'sRate':'30%'},
1.What are the total number of Nandi Awards won by actors?
2. What is the success rate of Prince?
3.What is the name of Prince?

you can answer the first question with this:
import jmespath
movies={
"actors": {
"prabhas": {
"knownAs": "Darling",
"awards": {
"nandi": 1,
"cinemaa": 1,
"siima": 1
},
"remuneration": 100,
"hits": {
"industry": 2,
"super": 3,
"flops": 8
},
"age": 41,
"height": 6.1,
"mStatus": "single",
"sRate": "35%"
},
"pavan": {
"knownAs": "Power Star",
"awards": {
"nandi": 2,
"cinemaa": 2,
"siima": 5
},
"hits": {
"industry": 2,
"super": 7,
"flops": 16
},
"age": 48,
"height": 5.9,
"mStatus": "married",
"sRate": "37%",
"remuneration": 50
}
},
"actress": {
"tamanna": {
"knownAs": "Milky Beauty",
"awards": {
"nandi": 0,
"cinemaa": 1,
"siima": 1
},
"remuneration": 10,
"hits": {
"industry": 1,
"super": 7,
"flops": 11
},
"age": 28,
"height": 5.9,
"mStatus": "single",
"sRate": "40%"
},
"rashmika": {
"knownAs": "Butter Milky Beauty",
"awards": {
"nandi": 0,
"cinemaa": 0,
"siima": 2
},
"remuneration": 12,
"hits": {
"industry": 0,
"super": 4,
"flops": 2
},
"age": 36,
"height": 5.9,
"mStatus": "single",
"sRate": "30%"
}
}
}
total_nandies_by_actors = sum(jmespath.search('[]',jmespath.search('actors.*.*.nandi',movies)))
but there is no Prince in the data you've provided

Create one 'list' by userID

I want to create a list per user so i got this jsonfile:
data = [
{
"id": "1",
"price": 1,
},
{
"id": "1",
"price": 10,
},
{
"id": "2",
"price": 3,
},
{
"id": "1",
"price": 10,
},
{
"id": "2",
"price":8,
},
]
I'm on python and I want to have a result like
for the user with 'id':1 [1,10,10]
and for the user with "id": "2": [3,8]
so two lists corresponding to the prices according to the ids
is it possible to do that in python ?
note, in fact user id are UUID type and randomly generated.
edit: quantity was a mistake all data are price and id, sorry

collections.defaultdict to the rescue.
Assuming you really do have mixed quantitys and prices and you don't care about mixing them into the same list,
from collections import defaultdict
data = [
{
"id": "1",
"price": 1,
},
{
"id": "1",
"price": 10,
},
{
"id": "2",
"quantity": 3,
},
{
"id": "1",
"price": 10,
},
{
"id": "2",
"price": 8,
},
]
by_id = defaultdict(list)
for item in data:
item = item.copy() # we need to mutate the item
id = item.pop("id")
# whatever is the other value in the dict, grab that:
other_value = item.popitem()[1]
by_id[id].append(other_value)
print(dict(by_id))
The output is
{'1': [1, 10, 10], '2': [3, 8]}
If you actually only do have prices, the loop is simpler:
by_id = defaultdict(list)
for item in data:
by_id[item["id"]].append(item.get("price"))
or
by_id = defaultdict(list)
for item in data:
by_id[item["id"]].append(item["price"])
to fail fast when the price is missing.

first :
you structur data : {[]}, is not supported in python.
assume your data is :
my_json = [
{
"id": "1",
"price": 1,
},
{
"id": "1",
"price": 10,
},
{
"id": "2",
"quantity": 3,
},
{
"id": "1",
"price": 10,
},
{
"id": "2",
"price":8,
},
]
then you can achive with this:
results = {}
for data in my_json:
if data.get('id') not in results:
results[data.get('id')] = [data.get('price') or data.get('quantity')]
else:
results[data.get('id')].append(data.get('price') or data.get('quantity'))
print(results)
output:
{'1': [1, 10, 10], '2': [3, 8]}

Maybe like this:
data = [
{
"id": "1",
"price": 1,
},
{
"id": "1",
"price": 10,
},
{
"id": "2",
"quantity": 3,
},
{
"id": "1",
"price": 10,
},
{
"id": "2",
"price": 8,
}
]
result = {}
for item in data:
try:
result[item['id']].append(item.get('price'))
except KeyError:
result[item['id']] = [item.get('price')]
print(result)
Where None is put in place of the missing price for that entry, quantity key ignored.
Result:
{'1': [1, 10, 10], '2': [None, 8]}

A simple loop that enumerates your list (it's not JSON) in conjunction with setdefault() is all you need:
data = [
{
"id": "1",
"price": 1,
},
{
"id": "1",
"price": 10,
},
{
"id": "2",
"price": 3,
},
{
"id": "1",
"price": 10,
},
{
"id": "2",
"price": 8,
}
]
dict_ = {}
for d in data:
dict_.setdefault(d['id'], []).append(d['price'])
print(dict_)
Output:
{'1': [1, 10, 10], '2': [3, 8]}
Note:
This will fail (KeyError) if either 'id' or 'price' is missing from the dictionaries in the list

How to eliminate duplicate items while adding them to their own structure

I have a list of dictionary items, with each dictionary containing a list of presentation items. The sample dictionaries below are a small prototype of my real data set.
I need to remove duplicate presentations based on day (one presentation per day) and store them in a new dictionary with the same structure within the existing list.
So starting with:
[
{
"time": "04:00-20:59",
"category": 1,
"presentations": [
{
"presentation": "ABC",
"day": 7,
},
{
"presentation": "DEF",
"day": 7,
},
{
"presentation": "GHI",
"day": 8,
},
{
"presentation": "JKL",
"day": 8,
},
{
"presentation": "MNO",
"day": 9,
},
{
"presentation": "PQR",
"day": 9,
},
{
"presentation": "STU",
"day": 9,
}
]
} #only one dictionary item in the list for simplicity
]
The end result should be three dictionaries containing lists of presentations where there is one presentation for a given day:
[
{
"time": "04:00-20:59",
"category": 1,
"presentations": [
{
"presentation": "ABC",
"day": 7
},
{
"presentation": "DEF",
"day": 8
},
{
"presentation": "GHI",
"day": 9
}
]
},
{
"time": "04:00-20:59",
"category": 1,
"presentations": [
{
"presentation": "JKL",
"day": 7
},
{
"presentation": "MNO",
"day": 8
},
{
"presentation": "PQR",
"day": 9
}
]
},
{
"time": "04:00-20:59",
"category": 1,
"presentations": [
{
"presentation": "STU",
"day": 9
}
]
}
]
I don't know how to go about removing these duplicates (based on day) while adding them to their own dictionary.

Correctly parsing data with jq

I have the following data:
[
{
"M": [
{
"id": 1,
"nk": "MATH$$SPRING$$INST1$$2",
"section": {
"nk": "MATH$$SPRING$$INST1",
"course": 1,
"id": 1
},
"location": {
"id": 1,
"nk": "mcu$$101",
"campus": {
"id": 1,
"nk": "mcu",
"name": "Main Campus"
},
"address": "1 st",
"building": "1",
"room": "101"
},
"day_of_week": 2,
"start_time": "09:00:00",
"end_time": "10:00:00"
},
{
"id": 3,
"nk": "ENG$$SPRING$$INST2$$2",
"section": {
"nk": "ENG$$SPRING$$INST2",
"course": 2,
"id": 4
},
"location": {
"id": 2,
"nk": "mcu$$201",
"campus": {
"id": 1,
"nk": "mcu",
"name": "Main Campus"
},
"address": "1 st",
"building": "1",
"room": "201"
},
"day_of_week": 2,
"start_time": "09:00:00",
"end_time": "10:00:00"
},
{
"id": 4,
"nk": "ENG$$SPRING$$INST2$$22",
"section": {
"nk": "ENG$$SPRING$$INST2",
"course": 2,
"id": 4
},
"location": {
"id": 2,
"nk": "mcu$$201",
"campus": {
"id": 1,
"nk": "mcu",
"name": "Main Campus"
},
"address": "1 st",
"building": "1",
"room": "201"
},
"day_of_week": 2,
"start_time": "10:00:00",
"end_time": "11:00:00"
}
]
},
{
"W": [
{
"id": 2,
"nk": "MATH$$SPRING$$INST1$$4",
"section": {
"nk": "MATH$$SPRING$$INST2",
"course": 1,
"id": 2
},
"location": {
"id": 2,
"nk": "mcu$$201",
"campus": {
"id": 1,
"nk": "mcu",
"name": "Main Campus"
},
"address": "1 st",
"building": "1",
"room": "201"
},
"day_of_week": 4,
"start_time": "08:00:00",
"end_time": "10:00:00"
}
]
}
]
I'm trying to extract "W"'s list.
When i do: jq('[.[].W][]').transform(data) i get None, But when i do jq('[.[].M][]').transform(data) I get the desired result. Why im i experiencing this?

I'm trying to extract "W"'s list.
OK, so let's first deal with jq, and then with the python interface.
jq
.[] yields all the items in the top-level array, and therefore
.[] | .W will yield two items:
null (because the first item does not have .W), and
the desired list
To extract just "W"'s list, you could use any of the following filters,
depending on your precise requirements:
.[] | select(has("W")) | .W
.[] | .W | select(.)
.[] | .W // empty
.[1].W
from jq import jq
As the documentation at https://pypi.org/project/pyjq/ says:
If multiple_output is False (the default), then the first output is used
For example:
print jq('1,2').transform(data)
yields just 1.
In summary
Depending on the precise requirements, you can use any of the filters given above, for example:
jq('.[] | .W // empty').transform(data)
Moral
If there's a moral to this tale, it might be that, when in doubt, one should consider using jq (the command-line executable) or jqplay to make sure your jq filter is doing what you want.

converting list of dictionary to dictionary tree based on parent id

I want to make a list of dictionary that way, every element which has a parent id, it should be child of the parent element.
Let's say we have a python list, which contains multiple dictionaries.
[{
"id": 1,
"title": "node1",
"parent": null
},
{
"id": 2,
"title": "node2",
"parent": 1
},
{
"id": 3,
"title": "node3",
"parent": 1
},
{
"id": 4,
"title": "node4",
"parent": 2
},
{
"id": 5,
"title": "node5",
"parent": 2
}]
And I want to convert this list to tree based on parent key. like,
[{
'id':1,
'title':'node1',
'childs':[
{
'id':2,
'title':'node2'
'childs':[
{
'id':4,
'title':'node4',
'childs': []
},
{
'id':5,
'title':'node5',
'childs': []
}
]
},
{
'id':3,
'title':'node3'
'childs':[]
}
]
}]

data = [{
"id": 1,
"title": "node1",
"parent": "null"
},
{ "id": 2,
"title": "node2",
"parent": "null"
},
{
"id": 2,
"title": "node2",
"parent": 1
},
{
"id": 3,
"title": "node3",
"parent": 1
},
{
"id": 4,
"title": "node4",
"parent": 2
},
{
"id": 5,
"title": "node5",
"parent": 2
}]
parent_data=[]
for keys in data:
if keys['parent'] == "null":
keys['childs']=[]
parent_data.append(keys)
for keys in data:
for key in parent_data:
if key['id'] == keys['parent']:
key['childs'].append(keys)
print parent_data

k = [{
"id": 1,
"title": "node1",
"parent": "null"
},
{
"id": 2,
"title": "node2",
"parent": 1
},
{
"id": 3,
"title": "node3",
"parent": 1
},
{
"id": 4,
"title": "node4",
"parent": 2
},
{
"id": 5,
"title": "node5",
"parent": 2
}]
result, t = [], {}
for i in k:
i['childs'] = []
if i['parent'] == 'null':
del i['parent']
result.append(i)
t[1] = result[0]
else:
t[i['parent']]['childs'].append(i)
t[i['id']] = t[i['parent']]['childs'][-1]
del t[i['parent']]['childs'][-1]['parent']
print result

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Comparing lists of dictionaries in Python - python

Related

fetching multiple vales and keys from dict

Create one 'list' by userID

How to eliminate duplicate items while adding them to their own structure

Correctly parsing data with jq

converting list of dictionary to dictionary tree based on parent id

Categories

Resources