Compare two list of dictionaries in python

Compare two list of dictionaries in python - python

I have two lists of dictionaries named as category and sub_category.
category = [{'cat_id':1,'total':300,'from':250},{'cat_id':2,'total':100,'from':150}]
sub_category = [{'id':1,'cat_id':1,'charge':30},{'id':2,'cat_id':1,'charge':20},{'id':3,'cat_id':2,'charge':30}]
I want to change the value for charge to 0 in sub_category if the value of total >= from in category where cat_id's are equal.
Expected result is :
sub_category = [{'id':1,'cat_id':1,'charge':0},{'id':2,'cat_id':1,'charge':0},{'id':3,'cat_id':2,'charge':30}]
I managed to get the result by using this
for sub in sub_category:
for cat in category:
if cat['cat_id'] == sub['cat_id']:
if cat['total'] >= cat['from']:
sub['charge']=0
But I want to know the better way of doing this. Any help would be highly appreciated.

This is one approach. Change category to a dict for easy loopup.
Ex:
category = [{'cat_id':1,'total':300,'from':250},{'cat_id':2,'total':100,'from':150}]
sub_category = [{'id':1,'cat_id':1,'charge':30},{'id':2,'cat_id':1,'charge':20},{'id':3,'cat_id':2,'charge':30}]
category = {i.pop('cat_id'): i for i in category}
for i in sub_category:
if i['cat_id'] in category:
if category[i['cat_id']]['total'] >= category[i['cat_id']]['from']:
i['charge'] = 0
print(sub_category)
Output:
[{'cat_id': 1, 'charge': 0, 'id': 1},
{'cat_id': 1, 'charge': 0, 'id': 2},
{'cat_id': 2, 'charge': 30, 'id': 3}]

Try this:
I thinkt the way i did may not suitable at some cases. I like to use List Comprehensions just have a look.
category = [{'cat_id':1,'total':300,'from':250},{'cat_id':2,'total':100,'from':150}]
sub_category = [{'id':1,'cat_id':1,'charge':30},{'id':2,'cat_id':1,'charge':20},{'id':3,'cat_id':2,'charge':30}]
print [sub_cat if cat['cat_id'] == sub_cat['id'] and cat['total'] >= cat['from'] and not sub_cat.__setitem__('charge','0') else sub_cat for sub_cat in sub_category for cat in category]
Result:[{'cat_id': 1, 'charge': '0', 'id': 1}, {'cat_id': 1, 'charge': '0', 'id': 1}, {'cat_id': 1, 'charge': 20, 'id': 2}, {'cat_id': 1, 'charge': 20, 'id': 2}, {'cat_id': 2, 'charge': 30, 'id': 3}, {'cat_id': 2, 'charge': 30, 'id': 3}]

You can solve your problem using this approach:
target_categories = set([elem.get('cat_id') for elem in category if elem.get('total', 0) >= elem.get('from', 0)])
if None in target_categories:
target_categories.remove(None) # if there's no cat_id in one of the categories we will get None in target_categories. Remove it.
for elem in sub_category:
if elem.get('cat_id') in target_categories:
elem.update({'charge': 0})
Time comparison with another approach:
import numpy as np
size = 5000000
np.random.seed()
cat_ids = np.random.randint(50, size=(size,))
totals = np.random.randint(500, size=(size,))
froms = np.random.randint(500, size=(size,))
category = [{'cat_id': cat_id, 'total': total, 'from': from_} for cat_id, total, from_ in zip(cat_ids, totals, froms)]
sub_category = [{'id': 1, 'cat_id': np.random.randint(50), 'charge': np.random.randint(100)} for i in range(size)]
%%time
target_categories = set([elem.get('cat_id') for elem in category if elem.get('total', 0) >= elem.get('from', 0)])
if None in target_categories:
target_categories.remove(None) # if there's no cat_id in one of the categories we will get None in target_categories. Remove it.
for elem in sub_category:
if elem.get('cat_id') in target_categories:
elem.update({'charge': 0})
# Wall time: 3.47 s
%%time
category = {i.pop('cat_id'): i for i in category}
for i in sub_category:
if i['cat_id'] in category:
if category[i['cat_id']]['total'] >= category[i['cat_id']]['from']:
i['charge'] = 0
# Wall time: 5.73 s

Solution:
# Input
category = [{'cat_id':1,'total':300,'from':250},{'cat_id':2,'total':100,'from':150}]
sub_category = [{'id':1,'cat_id':1,'charge':30},{'id':2,'cat_id':1,'charge':20},{'id':3,'cat_id':2,'charge':30}]
# Main code
for k in sub_category:
if k["cat_id"] in [i["cat_id"] for i in category if i["total"] >= i["from"]]:
k["charge"] = 0
print (sub_category)
# Output
[{'id': 1, 'cat_id': 1, 'charge': 0}, {'id': 2, 'cat_id': 1, 'charge': 0}, {'id': 3, 'cat_id': 2, 'charge': 30}]

Related

How to convert key to value in dictionary type?

I have a question about the convert key.
First, I have this type of word count in Data Frame.
[Example]
dict = {'forest': 10, 'station': 3, 'office': 7, 'park': 2}
I want to get this result.
[Result]
result = {'name': 'forest', 'value': 10,
'name': 'station', 'value': 3,
'name': 'office', 'value': 7,
'name': 'park', 'value': 2}
Please check this issue.

As Rakesh said:
dict cannot have duplicate keys
The closest way to achieve what you want is to build something like that
my_dict = {'forest': 10, 'station': 3, 'office': 7, 'park': 2}
result = list(map(lambda x: {'name': x[0], 'value': x[1]}, my_dict.items()))
You will get
result = [
{'name': 'forest', 'value': 10},
{'name': 'station', 'value': 3},
{'name': 'office', 'value': 7},
{'name': 'park', 'value': 2},
]

As Rakesh said, You can't have duplicate values in the dictionary
You can simply try this.
dict = {'forest': 10, 'station': 3, 'office': 7, 'park': 2}
result = {}
count = 0;
for key in dict:
result[count] = {'name':key, 'value': dict[key]}
count = count + 1;
print(result)

how to add list values to existing dictionary in python

Im trying to add each values of score to the dict names i,e score[0] to names[0] and so on...
names=[{'id': 1, 'name': 'laptop'}, {'id': 2, 'name': 'box'}, {'id': 3, 'name': 'printer'}]
score = [0.9894376397132874, 0.819094657897949, 0.78116521835327]
Output should be like this
names=[{'id': 1, 'name': 'laptop','score':0.98}, {'id': 2, 'name': 'box','score':0.81}, {'id': 3, 'name': 'printer','score':0.78}]
How to achieve this? thanks in advance

I'd do it with a comprehension like this:
>>> [{**d, 'score':s} for d, s in zip(names, score)]
[{'id': 1, 'name': 'laptop', 'score': 0.9894376397132874}, {'id': 2, 'name': 'box', 'score': 0.819094657897949}, {'id': 3, 'name': 'printer', 'score': 0.78116521835327}]

Without list comprehension.
for i, name in enumerate(names):
name['score'] = score[i]
print(names)

This is an easy-to-understand solution. From your example, I understand you don't want to round up the numbers but still want to cut them.
import math
def truncate(f, n):
return math.floor(f * 10 ** n) / 10 ** n
names=[{'id': 1, 'name': 'laptop'}, {'id': 2, 'name': 'box'}, {'id': 3, 'name': 'printer'}]
score = [0.9894376397132874, 0.819094657897949, 0.78116521835327]
n = len(score)
for i in range(n):
names[i]["score"] = truncate(score[i], 2)
print(names)
If you do want to round up the numbers:
names=[{'id': 1, 'name': 'laptop'}, {'id': 2, 'name': 'box'}, {'id': 3, 'name': 'printer'}]
score = [0.9894376397132874, 0.819094657897949, 0.78116521835327]
n = len(score)
for i in range(n):
names[i]["score"] = round(score[i], 2)
print(names)

Combining multiple lists of dictionaries

I have several lists of dictionaries, where each dictionary contains a unique id value that is common among all lists. I'd like to combine them into a single list of dicts, where each dict is joined on that id value.
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
desired_output = [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
I tried doing something like the answer found at https://stackoverflow.com/a/42018660/7564393, but I'm getting very confused since I have more than 2 lists. Should I try using a defaultdict approach? More importantly, I am NOT always going to know the other values, only that the id value is present in all dicts.

You can use itertools.groupby():
from itertools import groupby
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
desired_output = []
for _, values in groupby(sorted([*list1, *list2, *list3], key=lambda x: x['id']), key=lambda x: x['id']):
temp = {}
for d in values:
temp.update(d)
desired_output.append(temp)
Result:
[{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]

list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
# combine all lists
d = {} # id -> dict
for l in [list1, list2, list3]:
for list_d in l:
if 'id' not in list_d: continue
id = list_d['id']
if id not in d:
d[id] = list_d
else:
d[id].update(list_d)
# dicts with same id are grouped together since id is used as key
res = [v for v in d.values()]
print(res)

You can first build a dict of dicts, then turn it into a list:
from itertools import chain
from collections import defaultdict
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
dict_out = defaultdict(dict)
for d in chain(list1, list2, list3):
dict_out[d['id']].update(d)
out = list(dict_out.values())
print(out)
# [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
itertools.chain allows you to iterate on all the dicts contained in the 3 lists. We build a dict dict_out having the id as key, and the corresponding dict being built as value. This way, we can easily update the already built part with the small dict of our current iteration.

Here, I have presented a functional approach without using itertools (which is excellent in rapid development work).
This solution will work for any number of lists as the function takes variable number of arguments and also let user to specify the type of return output (list/dict).
By default it returns list as you want that otherwise it returns dictionary in case if you pass as_list = False.
I preferred dictionary to solve this because its fast and search complexity is also less.
Just have a look at the below get_packed_list() function.
get_packed_list()
def get_packed_list(*dicts_lists, as_list=True):
output = {}
for dicts_list in dicts_lists:
for dictionary in dicts_list:
_id = dictionary.pop("id") # id() is in-built function so preferred _id
if _id not in output:
# Create new id
output[_id] = {"id": _id}
for key in dictionary:
output[_id][key] = dictionary[key]
dictionary["id"] = _id # push back the 'id' after work (call by reference mechanism)
if as_list:
return [output[key] for key in output]
return output # dictionary
Test
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
output = get_packed_list(list1, list2, list3)
print(output)
# [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
output = get_packed_list(list1, list2, list3, as_list=False)
print(output)
# {1: {'id': 1, 'value': 20, 'sum': 10, 'total': 30}, 2: {'id': 2, 'value': 21, 'sum': 11, 'total': 32}}

list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
print(list1+list2+list3)

list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
result = []
for i in range(0,len(list1)):
final_dict = dict(list(list1[i].items()) + list(list2[i].items()) + list(list3[i].items()))
result.append(final_dict)
print(result)
output : [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]

Ordering a Django queryset based on other list with ids and scores

I'm a bit mentally stuck at something, that seems really simple at first glance.
I'm grabbing a list of ids to be selected and scores to sort them based on.
My current solution is the following:
ids = [1, 2, 3, 4, 5]
items = Item.objects.filter(pk__in=ids)
Now I need to add a score based ordering somehow so I'll build the following list:
scores = [
{'id': 1, 'score': 15},
{'id': 2, 'score': 7},
{'id': 3, 'score': 17},
{'id': 4, 'score': 11},
{'id': 5, 'score': 9},
]
ids = [score['id'] for score in scores]
items = Item.objects.filter(pk__in=ids)
So far so good - but how do I actually add the scores as some sort of aggregate and sort the queryset based on them?

Sort the scores list, and fetch the queryset using in_bulk().
scores = [
{'id': 1, 'score': 15},
{'id': 2, 'score': 7},
{'id': 3, 'score': 17},
{'id': 4, 'score': 11},
{'id': 5, 'score': 9},
]
sorted_scores = sorted(scores) # use reverse=True for descending order
ids = [score['id'] for score in scores]
items = Item.objects.in_bulk(ids)
Then generate a list of the items in the order you want:
items_in_order = [items[x] for x in ids]

Merging arrays of versioned dictionaries

Given the following two arrays of dictionaries, how can I merge them such that the resulting array of dictionaries contains only those dictionaries whose version is greatest?
data1 = [{'id': 1, 'name': u'Oneeee', 'version': 2},
{'id': 2, 'name': u'Two', 'version': 1},
{'id': 3, 'name': u'Three', 'version': 2},
{'id': 4, 'name': u'Four', 'version': 1},
{'id': 5, 'name': u'Five', 'version': 1}]
data2 = [{'id': 1, 'name': u'One', 'version': 1},
{'id': 2, 'name': u'Two', 'version': 1},
{'id': 3, 'name': u'Threeee', 'version': 3},
{'id': 6, 'name': u'Six', 'version': 2}]
The merged result should look like this:
data3 = [{'id': 1, 'name': u'Oneeee', 'version': 2},
{'id': 2, 'name': u'Two', 'version': 1},
{'id': 3, 'name': u'Threeee', 'version': 3},
{'id': 4, 'name': u'Four', 'version': 1},
{'id': 5, 'name': u'Five', 'version': 1},
{'id': 6, 'name': u'Six', 'version': 2}]

If you want to get the highest version according to the dictionaries ids then you can use itertools.groupby method like this:
sdata = sorted(data1 + data2, key=lambda x:x['id'])
res = []
for _,v in itertools.groupby(sdata, key=lambda x:x['id']):
v = list(v)
if len(v) > 1: # happened that the same id was in both datas
# append the one with higher version
res.append(v[0] if v[0]['version'] > v[1]['version'] else v[1])
else: # the id was in one of the two data
res.append(v[0])
The solution is not a one liner but I think is simple enough (once you understand groupby() which is not trivial).
This will result in res containing this list:
[{'id': 1, 'name': u'Oneeee', 'version': 2},
{'id': 2, 'name': u'Two', 'version': 1},
{'id': 3, 'name': u'Threeee', 'version': 3},
{'id': 4, 'name': u'Four', 'version': 1},
{'id': 5, 'name': u'Five', 'version': 1},
{'id': 6, 'name': u'Six', 'version': 2}]
I think is possible to shrink the solution even more, but it could be quite hard to understand.
Hope this helps!

A fairly straightforward procedural solution, where we build a dictionary keyed by item id, and then replace the items:
indexed_data = { item['id']: item for item in data1 }
# or, pre-Python2.7:
# indexed_data = dict((item['id'], item) for item in data1)
for item in data2:
if indexed_data.get(item['id'], {'version': float('-inf')})['version'] < item['version']:
indexed_data[item['id']] = item
data3 = [item for (_, item) in sorted(indexed_data.items())]
The same thing, but using a more functional approach:
sorted_items = sorted(data1 + data2, key=lambda item: (item['id'], item['version']))
merged = { item['id']: item for item in sorted_items }
# or, pre-Python2.7:
# merged = dict((item['id'], item) for item in sorted_items )
data3 = [item for (_, item) in sorted(merged.items())]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Compare two list of dictionaries in python - python

Related

How to convert key to value in dictionary type?

how to add list values to existing dictionary in python

Combining multiple lists of dictionaries

Ordering a Django queryset based on other list with ids and scores

Merging arrays of versioned dictionaries

Categories

Resources