Creating a complex nested dictionary from multiple lists in Python - python

I am struggling to create a nested dictionary with the following data:
Team, Group, ID, Score, Difficulty
OneTeam, A, 0, 0.25, 4
TwoTeam, A, 1, 1, 10
ThreeTeam, A, 2, 0.64, 5
FourTeam, A, 3, 0.93, 6
FiveTeam, B, 4, 0.5, 7
SixTeam, B, 5, 0.3, 8
SevenTeam, B, 6, 0.23, 9
EightTeam, B, 7, 1.2, 4
Once imported as a Pandas Dataframe, I turn each feature into these lists:
teams, group, id, score, diff.
Using this stack overflow answer Create a complex dictionary using multiple lists I can create the following dictionary:
{'EightTeam': {'diff': 4, 'id': 7, 'score': 1.2},
'FiveTeam': {'diff': 7, 'id': 4, 'score': 0.5},
'FourTeam': {'diff': 6, 'id': 3, 'score': 0.93},
'OneTeam': {'diff': 4, 'id': 0, 'score': 0.25},
'SevenTeam': {'diff': 9, 'id': 6, 'score': 0.23},
'SixTeam': {'diff': 8, 'id': 5, 'score': 0.3},
'ThreeTeam': {'diff': 5, 'id': 2, 'score': 0.64},
'TwoTeam': {'diff': 10, 'id': 1, 'score': 1.0}}
using the code:
{team: {'id': i, 'score': s, 'diff': d} for team, i, s, d in zip(teams, id, score, diff)}
But what I'm after is having 'Group' as the main key, then team, and then id, score and difficulty within the team (as above).
I have tried:
{g: {team: {'id': i, 'score': s, 'diff': d}} for g, team, i, s, d in zip(group, teams, id, score, diff)}
but this doesn't work and results in only one team per group within the dictionary:
{'A': {'FourTeam': {'diff': 6, 'id': 3, 'score': 0.93}},
'B': {'EightTeam': {'diff': 4, 'id': 7, 'score': 1.2}}}
Below is how the dictionary should look, but I'm not sure how to get there - any help would be much appreciated!
{'A:': {'EightTeam': {'diff': 4, 'id': 7, 'score': 1.2},
'FiveTeam': {'diff': 7, 'id': 4, 'score': 0.5},
'FourTeam': {'diff': 6, 'id': 3, 'score': 0.93},
'OneTeam': {'diff': 4, 'id': 0, 'score': 0.25}},
'B': {'SevenTeam': {'diff': 9, 'id': 6, 'score': 0.23},
'SixTeam': {'diff': 8, 'id': 5, 'score': 0.3},
'ThreeTeam': {'diff': 5, 'id': 2, 'score': 0.64},
'TwoTeam': {'diff': 10, 'id': 1, 'score': 1.0}}}

A dict comprehension may not be the best way of solving this if your data is stored in a table like this.
Try something like
from collections import defaultdict
groups = defaultdict(dict)
for g, team, i, s, d in zip(group, teams, id, score, diff):
groups[g][team] = {'id': i, 'score': s, 'diff': d }
By using defaultdict, if groups[g] already exists, the new team is added as a key, if it doesn't, an empty dict is automatically created that the new team is then inserted into.
Edit: you edited your answer to say that your data is in a pandas dataframe. You can definitely skip the steps of turning the columns into list. Instead you could then for example do:
from collections import defaultdict
groups = defaultdict(dict)
for row in df.itertuples():
groups[row.Group][row.Team] = {'id': row.ID, 'score': row.Score, 'diff': row.Difficulty}

If you absolutely want to use comprehension, then this should work:
z = zip(teams, group, id, score, diff)
s = set(group)
d = { #outer dict, one entry for each different group
group: ({ #inner dict, one entry for team, filtered for group
team: {'id': i, 'score': s, 'diff': d}
for team, g, i, s, d in z
if g == group
})
for group in s
}
I added linebreaks for clarity
EDIT:
After the comment, to better clarify my intention and out of curiosity, I run a comparison:
# your code goes here
from collections import defaultdict
import timeit
teams = ['OneTeam', 'TwoTeam', 'ThreeTeam', 'FourTeam', 'FiveTeam', 'SixTeam', 'SevenTeam', 'EightTeam']
group = ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B']
id = [0, 1, 2, 3, 4, 5, 6, 7]
score = [0.25, 1, 0.64, 0.93, 0.5, 0.3, 0.23, 1.2]
diff = [4, 10, 5, 6, 7, 8, 9, 4]
def no_comprehension():
global group, teams, id, score, diff
groups = defaultdict(dict)
for g, team, i, s, d in zip(group, teams, id, score, diff):
groups[g][team] = {'id': i, 'score': s, 'diff': d }
def comprehension():
global group, teams, id, score, diff
z = zip(teams, group, id, score, diff)
s = set(group)
d = {group: ({team: {'id': i, 'score': s, 'diff': d} for team, g, i, s, d in z if g == group}) for group in s}
print("no comprehension:")
print(timeit.timeit(lambda : no_comprehension(), number=10000))
print("comprehension:")
print(timeit.timeit(lambda : comprehension(), number=10000))
executable version
Output:
no comprehension:
0.027287796139717102
comprehension:
0.028979241847991943
They do look the same, in terms of performance. With my sentence above, I was just highlighting this as an alternative solution to the one already posted by #JohnO.

Related

Using Counts in python to subtract list of dictionaries

I've worked out how to use "counter" to add lists of dictionarys with the code below
from collections import Counter
a = [{'num': 'star1', 'count': 1},
{'num': 'star2', 'count': 3}]
b = [{'num': 'star1', 'count': 7},
{'num': 'star2', 'count': 2},
{'num': 'star3', 'count': 1}]
joint = sum((Counter({elem['num']: elem['count']}) for elem in a + b), Counter())
[{'num': num, 'count': counts} for num, counts in joint .items()]
However, when I try to subtract I get an error. For example:
from collections import Counter
a = [{'num': 'star1', 'count': 1},
{'num': 'star2', 'count': 3}]
b = [{'num': 'star1', 'count': 7},
{'num': 'star2', 'count': 2},
{'num': 'star3', 'count': 1}]
joint = sum((Counter({elem['num']: elem['count']}) for elem in a - b), Counter())
[{'num': num, 'count': counts} for num, counts in joint .items()]
Is anyone aware of a work around to this? Or how I could approach this issue?
I've tried using the subtract function in counter but it still doesn't seem to work
I don't think Counter accept negative counts, but if you don't mind about not having negative counts, this will work (just doing the two Counters separately):
from collections import Counter
a = [{'num': 'star1', 'count': 1},
{'num': 'star2', 'count': 3}]
b = [{'num': 'star1', 'count': 7},
{'num': 'star2', 'count': 2},
{'num': 'star3', 'count': 1}]
joint = sum((Counter({elem['num']: elem['count']}) for elem in a), Counter())
joint -= sum((Counter({elem['num']: elem['count']}) for elem in b), Counter())
[{'num': num, 'count': counts} for num, counts in joint .items()]
It outputs:
[{'num': 'star2', 'count': 1}]
The other values (star1 and star3) would both be less than 0 because they come to 1-7=-6 and 0-1=-1 respectively
Alternatively you can just not use counters and do it like this (this supports negative numbers and whatever operation you want):
a_info = {d['num']:d['count'] for d in a}
b_info = {d['num']:d['count'] for d in b}
[{'num':item, 'count': a_info.get(item, 0)-b_info.get(item, 0)} for item in set(a_info)|set(b_info)]

Python - group/merge dictionaries based on key/values identity

I have a list containing many dictionaries with same keys but different values.
What I would like to do is to group/merge dictionaries based on the values of some of the keys.
It's probably faster to show an example rather than trying to explain:
[{'zone': 'A', 'weekday': 1, 'hour': 12, 'C1': 3, 'C2': 15},
{'zone': 'B', 'weekday': 2, 'hour': 6, 'C1': 5, 'C2': 27},
{'zone': 'A', 'weekday': 1, 'hour': 12, 'C1': 7, 'C2': 12},
{'zone': 'C', 'weekday': 5, 'hour': 8, 'C1': 2, 'C2': 13}]
So, what I want to achieve is merging the first and third dictionary, since they have the same "zone", "hour" and "weekday", summing the values in C1 and C2:
[{'zone': 'A', 'weekday': 1, 'hour': 12, 'C1': 10, 'C2': 27},
{'zone': 'B', 'weekday': 2, 'hour': 6, 'C1': 5, 'C2': 27},
{'zone': 'C', 'weekday': 5, 'hour': 8, 'C1': 2, 'C2': 13}]
Any help here? :) I've been struggling with this for a couple of days, I've got a bad unscalable solution, but I'm sure there is something far more pythonic that I could put in place.
Thanks!
Sort then group by the relevant keys; iterate over the groups and create new dictionaries with summed values.
import operator
import itertools
keys = operator.itemgetter('zone','weekday','hour')
c1_c2 = operator.itemgetter('C1','C2')
# data is your list of dicts
data.sort(key=keys)
grouped = itertools.groupby(data,keys)
new_data = []
for (zone,weekday,hour),g in grouped:
c1,c2 = 0,0
for d in g:
c1 += d['C1']
c2 += d['C2']
new_data.append({'zone':zone,'weekday':weekday,
'hour':hour,'C1':c1,'C2':c2})
That last loop could also be written as:
for (zone,weekday,hour),g in grouped:
cees = map(c1_c2,g)
c1,c2 = map(sum,zip(*cees))
new_data.append({'zone':zone,'weekday':weekday,
'hour':hour,'C1':c1,'C2':c2})
By using a defaultdict you can merge them in linear time.
from collections import defaultdict
res = defaultdict(lambda : defaultdict(int))
for d in dictionaries:
res[(d['zone'],d['weekday'],d['hour'])]['C1']+= d['C1']
res[(d['zone'],d['weekday'],d['hour'])]['C2']+= d['C2']
The drawback is that you need another pass to have the output as you've defined it.
I've gone ahead and written a slightly longer solution, making use of nametuples as keys of the dictionary:
from collections import namedtuple
zones = [{'zone': 'A', 'weekday': 1, 'hour': 12, 'C1': 3, 'C2': 15},
{'zone': 'B', 'weekday': 2, 'hour': 6, 'C1': 5, 'C2': 27},
{'zone': 'A', 'weekday': 1, 'hour': 12, 'C1': 7, 'C2': 12},
{'zone': 'C', 'weekday': 5, 'hour': 8, 'C1': 2, 'C2': 13}]
ZoneTime = namedtuple("ZoneTime", ["zone", "weekday", "hour"])
results = dict()
for zone in zones:
zone_time = ZoneTime(zone['zone'], zone['weekday'], zone['hour'])
if zone_time in results:
results[zone_time]['C1'] += zone['C1']
results[zone_time]['C2'] += zone['C2']
else:
results[zone_time] = {'C1': zone['C1'], 'C2': zone['C2']}
print(results)
This uses a namedtuple of (zone, weekday, hour) as the key to each dictionary. Then it's fairly trivial to either add to it if it already exists within results, or create a new entry in the dictionary.
You can definitely make this shorter and "smarter", but it may become less readable.
Edit: Run Time Comparison
My original answer (see below) was not a good one, but I think I had a useful contribution by doing a little bit of run time analysis on the other answers so I've edited that portion and put it at the top. Here I include the three other solutions, along with the required transformations to produce the desired output. For completeness I also include a version using pandas, which assumes that the user is working with a DataFrame (transforming from list of dicts to data frame and back was not even close to worth it). Comparison times vary a little depending on the random data generated, but these are fairly representative:
>>> run_timer(100)
Times with 100 values
...with defaultdict: 0.1496697600000516
...with namedtuple: 0.14976404899994122
...with groupby: 0.0690777249999428
...with pandas: 3.3165711250001095
>>> run_timer(1000)
Times with 1000 values
...with defaultdict: 1.267153091999944
...with namedtuple: 0.9605341750000207
...with groupby: 0.6634409229998255
...with pandas: 3.5146895360001054
>>> run_timer(10000)
Times with 10000 values
...with defaultdict: 9.194478484000001
...with namedtuple: 9.157486462000179
...with groupby: 5.18553969300001
...with pandas: 4.704001281000046
>>> run_timer(100000)
Times with 100000 values
...with defaultdict: 59.644778522000024
...with namedtuple: 89.26688319799996
...with groupby: 93.3517027989999
...with pandas: 14.495209061999958
Take aways:
working with pandas data frames pays off big time for large datasets
NOTE: I do not include conversion between list of dicts and data frame, which is definitely significant
otherwise the accepted solution (by wwii) wins for small to medium datasets, but for very large ones it may be the slowest
changing the sizes of the groups (e.g., by decreasing the number of zones) has a huge effect which is not examined here
Here is the script I used to generate the above.
import random
import pandas
from timeit import timeit
from functools import partial
from itertools import groupby
from operator import itemgetter
from collections import namedtuple, defaultdict
def with_pandas(df):
return df.groupby(['zone', 'weekday', 'hour']).agg(sum).reset_index()
def with_groupby(data):
keys = itemgetter('zone', 'weekday', 'hour')
# data is your list of dicts
data.sort(key=keys)
grouped = groupby(data, keys)
new_data = []
for (zone, weekday, hour), g in grouped:
c1, c2 = 0, 0
for d in g:
c1 += d['C1']
c2 += d['C2']
new_data.append({'zone': zone, 'weekday': weekday,
'hour': hour, 'C1': c1, 'C2': c2})
return new_data
def with_namedtuple(zones):
ZoneTime = namedtuple("ZoneTime", ["zone", "weekday", "hour"])
results = dict()
for zone in zones:
zone_time = ZoneTime(zone['zone'], zone['weekday'], zone['hour'])
if zone_time in results:
results[zone_time]['C1'] += zone['C1']
results[zone_time]['C2'] += zone['C2']
else:
results[zone_time] = {'C1': zone['C1'], 'C2': zone['C2']}
return [
{
'zone': key[0],
'weekday': key[1],
'hour': key[2],
**val
}
for key, val in results.items()
]
def with_defaultdict(dictionaries):
res = defaultdict(lambda: defaultdict(int))
for d in dictionaries:
res[(d['zone'], d['weekday'], d['hour'])]['C1'] += d['C1']
res[(d['zone'], d['weekday'], d['hour'])]['C2'] += d['C2']
return [
{
'zone': key[0],
'weekday': key[1],
'hour': key[2],
**val
}
for key, val in res.items()
]
def gen_random_vals(num):
return [
{
'zone': random.choice('ABCDEFGHIJKLMNOPQRSTUVWXYZ'),
'weekday': random.randint(1, 7),
'hour': random.randint(0, 23),
'C1': random.randint(1, 50),
'C2': random.randint(1, 50),
}
for idx in range(num)
]
def run_timer(num_vals=1000, timeit_num=1000):
vals = gen_random_vals(num_vals)
df = pandas.DataFrame(vals)
p_fmt = "\t...with %s: %s"
times = {
'defaultdict': timeit(stmt=partial(with_defaultdict, vals), number=timeit_num),
'namedtuple': timeit(stmt=partial(with_namedtuple, vals), number=timeit_num),
'groupby': timeit(stmt=partial(with_groupby, vals), number=timeit_num),
'pandas': timeit(stmt=partial(with_pandas, df), number=timeit_num),
}
print("Times with %d values" % num_vals)
for key, val in times.items():
print(p_fmt % (key, val))
where
with_groupby uses the solution by wwii
with_namedtuple uses the solution by Jose Salvatierra
with_defaultdict uses the solution by abc
with_pandas uses the solution proposed by Alexander Cécile in comments
assumes data is already in a DataFrame and produces a DataFrame as result
Original answer:
Just for fun, here's a completely different approach using groupby. Granted, it's not the prettiest, but it should be fairly quick.
from itertools import groupby
from operator import itemgetter
from pprint import pprint
vals = [
{'zone': 'A', 'weekday': 1, 'hour': 12, 'C1': 3, 'C2': 15},
{'zone': 'B', 'weekday': 2, 'hour': 6, 'C1': 5, 'C2': 27},
{'zone': 'A', 'weekday': 1, 'hour': 12, 'C1': 7, 'C2': 12},
{'zone': 'C', 'weekday': 5, 'hour': 8, 'C1': 2, 'C2': 13}
]
ordered = sorted(
[
(
(row['zone'], row['weekday'], row['hour']),
row['C1'], row['C2']
)
for row in vals
]
)
def invert_columns(grp):
return zip(*[g_row[1:] for g_row in grp])
merged = [
{
'zone': key[0],
'weekday': key[1],
'hour': key[2],
**dict(
zip(["C1", "C2"], [sum(col) for col in invert_columns(grp)])
)
}
for key, grp in groupby(ordered, itemgetter(0))
]
pprint(merged)
which yields
[{'C1': 10, 'C2': 27, 'hour': 12, 'weekday': 1, 'zone': 'A'},
{'C1': 5, 'C2': 27, 'hour': 6, 'weekday': 2, 'zone': 'B'},
{'C1': 2, 'C2': 13, 'hour': 8, 'weekday': 5, 'zone': 'C'}]

List of dictionaries - stack one value of dictionary

I have trouble in adding one value of dictionary when conditions met, For example I have this list of dictionaries:
[{'plu': 1, 'price': 150, 'quantity': 2, 'stock': 5},
{'plu': 2, 'price': 150, 'quantity': 7, 'stock': 10},
{'plu': 1, 'price': 150, 'quantity': 6, 'stock': 5},
{'plu': 1, 'price': 200, 'quantity': 4, 'stock': 5},
{'plu': 2, 'price': 150, 'quantity': 3, 'stock': 10}
]
Then output should look like this:
[{'plu': 1, 'price': 150, 'quantity': 8, 'stock': 5},
{'plu': 1, 'price': 200, 'quantity': 4, 'stock': 5},
{'plu': 2, 'price': 150, 'quantity': 10, 'stock': 10}
]
Quantity should be added only if plu and price are the same, it should ignore key:values other than that (ex. stock). What is the most efficient way to do that?
#edit
I tried:
import itertools as it
keyfunc = lambda x: x['plu']
groups = it.groupby(sorted(new_data, key=keyfunc), keyfunc)
x = [{'plu': k, 'quantity': sum(x['quantity'] for x in g)} for k, g in groups]
But it works only on plu and then I get only quantity value when making html table in django, other are empty
You need to sort/groupby the combined key, not just one key. Easiest/most efficient way to do this is with operator.itemgetter. To preserve an arbitrary stock value, you'll need to use the group twice, so you'll need to convert it to a sequence:
from operator import itemgetter
keyfunc = itemgetter('plu', 'price')
# Unpack key and listify g so it can be reused
groups = ((plu, price, list(g))
for (plu, price), g in it.groupby(sorted(new_data, key=keyfunc), keyfunc))
x = [{'plu': plu, 'price': price, 'stock': g[0]['stock'],
'quantity': sum(x['quantity'] for x in g)}
for plu, price, g in groups]
Alternatively, if stock is guaranteed to be the same for each unique plu/price pair, you can include it in the key to simplify matters, so you don't need to listify the groups:
keyfunc = itemgetter('plu', 'price', 'stock')
groups = it.groupby(sorted(new_data, key=keyfunc), keyfunc)
x = [{'plu': plu, 'price': price, 'stock': stock,
'quantity': sum(x['quantity'] for x in g)
for (plu, price, stock), g in groups]
Optionally, you could create getquantity = itemgetter('quantity') at top level (like the keyfunc) and change sum(x['quantity'] for x in g) to sum(map(getquantity, g)) which pushes work to the C layer in CPython, and can be faster if your groups are large.
The other approach is to avoid sorting entirely using collections.Counter (or collections.defaultdict(int), though Counter makes the intent more clear here):
from collections import Counter
grouped = Counter()
for plu, price, stock, quantity in map(itemgetter('plu', 'price', 'stock', 'quantity'), new_data):
grouped[plu, price, stock] += quantity
then convert back to your preferred form with:
x = [{'plu': plu, 'price': price, 'stock': stock, 'quantity': quantity}
for (plu, price, stock), quantity in grouped.items()]
This should be faster for large inputs, since it replaces O(n log n) sorting work with O(n) dict operations (which are roughly O(1) cost).
Using pandas will make this a trivial problem:
import pandas as pd
data = [{'plu': 1, 'price': 150, 'quantity': 2, 'stock': 5},
{'plu': 2, 'price': 150, 'quantity': 7, 'stock': 10},
{'plu': 1, 'price': 150, 'quantity': 6, 'stock': 5},
{'plu': 1, 'price': 200, 'quantity': 4, 'stock': 5},
{'plu': 2, 'price': 150, 'quantity': 3, 'stock': 10}]
df = pd.DataFrame.from_records(data)
# df
#
# plu price quantity stock
# 0 1 150 2 5
# 1 2 150 7 10
# 2 1 150 6 5
# 3 1 200 4 5
# 4 2 150 3 10
new_df = df.groupby(['plu','price','stock'], as_index=False).sum()
new_df = new_df[['plu','price','quantity','stock']] # Optional: reorder the columns
# new_df
#
# plu price quantity stock
# 0 1 150 8 5
# 1 1 200 4 5
# 2 2 150 10 10
And finally, if you want to, port it back to dict (though I would argue pandas give you a lot more functionality to handle the data elements):
new_data = df2.to_dict(orient='records')
# new_data
#
# [{'plu': 1, 'price': 150, 'quantity': 8, 'stock': 5},
# {'plu': 1, 'price': 200, 'quantity': 4, 'stock': 5},
# {'plu': 2, 'price': 150, 'quantity': 10, 'stock': 10}]

Ordering a Django queryset based on other list with ids and scores

I'm a bit mentally stuck at something, that seems really simple at first glance.
I'm grabbing a list of ids to be selected and scores to sort them based on.
My current solution is the following:
ids = [1, 2, 3, 4, 5]
items = Item.objects.filter(pk__in=ids)
Now I need to add a score based ordering somehow so I'll build the following list:
scores = [
{'id': 1, 'score': 15},
{'id': 2, 'score': 7},
{'id': 3, 'score': 17},
{'id': 4, 'score': 11},
{'id': 5, 'score': 9},
]
ids = [score['id'] for score in scores]
items = Item.objects.filter(pk__in=ids)
So far so good - but how do I actually add the scores as some sort of aggregate and sort the queryset based on them?
Sort the scores list, and fetch the queryset using in_bulk().
scores = [
{'id': 1, 'score': 15},
{'id': 2, 'score': 7},
{'id': 3, 'score': 17},
{'id': 4, 'score': 11},
{'id': 5, 'score': 9},
]
sorted_scores = sorted(scores) # use reverse=True for descending order
ids = [score['id'] for score in scores]
items = Item.objects.in_bulk(ids)
Then generate a list of the items in the order you want:
items_in_order = [items[x] for x in ids]

Python dictionary: Add to the key sum of values fulfilling given condition

I've following nested dictionary, where the first number is resource ID (the total number of IDs is greater than 100 000):
dict = {1: {'age':1,'cost':14,'score':0.3},
2: {'age':1,'cost':9,'score':0.5},
...}
I want to add to each resource a sum of costs of resources with lower score than given resource. I can add 'sum_cost' key which is equal to 0 by following code:
for id in adic:
dict[id]['sum_cost'] = 0
It gives me following:
dict = {1: {'age':1,'cost':14,'score':0.3, 'sum_cost':0},
2: {'age':1,'cost':9,'score':0.5,'sum_cost':0},
...}
Now I would like to use ideally for loop (to make the code easily readable) to assign to each sum_cost a value equal of sum of cost of IDs with lower score than the given ID.
Ideal output looks like dictionary where 'sum_cost' of each ID is equal to the cost of IDs with lower score than given ID:
dict = {1: {'age':1,'cost':14,'score':0.3, 'sum_cost':0},
2: {'age':1,'cost':9,'score':0.5,'sum_cost':21},
3: {'age':13,'cost':7,'score':0.4,'sum_cost':14}}
Is there any way how to do it?
Notes:
Using sorted method for sorting the dictionary output corresponding to the key score
dictionary get method to get dictionary values
and using a temporary variable for cumulative addition os sum_cost
Code:
dicts = {1: {'age': 1, 'cost': 14, 'score': 0.3, 'sum_cost': 0},
2: {'age': 1, 'cost': 9, 'score': 0.5, 'sum_cost': 0},
3: {'age': 13, 'cost': 7, 'score': 0.4, 'sum_cost': 0}}
sum_addition = 0
for key, values in sorted(dicts.items(), key=lambda x: x[1].get('score', None)):
if dicts[key].get('score') is not None: #By default gives None when key is not available
dicts[key]['sum_cost'] = sum_addition
sum_addition += dicts[key]['cost']
print key, dicts[key]
A even more simplified method by #BernarditoLuis and #Kevin Guan advise
Code2:
dicts = {1: {'age': 1, 'cost': 14, 'score': 0.3, 'sum_cost': 0},
2: {'age': 1, 'cost': 9, 'score': 0.5, 'sum_cost': 0},
3: {'age': 13, 'cost': 7, 'score': 0.4, 'sum_cost': 0}}
sum_addition = 0
for key, values in sorted(dicts.items(), key=lambda x: x[1].get('score', None)):
if dicts[key].get('score'): #By default gives None when key is not available
dicts[key]['sum_cost'] = sum_addition
sum_addition += dicts[key]['cost']
print key, dicts[key]
Output:
1 {'sum_cost': 0, 'age': 1, 'cost': 14, 'score': 0.3}
3 {'sum_cost': 14, 'age': 13, 'cost': 7, 'score': 0.4}
2 {'sum_cost': 21, 'age': 1, 'cost': 9, 'score': 0.5}
What about using OrderedDict?
from collections import OrderedDict
origin_dict = {
1: {'age':1,'cost':14,'score':0.3},
2: {'age':1,'cost':9,'score':0.5},
3: {'age':1,'cost':8,'score':0.45}
}
# sort by score
sorted_dict = OrderedDict(sorted(origin_dict.items(), key=lambda x: x[1]['score']))
# now all you have to do is to count sum_cost successively starting from 0
sum_cost = 0
for key, value in sorted_dict.items():
value['sum_cost'] = sum_cost
sum_cost += value['cost']
print sorted_dict

Categories