python code works on one file but fails on other - python

Hi all so I have this code, which prints out the minimum cost and restaurant id for the item/items. The customer doesnt want to visit multiple restaurants. So for example if he asks for "A,B" then the code should print shop which offers them both , instead of scattering the user requirement around different restaurants (even if some restaurant is offering it cheap).
Also if suppose the user asks for burger.Then if a certain restaurant 'X' is giving a "burger" for 4$, whereas another restaurant 'Y' is giving "burger+tuna+tofu" for $3, then we will tell the user to got for RESTAURANT 'Y', even if it has extra items apart from the 'burger' which user asked for, but we are happy to give them extra items as long as its cheap.
Everythings fine, but the code is strangely behaving differently on two input files(fails on input.csv but runs on input-2.csv) which are of same format, its giving correct output for one whereas fails for another. This is the only minute error I need your help to fix. Please help me , I guess I have hit the wall , cant think beyond it all.
def build_shops(shop_text):
shops = {}
for item_info in shop_text:
shop_id,cost,items = item_info.replace('\n', '').split(',')
cost = float(cost)
items = items.split('+')
if shop_id not in shops:
shops[shop_id] = {}
shop_dict = shops[shop_id]
for item in items:
if item not in shop_dict:
shop_dict[item] = []
shop_dict[item].append([cost,items])
return shops
def solve_one_shop(shop, items):
if len(items) == 0:
return [0.0, []]
all_possible = []
first_item = items[0]
if first_item in shop:
print "SHOP",shop.get(first_item)
for (price,combo) in shop[first_item]:
#print "items,combo=",items,combo
sub_set = [x for x in items if x not in combo]
#print "sub_set=",sub_set
price_sub_set,solution = solve_one_shop(shop, sub_set)
solution.append([price,combo])
all_possible.append([price+price_sub_set, solution])
cheapest = min(all_possible, key=(lambda x: x[0]))
return cheapest
def solver(input_data, required_items):
shops = build_shops(input_data)
#print shops
result_all_shops = []
for shop_id,shop_info in shops.iteritems():
(price, solution) = solve_one_shop(shop_info, required_items)
result_all_shops.append([shop_id, price, solution])
shop_id,total_price,solution = min(result_all_shops, key=(lambda x: x[1]))
print('SHOP_ID=%s' % shop_id)
sln_str = [','.join(items)+'(%0.2f)'%price for (price,items) in solution]
sln_str = '+'.join(sln_str)
print(sln_str + ' = %0.2f' % total_price)
shop_text = open('input-1.csv','rb')
solver(shop_text,['burger'])
=====input-1.csv=====restaurant_id, price, item
1,2.00,burger
1,1.25,tofulog
1,2.00,tofulog
1,1.00,chef_salad
1,1.00,A+B
1,1.50,A+CCC
1,2.50,A
2,3.00,A
2,1.00,B
2,1.20,CCC
2,1.25,D
=====output & error====:
{'1': {'A': [[1.0, ['A', 'B']], [1.5, ['A', 'CCC']], [2.5, ['A', 'D']]], 'B': [[1.0, ['A', 'B']]], 'D': [[2.5, ['A', 'D']]], 'chef_salad': [[1.0, ['chef_salad']]], 'burger': [[2.0, ['burger']]], 'tofulog': [[1.25, ['tofulog']], [2.0, ['tofulog']]], 'CCC': [[1.5, ['A', 'CCC']]]}, '2': {'A': [[3.0, ['A']]], 'B': [[1.0, ['B']]], 'D': [[1.25, ['D']]], 'CCC': [[1.2, ['CCC']]]}}
SHOP [[2.0, ['burger']]]
Traceback (most recent call last):
File "work.py", line 55, in <module>
solver(shop_text,['burger'])
File "work.py", line 43, in solver
(price, solution) = solve_one_shop(shop_info, required_items)
File "work.py", line 26, in solve_one_shop
for (price,combo) in shop[first_item]:
KeyError: 'burger'
whereas if I run the same code on input-2.csv , and query for solver(shop_text,['A','CCC']), I get correct result
=====input-2.csv======
1,2.00,A
1,1.25,B
1,2.00,B
1,1.00,A
1,1.00,A+B
1,1.50,A+CCC
1,2.50,A+D
2,3.00,A
2,1.00,B
2,1.20,CCC
2,1.25,D
=========output====
{'1': {'A': [[2.0, ['A']], [1.0, ['A']], [1.0, ['A', 'B']], [1.5, ['A', 'CCC']], [2.5, ['A', 'D']]], 'B': [[1.25, ['B']], [2.0, ['B']], [1.0, ['A', 'B']]], 'D': [[2.5, ['A', 'D']]], 'CCC': [[1.5, ['A', 'CCC']]]}, '2': {'A': [[3.0, ['A']]], 'B': [[1.0, ['B']]], 'D': [[1.25, ['D']]], 'CCC': [[1.2, ['CCC']]]}}
SHOP [[2.0, ['A']], [1.0, ['A']], [1.0, ['A', 'B']], [1.5, ['A', 'CCC']], [2.5, ['A', 'D']]]
SHOP [[1.5, ['A', 'CCC']]]
SHOP [[1.5, ['A', 'CCC']]]
SHOP [[1.5, ['A', 'CCC']]]
SHOP [[1.5, ['A', 'CCC']]]
SHOP [[3.0, ['A']]]
SHOP [[1.2, ['CCC']]]
SHOP_ID=1
A,CCC(1.50) = 1.50

You can figure out the error if you do this:
In your solve_one_shop method, print the dictionary shop after the line first_item = items[0]. Doing that will print out:
{'A': [[3.0, ['A']]], 'B': [[1.0, ['B']]], 'D': [[1.25, ['D']]], 'CCC': [[1.2, ['CCC']]]}
So, burger is not one of its keys and hence it throws a KeyError
Add this line:
2,1.25,burger
to the end of your input.csv file and your code works fine.
Do the reading of values from the shop dictionary in a try except block to deal with the case where an item may not be present.
Note:
In your method build_shops the line:
shop_id,cost,items = item_info.replace('\n', '').split(',')
although strips off the newline, it does not strip off the carriage return. To fix that, do this:
shop_id,cost,items = item_info.replace('\n', '').replace('\r', '').split(',')
Hope this helps.

I think I've fixed it...
solve_one_shop
The for loop should only happen within the if, otherwise you get a KeyError. Also, I have changed it so that it only returns if all_possible contains anything (an empty list evaluates to False.
edit To prevent a TypeError I have done assigned to a temporary value this_subset and the rest of the loop only happens is it is not None.
def solve_one_shop(shop, items):
if len(items) == 0:
return [0.0, []]
all_possible = []
first_item = items[0]
if first_item in shop:
for (price,combo) in shop[first_item]:
sub_set = [x for x in items if x not in combo]
this_subset = solve_one_shop(shop, sub_set)
if this_subset is not None:
price_sub_set,solution = this_subset
solution.append([price,combo])
all_possible.append([price+price_sub_set, solution])
if all_possible:
cheapest = min(all_possible, key=(lambda x: x[0]))
return cheapest
solver
I have assigned the return value of solve_one_shop to an intermediate variable. If this is None, then the shop is not added to result_all_shops.
edit If result_all_shops is empty, then print a message instead of trying to find the min.
def solver(input_data, required_items):
shops = build_shops(input_data)
result_all_shops = []
for shop_id,shop_info in shops.iteritems():
this_shop = solve_one_shop(shop_info, required_items)
if this_shop is not None:
(price, solution) = this_shop
result_all_shops.append([shop_id, price, solution])
if result_all_shops:
shop_id,total_price,solution = min(result_all_shops, key=(lambda x: x[1]))
print('SHOP_ID=%s' % shop_id)
sln_str = [','.join(items)+'(%0.2f)'%price for (price,items) in solution]
sln_str = '+'.join(sln_str)
print(sln_str + ' = %0.2f' % total_price)
else:
print "Item not available"

Related

How to convert a flat list into a dictionary in python?

I have a flat list containing information of multiple variables and need to convert it into a dictionary. For example, 'a','b','c' are variable names and need to be the keys in the dictionary. The list could be split by '_' and ':'.
list_x = ['a:1_b:45_c:abc','a:2_b:24_c:def','a:4_b:78_c:xxx']
The desired output would be:
dict_x = {'a':[1,2,4],'b':[45,24,78],'c':['abc','def','xxx']}
I am not sure how to loop to get the keys for the dictionary since it is the same for all elements in the list.
lst = [y.split(":") for x in [x.split("_") for x in list_x] for y in x]
d = {x:[] for x in set([x[0] for x in lst])}
for k, v in lst:
d[k].append(v)
# Out[40]: {'a': ['1', '2', '4'], 'c': ['abc', 'def', 'xxx'], 'b': ['45', '24', '78']}
Try this method (explanation inline as code comments) -
#Function to turn a list of tuples into a dict after converting integers and keeping string types.
def convert(tup):
di = {}
for a, b in tup:
if b.isdecimal(): #convert to int if possible
b = int(b)
di.setdefault(a, []).append(b)
return di
#convert the input into a list of tuples
k = [tuple(j.split(':')) for i in list_x for j in i.split('_')]
#convert list of tuples into dict
convert(k)
{'a': [1, 2, 4], 'b': [45, 24, 78], 'c': ['abc', 'def', 'xxx']}
list_x = ['a:1_b:45_c:abc','a:2_b:24_c:def','a:4_b:78_c:xxx']
result_dict = {}
for list_element in list_x:
key_val_pair = list_element.split('_')
for key_val in key_val_pair:
key, val = key_val.split(':')
if key not in result_dict:
result_dict[key] = []
result_dict[key].append(val)
print(result_dict)
You need to ensure that your dictionary is dictionary of type string: list that is why I check if the dictionary contains the key and if it does then I push the item and if it doesn't then add a new key with a list containing only the value.
list_x = ['a:1_b:45_c:abc','a:2_b:24_c:def','a:4_b:78_c:xxx']
print(list_x)
dic_x = dict()
for x in list_x:
keyValueList = x.split('_')
for keyValue in keyValueList:
split = keyValue.split(':')
key = split[0]
value = split[1]
if key in dic_x:
dic_x[key].append(value)
else:
dic_x.update({key: [value]})
print(dic_x)
Assuming strings in your list_x always have the same format as: a:integer_b:integer_c:string, you can do this:
dict_x = {'a':[],'b':[],'c':[]}
for s in list_x:
sl = s.split('_')
dict_x['a'].append(int(sl[0][2:]))
dict_x['b'].append(int(sl[1][2:]))
dict_x['c'].append(sl[2][2:])
Maybe this can solve you problem with an easy way without being too much verbose neither compact. It's versatile so you can add as much identifier as you want but as you can see the format of them should be the same
list_x = ['a:1_b:45_c:abc','a:2_b:24_c:def','a:4_b:78_c:xxx']
dict_x ={}
for val in list_x:
elements = val.split('_')
for el in elements:
key, value = el.split(':')[0], el.split(':')[1]
if dict_x.get(key) is None: #If the key it's founded for the first time
dict_x[key] = [value]
else: #If I've already founded the key the data is being appended
dict_x[key].append(value)
print(dict_x)
As you can see the core it's the if that checks if the key founded not exists, in this case create a new array containing the first value founded; otherwise append the value to the actual array.
First split each string based on _ as delimiter and then split it based on : as delimiter, and add each item to a dict
>>> list_x = ['a:1_b:45_c:abc','a:2_b:24_c:def','a:4_b:78_c:xxx']
>>>
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> for s in list_x:
... for kv in s.split('_'):
... k,v = kv.split(':')
... d[k].append(int(v) if v.isdigit() else v)
...
>>> dict(d)
{'a': [1, 2, 4], 'b': [45, 24, 78], 'c': ['abc', 'def', 'xxx']}
First, let's split the string by ':' and then '_'
list_x = ['a:1_b:45_c:abc','a:2_b:24_c:def','a:4_b:78_c:xxx']
def parse(s):
return [t.split(":") for t in s.split("_")]
parsed_to_lists = [parse(st) for st in list_x]
We now have
[[['a', '1'], ['b', '45'], ['c', 'abc']], [['a', '2'], ['b', '24'], ['c', 'def']], [['a', '4'], ['b', '78'], ['c', 'xxx']]]
we can flatten that by
flat_list = [item for sublist in parsed_to_lists for item in sublist]
flat_list
Which returns
[['a', '1'], ['b', '45'], ['c', 'abc'], ['a', '2'], ['b', '24'], ['c', 'def'], ['a', '4'], ['b', '78'], ['c', 'xxx']]
We want the result as a dictionary of lists, so let's create an empty one
from collections import defaultdict
res = defaultdict(list)
and fill it
for k,v in flat_list:
res[k].append(v)
res
defaultdict(<class 'list'>, {'a': ['1', '2', '4'], 'b': ['45', '24', '78'], 'c': ['abc', 'def', 'xxx']})

How to get a value in a tuple in a dictionary?

I want to access the values in a tuple within a dictionary using a lambda function
I need to get average GPA for each subject by comparing the average grades of the students in that class
I have tried using a lambda but I could not figure it out.
grade = {'A': 4.0, 'B': 3.0, 'C': 2.0, 'D': 1.0, 'F' : 0.0}
subjects = {'math': {('Jack', 'A'),('Larry', 'C')}, 'English': {('Kevin', 'C'),('Tom','B')}}
def highestAverageOfSubjects(subjects):
return
The output needs to be ['math','English'] since average GPA of math which is 3.0 is greater then English 2.0 average GPA
You can easily sort everything by using sorted with a key function:
Grade = {'A': 4.0, 'B': 3.0, 'C': 2.0, 'D': 1.0, 'F' : 0.0}
subject = {'math': {('Jack', 'A'),('Larry', 'C')}, 'English': {('Kevin', 'C'),('Tom','B')}}
result = sorted(subject, key=lambda x: sum(Grade[g] for _, g in subject[x]) / len(subject[x]), reverse=True)
print(result)
Output:
['math','English']
If, as a secondary, you want to sort by the number of students:
result = sorted(subject, key=lambda x: (sum(Grade[g] for _, g in subject[x]) / len(subject[x]), len(subject[x])), reverse=True)
print(result)
One of the issues with the way you have implemented is that you have used a set as values in your subject dict. This means you have to range over each element. But once you have the element, that value would simply be indexed like elem[1].
For ex:
Grade = {'A': 4.0, 'B': 3.0, 'C': 2.0, 'D': 1.0, 'F' : 0.0}
subject = {'math': {('Jack', 'A'),('Larry', 'C')}, 'English': {('Kevin', 'C'),('Tom','B')}}
for elem in subject['math']:
print(elem[1])
Output:
C
A
If in the print above you just print(elem) then you'd see something like:
('Larry', 'C')
('Jack', 'A')
So this way you could easily extend your highAveSub(subject) implementation to get what you want.
To find the avg grade of a subject:
def highAveSub(subname):
total = 0
for elem in subject[subname]: #Because your values are of type set, not dict.
total = total + grade[elem[1]] #This is how you will cross-reference the numerical value of the grade. You could also simply use enums and I'll leave that to you to find out
avg = total / len(subject[subname])
return avg

How to change the items in a list of sublists based on certain rules and conditions of those sublists?

I have a list of sublists that are made up of three items. Only the first and last item matter in the sublists, because I want to change the last item across all sublists based on the frequency of the last item across the list.
This is the list I have:
lst = [['A','abc','id1'],['A','def','id2'],['A','ghi','id1'],['A','ijk','id1'],['A','lmn','id2'],['B','abc','id3'],['B','def','id3'],['B','ghi','id3'],['B','ijk','id3'],['B','lmn','id'],['C','xyz','id6'],['C','lmn','id6'],['C','aaa','id5']]
For example, A appears the most with id1 instead of id2, so I'd like to replace all id2 that appear with A with id1. For B, id3 is the most common, so I'd like to replace any instance of anything else with id3, which means I'd want to replace 'id' with 'id3' only for B. For C, I'd like to replace the instance of 'id5' with 'id6,' because 'id6' appears the most with the list.
Desired_List = lst = [['A','abc','id1'],['A','def','id1'],['A','ghi','id1'],['A','ijk','id1'],['A','lmn','id1'],['B','abc','id3'],['B','def','id3'],['B','ghi','id3'],['B','ijk','id3'],['B','lmn','id3'],['C','xyz','id6'],['C','lmn','id6'],['C','aaa','id6']]
I should also mention that this is going to be done on a very large list, so speed and efficiency is needed.
Straight-up data processing using your ad-hoc requirement above, I can come up with the following algorithm.
First sweep: collect frequency information for every key (i.e. 'A', 'B', 'C'):
def generate_frequency_table(lst):
assoc = {} # e.g. 'A': {'id1': 3, 'id2': 2}
for key, unused, val in list:
freqs = assoc.get(key, None)
if freqs is None:
freqs = {}
assoc[key] = freqs
valfreq = freqs.get(val, None)
if valfreq is None:
freqs[val] = 1
else:
freqs[val] = valfreq + 1
return assoc
>>> generate_frequency_table(lst)
{'A': {'id2': 2, 'id1': 3}, 'C': {'id6': 2, 'id5': 1}, 'B': {'id3': 4, 'id': 1}}
Then, see what 'value' is associated with each key (i.e. {'A': 'id1'}):
def generate_max_assoc(assoc):
max = {} # e.g. {'A': 'id1'}
for key, freqs in assoc.iteritems():
curmax = ('', 0)
for val, freq in freqs.iteritems():
if freq > curmax[1]:
curmax = (val, freq)
max[key] = curmax[0]
return max
>>> maxtable = generate_max_assoc(generate_frequency_table(lst))
>>> print maxtable
{'A': 'id1', 'C': 'id6', 'B': 'id3'}
Finally, iterate through the original list and replace values using the table above:
>>> newlst = [[key, unused, maxtable[key]] for key, unused, val in lst]
>>> print newlst
[['A', 'abc', 'id1'], ['A', 'def', 'id1'], ['A', 'ghi', 'id1'], ['A', 'ijk', 'id1'], ['A', 'lmn', 'id1'], ['B', 'abc', 'id3'], ['B', 'def', 'id3'], ['B', 'ghi', 'id3'], ['B', 'ijk', 'id3'], ['B', 'lmn', 'id3'], ['C', 'xyz', 'id6'], ['C', 'lmn', 'id6'], ['C', 'aaa', 'id6']]
This is pretty much the same solution as supplied by Santa, but I've combined a few steps into one, as we can scan for the maximum value while we are collecting the frequencies:
def fix_by_frequency(triple_list):
freq = {}
for key, _, value in triple_list:
# Get existing data
data = freq[key] = \
freq.get(key, {'max_value': value, 'max_count': 1, 'counts': {}})
# Increment the count
count = data['counts'][value] = data['counts'].get(value, 0) + 1
# Update the most frequently seen
if count > data['max_count']:
data['max_value'], data['max_count'] = value, count
# Use the maximums to map the list
return [[key, mid, freq[key]['max_value']] for key, mid, _ in triple_list]
This has been optimised a bit for readability (I think, be nice!) rather than raw speed. For example you might not want to write back to the dict when you don't need to, or maintain a separate max dict to prevent two key lookups in the list comprehension at the end.

Python: merging tally data

Okay - I'm sure this has been answered here before but I can't find it....
My problem: I have a list of lists with this composition
0.2 A
0.1 A
0.3 A
0.3 B
0.2 C
0.5 C
My goal is to output the following:
0.6 A
0.3 B
0.7 C
In other words, I need to merge the data from multiple lines together.
Here's the code I'm using:
unique_percents = []
for line in percents:
new_percent = float(line[0])
for inner_line in percents:
if line[1] == inner_line[1]:
new_percent += float(inner_line[0])
else:
temp = []
temp.append(new_percent)
temp.append(line[1])
unique_percents.append(temp)
break
I think it should work, but it's not adding the percents up and still has the duplicates. Perhaps I'm not understanding how "break" works?
I'll also take suggestions of a better loop structure or algorithm to use. Thanks, David.
You want to use a dict, but collections.defaultdict can come in really handy here so that you don't have to worry about whether the key exists in the dict or not -- it just defaults to 0.0:
import collections
lines = [[0.2, 'A'], [0.1, 'A'], [0.3, 'A'], [0.3, 'B'], [0.2, 'C'], [0.5, 'C']]
amounts = collections.defaultdict(float)
for amount, letter in lines:
amounts[letter] += amount
for letter, amount in sorted(amounts.iteritems()):
print amount, letter
Try this out:
result = {}
for line in percents:
value, key = line
result[key] = result.get(key, 0) + float(value)
total = {}
data = [('0.1', 'A'), ('0.2', 'A'), ('.3', 'B'), ('.4', 'B'), ('-10', 'C')]
for amount, key in data:
total[key] = total.get(key, 0.0) + float(amount)
for key, amount in total.items():
print key, amount
Since all of the letter grades are grouped together, you can use itertools.groupby (and if not, just sort the list ahead of time to make them so):
data = [
[0.2, 'A'],
[0.1, 'A'],
[0.3, 'A'],
[0.3, 'B'],
[0.2, 'C'],
[0.5, 'C'],
]
from itertools import groupby
summary = dict((k, sum(i[0] for i in items))
for k,items in groupby(data, key=lambda x:x[1]))
print summary
Gives:
{'A': 0.60000000000000009, 'C': 0.69999999999999996, 'B': 0.29999999999999999}
If you have a list of lists like this:
[ [0.2, A], [0.1, A], ...] (in fact it looks like a list of tuples :)
res_dict = {}
for pair in lst:
letter = pair[1]
val = pair[0]
try:
res_dict[letter] += val
except KeyError:
res_dict[letter] = val
res_lst = [(val, letter) for letter, val in res_dict] # note, a list of tuples!
Using collections.defaultdict to tally values
(assuming text data in d):
>>> s=collections.defaultdict(float)
>>> for ln in d:
... v,k=ln.split()
... s[k] += float(v)
>>> s
defaultdict(<type 'float'>, {'A': 0.60000000000000009, 'C': 0.69999999999999996, 'B': 0.29999999999999999})
>>> ["%s %s" % (v,k) for k,v in s.iteritems()]
['0.6 A', '0.7 C', '0.3 B']
>>>
If you are using Python 3.1 or newer, you can use collections.Counter. Also I suggest using decimal.Decimal instead of floats:
# Counter requires python 3.1 and newer
from collections import Counter
from decimal import Decimal
lines = ["0.2 A", "0.1 A", "0.3 A", "0.3 B", "0.2 C", "0.5 C"]
results = Counter()
for line in lines:
percent, label = line.split()
results[label] += Decimal(percent)
print(results)
The result is:
Counter({'C': Decimal('0.7'), 'A': Decimal('0.6'), 'B': Decimal('0.3')})
This is verbose, but works:
# Python 2.7
lines = """0.2 A
0.1 A
0.3 A
0.3 B
0.2 C
0.5 C"""
lines = lines.split('\n')
#print(lines)
pctg2total = {}
thing2index = {}
index = 0
for line in lines:
pctg, thing = line.split()
pctg = float(pctg)
if thing not in thing2index:
thing2index[thing] = index
index = index + 1
pctg2total[thing] = pctg
else:
pctg2total[thing] = pctg2total[thing] + pctg
output = ((pctg2total[thing], thing) for thing in pctg2total)
# Let's sort by the first occurrence.
output = list(sorted(output, key = lambda thing: thing2index[thing[1]]))
print(output)
>>>
[(0.60000000000000009, 'A'), (0.29999999999999999, 'B'), (0.69999999999999996, 'C')]
letters = {}
for line in open("data", "r"):
lineStrip = line.strip().split()
percent = float(lineStrip[0])
letter = lineStrip[1]
if letter in letters:
letters[letter] = percent + letters[letter]
else:
letters[letter] = percent
for letter, percent in letters.items():
print letter, percent
A 0.6
C 0.7
B 0.3
Lets say we have this
data =[(b, float(a)) for a,b in
(line.split() for line in
"""
0.2 A
0.1 A
0.3 A
0.3 B
0.2 C
0.5 C""".splitlines()
if line)]
print data
# [('A', 0.2), ('A', 0.1), ('A', 0.3), ('B', 0.3), ('C', 0.2), ('C', 0.5)]
You can now just go though this and sum
counter = {}
for letter, val in data:
if letter in counter:
counter[letter]+=val
else:
counter[letter]=val
print counter.items()
Or group values together and use sum:
from itertools import groupby
# you want the name and the sum of the values
print [(name, sum(value for k,value in grp))
# from each group
for name, grp in
# where the group name of a item `p` is given by `p[0]`
groupby(sorted(data), key=lambda p:p[0])]
>>> from itertools import groupby, imap
>>> from operator import itemgetter
>>> data = [['0.2', 'A'], ['0.1', 'A'], ['0.3', 'A'], ['0.3', 'B'], ['0.2', 'C'], ['0.5', 'C']]
>>> # data = sorted(data, key=itemgetter(1))
...
>>> for k, g in groupby(data, key=itemgetter(1)):
... print sum(imap(float, imap(itemgetter(0), g))), k
...
0.6 A
0.3 B
0.7 C
>>>

Make a python nested list for use in Django's unordered_list

I've got a Django model with a self-referencing foreign key, so my model (as a class in its most basic form) looks like:
class MyObj(object):
def __init__(self, id, ttl, pid):
self.id = id
self.name = ttl
self.parentid = pid
So a sample of my data might look like:
nodes = []
nodes.append(MyObj(1,'a',0))
nodes.append(MyObj(2,'b',0))
nodes.append(MyObj(3,'c',1))
nodes.append(MyObj(4,'d',1))
nodes.append(MyObj(5,'e',3))
nodes.append(MyObj(6,'f',2))
I've got to a point where I can convert this into a nested dictionary:
{'a': {'c': {'e': {}}, 'd': {}}, 'b': {'f': {}}}
using Converting tree list to hierarchy dict as a guide, but I need it in a form that I can use for Django's unordered_list filter.
So my question is, how can I get from (either) a nested dictionary to a nested list/tuple or straight from the source data to a nested list? I can't seem to get a recursive function to nest the lists correctly (as in a list I can't reference "sub trees" by name)
eval(string_rep_of_dictionary.replace(':',',').replace('{','[').replace('}',']')) seems to just about get me there but that seems a horrible solution?
Try
lists = {}
for n in nodes:
b = lists.setdefault(n.id, [])
lists.setdefault(n.parentid, []).extend([n.name, b])
print lists[0]
or, using collections.defaultdict
lists = collections.defaultdict(list)
for n in nodes:
lists[n.parentid] += [n.name, lists[n.id]]
print lists[0]
both of which will print
['a', ['c', ['e', []], 'd', []], 'b', ['f', []]]
Edit:To get rid of the empty lists, iterate through the nodes for a second time:
for n in nodes:
if not lists[n.id]:
lists[n.parentid].remove(lists[n.id])
def nested_dict_to_list(d):
result = []
for key, value in d.iteritems():
try:
value = nested_dict_to_list(value)
except AttributeError:
pass
result += [key, value]
return result
test = {'a': {'c': {'e': {}}, 'd': {}}, 'b': {'f': {}}}
desired_result = ['a', ['c', ['e', []], 'd', []], 'b', ['f', []]]
nested_dict_to_list(test) == desired_result
# True

Categories