This question already has answers here:
How to modify list entries during for loop?
(10 answers)
Closed 1 year ago.
I have a list of lists containing dictionaries:
[[{'event_id': 1, 'order_id': 1, 'item_id': 1, 'count': 1, 'return_count': 0, 'status': 'OK'},
{'event_id': 2, 'order_id': 1, 'item_id': 1, 'count': 1, 'return_count': 0, 'status': 'OK'}],
[{'order_id': 2, 'item_id': 1, 'event_id': 1, 'count': 3, 'return_count': 1, 'status': 'OK'},
{'order_id': 2, 'event_id': 2, 'item_id': 1, 'count': 3, 'return_count': 1, 'status': 'OK'},
{'order_id': 2, 'event_id': 1, 'item_id': 2, 'count': 4, 'return_count': 2, 'status': 'OK'}]]
For each item in the given order I only need those dictionaries whose event_id is max. So I wrote the following code:
for el in lst:
for element in el:
if element['event_id'] != max(x['event_id'] for x in el if element['item_id'] == x['item_id']):
el.remove(element)
lst is the initial list.
For some reason, after running the code lst remains unchanged.
This isn't in one line, but it does return dictionaries with the max event id
dictlist = [
[{'event_id': 1, 'order_id': 1, 'item_id': 1, 'count': 1, 'return_count': 0, 'status': 'OK'},
{'event_id': 2, 'order_id': 1, 'item_id': 1, 'count': 1, 'return_count': 0, 'status': 'OK'}],
[{'order_id': 2, 'item_id': 1, 'event_id': 1, 'count': 3, 'return_count': 1, 'status': 'OK'},
{'order_id': 2, 'event_id': 2, 'item_id': 1, 'count': 3, 'return_count': 1, 'status': 'OK'},
{'order_id': 2, 'event_id': 1, 'item_id': 2, 'count': 4, 'return_count': 2, 'status': 'OK'}]]
max = 0
parsed = []
for item in dictlist:
for i in item:
if i['event_id'] > max:
max = i['event_id']
for item in dictlist:
for dic in item:
if dic['event_id'] == max:
parsed.append(dic)
You're trying to remove an element from a list you're iterating over, that won't work. And it's really hard to understand what you're trying to do; my suggestion would be to do something like this:
newlst = []
for el in lst:
max_event_id = max(element['event_id'] for element in el)
max_event_element = next(element for element in el if element['event_id'] == max_event_id)
newlst.append(max_event_element)
The expected result ends up in the newlst variable.
sort on "event_id" and keep only the max (last element):
result = [sorted(l, key=lambda x: x["event_id"])[-1] for l in lst]
If you want to keep all dictionaries with the max "event_id":
lsts = [[x for x in l if x["event_id"]==max(l, key=lambda x: x["event_id"])["event_id"]] for l in lst]
result = [item for sublist in lsts for item in lsts]
Related
I want to consolidate a list of lists (of dicts), but I have honestly no idea how to get it done.
The list looks like this:
l1 = [
[
{'id': 1, 'category': 5}, {'id': 3, 'category': 7}
],
[
{'id': 1, 'category': 5}, {'id': 4, 'category': 8}, {'id': 6, 'category': 9}
],
[
{'id': 6, 'category': 9}, {'id': 9, 'category': 16}
],
[
{'id': 2, 'category': 4}, {'id': 5, 'category': 17}
]
]
If one of the dicts from l1[0] is also present in l1[1], I want to concatenate the two lists and delete l1[0]. Afterwards I want to check if there are values from l1[1] also present in l1[2].
So my desired output would eventually look like this:
new_list = [
[
{'id': 1, 'category': 5}, {'id': 3, 'category': 7}, {'id': 4, 'category': 8}, {'id': 6, 'category': 9}, {'id': 9, 'category': 16}
],
[
{'id': 2, 'category': 4}, {'id': 5, 'category': 17}
]
]
Any idea how it can be done?
I tried it with 3 different for loops, but it wouldnt work, because I change the length of the list and by doing so I provoke an index-out-of-range error (apart from that it would be an ugly solution anyway):
for list in l1:
for dictionary in list:
for index in range(0, len(l1), 1):
if dictionary in l1[index]:
dictionary in l1[index].append(list)
dictionary.remove(list)
Can I apply some map or list_comprehension here?
Thanks a lot for any help!
IIUC, the following algorithm works.
Initialize result to empty
For each sublist in l1:
if sublist and last item in result overlap
append into last list of result without overlapping items
otherwise
append sublist at end of result
Code
# Helper functions
def append(list1, list2):
' append list1 and list2 (without duplicating elements) '
return list1 + [d for d in list2 if not d in list1]
def is_intersect(list1, list2):
' True if list1 and list2 have an element in common '
return any(d in list2 for d in list1) or any(d in list1 for d in list2)
# Generate desired result
result = [] # resulting list
for sublist in l1:
if not result or not is_intersect(sublist, result[-1]):
result.append(sublist)
else:
# Intersection with last list, so append to last list in result
result[-1] = append(result[-1], sublist)
print(result)
Output
[[{'id': 1, 'category': 5},
{'id': 3, 'category': 7},
{'id': 4, 'category': 8},
{'id': 6, 'category': 9},
{'id': 9, 'category': 16}],
[{'id': 2, 'category': 4}, {'id': 5, 'category': 17}]]
maybe you can try to append the elements into a new list. by doing so, the original list will remain the same and index-out-of-range error wouldn't be raised.
new_list = []
for list in l1:
inner_list = []
for ...
if dictionary in l1[index]:
inner_list.append(list)
...
new_list.append(inner_list)
I have a table of dicts looking like:
[{
'variant_id': 4126274,
'stock': [
{'stock_id': 6, 'quantity': 86},
{'stock_id': 4, 'quantity': 23},
{'stock_id': 3, 'quantity': 9}
]
}, ...]
My goal is to unzip every piece of stock to look like this:
[{'variant_id': 4126274, 'stock_id': 6, 'quantity':86}
{'variant_id': 4126274, 'stock_id': 4, 'quantity':23}
{'variant_id': 4126274, 'stock_id': 3, 'quantity':9}...]
Is there any fast and optimal way to do this?
You could do something like:
result = [{'variant_id': entry['variant_id'],
'stock_id': stock_entry['stock_id'],
'quantity': stock_entry['quantity']} for entry in table for stock_entry in entry['stock']]
This gives
[{'quantity': 86, 'stock_id': 6, 'variant_id': 4126274},
{'quantity': 23, 'stock_id': 4, 'variant_id': 4126274},
{'quantity': 9, 'stock_id': 3, 'variant_id': 4126274}]
Here are two approaches: one with nested for loops, and one with a list comprehension.
data = [{'variant_id': 4126274, 'stock': [{'stock_id': 6, 'quantity': 86}, {'stock_id': 4, 'quantity': 23}, {'stock_id': 3, 'quantity': 9}]}]
result = []
for entry in data:
for stock in entry['stock']:
result.append({'variant_id': entry['variant_id'], 'stock_id': stock['stock_id'], 'quantity': stock['quantity']})
print(result)
result_list_comprehension = [{'variant_id': entry['variant_id'], 'stock_id': stock['stock_id'], 'quantity': stock['quantity']} for entry in data for stock in entry['stock']]
print(result_list_comprehension)
I have a list of dictionaries that state a date as well as a price. It looks like this:
dict = [{'Date':datetime.datetime(2020, 6, 1, 0, 0), 'Price': 50}, {'Date':datetime.datetime(2020, 6, 1, 0, 0), 'Price': 12}, {'Date':datetime.datetime(2020, 6, 2, 0, 0), 'Price': 60}]
I'd like to create a new list of dictionaries that sum all the Price values that are on the same date. So the output would look like this:
output_dict = [{'Date':datetime.datetime(2020, 6, 1, 0, 0), 'Price': 62}, {'Date':datetime.datetime(2020, 6, 2, 0, 0), 'Price': 60}]
How could I achieve this?
You can use Counter from collections module:
from collections import Counter
c = Counter()
for v in dict:
c[v['Date']] += v['Price']
output_dict = [{'Date': name, 'Price': count} for name, count in c.items()]
Output:
[{'Date': datetime.datetime(2020, 6, 1, 0, 0), 'Price': 62},
{'Date': datetime.datetime(2020, 6, 2, 0, 0), 'Price': 60}]
OR, a new way:
You can use Pandas library to solve this:
Install pandas like:
pip install pandas
Then code would be:
import pandas as pd
output_dict = pd.DataFrame(dict).groupby('Date').agg(sum).to_dict()['Price']
Output:
{Timestamp('2020-06-01 00:00:00'): 62, Timestamp('2020-06-02 00:00:00'): 60}
Another solution using itertools.groupby:
import datetime
from itertools import groupby
dct = [{'Date':datetime.datetime(2020, 6, 1, 0, 0), 'Price': 50}, {'Date':datetime.datetime(2020, 6, 1, 0, 0), 'Price': 12}, {'Date':datetime.datetime(2020, 6, 2, 0, 0), 'Price': 60}]
out = []
for k, g in groupby(dct, lambda k: k['Date']):
out.append({'Date': k, 'Price': sum(v['Price'] for v in g)})
print(out)
Prints:
[{'Date': datetime.datetime(2020, 6, 1, 0, 0), 'Price': 62}, {'Date': datetime.datetime(2020, 6, 2, 0, 0), 'Price': 60}]
You can use itertools' groupby, although I'd like to believe that defaultdict will be faster :
#sort dicts
dicts = sorted(dicts, key= itemgetter("Date"))
#get the sum via itertools' groupby
result = [{"Date" : key,
"Price" : sum(entry['Price'] for entry in value)}
for key,value in
groupby(dicts, key = itemgetter("Date"))]
print(result)
[{'Date': datetime.datetime(2020, 6, 1, 0, 0), 'Price': 62},
{'Date': datetime.datetime(2020, 6, 2, 0, 0), 'Price': 60}]
Using defaultdict
import datetime
from collections import defaultdict
dct = [{'Date': datetime.datetime(2020, 6, 1, 0, 0), 'Price': 50},
{'Date': datetime.datetime(2020, 6, 1, 0, 0), 'Price': 12},
{'Date': datetime.datetime(2020, 6, 2, 0, 0), 'Price': 60}]
sum_up = defaultdict(int)
for v in dct:
sum_up[v['Date']] += v['Price']
print([{"Date": k, "Price": v} for k, v in sum_up.items()])
[{'Date': datetime.datetime(2020, 6, 1, 0, 0), 'Price': 62}, {'Date': datetime.datetime(2020, 6, 2, 0, 0), 'Price': 60}]
This a good use-case for defaultdict, let's say our dict is my_dict:
import datetime
my_dict = [{'Date':datetime.datetime(2020, 6, 1, 0, 0), 'Price': 50},
{'Date':datetime.datetime(2020, 6, 1, 0, 0), 'Price': 12},
{'Date':datetime.datetime(2020, 6, 2, 0, 0), 'Price': 60}]
We can accumulate prices using a defaultdict like so:
from collections import defaultdict
new_dict = defaultdict(int)
for dict_ in my_dict:
new_dict[dict_['Date']] += dict_['Price']
Then we just reconvert this dict into a list of dicts!:
my_dict = [{'Date': date, 'Price': price} for date, price in new_dict.items()]
I have several lists of dictionaries, where each dictionary contains a unique id value that is common among all lists. I'd like to combine them into a single list of dicts, where each dict is joined on that id value.
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
desired_output = [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
I tried doing something like the answer found at https://stackoverflow.com/a/42018660/7564393, but I'm getting very confused since I have more than 2 lists. Should I try using a defaultdict approach? More importantly, I am NOT always going to know the other values, only that the id value is present in all dicts.
You can use itertools.groupby():
from itertools import groupby
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
desired_output = []
for _, values in groupby(sorted([*list1, *list2, *list3], key=lambda x: x['id']), key=lambda x: x['id']):
temp = {}
for d in values:
temp.update(d)
desired_output.append(temp)
Result:
[{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
# combine all lists
d = {} # id -> dict
for l in [list1, list2, list3]:
for list_d in l:
if 'id' not in list_d: continue
id = list_d['id']
if id not in d:
d[id] = list_d
else:
d[id].update(list_d)
# dicts with same id are grouped together since id is used as key
res = [v for v in d.values()]
print(res)
You can first build a dict of dicts, then turn it into a list:
from itertools import chain
from collections import defaultdict
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
dict_out = defaultdict(dict)
for d in chain(list1, list2, list3):
dict_out[d['id']].update(d)
out = list(dict_out.values())
print(out)
# [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
itertools.chain allows you to iterate on all the dicts contained in the 3 lists. We build a dict dict_out having the id as key, and the corresponding dict being built as value. This way, we can easily update the already built part with the small dict of our current iteration.
Here, I have presented a functional approach without using itertools (which is excellent in rapid development work).
This solution will work for any number of lists as the function takes variable number of arguments and also let user to specify the type of return output (list/dict).
By default it returns list as you want that otherwise it returns dictionary in case if you pass as_list = False.
I preferred dictionary to solve this because its fast and search complexity is also less.
Just have a look at the below get_packed_list() function.
get_packed_list()
def get_packed_list(*dicts_lists, as_list=True):
output = {}
for dicts_list in dicts_lists:
for dictionary in dicts_list:
_id = dictionary.pop("id") # id() is in-built function so preferred _id
if _id not in output:
# Create new id
output[_id] = {"id": _id}
for key in dictionary:
output[_id][key] = dictionary[key]
dictionary["id"] = _id # push back the 'id' after work (call by reference mechanism)
if as_list:
return [output[key] for key in output]
return output # dictionary
Test
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
output = get_packed_list(list1, list2, list3)
print(output)
# [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
output = get_packed_list(list1, list2, list3, as_list=False)
print(output)
# {1: {'id': 1, 'value': 20, 'sum': 10, 'total': 30}, 2: {'id': 2, 'value': 21, 'sum': 11, 'total': 32}}
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
print(list1+list2+list3)
list1 = [{'id': 1, 'value': 20}, {'id': 2, 'value': 21}]
list2 = [{'id': 1, 'sum': 10}, {'id': 2, 'sum': 11}]
list3 = [{'id': 1, 'total': 30}, {'id': 2, 'total': 32}]
result = []
for i in range(0,len(list1)):
final_dict = dict(list(list1[i].items()) + list(list2[i].items()) + list(list3[i].items()))
result.append(final_dict)
print(result)
output : [{'id': 1, 'value': 20, 'sum': 10, 'total': 30}, {'id': 2, 'value': 21, 'sum': 11, 'total': 32}]
I want to merge two lists of dictionaries, using multiple keys.
I have a single list of dicts with one set of results:
l1 = [{'id': 1, 'year': '2017', 'resultA': 2},
{'id': 2, 'year': '2017', 'resultA': 3},
{'id': 1, 'year': '2018', 'resultA': 3},
{'id': 2, 'year': '2018', 'resultA': 5}]
And another list of dicts for another set of results:
l2 = [{'id': 1, 'year': '2017', 'resultB': 5},
{'id': 2, 'year': '2017', 'resultB': 8},
{'id': 1, 'year': '2018', 'resultB': 7},
{'id': 2, 'year': '2018', 'resultB': 9}]
And I want to combine them using the 'id' and 'year' keys to get the following:
all = [{'id': 1, 'year': '2017', 'resultA': 2, 'resultB': 5},
{'id': 2, 'year': '2017', 'resultA': 3, 'resultB': 8},
{'id': 1, 'year': '2018', 'resultA': 3, 'resultB': 7},
{'id': 2, 'year': '2018', 'resultA': 5, 'resultB': 9}]
I know that for combining two lists of dicts on a single key, I can use this:
l1 = {d['id']:d for d in l1}
all = [dict(d, **l1.get(d['id'], {})) for d in l2]
But it ignores the year, providing the following incorrect result:
all = [{'id': 1, 'year': '2018', 'resultA': 3, 'resultB': 5},
{'id': 2, 'year': '2018', 'resultA': 5, 'resultB': 8},
{'id': 1, 'year': '2018', 'resultA': 3, 'resultB': 7},
{'id': 2, 'year': '2018', 'resultA': 5, 'resultB': 9}]
Treating this as I would in R, by adding in the second variable I want to merge on, I get a KeyError:
l1 = {d['id','year']:d for d in l1}
all = [dict(d, **l1.get(d['id','year'], {})) for d in l2]
How do I merge using multiple keys?
Instead of d['id','year'], use the tuple (d['id'], d['year']) as your key.
You can combine both list and groupby the resulting list on id and year. Then merge the dict together that have same keys.
Grouping can be achieved by using itertools.groupby, and merge can be done using collection.ChainMap
>>> from itertools import groupby
>>> from collections import ChainMap
>>> [dict(ChainMap(*list(g))) for _,g in groupby(sorted(l1+l2, key=lambda x: (x['id'],x['year'])),key=lambda x: (x['id'],x['year']))]
>>> [{'resultA': 2, 'id': 1, 'resultB': 5, 'year': '2017'}, {'resultA': 3, 'id': 1, 'resultB': 7, 'year': '2018'}, {'resultA': 3, 'id': 2, 'resultB': 8, 'year': '2017'}, {'resultA': 5, 'id': 2, 'resultB': 9, 'year': '2018'}]
Alternatively to avoid lambda you can also use operator.itemgetter
>>> from operator import itemgetter
>>> [dict(ChainMap(*list(g))) for _,g in groupby(sorted(l1+l2, key=itemgetter('id', 'year')),key=itemgetter('id', 'year'))]
Expanding on #AlexHall's suggestion, you can use collections.defaultdict to help you:
from collections import defaultdict
d = defaultdict(dict)
for i in l1 + l2:
results = {k: v for k, v in i.items() if k not in ('id', 'year')}
d[(i['id'], i['year'])].update(results)
Result
defaultdict(dict,
{(1, '2017'): {'resultA': 2, 'resultB': 5},
(1, '2018'): {'resultA': 3, 'resultB': 7},
(2, '2017'): {'resultA': 3, 'resultB': 8},
(2, '2018'): {'resultA': 5, 'resultB': 9}})