How to group an array by multiple keys? - python

I'd like a function that can group a list of dictionaries into sublists of dictionaries depending on an arbitrary set of keys that all dictionaries have in common.
For example, I'd like the following list to be grouped into sublists of dictionaries depending on a certain set of keys
l = [{'name':'b','type':'new','color':'blue','amount':100},{'name':'c','type':'new','color':'red','amount':100},{'name':'d','type':'old','color':'gold','amount':100},{'name':'e','type':'old','color':'red','amount':100},
{'name':'f','type':'old','color':'red','amount':100},{'name':'g','type':'normal','color':'red','amount':100}]
If I wanted to group by type, the following list would result, which has a sublists where each sublist has the same type:
[[{'name':'b','type':'new','color':'blue','amount':100},{'name':'c','type':'new','color':'red','amount':100}],[{'name':'d','type':'old','color':'gold','amount':100},{'name':'e','type':'old','color':'red','amount':100},
{'name':'f','type':'old','color':'red','amount':100}],[{'name':'g','type':'normal','color':'red','amount':100}]]
If I wanted to group by type and color, the following would result where the list contains sublists that have the same type and color:
[[{'name':'b','type':'new','color':'blue','amount':100}],[{'name':'c','type':'new','color':'red','amount':100}],[{'name':'d','type':'old','color':'gold','amount':100}],[{'name':'e','type':'old','color':'red','amount':100},
{'name':'f','type':'old','color':'red','amount':100}],[{'name':'g','type':'normal','color':'red','amount':100}]]
I understand the following function can group by one key, but I'd like to group by multiple keys:
def group_by_key(l,i):
l = [list(grp) for key, grp in itertools.groupby(sorted(l, key=operator.itemgetter(i)), key=operator.itemgetter(i))]
This is my attempt using the group_by_function above
def group_by_multiple_keys(l,*keys):
for key in keys:
l = group_by_key(l,key)
l = [item for sublist in l for item in sublist]
return l
The issue there is that it ungroups it right after it grouped it by a key. Instead, I'd like to re-group it by another key and still have one list of sublists.

itertools.groupby() + operator.itemgetter() will do what you want. groupby() takes an iterable and a key function, and groups the items in the iterable by the value returned by passing each item to the key function. itemgetter() is a factory that returns a function, which gets the specified items from any item passed to it.
from __future__ import print_function
import pprint
from itertools import groupby
from operator import itemgetter
def group_by_keys(iterable, keys):
key_func = itemgetter(*keys)
# For groupby() to do what we want, the iterable needs to be sorted
# by the same key function that we're grouping by.
sorted_iterable = sorted(iterable, key=key_func)
return [list(group) for key, group in groupby(sorted_iterable, key_func)]
dicts = [
{'name': 'b', 'type': 'new', 'color': 'blue', 'amount': 100},
{'name': 'c', 'type': 'new', 'color': 'red', 'amount': 100},
{'name': 'd', 'type': 'old', 'color': 'gold', 'amount': 100},
{'name': 'e', 'type': 'old', 'color': 'red', 'amount': 100},
{'name': 'f', 'type': 'old', 'color': 'red', 'amount': 100},
{'name': 'g', 'type': 'normal', 'color': 'red', 'amount': 100}
]
Examples:
>>> pprint.pprint(group_by_keys(dicts, ('type',)))
[[{'amount': 100, 'color': 'blue', 'name': 'b', 'type': 'new'},
{'amount': 100, 'color': 'red', 'name': 'c', 'type': 'new'}],
[{'amount': 100, 'color': 'gold', 'name': 'd', 'type': 'old'},
{'amount': 100, 'color': 'red', 'name': 'e', 'type': 'old'},
{'amount': 100, 'color': 'red', 'name': 'f', 'type': 'old'}],
[{'amount': 100, 'color': 'red', 'name': 'g', 'type': 'normal'}]]
>>>
>>> pprint.pprint(group_by_keys(dicts, ('type', 'color')))
[[{'amount': 100, 'color': 'blue', 'name': 'b', 'type': 'new'}],
[{'amount': 100, 'color': 'red', 'name': 'c', 'type': 'new'}],
[{'amount': 100, 'color': 'gold', 'name': 'd', 'type': 'old'}],
[{'amount': 100, 'color': 'red', 'name': 'e', 'type': 'old'},
{'amount': 100, 'color': 'red', 'name': 'f', 'type': 'old'}],
[{'amount': 100, 'color': 'red', 'name': 'g', 'type': 'normal'}]]

Related

How to convert excel data to json in python?

My data is below
food ID
name
ingredients
ingredient ID
amount
unit
1
rice
red
R1
10
g
1
soup
blue
B1
20
g
1
soup
yellow
Y1
30
g
and I want to convert it like this
{
'data': [
{
'name': 'rice',
'ingredients': [
{
'name': 'red',
'ingredient_id':'R1',
'amount': 10,
'unit': 'g',
}
]
},
{
'name': 'soup',
'ingredients': [
{
'name': 'blue',
'ingredient_id':'B1',
'amount': 20,
'unit': 'g',
},
{
'name': 'yellow',
'ingredient_id':'Y1',
'amount': 30,
'unit': 'g',
}
]
}
]
}
How can I do it? Do I need to use the same library as pandas?
Yes you can modify your data by using custom code function inside python.
For your required format you need to use this code for format your data into json.
import pandas as pd
data = [[1, 'rice', 'red', 'R1', 10, 'g'],
[1, 'soup', 'blue', 'B1', 20, 'g'],
[1, 'soup', 'yellow', 'Y1', 30, 'g'],
[1, 'apple', 'yellow', 'Y1', 30, 'g']]
df = pd.DataFrame(data, columns=['food ID', 'name', 'ingredients', 'ingredient ID', 'amount', 'unit'])
def convert_data_group(group):
ingredients = [{'name': row['ingredients'], 'ingredient_id': row['ingredient ID'], 'amount': row['amount'], 'unit': row['unit']} for _, row in group.iterrows()]
return {'name': group.iloc[0]['name'], 'ingredients': ingredients}
unique_names = df['name'].unique().tolist()
result = []
for name in unique_names:
group = df[df['name'] == name]
result.append(convert_data_group(group))
final_result = {'datas': result}
print(final_result)
Your final result will be:
{'datas': [{'name': 'rice', 'ingredients': [{'name': 'red', 'ingredient_id': 'R1', 'amount': 10, 'unit': 'g'}]}, {'name': 'soup', 'ingredients': [{'name': 'blue', 'ingredient_id': 'B1', 'amount': 20, 'unit': 'g'}, {'name': 'yellow', 'ingredient_id': 'Y1', 'amount': 30, 'unit': 'g'}]}, {'name': 'apple', 'ingredients': [{'name': 'yellow', 'ingredient_id': 'Y1', 'amount': 30, 'unit': 'g'}]}]}

merger two list of dictionaries with common key

I have 2 list of dictionaries:-
x = [{'Name': 'SG', 'State': 'All good'}, {'Name': 'AA', 'State': 'All good'}]
y = [{'Name': 'SG', 'Alias': 'blue', 'Status': 'Started'},
{'Name': 'AA', 'Alias': 'blue', 'Status': 'Started'}]
Would like to merge them both with y showing as:
y = [{'Name': 'SG', 'Alias': 'blue', 'Status': 'Started', 'State: 'All good'},
{'Name': 'AA', 'Alias': 'blue', 'Status': 'Started', 'State: 'All good'}]
Below code does not give the desired result:
for i in range(len(x)):
for k, v in x[i]:
y[i][k] = v
NOte: x and y both the list have the same number of dictionaries and both have a matching "Name"
Here's a simpler way that does the same as what I think you're trying to do in your code:
>>> for d1, d2 in zip(x, y):
... d2.update(d1)
...
>>> y
[{'Name': 'SG', 'Alias': 'blue', 'Status': 'Started', 'State': 'All good'}, {'Name': 'AA', 'Alias': 'blue', 'Status': 'Started', 'State': 'All good'}]
If the dictionnaries are perfectly matching by pairs, look #Iguananaut solution
If not, and you have to check the Name field, I'd suggest you build intermediate dict {Name:value}, then you iterate to retrieve the possible informations in both, that allow to have missing values (missing Name) in any of the dict
x = [{'Name': 'SG', 'State': 'All good'}, {'Name': 'AA', 'State': 'All good'}]
y = [{'Name': 'SG', 'Alias': 'blue', 'Status': 'Started'},
{'Name': 'AA', 'Alias': 'blue', 'Status': 'Started'}]
prepare_x = {row['Name']: row for row in x}
prepare_y = {row['Name']: row for row in y}
result = [{**prepare_x.get(key, {}), **prepare_y.get(key, {})}
for key in (prepare_x.keys() | prepare_y.keys())]

How to move a key value in a dictionary list one level up in python

Using Python3, I'm trying to move on key value pair in a dictionary list up on level.
I have a variable called product that has the following:
[
{'color': 'red',
'shape': 'round',
'extra': {'price': 'large',
'onsale': 'yes',
'instock: 'yes'}
},
{'color': 'blue',
'shape': 'square',
'extra': {'price': 'small',
'onsale': 'no',
'instock: 'yes'}
}
]
I'd like to move the key value pair of "instock" within extra up one level, to be on par with color, shape, extra - so this:
[
{'color': 'red',
'shape': 'round',
'extra': {'price': 'large',
'instock: 'yes'},
'onsale': 'yes'
},
{'color': 'blue',
'shape': 'square',
'extra': {'price': 'small',
'onsale': 'no'},
'instock: 'yes'
}
]
I tried playing with the following code that I found here:
result = {}
for i in products:
if i["href"] not in result:
result[i["selection_id"]] = {'selection_id': i["selection_id"], 'other_data': i["other_data"], 'value_dict': []}
result[i["selection_id"]]["value_dict"].append({'value': i["value"], "value_name": i["value_name"]})
It didn't work for me.
Any help or additional literature that I can find online would be greatly appreciated!
Very simple: iterate through the list. For each dict, copy "extra"."instock" up one level and delete the original:
for outer_dict in product:
outer_dict["instock"] = outer_dict["extra"]["instock"]
del outer_dict["extra"]["instock"]
for outer_dict in product:
print(outer_dict)
Output:
{'color': 'red', 'shape': 'round', 'extra': {'price': 'large', 'onsale': 'yes'}, 'instock': 'yes'}
{'color': 'blue', 'shape': 'square', 'extra': {'price': 'small', 'onsale': 'no'}, 'instock': 'yes'}
lst = [
{'color': 'red',
'shape': 'round',
'extra': {'price': 'large',
'onsale': 'yes',
'instock': 'yes'}
},
{'color': 'blue',
'shape': 'square',
'extra': {'price': 'small',
'onsale': 'no',
'instock': 'yes'}
}
]
for d in lst:
d['instock'] = d['extra'].pop('instock')
# pretty print on screen:
from pprint import pprint
pprint(lst)
Prints:
[{'color': 'red',
'extra': {'onsale': 'yes', 'price': 'large'},
'instock': 'yes',
'shape': 'round'},
{'color': 'blue',
'extra': {'onsale': 'no', 'price': 'small'},
'instock': 'yes',
'shape': 'square'}]
Or you could use:
d['extra'].pop('instock', 'no')
in case there's no instock key (the default value is no in this case)
products = [
{'color': 'red',
'shape': 'round',
'extra': {'price': 'large',
'onsale': 'yes',
'instock': 'yes'}
},
{'color': 'blue',
'shape': 'square',
'extra': {'price': 'small',
'onsale': 'no',
'instock': 'yes'}
}
]
result_list = []
result = {}
for item in products:
for key,values in item.items():
if isinstance(values,dict):
for inner_key, inner_value in values.items():
#remove me if you want all of the inner items to level-up
if inner_key == "instock":
result[inner_key] = inner_value
else:
result[key] = values
result_list.append(result)
print (result_list)
output:
[{'color': 'blue', 'shape': 'square', 'instock': 'yes'}, {'color': 'blue', 'shape': 'square', 'instock': 'yes'}]
added comment to clarify where to modify in case you want other key to be level-up as well

Combine two lists of dictionaries by value of key in dictionaries

I have two lists whose elements are dictionaries.
list1 = [
{'id': 1, 'color': 'purple', 'size': 10},
{'id': 2, 'color': 'red', 'size': 25},
{'id': 3, 'color': 'orange', 'size': 1},
{'id': 4, 'color': 'black', 'size': 100},
{'id': 5, 'color': 'green', 'size': 33}
]
list2 = [
{'id': 2, 'width': 22, 'age': 22.3},
{'id': 5, 'width': 9, 'age': 1.7}
]
I want a third list that is the same length as the larger list, and where there is a dictionary element in the smaller list that has an id that matches a dictionary element in the larger list, merge the two dictionaries, so that the final output would look like:
list3 = [
{'id': 1, 'color': 'purple', 'size': 10},
{'id': 2, 'color': 'red', 'size': 25, 'width': 22, 'age': 22.3},
{'id': 3, 'color': 'orange', 'size': 1},
{'id': 4, 'color': 'black', 'size': 100},
{'id': 5, 'color': 'green', 'size': 33, 'width': 9, 'age': 1.7}
]
Ideally if this could be done without looping over both lists, that would be ideal.
Try this nested list comprehension with a dictionary with unpacking, and a next, as well as another list comprehension:
list3 = [{**i, **next(iter([x for x in list2 if x['id'] == i['id']]), {})} for i in list1]
And now:
print(list3)
Is:
[{'id': 1, 'color': 'purple', 'size': 10}, {'id': 2, 'color': 'red', 'size': 25, 'width': 22, 'age': 22.3}, {'id': 3, 'color': 'orange', 'size': 1}, {'id': 4, 'color': 'black', 'size': 100}, {'id': 5, 'color': 'green', 'size': 33, 'width': 9, 'age': 1.7}]
Use defaultdict
from collections import defaultdict
list1 = [
{'id': 1, 'color': 'purple', 'size': 10},
{'id': 2, 'color': 'red', 'size': 25},
{'id': 3, 'color': 'orange', 'size': 1},
{'id': 4, 'color': 'black', 'size': 100},
{'id': 5, 'color': 'green', 'size': 33}
]
list2 = [
{'id': 2, 'width': 22, 'age': 22.3},
{'id': 5, 'width': 9, 'age': 1.7}
]
dict1 = defaultdict(dict)
for l in (list1, list2):
for elem in l:
dict1[elem['id']].update(elem)
list3 = dict1.values()
print(list(list3))
O/P:
[
{
'id': 1,'color': 'purple','size': 10
},
{
'id': 2,'color': 'red','size': 25,'width': 22,'age': 22.3
},
{
'id': 3,'color': 'orange','size': 1
},
{
'id': 4,'color': 'black','size': 100
},
{
'id': 5, 'color': 'green','size': 33, 'width': 9,'age': 1.7
}
]
list3 is not guaranteed to be sorted (.values() returns items in no
specific order, you can try this to sort.
from operator import itemgetter
...
new_list = sorted(dict1.values(), key=itemgetter("id"))

Append Python List of Dictionaries by loops

I have 2 Python List of Dictionaries:
[{'index':'1','color':'red'},{'index':'2','color':'blue'},{'index':'3','color':'green'}]
&
[{'device':'1','name':'x'},{'device':'2','name':'y'},{'device':'3','name':'z'}]
How can I Append each dictionary of second list to the first list so as to get an output as:
[{'device':'1','name':'x'},{'index':'1','color':'red'},{'index':'2','color':'blue'},{'index':'3','color':'green'}]
[{'device':'2','name':'y'},{'index':'1','color':'red'},{'index':'2','color':'blue'},{'index':'3','color':'green'}]
[{'device':'3','name':'z'},{'index':'1','color':'red'},{'index':'2','color':'blue'},{'index':'3','color':'green'}]
I think that the following code answers your question:
indexes = [
{'index':'1','color':'red'},
{'index':'2','color':'blue'},
{'index':'3','color':'green'}
]
devices = [
{'device':'1','name':'x'},
{'device':'2','name':'y'},
{'device':'3','name':'z'}
]
new_lists = [[device] for device in devices]
for new_list in new_lists:
new_list.extend(indexes)
I don't know where you wanted to save your result lists, so I printed them out:
d1 = [{'index':'1','color':'red'},{'index':'2','color':'blue'},{'index':'3','color':'green'}]
d2 = [{'device':'1','name':'x'},{'device':'2','name':'y'},{'device':'3','name':'z'}]
for item in d2:
print ([item] + d1)
The output:
[{'name': 'x', 'device': '1'}, {'index': '1', 'color': 'red'}, {'index': '2', 'color': 'blue'}, {'index': '3', 'color': 'green'}]
[{'name': 'y', 'device': '2'}, {'index': '1', 'color': 'red'}, {'index': '2', 'color': 'blue'}, {'index': '3', 'color': 'green'}]
[{'name': 'z', 'device': '3'}, {'index': '1', 'color': 'red'}, {'index': '2', 'color': 'blue'}, {'index': '3', 'color': 'green'}]
(Don't be confused by order of items in individual directories as directories are not ordered.)

Categories