How to combine dups from dictionary with Python [duplicate] - python

I have this list of dictionaries:
"ingredients": [
{
"unit_of_measurement": {"name": "Pound (Lb)", "id": 13},
"quantity": "1/2",
"ingredient": {"name": "Balsamic Vinegar", "id": 12},
},
{
"unit_of_measurement": {"name": "Pound (Lb)", "id": 13},
"quantity": "1/2",
"ingredient": {"name": "Balsamic Vinegar", "id": 12},
},
{
"unit_of_measurement": {"name": "Tablespoon", "id": 15},
"ingredient": {"name": "Basil Leaves", "id": 14},
"quantity": "3",
},
]
I want to be able to find the duplicates of ingredients (by either name or id). If there are duplicates and have the same unit_of_measurement, combine them into one dictionary and add the quantity accordingly. So the above data should return:
[
{
"unit_of_measurement": {"name": "Pound (Lb)", "id": 13},
"quantity": "1",
"ingredient": {"name": "Balsamic Vinegar", "id": 12},
},
{
"unit_of_measurement": {"name": "Tablespoon", "id": 15},
"ingredient": {"name": "Basil Leaves", "id": 14},
"quantity": "3",
},
]
How do I go about it?

Assuming you have a dictionary represented like this:
data = {
"ingredients": [
{
"unit_of_measurement": {"name": "Pound (Lb)", "id": 13},
"quantity": "1/2",
"ingredient": {"name": "Balsamic Vinegar", "id": 12},
},
{
"unit_of_measurement": {"name": "Pound (Lb)", "id": 13},
"quantity": "1/2",
"ingredient": {"name": "Balsamic Vinegar", "id": 12},
},
{
"unit_of_measurement": {"name": "Tablespoon", "id": 15},
"ingredient": {"name": "Basil Leaves", "id": 14},
"quantity": "3",
},
]
}
What you could do is use a collections.defaultdict of lists to group the ingredients by a (name, id) grouping key:
from collections import defaultdict
ingredient_groups = defaultdict(list)
for ingredient in data["ingredients"]:
key = tuple(ingredient["ingredient"].items())
ingredient_groups[key].append(ingredient)
Then you could go through the grouped values of this defaultdict, and calculate the sum of the fraction quantities using fractions.Fractions. For unit_of_measurement and ingredient, we could probably just use the first grouped values.
from fractions import Fraction
result = [
{
"unit_of_measurement": value[0]["unit_of_measurement"],
"quantity": str(sum(Fraction(ingredient["quantity"]) for ingredient in value)),
"ingredient": value[0]["ingredient"],
}
for value in ingredient_groups.values()
]
Which will then give you this result:
[{'ingredient': {'id': 12, 'name': 'Balsamic Vinegar'},
'quantity': '1',
'unit_of_measurement': {'id': 13, 'name': 'Pound (Lb)'}},
{'ingredient': {'id': 14, 'name': 'Basil Leaves'},
'quantity': '3',
'unit_of_measurement': {'id': 15, 'name': 'Tablespoon'}}]
You'll probably need to amend the above to account for ingredients with different units or measurements, but this should get you started.

Related

ordering a dictionary by count of items across a number of key value lists

hopefully he the title is not too confusing, I have a dictionary (sample below) whereby im trying to sort the dictionary by the number of list (dictionary items) across a number of key values beneath a parent. Hopefully the example makes more sense then my description?
{
"data": {
"London": {
"SHOP 1": [
{
"kittens": 10,
"type": "fluffy"
},
{
"puppies": 11,
"type": "squidgy"
}
],
"SHOP 2": [
{
"kittens": 15,
"type": "fluffy"
},
{
"puppies": 3,
"type": "squidgy"
},
{
"fishes": 132,
"type": "floaty"
}
]
},
"Manchester": {
"SHOP 1": [
{
"kittens": 10,
"type": "fluffy"
},
{
"puppies": 11,
"type": "squidgy"
}
],
"SHOP 2": [
{
"kittens": 15,
"type": "fluffy"
},
{
"puppies": 3,
"type": "squidgy"
},
{
"fishes": 132,
"type": "floaty"
}
],
"SHOP 3": [
{
"kittens": 15,
"type": "fluffy"
},
{
"puppies": 3,
"type": "squidgy"
},
]
},
"Edinburgh": {
"SHOP 1": [
{
"kittens": 10,
"type": "fluffy"
},
{
"puppies": 11,
"type": "squidgy"
}
],
"SHOP 2": [
{
"kittens": 15,
"type": "fluffy"
},
],
"SHOP 3": [
{
"puppies": 3,
"type": "squidgy"
},
]
}
}
}
Summary
# London 2 shops, 5 item dictionaries total
# Machester 3 shops, 7 item dictionaries total
# Edinburgh 3 shops, 4 item dictionaries total
Desired sorting would be by total items across the shops, so ordered Manchester, London, Edinburgh
id usually use somethign like the below to sort, but im not sure how to do this oen with it being counting the number of items across a number of keys?
{k: v for k, v in sorted(x.items(), key=lambda item: item[1])}
You need to reverse sort based on the total number of items for each location, which you can generate as:
sum(len(i) for i in s.values())
where s is the shop dictionary for each location.
Putting this into a sorted expression:
dict(sorted(d['data'].items(), key=lambda t:sum(len(i) for i in t[1].values()), reverse=True))
gives:
{
'Manchester': {
'SHOP 1': [{'kittens': 10, 'type': 'fluffy'}, {'puppies': 11, 'type': 'squidgy'}],
'SHOP 2': [{'kittens': 15, 'type': 'fluffy'}, {'puppies': 3, 'type': 'squidgy'}, {'fishes': 132, 'type': 'floaty'}],
'SHOP 3': [{'kittens': 15, 'type': 'fluffy'}, {'puppies': 3, 'type': 'squidgy'}]
},
'London': {
'SHOP 1': [{'kittens': 10, 'type': 'fluffy'}, {'puppies': 11, 'type': 'squidgy'}],
'SHOP 2': [{'kittens': 15, 'type': 'fluffy'}, {'puppies': 3, 'type': 'squidgy'}, {'fishes': 132, 'type': 'floaty'}]
},
'Edinburgh': {
'SHOP 1': [{'kittens': 10, 'type': 'fluffy'}, {'puppies': 11, 'type': 'squidgy'}],
'SHOP 2': [{'kittens': 15, 'type': 'fluffy'}], 'SHOP 3': [{'puppies': 3, 'type': 'squidgy'}]
}
}
No need to make things complex:
adict = adict['data']
result = []
for capital, value in adict.items():
shop_count = len(value)
items = sum([len(obj) for obj in value.values()])
result.append((capital, shop_count, items))
for capital, shop_count, items in sorted(result, key=lambda x: x[2], reverse=True):
print(f'{capital} {shop_count} shops, {items} item dictionaries total')
Output:
Manchester 3 shops, 7 item dictionaries total
London 2 shops, 5 item dictionaries total
Edinburgh 3 shops, 4 item dictionaries total

To delete dictionaries with duplicate values for keys in two dictionary lists

I want to remove a dictionary from the l2 list that has duplicate values for the ["name"] key in two dictionary lists.
How do I do this?
l = [
{
"id": 1,
"name": "John"
},
{
"id": 2,
"name": "Tom"
}
]
l2 = [
{
"name": "John",
"gender": "male",
"country": "USA"
},
{
"name": "Alex",
"gender": "male"
"country": "Canada"
},
{
"name": "Sofía",
"gender": "female"
"country": "Mexico"
},
]
Results sought
[
{
"name": "Alex",
"gender": "male"
"country": "Canada"
},
{
"name": "Sofía",
"gender": "female"
"country": "Mexico"
},
]
Try:
>>> [d for d in l2 if d["name"] not in [d1["name"] for d1 in l]]
[{'name': 'Alex', 'gender': 'male', 'country': 'Canada'},
{'name': 'Sofía', 'gender': 'female', 'country': 'Mexico'}]

Python Json data Group it by same last name

Newbie here. I have a Json data that have full name, age, Country and Department. By using python, how can i generate a new Json format with last name as a key and the Json data contain the total number of people that have same last name, list of age and list of departments?
Json as data
{
"John Jane": {
"age": 30,
"Country": "Denmark",
"Department": "Marketing"
},
"Gennie Jane": {
"age": 45,
"Country": "New Zealand",
"Department": "Finance"
},
"Mark Michael": {
"age": 55,
"Country": "Australia",
"Department": "HR"
},
"Jenny Jane": {
"age": 45,
"Country": "United States",
"Department": "IT"
},
"Jane Michael": {
"age": 27,
"Country": "United States",
"Department": "HR"
},
"Scofield Michael": {
"age": 37,
"Country": "England",
"Department": "HR"
}
}
Expected Result:
{
"Michael": {
"count": 3, // number of people that have same last name,
"age": {
"age1": 55,
"age2": 27,
"age3": 37
},
"Country": {
"Country1":"Australia",
"Country2":"United States",
"Country3":"England"
},
"Department": {
"Department1": "HR",
"Department2": "HR",
"Department3": "HR"
},
...
...
...
}
}
In my point of view, using dict for 'age', 'Country' or 'Department' is not necessary and more complicate, using list should be better.
import json
text = """{
"John Jane": {
"age": 30,
"Country": "Denmark",
"Department": "Marketing"
},
"Gennie Jane": {
"age": 45,
"Country": "New Zealand",
"Department": "Finance"
},
"Mark Michael": {
"age": 55,
"Country": "Australia",
"Department": "HR"
},
"Jenny Jane": {
"age": 45,
"Country": "United States",
"Department": "IT"
},
"Jane Michael": {
"age": 27,
"Country": "United States",
"Department": "HR"
},
"Scofield Michael": {
"age": 37,
"Country": "England",
"Department": "HR"
}
}"""
dictionary = json.loads(text)
result = {}
for key, value in dictionary.items():
last_name = key.split()[1]
if last_name in result:
result[last_name]['count'] += 1
result[last_name]['age'].append(value['age'])
result[last_name]['Country'].append(value['Country'])
result[last_name]['Department'].append(value['Department'])
else:
result[last_name] = {'count':1, 'age':[value['age']], 'Country':[value['Country']], 'Department':[value['Department']]}
print(result)
{'Jane': {'count': 3, 'age': [30, 45, 45], 'Country': ['Denmark', 'New Zealand', 'United States'], 'Department': ['Marketing', 'Finance', 'IT']}, 'Michael': {'count': 3, 'age': [55, 27, 37], 'Country': ['Australia', 'United States', 'England'], 'Department': ['HR', 'HR', 'HR']}}

Remove specific key from nested list of dictionary

I have specific format of list containing complex dictionary & containing again list of dictionaries (Nested Format), e.g.
And Requirement is to remove question_id from all associate dictionaries.
options = [
{
"value": 1,
"label": "Paints",
"question_id": "207",
"question": "Which Paint Brand?",
"question_type_id": 2,
"options": [
{
"value": 2,
"label": "Glidden",
"question": "Is it Glidden Paint?",
"question_id": 1,
"options": [{"question_id": 1,"value": 10000, "label": "No"}, {"question_id": 1,"value": 10001, "label": "Yes"}],
},
{
"value": 1,
"label": "Valspar",
"question": "Is it Valspar Paint?",
"question_id": 1,
"options": [{"question_id": 1,"value": 10000, "label": "No"}, {"question_id": 1,"value": 10001, "label": "Yes"}],
},
{
"value": 3,
"label": "DuPont",
"question": "Is it DuPont Paint?",
"question_id": 1,
"options": [{"question_id": 1,"value": 10000, "label": "No"}, {"question_id": 1,"value": 10001, "label": "Yes"}],
},
],
},
{
"value": 4,
"label": "Rods",
"question": "Which Rods Brand?",
"question_id": 2,
"options": [
{"value": 3, "label": "Trabucco"},
{"value": 5, "label": "Yuki"},
{"value": 1, "label": "Shimano"},
{"value": 4, "label": "Daiwa"},
{"value": 2, "label": "Temple Reef"},
],
},
{
"value": 3,
"label": "Metal Sheets",
"question": "Which Metal Sheets Brand?",
"question_id": 2,
"options": [
{"value": 2, "label": "Nippon Steel Sumitomo Metal Corporation"},
{"value": 3, "label": "Hebei Iron and Steel Group"},
{"value": 1, "label": "ArcelorMittal"},
],
},
{
"value": 2,
"label": "Door Knobs Locks",
"question": "Which Door Knobs Locks Brand?",
"question_id": 2,
"options": [
{
"value": 1,
"label": "ASSA-Abloy",
"question": "Is it ASSA-Abloy Door Knobs Locks?",
"question_type_id": 1,
"options": [{"value": 10000, "label": "No"}, {"value": 10001, "label": "Yes"}],
},
{
"value": 4,
"label": "RR Brink",
"question": "Is it RR Brink Door Knobs Locks?",
"question_type_id": 1,
"options": [{"value": 10000, "label": "No"}, {"value": 10001, "label": "Yes"}],
},
{
"value": 3,
"label": "Medeco",
"question": "Is it Medeco Door Knobs Locks?",
"question_type_id": 1,
"options": [{"value": 10000, "label": "No"}, {"value": 10001, "label": "Yes"}],
},
{
"value": 2,
"label": "Evva",
"question": "Is it Evva Door Knobs Locks?",
"question_type_id": 1,
"options": [{"value": 10000, "label": "No"}, {"value": 10001, "label": "Yes"}],
},
],
},
]
For this I have written a code & trying to run it recursively.
from collections import MutableMapping
def delete_keys_from_dict(dictionary_list, keys):
keys_set = set(keys) # Just an optimization for the "if key in keys" lookup.
# modified_list=[]
for index, dictionary in enumerate(dictionary_list):
modified_dict = {}
for key, value in dictionary.items():
if key not in keys_set:
if isinstance(value, list):
modified_dict[key] = delete_keys_from_dict(value, keys_set)
else:
if isinstance(value, MutableMapping):
modified_dict[key] = delete_keys_from_dict(value, keys_set)
else:
modified_dict[key] = value
# or copy.deepcopy(value) if a copy is desired for non-dicts.
dictionary_list[index] = modified_dict
return dictionary_list
It's returning incorrect list & which is not preserving the existing list data.
May i know, Where am i going wrong or missing something somewhere?
I think something like this should do what you want.
obj may be any object, and this recurses into lists and dicts.
def delete_keys(obj, keys):
if isinstance(obj, list):
return [
delete_keys(item, keys)
for item in obj
]
if isinstance(obj, dict):
return {
key: delete_keys(value, keys)
for (key, value) in obj.items()
if key not in keys
}
return obj # Nothing to do for this value
e.g.
from pprint import pprint
options = [
{
"value": 1,
"label": "Paints",
"question_id": "207",
"question": "Which Paint Brand?",
"question_type_id": 2,
"options": [
{
"value": 2,
"label": "Glidden",
"question": "Is it Glidden Paint?",
"question_id": 1,
"options": [{"question_id": 1,"value": 10000, "label": "No"}, {"question_id": 1,"value": 10001, "label": "Yes"}],
}
],
},
{
"value": 4,
"label": "Rods",
"question": "Which Rods Brand?",
"question_id": 2,
"options": [
{"value": 3, "label": "Trabucco"},
{"value": 5, "label": "Yuki"},
{"value": 1, "label": "Shimano"},
{"value": 4, "label": "Daiwa"},
{"value": 2, "label": "Temple Reef"},
],
},
]
pprint(delete_keys(options, {"question_id"}))
outputs
[{'label': 'Paints',
'options': [{'label': 'Glidden',
'options': [{'label': 'No', 'value': 10000},
{'label': 'Yes', 'value': 10001}],
'question': 'Is it Glidden Paint?',
'value': 2}],
'question': 'Which Paint Brand?',
'question_type_id': 2,
'value': 1},
{'label': 'Rods',
'options': [{'label': 'Trabucco', 'value': 3},
{'label': 'Yuki', 'value': 5},
{'label': 'Shimano', 'value': 1},
{'label': 'Daiwa', 'value': 4},
{'label': 'Temple Reef', 'value': 2}],
'question': 'Which Rods Brand?',
'value': 4}]

Adding new pairs to a json file

I have a json file I need to add pairs to, I convert it into a dict, but now I need to put my new values in a specific place.
This is some of the json file I convert:
"rootObject": {
"id": "6ff0010c-00fe-485b-b695-4ddd6aca4dcd",
"type": "IDO_GEAR",
"children": [
{
"id": "1dd94d1a-e52d-40b3-a82b-6db02a8fbbab",
"type": "IDO_SYSTEM_LOADCASE",
"children": [],
"childList": "SYSTEMLOADCASE",
"properties": [
{
"name": "IDCO_IDENTIFICATION",
"value": "1dd94d1a-e52d-40b3-a82b-6db02a8fbbab"
},
{
"name": "IDCO_DESIGNATION",
"value": "Lastfall 1"
},
{
"name": "IDSLC_TIME_PORTION",
"value": 100
},
{
"name": "IDSLC_DISTANCE_PORTION",
"value": 100
},
{
"name": "IDSLC_OPERATING_TIME_IN_HOURS",
"value": 1
},
{
"name": "IDSLC_OPERATING_TIME_IN_SECONDS",
"value": 3600
},
{
"name": "IDSLC_OPERATING_REVOLUTIONS",
"value": 1
},
{
"name": "IDSLC_OPERATING_DISTANCE",
"value": 1
},
{
"name": "IDSLC_ACCELERATION",
"value": 9.81
},
{
"name": "IDSLC_EPSILON_X",
"value": 0
},
{
"name": "IDSLC_EPSILON_Y",
"value": 0
},
{
"name": "IDSLC_EPSILON_Z",
"value": 0
},
{
"name": "IDSLC_CALCULATION_WITH_OWN_WEIGHT",
"value": "CO_CALCULATION_WITHOUT_OWN_WEIGHT"
},
{
"name": "IDSLC_CALCULATION_WITH_TEMPERATURE",
"value": "CO_CALCULATION_WITH_TEMPERATURE"
},
{
"name": "IDSLC_FLAG_FOR_LOADCASE_CALCULATION",
"value": "LB_CALCULATE_LOADCASE"
},
{
"name": "IDSLC_STATUS_OF_LOADCASE_CALCULATION",
"value": false
}
I want to add somthing like ENTRY_ONE and ENTRY_TWO like this:
"rootObject": {
"id": "6ff0010c-00fe-485b-b695-4ddd6aca4dcd",
"type": "IDO_GEAR",
"children": [
{
"id": "1dd94d1a-e52d-40b3-a82b-6db02a8fbbab",
"type": "IDO_SYSTEM_LOADCASE",
"children": [],
"childList": "SYSTEMLOADCASE",
"properties": [
{
"name": "IDCO_IDENTIFICATION",
"value": "1dd94d1a-e52d-40b3-a82b-6db02a8fbbab"
},
{
"name": "IDCO_DESIGNATION",
"value": "Lastfall 1"
},
{
"name": "IDSLC_TIME_PORTION",
"value": 100
},
{
"name": "IDSLC_DISTANCE_PORTION",
"value": 100
},
{
"name": "ENTRY_ONE",
"value": 100
},
{
"name": "ENTRY_TWO",
"value": 55
},
{
"name": "IDSLC_OPERATING_TIME_IN_HOURS",
"value": 1
},
{
"name": "IDSLC_OPERATING_TIME_IN_SECONDS",
"value": 3600
},
{
"name": "IDSLC_OPERATING_REVOLUTIONS",
"value": 1
},
{
"name": "IDSLC_OPERATING_DISTANCE",
"value": 1
},
{
"name": "IDSLC_ACCELERATION",
"value": 9.81
},
{
"name": "IDSLC_EPSILON_X",
"value": 0
},
{
"name": "IDSLC_EPSILON_Y",
"value": 0
},
{
"name": "IDSLC_EPSILON_Z",
"value": 0
},
{
"name": "IDSLC_CALCULATION_WITH_OWN_WEIGHT",
"value": "CO_CALCULATION_WITHOUT_OWN_WEIGHT"
},
{
"name": "IDSLC_CALCULATION_WITH_TEMPERATURE",
"value": "CO_CALCULATION_WITH_TEMPERATURE"
},
{
"name": "IDSLC_FLAG_FOR_LOADCASE_CALCULATION",
"value": "LB_CALCULATE_LOADCASE"
},
{
"name": "IDSLC_STATUS_OF_LOADCASE_CALCULATION",
"value": false
}
How can I add the entries so that they are under the IDCO_IDENTIFICATION tag, and not under the rootObject?
The way I see your json file, it WOULD be under rootObject as EVERYTHING is under that key. There's quite a few closing brackets and braces missing.
So I can only assume you are meaning you want it directly under IDCO_IDENTIFICATION (which is nested under rootObject). But that doesn't match what you have as your example output either. You have the new ENTRY_ONE and ENTRY_TWO within the properties, within the children, within the rootObject, not "under" IDCO_IDENTIFICATION. So I'm going to follow what you are asking for from your example output.
import json
with open('C:/test.json') as f:
data = json.load(f)
new_elements = [{"name":"ENTRY_ONE", "value":100},
{"name":"ENTRY_TWO", "value":55}]
for each in new_elements:
data['rootObject']['children'][0]['properties'].append(each)
with open('C:/test.json', 'w') as f:
json.dump(data, f)
Output:
import pprint
pprint.pprint(data)
{'rootObject': {'children': [{'childList': 'SYSTEMLOADCASE',
'children': [],
'id': '1dd94d1a-e52d-40b3-a82b-6db02a8fbbab',
'properties': [{'name': 'IDCO_IDENTIFICATION',
'value': '1dd94d1a-e52d-40b3-a82b-6db02a8fbbab'},
{'name': 'IDCO_DESIGNATION',
'value': 'Lastfall 1'},
{'name': 'IDSLC_TIME_PORTION',
'value': 100},
{'name': 'IDSLC_DISTANCE_PORTION',
'value': 100},
{'name': 'IDSLC_OPERATING_TIME_IN_HOURS',
'value': 1},
{'name': 'IDSLC_OPERATING_TIME_IN_SECONDS',
'value': 3600},
{'name': 'IDSLC_OPERATING_REVOLUTIONS',
'value': 1},
{'name': 'IDSLC_OPERATING_DISTANCE',
'value': 1},
{'name': 'IDSLC_ACCELERATION',
'value': 9.81},
{'name': 'IDSLC_EPSILON_X',
'value': 0},
{'name': 'IDSLC_EPSILON_Y',
'value': 0},
{'name': 'IDSLC_EPSILON_Z',
'value': 0},
{'name': 'IDSLC_CALCULATION_WITH_OWN_WEIGHT',
'value': 'CO_CALCULATION_WITHOUT_OWN_WEIGHT'},
{'name': 'IDSLC_CALCULATION_WITH_TEMPERATURE',
'value': 'CO_CALCULATION_WITH_TEMPERATURE'},
{'name': 'IDSLC_FLAG_FOR_LOADCASE_CALCULATION',
'value': 'LB_CALCULATE_LOADCASE'},
{'name': 'IDSLC_STATUS_OF_LOADCASE_CALCULATION',
'value': False},
{'name': 'ENTRY_ONE',
'value': 100},
{'name': 'ENTRY_TWO',
'value': 55}],
'type': 'IDO_SYSTEM_LOADCASE'}],
'id': '6ff0010c-00fe-485b-b695-4ddd6aca4dcd',
'type': 'IDO_GEAR'}}

Categories