Remove dictionaries from a list of dictionaries under certain conditions - python

I have a very large list of dictionaries that looks like this (I show a simplified version):
list_of_dicts:
[{'ID': 1234,
'Name': 'Bobby',
'Animal': 'Dog',
'About': [{'ID': 5678, 'Food': 'Dog Food'}]},
{'ID': 5678, 'Food': 'Dog Food'},
{'ID': 91011,
'Name': 'Jack',
'Animal': 'Bird',
'About': [{'ID': 1996, 'Food': 'Seeds'}]},
{'ID': 1996, 'Food': 'Seeds'},
{'ID': 2007,
'Name': 'Bean',
'Animal': 'Cat',
'About': [{'ID': 2008, 'Food': 'Fish'}]},
{'ID': 2008, 'Food': 'Fish'}]
I'd like to remove the dictionaries containing IDs that are equal to the ID's nested in the 'About' entries. For example, 'ID' 2008, is already nested in the nested 'About' value, therefore I'd like to remove that dictionary.
I have some code that can do this, and for this specific example it works. However, the amount of data that I have is much larger, and the remove() function does not seem to remove all the entries unless I run it a couple of times.
Any suggestions on how I can do this better?
My code:
nested_ids = [5678, 1996, 2008]
for i in list_of_dicts:
if i['ID'] in nested_ids:
list_of_dicts.remove(i)
Desired output:
[{'ID': 1234,
'Name': 'Bobby',
'Animal': 'Dog',
'About': [{'ID': 5678, 'Food': 'Dog Food'}]},
{'ID': 91011,
'Name': 'Jack',
'Animal': 'Bird',
'About': [{'ID': 1996, 'Food': 'Seeds'}]},
{'ID': 2007,
'Name': 'Bean',
'Animal': 'Cat',
'About': [{'ID': 2008, 'Food': 'Fish'}]}]

You can use a list comprehension:
cleaned_list = [d for d in list_of_dicts if d['ID'] not in nested_ids]

It is happening because we are modifying the dict while iterating it, So to avoid that we can copy the required values to a new dict as follow
filtered_dicts = []
nested_ids = [5678, 1996, 2008]
for curr in list_of_dicts:
if curr['ID'] not in nested_ids:
filtered_dicts.append(curr)

the problem is that when you remove a member of a list you're changing the indexes of everything after that index so you should reorder the indices you want to remove in reverse so you start from the back of the list
so all you need to do is iterate over the list in reverse order:
for i in list_of_dicts[::-1]:

Related

how to convert list of dict to dict when some keys repeated

I have list of dictionaries, similar to this:
results=[{'year':2020,'id':'321abc','color':'blue'},
{'year':2020,'id':'412nbg','color':'brown'},
{'year':2021,'id':'klp54','color':'yellow'}...]
I want to organize it, to be one dictionary instead of list with dictionary, and also, to organize it so I have year as key, and then all the id and colors as values. I saw this post which has similar problem, however, they keys there were unique (names in the referenced post), while I have it repeating (years in my example).
So in the end maybe it will be nested dictionary , something like this:
results={ 2020:{id{},color{}},2020:{id{},color{},2022:{id:{},color:{}}
(when I have many ids and colors in each year)
How can I do this ?
With itertools.groupby and a few list and dictionary comprehensions, this is trivial. Remember that you need to sort first or groupby will not work the way you want. You will get repeated groups.
results = [{'year': 2020, 'id': '321abc', 'color': 'blue'},
{'year': 2020, 'id': '412nbg', 'color': 'brown'},
{'year': 2021, 'id': 'klp54', 'color': 'yellow'}]
from itertools import groupby
from operator import itemgetter
year = itemgetter('year')
r = sorted(results, key=year)
# [{'year': 2020, 'id': '321abc', 'color': 'blue'},
# {'year': 2020, 'id': '412nbg', 'color': 'brown'},
# {'year': 2021, 'id': 'klp54', 'color': 'yellow'}]
g = groupby(r, key=year)
# <itertools.groupby object at 0x7f9f2a232138>
{k: [{'id': x['id'], 'color': x['color']} for x in v]
for k, v in g}
# {2020: [{'id': '321abc', 'color': 'blue'},
# {'id': '412nbg', 'color': 'brown'}],
# 2021: [{'id': 'klp54', 'color': 'yellow'}]}

Adding key value pair to a dictionary list

I'm trying to add a third child to the below dictionary:
person = {'Name':'Jane', 'Age':32, 'Allergies':
[{'Allergen':'Dust'},
{'Allergen':'Feathers'},
{'Allergen':'Strawberries'}],
'Children':
[{'Name':'Ben', 'Age': 6}, {'Name':'Elly', 'Age': 8}]}
print(person)
{'Name': 'Jane', 'Age': 32, 'Allergies': [{'Allergen': 'Dust'}, {'Allergen': 'Feathers'}, {'Allergen': 'Strawberries'}], 'Children': [{'Name': 'Ben', 'Age': 6}, {'Name': 'Elly', 'Age': 8}]}
When I try update person.update('Children': [{'Name':'Hanna', 'Age':0}])
it replaces all children with just that one? Nothing else works either... Any suggestions?
The person dictionary does not know that the Allergies and Children are lists, so you need to use the lists' methods to append things to that specific list.
person["Allergies"].append({"Allergen": "gluten"})
# or
person["Children"].append({"name":"Hannah", "age": 0})

Create a List of Dictionaries using Comprehension

I currently have a list of strings that I am trying to create each string item into a dictionary object and store it within a list.
In attempting to create this list of dictionaries, I repeatedly create one big dictionary instead of iterating through item by item.
My code:
clothes_dict = [{clothes_list[i]: clothes_list[i + 1] for i in range(0, len(clothes_list), 2)}]
The error (All items being merged into one dictionary):
clothes_dict = {list: 1} [{'name': 'Tom', 'age': 10}, {'name': 'Mark', 'age': 5}, {'name': 'Pam', 'age': 7}]
0 = {dict: 2} {'name': 'Tom', 'age': 10}, {dict: 2} {'name': 'Mark', 'age': 5}, {'name': 'Pam', 'age': 7}```
Target Output (All Items being created into separate dictionaries within the single list):
clothes_dict = {list: 3} [{'name': 'Tom', 'age': 10}, {'name': 'Mark', 'age': 5}, {'name': 'Pam', 'age': 7}]
0 = {dict: 2} {'name': 'Tom', 'age': 10}
1 = {dict: 2} {'name': 'Mark', 'age': 5}
2 = {dict: 2} {'name': 'Pam', 'age': 7}```
I am attempting to make each entry within the list a new dictionary in the same form as the target output image.
clothes_dict = [{clothes_list[i]: clothes_list[i + 1]} for i in range(0, len(clothes_list), 2)]
You misplaced your closing right curly brace '}' in your list comprehension and placed it at the end which meant you was performing a dictionary comprehension as opposed to a list comprehension with each item being a dictionary.
Your code creates a list with a single dictionary:
clothes_dict = [{clothes_list[i]: clothes_list[i + 1] for i in range(0,l en(clothes_list), 2)}]
If you (for some reason) want a list of dictionaries with single entries:
clothes_dict = [{clothes_list[i]: clothes_list[i + 1]} for i in range(0,l en(clothes_list), 2)]
However, it seems to me that this may be a bit of an XY problem - in what case is a list of single-entry dictionaries the required format? Why not use a list of tuples for example?

What is the most efficient way to create nested dictionaries in Python?

I currently have over 10k elements in my dictionary looks like:
cars = [{'model': 'Ford', 'year': 2010},
{'model': 'BMW', 'year': 2019},
...]
And I have a second dictionary:
car_owners = [{'model': 'BMW', 'name': 'Sam', 'age': 34},
{'model': 'BMW', 'name': 'Taylor', 'age': 34},
.....]
However, I want to join together the 2 together to be something like:
combined = [{'model': 'BMW',
'year': 2019,
'owners: [{'name': 'Sam', 'age': 34}, ...]
}]
What is the best way to combine them? For the moment I am using a For loop but I feel like there are more efficient ways of dealing with this.
** This is just a fake example of data, the one I have is a lot more complex but this helps give the idea of what I want to achieve
Iterate over the first list, creating a dict with the key-val as model-val, then in the second dict, look for the same key (model) and update the first dict, if it is found:
cars = [{'model': 'Ford', 'year': 2010}, {'model': 'BMW', 'year': 2019}]
car_owners = [{'model': 'BMW', 'name': 'Sam', 'age': 34}, {'model': 'Ford', 'name': 'Taylor', 'age': 34}]
dd = {x['model']:x for x in cars}
for item in car_owners:
key = item['model']
if key in dd:
del item['model']
dd[key].update({'car_owners': item})
else:
dd[key] = item
print(list(dd.values()))
OUTPUT:
[{'model': 'BMW', 'year': 2019, 'car_owners': {'name': 'Sam', 'age': 34}}, {'model': 'Ford', 'year': 2010, 'car_owners': {'name': 'Taylor',
'age': 34}}]
Really, what you want performance wise is to have dictionaries with the model as the key. That way, you have O(1) lookup and can quickly get the requested element (instead of looping each time in order to find the car with model x).
If you're starting off with lists, I'd first create dictionaries, and then everything is O(1) from there on out.
models_to_cars = {car['model']: car for car in cars}
models_to_owners = {}
for car_owner in car_owners:
models_to_owners.setdefault(car_owner['model'], []).append(car_owner)
combined = [{
**car,
'owners': models_to_owners.get(model, [])
} for model, car in models_to_cars.items()]
Then you'd have
combined = [{'model': 'BMW',
'year': 2019,
'owners': [{'name': 'Sam', 'age': 34}, ...]
}]
as you wanted

How to categorize list of dictionaries based on the value of a key in python efficiently?

I have a list of dictionaries in python which I want to categorized them based on the value of a key which exists in all dictionaries and process each category separately. I don't know what are the values, I just know that there exists a special key. Here's the list:
dictList = [
{'name': 'name1', 'type': 'type1', 'id': '14464'},
{'name': 'name2', 'type': 'type1', 'id': '26464'},
{'name': 'name3', 'type': 'type3', 'id': '36464'},
{'name': 'name4', 'type': 'type5', 'id': '43464'},
{'name': 'name5', 'type': 'type2', 'id': '68885'}
]
This is the code I currently use:
while len(dictList):
category = [l for l in dictList if l['type'] == dictList[0]['type']]
processingMethod(category)
for item in category:
dictList.remove(item)
This iteration on the above list will give me following result:
Iteration 1:
category = [
{'name': 'name1', 'type': 'type1', 'id': '14464'},
{'name': 'name2', 'type': 'type1', 'id': '26464'},
]
Iteration 2:
category = [
{'name': 'name3', 'type': 'type3', 'id': '36464'}
]
Iteration 3:
category = [
{'name': 'name4', 'type': 'type5', 'id': '43464'}
]
Iteration 4:
category = [
{'name': 'name5', 'type': 'type2', 'id': '68885'}
]
Each time, I get a category, process it and finally remove processed items to iterate over remaining items, until there is no remaining item. Any idea to make it better?
Your code can be rewritten using itertools.groupby
for _, category in itertools.groupby(dictList, key=lambda item:item['type']):
processingMethod(list(category))
Or if processingMethod can process iterable,
for _, category in itertools.groupby(dictList, key=lambda item:item['type']):
processingMethod(category)
If l['type'] is hashable for each l in dictList, here's a possible, somewhat-elegant solution:
bins = {}
for l in dictList:
if l['type'] in bins:
bins[l['type']].append(l)
else:
bins[l['type']] = [l]
for category in bins.itervalues():
processingMethod(category)
The idea is that first, we'll sort all the ls into bins, using l['type'] as the key; second, we'll process each bin.
If l['type'] isn't guaranteed to be hashable for each l in dictList, the approach is essentially the same, but we'll have to use a list of tuples instead of the dict, which means this is a bit less efficient:
bins = []
for l in dictList:
for bin in bins:
if bin[0] == l['type']:
bin[1].append(l)
break
else:
bins.append((l['type'], [l]))
for _, category in bins:
processingMethod(category)

Categories