Extracting duplicate from array of dictionaries

Extracting duplicate from array of dictionaries - python

Hi i have an array of dicts that looks like this:
books = [
{'Serial Number': '3333', 'size':'500', 'Book':'The Hobbit'},
{'Serial Number': '2222', 'size':'100', 'Book':'Lord of the Rings'},
{'Serial Number': '1111', 'size':'200', 'Book':'39 Steps'},
{'Serial Number': '3333', 'size':'600', 'Book':'100 Dalmations'},
{'Serial Number': '2222', 'size':'800', 'Book':'Woman in Black'},
{'Serial Number': '6666', 'size':'1000', 'Book':'The Hunt for Red October'},
]
I need to create a separate array of dicts that looks like this based on duplicate serial numbers:
duplicates = [
'3333', [{'Book':'The Hobbit'}, {'Book':'100 Dalmations'}],
'2222', [{'Book':'Lord of the Rings'}, {'Book':'Woman in Black'}]
]
Is there an easy way to do this using a built in function, if not whats the best way to achieve this?

The most pythonic way I can think about:
from collections import defaultdict
res = defaultdict(list)
for d in books:
res[d.pop('Serial Number')].append(d)
print({k: v for k, v in res.items() if len(v) > 1})
Output:
{'2222': [{'Book': 'Lord of the Rings', 'size': '100'},
{'Book': 'Woman in Black', 'size': '800'}],
'3333': [{'Book': 'The Hobbit', 'size': '500'},
{'Book': '100 Dalmations', 'size': '600'}]}

Related

convert values of list of dictionaries with different keys to new list of dicts

I have this list of dicts:
[{'name': 'aly', 'age': '104'},
{'name': 'Not A name', 'age': '99'}]
I want the name value to be the key and the age value to be the value of new dict.
Expected output:
['aly' : '104', 'Not A name': '99']

If you want output to be single dict, you can use dict comprehension:
output = {p["name"]: p["age"] for p in persons}
>>> {'aly': '104', 'Not A name': '99'}
If you want output to be list of dicts, you can use list comprehension:
output = [{p["name"]: p["age"]} for p in persons]
>>> [{'aly': '104'}, {'Not A name': '99'}]

You can initialize the new dict, iterate through the list and add to the new dict:
lst = [{'name': 'aly', 'age': '104'}, {'name': 'Not A name', 'age': '99'}]
newdict = {}
for item in lst:
newdict[item['name']] = item['age']

This will help you:
d = [
{'name': 'aly', 'age': '104'},
{'name': 'Not A name', 'age': '99'}
]
dict([i.values() for i in d])
# Result
{'aly': '104', 'Not A name': '99'}
# In case if you want a list of dictionary, use this
[dict([i.values() for i in d])]
# Result
[{'aly': '104', 'Not A name': '99'}]
Just a side note:
Your expected answer looks like a list (because of [ ]) but values inside the list are dictionary (key:value) which is invalid.

Here is the easiest way to convert the new list of dicts
res = list(map(lambda data: {data['name']: data['age']}, d))
print(res)

What is the most efficient way to create nested dictionaries in Python?

I currently have over 10k elements in my dictionary looks like:
cars = [{'model': 'Ford', 'year': 2010},
{'model': 'BMW', 'year': 2019},
...]
And I have a second dictionary:
car_owners = [{'model': 'BMW', 'name': 'Sam', 'age': 34},
{'model': 'BMW', 'name': 'Taylor', 'age': 34},
.....]
However, I want to join together the 2 together to be something like:
combined = [{'model': 'BMW',
'year': 2019,
'owners: [{'name': 'Sam', 'age': 34}, ...]
}]
What is the best way to combine them? For the moment I am using a For loop but I feel like there are more efficient ways of dealing with this.
** This is just a fake example of data, the one I have is a lot more complex but this helps give the idea of what I want to achieve

Iterate over the first list, creating a dict with the key-val as model-val, then in the second dict, look for the same key (model) and update the first dict, if it is found:
cars = [{'model': 'Ford', 'year': 2010}, {'model': 'BMW', 'year': 2019}]
car_owners = [{'model': 'BMW', 'name': 'Sam', 'age': 34}, {'model': 'Ford', 'name': 'Taylor', 'age': 34}]
dd = {x['model']:x for x in cars}
for item in car_owners:
key = item['model']
if key in dd:
del item['model']
dd[key].update({'car_owners': item})
else:
dd[key] = item
print(list(dd.values()))
OUTPUT:
[{'model': 'BMW', 'year': 2019, 'car_owners': {'name': 'Sam', 'age': 34}}, {'model': 'Ford', 'year': 2010, 'car_owners': {'name': 'Taylor',
'age': 34}}]

Really, what you want performance wise is to have dictionaries with the model as the key. That way, you have O(1) lookup and can quickly get the requested element (instead of looping each time in order to find the car with model x).
If you're starting off with lists, I'd first create dictionaries, and then everything is O(1) from there on out.
models_to_cars = {car['model']: car for car in cars}
models_to_owners = {}
for car_owner in car_owners:
models_to_owners.setdefault(car_owner['model'], []).append(car_owner)
combined = [{
**car,
'owners': models_to_owners.get(model, [])
} for model, car in models_to_cars.items()]
Then you'd have
combined = [{'model': 'BMW',
'year': 2019,
'owners': [{'name': 'Sam', 'age': 34}, ...]
}]
as you wanted

How to sort a list which holds nested lists, which further holds dictionaries. the aim is to sort the main list by a dictionary VALUE

I am looking to sort a list similar to this using the itemgetter function. I am trying to arrange the list in ascending order by the dictionary KEY "Special number".
Because the nested list deals with dictionaries I am finding this difficult to complete.
from operator import itemgetter
lists = [
[{'time': str, 'ask price': str},
{'ticker': 'BB','Special number': 10}],
[{'time': str , 'price': str},
{'ticker': 'AA', 'Special number': 5}
]
]
I tried to use:
gg = lists.sort(key=itemgetter((1)['special number']))
print(gg)
Many Thanks!

I don't think it is necessary to use itemgetter() here. If the dictionary that has the key 'Special number' is always in the index 1 of the lists, then it would be enough to do the following:
sorted_list = sorted(lists, key=lambda x: x[1]['Special number'])
print(sorted_list)
output
[[{'price': <class 'str'>, 'time': <class 'str'>},
{'Special number': 5, 'ticker': 'AA'}],
[{'ask price': <class 'str'>, 'time': <class 'str'>},
{'Special number': 10, 'ticker': 'BB'}]]

Creating a list of dictionaries from separate lists

I honestly expected this to have been asked previously, but after 30 minutes of searching I haven't had any luck.
Say we have multiple lists, each of the same length, each one containing a different type of data about something. We would like to turn this into a list of dictionaries with the data type as the key.
input:
data = [['tom', 'jim', 'mark'], ['Toronto', 'New York', 'Paris'], [1990,2000,2000]]
data_types = ['name', 'place', 'year']
output:
travels = [{'name':'tom', 'place': 'Toronto', 'year':1990},
{'name':'jim', 'place': 'New York', 'year':2000},
{'name':'mark', 'place': 'Paris', 'year':2001}]
This is fairly easy to do with index-based iteration:
travels = []
for d_index in range(len(data[0])):
travel = {}
for dt_index in range(len(data_types)):
travel[data_types[dt_index]] = data[dt_index][d_index]
travels.append(travel)
But this is 2017! There has to be a more concise way to do this! We have map, flatmap, reduce, list comprehensions, numpy, lodash, zip. Except I can't seem to compose these cleanly into this particular transformation. Any ideas?

You can use a list comprehension with zip after transposing your dataset:
>>> [dict(zip(data_types, x)) for x in zip(*data)]
[{'place': 'Toronto', 'name': 'tom', 'year': 1990},
{'place': 'New York', 'name': 'jim', 'year': 2000},
{'place': 'Paris', 'name': 'mark', 'year': 2000}]

convert array to dict

I want to convert a list to a dictionary:
products=[['1','product 1'],['2','product 2']]
arr=[]
vals={}
for product in products:
vals['id']=product[0]
vals['name']=product
arr.append(vals)
print str(arr)
The result is
[{'id': '2', 'name': 'product 2'}, {'id': '2', 'name': 'product 2'}]
But I want something thing like that:
[{'id': '1', 'name': 'product 1'}, {'id': '2', 'name': 'product 2'}]

What you need to do is create a new dictionary for each iteration of the loop.
products=[['1','product 1'],['2','product 2']]
arr=[]
for product in products:
vals = {}
vals['id']=product[0]
vals['name']=product[1]
arr.append(vals)
print str(arr)
When you append an object like a dictionary to an array, Python does not make a copy before it appends. It will append that exact object to the array. So if you add dict1 to an array, then change dict1, then the array's contents will also change. For that reason, you should be making a new dictionary each time, as above.

For simplicity sake you could also make this into a one liner:
products=[['1','product 1'],['2','product 2']]
arr= [{"id":item[0], "name":item[1]} for item in products]
Which yields:
[{'id': '1', 'name': 'product 1'}, {'id': '2', 'name': 'product 2'}]

products=[['1','product 1'],['2','product 2']]
arr=[{'id':a[0], 'name': a[1]} for a in products]
print str(arr)
Would also work

products=[['1','product 1'],['2','product 2']]
arr=[]
for product in products:
vals = {}
for i, n in enumerate(['id', 'name', ....]): # to make it more dynamic?
vals[n]=product[i]
arr.append(vals)
or just use [0], [1] like stated in previous post

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extracting duplicate from array of dictionaries - python

Related

convert values of list of dictionaries with different keys to new list of dicts

What is the most efficient way to create nested dictionaries in Python?

How to sort a list which holds nested lists, which further holds dictionaries. the aim is to sort the main list by a dictionary VALUE

Creating a list of dictionaries from separate lists

convert array to dict

Categories

Resources