How to convert list of dict to dict. Below is the list of dict
data = [{'name': 'John Doe', 'age': 37, 'sex': 'M'},
{'name': 'Lisa Simpson', 'age': 17, 'sex': 'F'},
{'name': 'Bill Clinton', 'age': 57, 'sex': 'M'}]
to
data = {'John Doe': {'name': 'John Doe', 'age': 37, 'sex': 'M'},
'Lisa Simpson': {'name': 'Lisa Simpson', 'age': 17, 'sex': 'F'},
'Bill Clinton': {'name': 'Bill Clinton', 'age': 57, 'sex': 'M'}}
A possible solution using names as the new keys:
new_dict = {}
for item in data:
name = item['name']
new_dict[name] = item
With python 3.x you can also use dict comprehensions for the same approach in a more nice way:
new_dict = {item['name']:item for item in data}
As suggested in a comment by Paul McGuire, if you don't want the name in the inner dict, you can do:
new_dict = {}
for item in data:
name = item.pop('name')
new_dict[name] = item
With python 3.3 and above, you can use ChainMap
A ChainMap groups multiple dicts or other mappings together to create
a single, updateable view. If no maps are specified, a single empty
dictionary is provided so that a new chain always has at least one
mapping.
from collections import ChainMap
data = dict(ChainMap(*data))
If the dicts wouldnt share key, then you could use:
dict((key,d[key]) for d in data for key in d)
Probably its better in your case to generate a dict with lists as values?
newdict={}
for k,v in [(key,d[key]) for d in data for key in d]:
if k not in newdict: newdict[k]=[v]
else: newdict[k].append(v)
This yields:
>>> newdict
`{'age': [37, 17, 57], 'name': ['John Doe', 'Lisa Simpson', 'Bill Clinton'], 'sex': ['M', 'F', 'M']}`
Try this approach:
{key: val} for k in data for key, val in k.items())
Let's not over complicate this:
simple_dictionary = dict(data[0])
Perhaps you want the name to be the key? You don't really specify, since your second example is invalid and not really meaningful.
Note that my example removes the key "name" from the value, which may be desirable (or perhaps not).
data = [{'name': 'John Doe', 'age': 37, 'sex': 'M'},
{'name': 'Lisa Simpson', 'age': 17, 'sex': 'F'},
{'name': 'Bill Clinton', 'age': 57, 'sex': 'M'}]
newdata = {}
for entry in data:
name = entry.pop('name') #remove and return the name field to use as a key
newdata[name] = entry
print newdata
##{'Bill Clinton': {'age': 57, 'sex': 'M'},
## 'John Doe': {'age': 37, 'sex': 'M'},
## 'Lisa Simpson': {'age': 17, 'sex': 'F'}}
print newdata['John Doe']['age']
## 37
import pandas as pd
data = [{'name': 'John Doe', 'age': 37, 'sex': 'M'},
{'name': 'Lisa Simpson', 'age': 17, 'sex': 'F'},
{'name': 'Bill Clinton', 'age': 57, 'sex': 'M'}]
print(pd.DataFrame(data).to_dict())
My 5 cents, didn't like any of answers:
from functools import reduce
collection = [{'hello': 1}, {'world': 2}]
answer = reduce(lambda aggr, new: aggr.update(new) or aggr, collection, {})
Just in case you wanted a functional alternative (also assuming the names are wanted as the new keys), you could do
from toolz.curried import *
data = [{'name': 'John Doe', 'age': 37, 'sex': 'M'},
{'name': 'Lisa Simpson', 'age': 17, 'sex': 'F'},
{'name': 'Bill Clinton', 'age': 57, 'sex': 'M'}]
newdata = pipe(data,
map(lambda x: {x['name']: dissoc(x, 'name')}),
lambda x: merge(*x)
)
print(newdata)
Related
I have 2 lists
list1 = ["ben", "tim", "john", "wally"]
list2 = [18,12,34,55]
the output im looking for is this
[{'Name': 'ben', 'Age': 18, 'Name': 'tim', 'Age': 12, 'Name': 'john', 'Age': 34, 'Name': 'wally', 'Age': 55}]
As mentioned in the comments, you can't have duplicate keys in a dictionary; even your output snippet would just return [{'Name': 'wally', 'Age': 55}]
However, {k: v for k, v in zip(list1, list2)} will return
{'ben': 18, 'tim': 12, 'john': 34, 'wally': 55}
And [{'Name': n, 'Age': a} for n, a in zip(list1, list2)] will return
[{'Name': 'ben', 'Age': 18},
{'Name': 'tim', 'Age': 12},
{'Name': 'john', 'Age': 34},
{'Name': 'wally', 'Age': 55}]
I am trying to get data from this list for example the age, how do i get this and how can i count how many times it appears?
my_list = [
{'Name': 'Michael','Age': 29, 'Gender': 'Male', 'City':'Wisconsin'}
{'Name': 'James','Age': 29, 'Gender': 'Male', 'City':'Tokyo'}
{'Name': 'Diesel','Age': 29, 'Gender': 'Male', 'City':'Shanghai'}]
List comprehensions is what you need here:
my_list = [
{'Name': 'Michael', 'Age': 29, 'Gender': 'Male', 'City': 'Wisconsin'},
{'Name': 'James', 'Age': 29, 'Gender': 'Male', 'City': 'Tokyo'},
{'Name': 'Diesel', 'Age': 29, 'Gender': 'Male', 'City': 'Shanghai'}
]
ages = [person["Age"] for person in my_list]
# All the values having "Age" as key
print(ages)
>>> [29, 29, 29]
# The number of times the key "Age" is present
print(len(ages))
>>> 3
from collections import Counter
Counter(k['Age'] for k in my_list if k.get('Age'))
Counter({29: 3})
Using collections.Counter
I have a dicts in a list and some dicts are identical. I want to find duplicated ones and want to add to new list or dictionary with how many duplicate they have.
import itertools
myListCombined = list()
for a, b in itertools.combinations(myList, 2):
is_equal = set(a.items()) - set(b.items())
if len(is_equal) == 0:
a.update(count=2)
myListCombined.append(a)
else:
a.update(count=1)
b.update(count=1)
myListCombined.append(a)
myListCombined.append(b)
myListCombined = [i for n, i enumerate(myListCombine) if i not in myListCombine[n + 1:]]
This code is kinda working, but it's just for 2 duplicated dicts in list. a.update(count=2) won't work in this situations.
I'm also deleting duplicated dicts after separete them in last line, but i'm not sure if it's going to work well.
Input:
[{'name': 'Mary', 'age': 25, 'salary': 1000},
{'name': 'John', 'age': 25, 'salary': 2000},
{'name': 'George', 'age': 30, 'salary': 2500},
{'name': 'John', 'age': 25, 'salary': 2000},
{'name': 'John', 'age': 25, 'salary': 2000}]
Desired Output:
[{'name': 'Mary', 'age': 25, 'salary': 1000, 'count':1},
{'name': 'John', 'age': 25, 'salary': 2000, 'count': 3},
{'name': 'George', 'age': 30, 'salary': 2500, 'count' 1}]
You could try the following, which first converts each dictionary to a frozenset of key,value tuples (so that they are hashable as required by collections.Counter).
import collections
a = [{'a':1}, {'a':1},{'b':2}]
print(collections.Counter(map(lambda x: frozenset(x.items()),a)))
Edit to reflect your desired input/output:
from copy import deepcopy
def count_duplicate_dicts(list_of_dicts):
cpy = deepcopy(list_of_dicts)
for d in list_of_dicts:
d['count'] = cpy.count(d)
return list_of_dicts
x = [{'a':1},{'a':1}, {'c':3}]
print(count_duplicate_dicts(x))
If your dict data is well structured and the contents of the dict are simple data types, e.g., numbers and string, and you have following data analysis processing, I would suggest you use pandas, which provide rich functions. Here is a sample code for your case:
In [32]: data = [{'name': 'Mary', 'age': 25, 'salary': 1000},
...: {'name': 'John', 'age': 25, 'salary': 2000},
...: {'name': 'George', 'age': 30, 'salary': 2500},
...: {'name': 'John', 'age': 25, 'salary': 2000},
...: {'name': 'John', 'age': 25, 'salary': 2000}]
...:
...: df = pd.DataFrame(data)
...: df['counts'] = 1
...: df = df.groupby(df.columns.tolist()[:-1]).sum().reset_index(drop=False)
...:
In [33]: df
Out[33]:
age name salary counts
0 25 John 2000 3
1 25 Mary 1000 1
2 30 George 2500 1
In [34]: df.to_dict(orient='records')
Out[34]:
[{'age': 25, 'counts': 3, 'name': 'John', 'salary': 2000},
{'age': 25, 'counts': 1, 'name': 'Mary', 'salary': 1000},
{'age': 30, 'counts': 1, 'name': 'George', 'salary': 2500}]
The logical are:
(1) First build the DataFrame from your data
(2) The groupby function can do aggregate function on each group.
(3) To output back to dict, you can call pd.to_dict
Pandas is a big package, which costs some time to learn it, but it worths to know pandas. It is so powerful that can make your data analysis quite faster and elegant.
Thanks.
You can try this:
import collections
d = [{'name': 'Mary', 'age': 25, 'salary': 1000},
{'name': 'John', 'age': 25, 'salary': 2000},
{'name': 'George', 'age': 30, 'salary': 2500},
{'name': 'John', 'age': 25, 'salary': 2000},
{'name': 'John', 'age': 25, 'salary': 2000}]
count = dict(collections.Counter([i["name"] for i in d]))
a = list(set(map(tuple, [i.items() for i in d])))
final_dict = [dict(list(i)+[("count", count[dict(i)["name"]])]) for i in a]
Output:
[{'salary': 2000, 'count': 3, 'age': 25, 'name': 'John'}, {'salary': 2500, 'count': 1, 'age': 30, 'name': 'George'}, {'salary': 1000, 'count': 1, 'age': 25, 'name': 'Mary'}]
You can take the count values using collections.Counter and then rebuild the dicts after adding the count value from the Counter to each frozenset:
from collections import Counter
l = [dict(d | {('count', c)}) for d, c in Counter(frozenset(d.items())
for d in myList).items()]
print(l)
# [{'salary': 1000, 'name': 'Mary', 'age': 25, 'count': 1},
# {'name': 'John', 'salary': 2000, 'age': 25, 'count': 3},
# {'salary': 2500, 'name': 'George', 'age': 30, 'count': 1}]
mydict = {'a': {'name': 'Marco', 'gender': 'm', 'age': 38, 'info': 'teacher musician'}
'b': {'name': 'Daniela', 'gender': 'f', 'age': 28, 'info': 'student music'}
'c': {'name': 'Maria', 'gender': 'f', 'age': 25, 'info': 'doctor dance whatever'}}
How to select the records with an age below 30 and with the words including 'music' in the 'info'?
The results should be like:
newdict = {'b': {'name': 'Daniela', 'gender': 'f', 'age': 28, 'info': 'student music'}}
Simplest way is to use a dict-comp:
mydict = {'a': {'name': 'Marco', 'gender': 'm', 'age': 38, 'info': 'teacher musician'},
'b': {'name': 'Daniela', 'gender': 'f', 'age': 28, 'info': 'student music'},
'c': {'name': 'Maria', 'gender': 'f', 'age': 25, 'info': 'doctor dance whatever'}}
new_dict = {k:v for k,v in mydict.iteritems() if v['age'] < 30 and 'music' in v['info'].split()}
# {'b': {'info': 'student music', 'gender': 'f', 'age': 28, 'name': 'Daniela'}}
You can use the following comprehension :
>>> {d:k for d,k in mydict.items() if k['age']<30 and 'music' in k['info']}
{'b': {'info': 'student music', 'gender': 'f', 'age': 28, 'name': 'Daniela'}}
mydict.items() give you a tuple contain key ans value of dictionary at each loop , and you can chose the item that have the proper conditions !
I have a list of dictionaries in python like this;
l = [{'name': 'John', 'age': 23},
{'name': 'Steve', 'age': 35},
{'name': 'Helen'},
{'name': 'George'},
{'name': 'Jessica', 'age': 23}]
What I am trying to achieve here is reorder the elements of l in such a way that each entry containing the key age move to the end of the list like this;
End result:
l = [{'name': 'Helen'},
{'name': 'George'},
{'name': 'Jessica', 'age': 23},
{'name': 'John', 'age': 23},
{'name': 'Steve', 'age': 35}]
How can I do this?
You can sort the list:
l.sort(key=lambda d: 'age' in d)
The key returns either True or False, based on the presence of the 'age' key; True is sorted after False. Python's sort is stable, leaving the rest of the relative ordering intact.
Demo:
>>> from pprint import pprint
>>> l = [{'name': 'John', 'age': 23},
... {'name': 'Steve', 'age': 35},
... {'name': 'Helen'},
... {'name': 'George'},
... {'name': 'Jessica', 'age': 23}]
>>> l.sort(key=lambda d: 'age' in d)
>>> pprint(l)
[{'name': 'Helen'},
{'name': 'George'},
{'age': 23, 'name': 'John'},
{'age': 35, 'name': 'Steve'},
{'age': 23, 'name': 'Jessica'}]
If you also wanted to sort by age, then retrieve the age value and return a suitable stable sentinel for those entries that do not have an age, but which will be sorted first. float('-inf') will always be sorted before any other number, for example:
l.sort(key=lambda d: d.get('age', float('-inf')))
Again, entries without an age are left in their original relative order:
>>> l.sort(key=lambda d: d.get('age', float('-inf')))
>>> pprint(l)
[{'name': 'Helen'},
{'name': 'George'},
{'age': 23, 'name': 'John'},
{'age': 23, 'name': 'Jessica'},
{'age': 35, 'name': 'Steve'}]