I wrote some python code to pull data from SQL server and I'm currently trying to merge the data. I tried to pull the data into a Dataframe and then work with it, but wasn't able to do that.
The current form that I set the data up as is like this :
[ { a : { 1 : ( x,y,z,...) }},
{ a : { 2 : ( x,y,z,...) }},
{ a : { 3 : ( x,y,z,...) }} ]
This is where I want to get to
[ { a : { 1 : ( x,y,z,...) , 2 : (x,y,...) , 3 : (x,y,z,...) } ]
Use a nested dictionary structure via collections.defaultdict.
Note that in this implementation duplicate inner keys are not permitted; for example, you cannot have two dictionaries with outer key 'a' and inner key 1. In this case, the last will take precedence.
from collections import defaultdict
lst = [ { 'a' : { 1 : ( 3, 4, 5 ) }},
{ 'a' : { 2 : ( 6, 7, 8 ) }},
{ 'a' : { 3 : ( 1, 2, 3 ) }},
{ 'c' : { 4 : ( 5, 9, 8 ) }},
{ 'b' : { 1 : ( 6, 6, 8 ) }},
{ 'c' : { 3 : ( 2, 5, 7 ) }}]
d = defaultdict(dict)
for item in lst:
key = next(iter(item))
d[key].update(item[key])
# defaultdict(dict,
# {'a': {1: (3, 4, 5), 2: (6, 7, 8), 3: (1, 2, 3)},
# 'b': {1: (6, 6, 8)},
# 'c': {3: (2, 5, 7), 4: (5, 9, 8)}})
How's this?
data = [{'a': {1: 4}, 'b': {7: 8}},
{'a': {2: 5}, 'b': {9: 10}},
{'a': {3: 6}}]
all_keys = set().union(*data)
result = {}
for key in all_keys:
result[key] = {}
for d in data:
if key in d:
result[key].update(d[key])
print(result) # {'b': {7: 8, 9: 10}, 'a': {1: 4, 2: 5, 3: 6}}
You can make use of the reduce function and dict.update to transform the data. Assuming that 'a' is your only key, you can do this:
a = [
{'a': {1: (1, 2, 3)}},
{'a': {2: (4, 5, 6)}},
{'a': {3: (7, 8, 9)}}
]
def update(d, c):
d['a'].update(c['a'])
return d
print reduce(update, a, {'a':{}}) #Prints {'a': {1: (1, 2, 3), 2: (4, 5, 6), 3: (7, 8, 9)}}
Related
This question already has answers here:
How do I sort a list of dictionaries by a value of the dictionary?
(20 answers)
Closed 2 months ago.
I had Question in python
Imagine a list with dictionaries in it
how can we sort it by a value in dictionary ?
Imagine this list :
lst = [
{
"a" : 3,
"b" : 2
},
{
"a" : 1,
"b" : 4
},
{
"a" : 2,
"b" : 3
}
]
how can we sort this list by value of "a" in each dictionary (python)
i mean i want this list at the end :
lst = [
{
"a" : 1,
"b" : 4
},
{
"a" : 2,
"b" : 3
},
{
"a" : 3,
"b" : 2
}
]
You could provide a lambda key to sorted:
>>> lst = [
... {
... "a" : 3,
... "b" : 2
... },
... {
... "a" : 1,
... "b" : 4
... },
... {
... "a" : 2,
... "b" : 3
... }
... ]
>>> sorted(lst, key=lambda d: d["a"])
[{'a': 1, 'b': 4}, {'a': 2, 'b': 3}, {'a': 3, 'b': 2}]
One approach, use the key argument with itemgetter:
from operator import itemgetter
lst = [{"a": 3, "b": 2}, {"a": 1, "b": 4}, {"a": 2, "b": 3}]
res = sorted(lst, key=itemgetter("a"))
print(res)
Output
[{'a': 1, 'b': 4}, {'a': 2, 'b': 3}, {'a': 3, 'b': 2}]
From the documentation on itemgetter:
Return a callable object that fetches item from its operand using the
operand’s getitem() method. If multiple items are specified,
returns a tuple of lookup values. For example:
After f = itemgetter(2), the call f(r) returns r[2].
After g =
itemgetter(2, 5, 3), the call g(r) returns (r[2], r[5], r[3]).
first post!
I am trying to create a function that create dictionary in loop from a dataframe.
Assume those 2 simplistic dataframes already exist:
data1 = {'A':[1, 2, 3, 4], 'B':[5, 6, 7, 8]}
df1 = pd.DataFrame(data)
dataframe1
and
data2 = {'C':[9, 10], 'D':[11, 12], 'E':[13, 14] }
df2 = pd.DataFrame(data2)
dataframe2
I want to be able to create a function like this:
def create_dict(df):
where the end results of df1 is:
dict1 = { 'A' : 1, 'B' : 5}
dict2 = { 'A' : 2, 'B' : 6}
dict3 = { 'A' : 3, 'B' : 7}
dict4 = { 'A' : 4, 'B' : 8}
and the end results of df2 is:
dict1 = { 'C' : 9, 'D' : 11, 'E' : 13}
dict2 = { 'C' : 10, 'D' : 12, 'E' : 14}
I was looking at dictionary comprehension to handle this, but I'm obviously not sure how to handle that problem. Thanks!
Use pandas.DataFrame.to_dict with records:
df1.to_dict(orient="records")
Output:
[{'A': 1, 'B': 5}, {'A': 2, 'B': 6}, {'A': 3, 'B': 7}, {'A': 4, 'B': 8}]
Consider the following:
>>> # list of length n
>>> idx = ['a', 'b', 'c', 'd']
>>> # list of length n
>>> l_1 = [1, 2, 3, 4]
>>> # list of length n
>>> l_2 = [5, 6, 7, 8]
>>> # first key
>>> key_1 = 'mkt_o'
>>> # second key
>>> key_2 = 'mkt_c'
How do I zip this mess to look like this?
{
'a': {'mkt_o': 1, 'mkt_c': 5},
'b': {'mkt_o': 2, 'mkt_c': 6},
'c': {'mkt_o': 3, 'mkt_c': 6},
'd': {'mkt_o': 4, 'mkt_c': 7},
...
}
The closest I've got is something like this:
>>> dict(zip(idx, zip(l_1, l_2)))
{'a': (1, 5), 'b': (2, 6), 'c': (3, 7), 'd': (4, 8)}
Which of course has tuples as values instead of dictionaries, and
>>> dict(zip(('mkt_o', 'mkt_c'), (1,2)))
{'mkt_o': 1, 'mkt_c': 2}
Which seems like it might be promising, but again, fails to meet requirements.
{k : {key_1 : v1, key_2 : v2} for k,v1,v2 in zip(idx, l_1, l_2)}
Solution 1: You may use zip twice (actually thrice) with dictionary comprehension to achieve this as:
idx = ['a', 'b', 'c', 'd']
l_1 = [1, 2, 3, 4]
l_2 = [5, 6, 7, 8]
keys = ['mkt_o', 'mkt_c'] # yours keys in another list
new_dict = {k: dict(zip(keys, v)) for k, v in zip(idx, zip(l_1, l_2))}
Solution 2: You may also use zip with nested list comprehension as:
new_dict = dict(zip(idx, [{key_1: i, key_2: j} for i, j in zip(l_1, l_2)]))
Solution 3: using dictionary comprehension on top of zip as shared in DYZ's answer:
new_dict = {k : {key_1 : v1, key_2 : v2} for k,v1,v2 in zip(idx, l_1, l_2)}
All the above solutions will return new_dict as:
{
'a': {'mkt_o': 1, 'mkt_c': 5},
'b': {'mkt_o': 2, 'mkt_c': 6},
'c': {'mkt_o': 3, 'mkt_c': 7},
'd': {'mkt_o': 4, 'mkt_c': 8}
}
You're working with dicts, lists, indices, keys and would like to transpose the data. It might make sense to work with pandas (DataFrame, .T and .to_dict):
>>> import pandas as pd
>>> idx = ['a', 'b', 'c', 'd']
>>> l_1 = [1, 2, 3, 4]
>>> l_2 = [5, 6, 7, 8]
>>> key_1 = 'mkt_o'
>>> key_2 = 'mkt_c'
>>> pd.DataFrame([l_1, l_2], index=[key_1, key_2], columns = idx)
a b c d
mkt_o 1 2 3 4
mkt_c 5 6 7 8
>>> pd.DataFrame([l_1, l_2], index=[key_1, key_2], columns = idx).T
mkt_o mkt_c
a 1 5
b 2 6
c 3 7
d 4 8
>>> pd.DataFrame([l_1, l_2], index=[key_1, key_2], columns = idx).to_dict()
{'a': {'mkt_o': 1, 'mkt_c': 5},
'b': {'mkt_o': 2, 'mkt_c': 6},
'c': {'mkt_o': 3, 'mkt_c': 7},
'd': {'mkt_o': 4, 'mkt_c': 8}
}
It can also be done with dict, zip, map and repeat from itertools:
>>> from itertools import repeat
>>> dict(zip(idx, map(dict, zip(zip(repeat(key_1), l_1), zip(repeat(key_2), l_2)))))
{'a': {'mkt_c': 5, 'mkt_o': 1}, 'c': {'mkt_c': 7, 'mkt_o': 3}, 'b': {'mkt_c': 6, 'mkt_o': 2}, 'd': {'mkt_c': 8, 'mkt_o': 4}}
I have a list of dictionaries as follows.
[{'a' : 1, 'b' : 2, 'c' : 2},
{'a' : 2, 'b' : 3, 'c' : 3},
{'a' : 3, 'b' : 5, 'c' : 6},
{'a' : 4, 'b' : 7, 'c' : 8},
{'a' : 1, 'b' : 8, 'c' : 9},
{'a' : 2, 'b' : 0, 'c' : 0},
{'a' : 5, 'b' : 1, 'c' : 3},
{'a' : 7, 'b' : 4, 'c' : 5}]
I want to create a dictionary of lists from above list which should be as follows.
{1 : [{'a' : 1, 'b' : 2, 'c' : 2}, {'a' : 1, 'b' : 8, 'c' : 9}]
2 : [{'a' : 2, 'b' : 3, 'c' : 3}, {'a' : 2, 'b' : 0, 'c' : 0}]
3 : [{'a' : 3, 'b' : 5, 'c' : 6}]
4 : [{'a' : 4, 'b' : 7, 'c' : 8}]
5 : [{'a' : 5, 'b' : 1, 'c' : 3}]
7 : [{'a' : 7, 'b' : 4, 'c' : 5}]
Basically I want to pick one of the keys in dictionary say 'a', and create new dictionary with the values of that key (1, 2, 3, 4, 5, 7) as keys for new dictionary to be created, and values for new dictionary should be list of all the dictionaries containing that value as value for key 'a'.
I know the simplest approach is iterating over the list and build the required dictionary. I am just curious is there another way of doing it.
A collections.defaultdict will be the most efficient:
from collections import defaultdict
l = [{'a': 1, 'b': 2, 'c': 2},
{'a': 2, 'b': 3, 'c': 3},
{'a': 3, 'b': 5, 'c': 6},
{'a': 4, 'b': 7, 'c': 8},
{'a': 1, 'b': 8, 'c': 9},
{'a': 2, 'b': 0, 'c': 0},
{'a': 5, 'b': 1, 'c': 3},
{'a': 7, 'b': 4, 'c': 5}]
dct = defaultdict(list)
for d in l:
dct[d["a"]].append(d)
from pprint import pprint as pp
pp(dict(dct))
Output:
{1: [{'a': 1, 'b': 2, 'c': 2}, {'a': 1, 'b': 8, 'c': 9}],
2: [{'a': 2, 'b': 3, 'c': 3}, {'a': 2, 'b': 0, 'c': 0}],
3: [{'a': 3, 'b': 5, 'c': 6}],
4: [{'a': 4, 'b': 7, 'c': 8}],
5: [{'a': 5, 'b': 1, 'c': 3}],
7: [{'a': 7, 'b': 4, 'c': 5}]}
Normal dictionary with setdefault method can be used
Code:
data=[{'a' : 1, 'b' : 2, 'c' : 2},
{'a' : 2, 'b' : 3, 'c' : 3},
{'a' : 3, 'b' : 5, 'c' : 6},
{'a' : 4, 'b' : 7, 'c' : 8},
{'a' : 1, 'b' : 8, 'c' : 9},
{'a' : 2, 'b' : 0, 'c' : 0},
{'a' : 5, 'b' : 1, 'c' : 3},
{'a' : 7, 'b' : 4, 'c' : 5}]
dictionary_list={}
for row in data:
dictionary_list.setdefault(row["a"],[]).append(row)
print dictionary_list
Output:
{1: [{'a': 1, 'c': 2, 'b': 2}, {'a': 1, 'c': 9, 'b': 8}],
2: [{'a': 2, 'c': 3, 'b': 3}, {'a': 2, 'c': 0, 'b': 0}],
3: [{'a': 3, 'c': 6, 'b': 5}],
4: [{'a': 4, 'c': 8, 'b': 7}],
5: [{'a': 5, 'c': 3, 'b': 1}],
7: [{'a': 7, 'c': 5, 'b': 4}]}
You can do it in following way
mylist = [
{'a' : 1, 'b' : 2, 'c' : 2},
{'a' : 2, 'b' : 3, 'c' : 3},
{'a' : 3, 'b' : 5, 'c' : 6},
{'a' : 4, 'b' : 7, 'c' : 8},
{'a' : 1, 'b' : 8, 'c' : 9},
{'a' : 2, 'b' : 0, 'c' : 0},
{'a' : 5, 'b' : 1, 'c' : 3},
{'a' : 7, 'b' : 4, 'c' : 5}
]
def get_dict(mylist, required_key):
result_dict = {}
for mydict in mylist:
result_dict.setdefault(mydict[required_key], [])
result_dict[mydict[required_key]].append(mydict)
return result_dict
result_dict = get_dict(mylist, required_key = 'a')
print(result_dict)
Given a list of dictionaries in python like this:
dict_list = [
{'a' : 1, 'b' : 2},
{'c' : 2, 'd' : 3},
{'x' : 4, 'y' : 5, 'z': 0}
]
What is the best to loop through all the values while emulating the obvious:
for i in dict_list:
for x in i.values():
print x
But ideally avoiding the nested for loops. I'm sure there must be a better way but I can't find it.
To loop through all the values, use itertools.chain.from_iterable
from itertools import chain
dict_list = [
{'a' : 1, 'b' : 2},
{'c' : 2, 'd' : 3},
{'x' : 4, 'y' : 5, 'z': 0}
]
for item in chain.from_iterable(i.values() for i in dict_list):
print item
Outputs:
1
2
2
3
5
4
0