Add nested dictionaries on matching keys - python

I have a nested dictionary, such as:
{'A1': {'T1': [1, 3.0, 3, 4.0], 'T2': [2, 2.0]}, 'A2': {'T1': [1, 0.0, 3, 5.0], 'T2': [2, 3.0]}}
What I want to do is sum each sub dictionary, to obtain this:
A1 A2 A1 A2
T1+T1 T2+T2 (ignore the first entry of the list)
[3.0, 5.0, 9.0] <<<< output
1 2 3
res 3.0 + 0.0 = 3.0 and 2.0 + 3.0 = 5.0 and 5.0 + 4.0 = 9.0
How can I do this? I've tried a for, but I've created a big mess

One way is to use collections.Counter in a list comprehension, and sum the resulting Counter objects:
from collections import Counter
d = {'A1': {'T1': 3.0, 'T2': 2.0}, 'A2': {'T1': 0.0, 'T2': 3.0}}
l = (Counter(i) for i in d.values())
sum(l, Counter())
# Counter({'T1': 3.0, 'T2': 5.0})
For sum to work here, I've defined an empty Counter() as the start argument, so sum expects other Counter objects.
To get only the values, you can do:
sum(l, Counter()).values()
# dict_values([3.0, 5.0])

you could use a list comprehension with zip:
d = {'A1': {'T1': 3.0, 'T2': 2.0}, 'A2': {'T1': 0.0, 'T2': 3.0}}
[sum(e) for e in zip(*(e.values() for e in d.values()))]
output:
[3.0, 5.0]
this will work if your python version is >= 3.6
also, you can use 2 for loops:
r = {}
for dv in d.values():
for k, v in dv.items():
r.setdefault(k, []).append(v)
result = [sum(v) for v in r.values()]
print(result)
output:
[3.0, 5.0]
after your edit
you could use:
from itertools import zip_longest
sum_t1, sum_t2 = list(list(map(sum, zip(*t))) for t in zip(*[e.values() for e in d.values()]))
[i for t in zip_longest(sum_t1[1:], sum_t2[1:]) for i in t if i is not None]
output:
[3.0, 5.0, 6, 9.0]

Related

How to group a dictionary by the first character of their key-values and sort them in ascending order?

I'd like to group the dictionary by the first character of their key-value, find the minimum and maximum value and sort them in ascending order of the maximum value found.
dict = {'1,1': [1.0, 2.0], '3,1': [5.0, 8.0], '2,2': [3.0, 9.0], '2,1': [3.0, 11.0]}
The dictionary after grouping, finding the max and min value, and sort in ascending order of their maximum values should be:
dict = {'1': [1.0, 2.0], '3': [5.0, 8.0], '2': [3.0, 11.0]}
First you can keep concatenating the lists grouped by k[0], and then take minimum and maximum of the lists:
dct = {'1,1': [1.0, 2.0], '3,1': [5.0, 8.0], '2,2': [3.0, 9.0], '2,1': [3.0, 11.0]}
output = {}
for k, v in dct.items():
output[k[0]] = output.get(k[0], []) + v
output = {k: [min(v), max(v)] for k, v in output.items()}
print(output) # {'1': [1.0, 2.0], '3': [5.0, 8.0], '2': [3.0, 11.0]}
Alternatively, if you are willing to use defaultdict:
from collections import defaultdict # this at the beginning of the script
output = defaultdict(list)
for k, v in dct.items():
output[k[0]] += v
output = {k: [min(v), max(v)] for k, v in output.items()}
this works but maybe someone has a more elegant answer:
dictionnary = {'1,1': [1.0, 2.0], '3,1': [5.0, 8.0], '2,2': [3.0, 9.0], '2,1': [3.0, 11.0]}
a = [i[0] for i in dictionnary.keys()]
b = dict.fromkeys(a)
for i in b:
b[i] = []
for j in dictionnary:
if j[0] == i:
if b[i]:
if dictionnary[j][0]<b[i][0]:
b[i][0] = dictionnary[j][0]
if dictionnary[j][1]>b[i][1]:
b[i][1] = dictionnary[j][1]
else:
b[i] = dictionnary[j]
b
Output:
{'1': [1.0, 2.0], '3': [5.0, 8.0], '2': [3.0, 11.0]}
Also, you shouldn't overwrite the builtin python dict

How to make multiindex dataframe from a nested dictionary keys and lists of values?

I have checked the advicse here: Nested dictionary to multiindex dataframe where dictionary keys are column labels
However, I couldn't get it to work in my problem.
I would like to change a dictionary into multiindexed dataframe, where 'a','b','c' are names of multiindexes, their values 12,0.8,1.8,bla1,bla2,bla3,bla4 are multiindexes and values from lists are assign to the multiindexes as in the picture of table below.
My dictionary:
dictionary ={
"{'a': 12.0, 'b': 0.8, 'c': ' bla1'}": [200, 0.0, '0.0'],
"{'a': 12.0, 'b': 0.8, 'c': ' bla2'}": [37, 44, '0.6'],
"{'a': 12.0, 'b': 1.8, 'c': ' bla3'}": [100, 2.0, '1.0'],
"{'a': 12.0, 'b': 1.8, 'c': ' bla4'}": [400, 3.0, '1.0']
}
The result DataFrame I would like to get:
The code which don't make multiindexes and set every values under each other in next row:
df_a = pd.DataFrame.from_dict(dictionary, orient="index").stack().to_frame()
df_b = pd.DataFrame(df_a[0].values.tolist(), index=df_a.index)
Use ast.literal_eval to convert each string into a dictionary and build the index from there:
import pandas as pd
from ast import literal_eval
dictionary ={
"{'a': 12.0, 'b': 0.8, 'c': ' bla1'}": [200, 0.0, '0.0'],
"{'a': 12.0, 'b': 0.8, 'c': ' bla2'}": [37, 44, '0.6'],
"{'a': 12.0, 'b': 1.8, 'c': ' bla3'}": [100, 2.0, '1.0'],
"{'a': 12.0, 'b': 1.8, 'c': ' bla4'}": [400, 3.0, '1.0']
}
keys, data = zip(*dictionary.items())
index = pd.MultiIndex.from_frame(pd.DataFrame([literal_eval(i) for i in keys]))
res = pd.DataFrame(data=list(data), index=index)
print(res)
Output
0 1 2
a b c
12.0 0.8 bla1 200 0.0 0.0
bla2 37 44.0 0.6
1.8 bla3 100 2.0 1.0
bla4 400 3.0 1.0

Binning a list in groups python

I have a list:
l = [2.0, 4.0, 5.0, 6.0, 7.0, 8.0, 10.0, 12.0,96.0, 192.0, 480.0, 360.0, 504.0, 300.0]
I want to group the elements in list in group size difference of 10. (i.e, 0-10,10-20,20-30,30-40...etc)
For eg:
Output that I'm looking for is:
[ [2,4,5,6,7,8,10],[12],[96],[192],[300],[360],[480],[504] ]
I tried using:
list(zip(*[iter(l)] * 10))
But getting wrong answer.
Use itertools.groupby to group together after dividing(//) it by 10
from itertools import groupby
l = [2.0, 4.0, 5.0, 6.0, 7.0, 8.0, 10.0, 12.0,96.0, 192.0, 480.0, 360.0, 504.0, 300.0]
groups = []
for _, g in groupby(l, lambda x: (x-1)//10):
groups.append(list(g)) # Store group iterator as a list
print(groups)
Output:
[[2.0, 4.0, 5.0, 6.0, 7.0, 8.0, 10.0], [12.0], [96.0], [192.0], [480.0], [360.0], [504.0], [300.0]]
A defaultdict might not be bad for this, it's not in one pass, but you can sort the keys to keep everything in place. The integer divide by 10 will bin everything for you
groups = defaultdict(list)
for i in l:
groups[int((i-1)//10)].append(i)
groups_list = sorted(groups.values())
groups_list[[2.0, 4.0, 5.0, 6.0, 7.0, 8.0, 10.0], [12.0], [96.0], [192.0], [300.0], [360.0], [480.0], [504.0]]
Even though, an answer is accepted, here is another way :
l = [2.0, 4.0, 5.0, 6.0, 7.0, 8.0, 10.0, 12.0,96.0, 192.0, 480.0, 360.0, 504.0, 300.0]
l1 = [int(k) for k in l]
l2 = list(list([k for k in l1 if len(str(k))==j]) for j in range(1,len(str(max(l1))) +1))
OUTPUT :
l2 = [[2, 4, 5, 6, 7, 8], [10, 12, 96], [192, 480, 360, 504, 300]]
It can be sub listed using dictionary : the key for dict will be value-1/10 if same key comes value will be appended:
gd={}
for i in l:
k=int((i-1)//10)
if k in gd:
gd[k].append(i)
else:
gd[k]=[i]
print(gd.values())
You can loop over you list l and create a new list using extend and an if condition:
smaller_list = []
larger_list = []
desired_result_list = []
for element in l:
if element <= 10:
smaller_list.extend([element])
else:
larger_list.append([element])
desired_result_list.extend(larger_list + [smaller_list])

I want to replace dictionary's value

I want to replace dictionary's value.I have a dictionary whose variable's name is dct like
dct={'A': {'a1': [[10.0, 5.0], [7.0, 7.0], [1.0, 5.0], [20.0, 30.0]],
'a2': [[50.0, 50.0], [55.0, 60.0]],
'a3': [[40.0, 100.0], [100.0, 200.0], [100.0, 140.0], [200.0, 190.0]],
'a4': [[50.0, 70.0], [140.0, 130.0], [160.0, 150.0], [200.0, 180.0]],
'a5': [[100.0, 110.0], [180.0, 210.0], [60.0, 50.0], [200.0, 190.0]] }}
If dictionary's child value like [[10.0, 5.0], [7.0, 7.0], [1.0, 5.0], [20.0, 30.0]] or [[50.0, 50.0], [55.0, 60.0]] can be divided 4,I want to replace 5 instead of the child value.If dictionary's child value can be divided 2,I want to replace 4 instead of the child value.
So, I wrote the codes,
for ky, vl in dct.items():
for k,v in vl.items():
if len(v) %4 == 0:
element[ky] = 5
elif len(v) %2 == 0:
element[ky] = 4
else:
continue
print(element)
But print(element) shows {‘A’: {‘a5’: 5}} so it has only last value.I really cannot understand why such a thing happens.How can I fix this?What is wrong in my codes?
Actually your code is incorrect to perform that given task, here's the correct code to solve your query like whatever you wanted to implement.
Check this below code it works fine and replaces child values by 5 when each child value is divisible by 4 and replaces values by 4 when each child value is divisible by 2
dct = {'A': {'a1': [[10.0, 5.0], [7.0, 7.0], [1.0, 5.0], [20.0, 30.0]],
'a2': [[50.0, 50.0], [55.0, 60.0]],
'a3': [[40.0, 100.0], [100.0, 200.0], [100.0, 140.0], [200.0, 190.0]],
'a4': [[50.0, 70.0], [140.0, 130.0], [160.0, 150.0], [200.0, 180.0]],
'a5': [[100.0, 110.0], [180.0, 210.0], [60.0, 50.0], [200.0, 190.0]] }}
print (dct)
for k,v in dct.items():
for ky,vl in v.items():
for each_elem in (range(0,len(vl))):
if vl[each_elem][0] % 4 == 0:
vl[each_elem][0] = 5
else:
if vl[each_elem][0] % 2 == 0:
vl[each_elem][0] = 4
if vl[each_elem][1] % 4 == 0:
vl[each_elem][1] = 5
else:
if vl[each_elem][1] % 2 == 0:
vl[each_elem][1] = 4
print ("\n")
print (dct)
that gives this output below
{'A': {'a1': [[10.0, 5.0], [7.0, 7.0], [1.0, 5.0], [20.0, 30.0]], 'a3': [[40.0, 100.0], [100.0, 200.0], [100.0, 140.0], [200.0, 190.0]], 'a2': [[50.0, 50.0], [55.0, 60.0]], 'a5': [[100.0, 110.0], [180.0, 210.0], [60.0, 50.0], [200.0, 190.0]], 'a4': [[50.0, 70.0], [140.0, 130.0], [160.0, 150.0], [200.0, 180.0]]}}
{'A': {'a1': [[4, 5.0], [7.0, 7.0], [1.0, 5.0], [5, 4]], 'a3': [[5, 5], [5, 5], [5, 5], [5, 4]], 'a2': [[4, 4], [55.0, 5]], 'a5': [[5, 4], [5, 4], [5, 4], [5, 4]], 'a4': [[4, 4], [5, 4], [5, 4], [5, 5]]}}
Hope this answer work great for you. Have a good time ahead :)
The problem is you are inserting main dict key's into new dict , But in origional dict there are two dict , so you have to maintain a sub or nested dict and then at last you can insert that nested dict to main dict:
Try this code :
dct={'A': {'a1': [[10.0, 5.0], [7.0, 7.0], [1.0, 5.0], [20.0, 30.0]], 'a2': [[50.0, 50.0], [55.0, 60.0]], 'a3': [[40.0, 100.0], [100.0, 200.0], [100.0, 140.0], [200.0, 190.0]], 'a4': [[50.0, 70.0], [140.0, 130.0], [160.0, 150.0], [200.0, 180.0]], 'a5': [[100.0, 110.0], [180.0, 210.0], [60.0, 50.0], [200.0, 190.0]] }}
element={}
for ky, vl in dct.items():
sub_dict={}
for k, v in vl.items():
if len(v) % 4 == 0:
sub_dict[k] = 5
elif len(v) % 2 == 0:
sub_dict[k] = 4
else:
continue
element[ky]=sub_dict
print(element)
output:
{'A': {'a1': 5, 'a2': 4, 'a3': 5, 'a5': 5, 'a4': 5}}

add list value of a key in a dictionary python

I have a following dictionary:
centroid = {'A': [1.0, 1.0], 'B': [2.0, 1.0]}
Using the above dictionary I am creating two different dictionaries and appending them to a list:
for key in centroids:
clusters_list.append(dict(zip(key, centroids.get(key))))
However when I check my cluster_list I get the following data:
[{'A': 1.0}, {'B': 2.0}]
instead of
[{'A': [1.0, 1.0]}, {'B': [2.0, 1.0]}].
How can i fix this?
You can use a list comprehension:
For Python 2:
cluster_list = [{k: v} for k, v in centroid.iteritems()]
# [{'A': [1.0, 1.0]}, {'B': [2.0, 1.0]}]
For Python 3:
cluster_list = [{k: v} for k, v in centroid.items()]
You can also use starmap from itertools module.
In [1]: from itertools import starmap
In [2]: list(starmap(lambda k,v: {k:v}, centroid.items()))
Out[2]: [{'B': [2.0, 1.0]}, {'A': [1.0, 1.0]}]
And of course, it doesn't guarantee the order in the resulting list.

Categories