Updating Nested dictionary with new information in table/dictionary using update - python

Given the following dictionary:
dict1 = {'AA':['THISISSCARY'],
'BB':['AREYOUAFRAID'],
'CC':['DONOTWORRY']}
I'd like to update the values in the dictionary given the information in the following table
Table = pd.DataFrame({'KEY':['AA','AA','BB','CC'],
'POSITION':[2,4,9,3],
'oldval':['I','I','A','O'],
'newval':['X','X','U','I']})
that looks like this
KEY POSITION oldval newval
0 AA 2 I X
1 AA 4 I X
2 BB 9 A U
3 CC 3 O I
The end result should look like this:
dict1 = {'AA':['THXSXSSCARY'],
'BB':['AREYOUAFRUID'],
'CC':['DONITWORRY']}
Essentially, I'm using the KEY and POSITION to find the location of the value in the dictionary then if the oldvalue matches the one in the dictionary, then replacing it with the newval
I've been looking at the update function where I'd convert my table to a dictionary but I'm unsure how to apply to my example.

First craft a nested Series/dictionary to map the key/position/newval, then use a dictionary comprehension:
s = (Table.groupby('KEY')
.apply(lambda d: d.set_index('POSITION')['newval'].to_dict())
)
out = {k: [''.join(s.get(k, {}).get(i, x) for i,x in enumerate(v[0]))]
for k,v in dict1.items()
}
Output:
{'AA': ['THXSXSSCARY'],
'BB': ['AREYOUAFRUID'],
'CC': ['DONITWORRY']}
Intermediate s:
KEY
AA {2: 'X', 4: 'X'}
BB {9: 'U'}
CC {3: 'I'}
dtype: object

you can use:
dict_df=Table.to_dict('records')
print(dict_df)
'''
[{'KEY': 'AA', 'POSITION': 2, 'oldval': 'I', 'newval': 'X'}, {'KEY': 'AA', 'POSITION': 4, 'oldval': 'I', 'newval': 'X'}, {'KEY': 'BB', 'POSITION': 9, 'oldval': 'A', 'newval': 'U'}, {'KEY': 'CC', 'POSITION': 3, 'oldval': 'O', 'newval': 'I'}]
'''
for i in list(dict1.keys()):
for j in dict_df:
if i == j['KEY']:
mask=list(dict1[i][0])
mask[j['POSITION']]=j['newval']
dict1[i]=["".join(mask)]
print(dict1)
# {'AA': ['THXSXSSCARY'], 'BB': ['AREYOUAFRUID'], 'CC': ['DONITWORRY']}

Related

Flat map list without losing mapping?

I have a existing dict that maps single values to lists.
I want to reverse this dictionary and map from every list entry on the original key.
The list entries are unique.
Given:
dict { 1: ['a', 'b'], 2: ['c'] }
Result:
dict { 'a' : 1, 'b' : 1, 'c' : 2 }
How can this be done?
Here's an option
new_dict = {v: k for k, l in d.items() for v in l}
{'a': 1, 'b': 1, 'c': 2}
You can use a list comprehension to produce a tuple with the key-value pair, then, flatten the new list and pass to the built-in dictionary function:
d = { 1: ['a', 'b'], 2: ['c'] }
new_d = dict([c for h in [[(i, a) for i in b] for a, b in d.items()] for c in h])
Output:
{'a': 1, 'c': 2, 'b': 1}

Python dict group and sum multiple values [duplicate]

This question already has answers here:
Group by multiple keys and summarize/average values of a list of dictionaries
(8 answers)
Closed 5 years ago.
I have a set of data in the list of dict format like below:
data = [
{'name': 'A', 'tea':5, 'coffee':6},
{'name': 'A', 'tea':2, 'coffee':3},
{'name': 'B', 'tea':7, 'coffee':1},
{'name': 'B', 'tea':9, 'coffee':4},
]
I'm trying to group by 'name' and sum the 'tea' separately and 'coffee' separately
The final grouped data must be in the this format:
grouped_data = [
{'name': 'A', 'tea':7, 'coffee':9},
{'name': 'B', 'tea':16, 'coffee':5},
]
I tried some steps:
from collections import Counter
c = Counter()
for v in data:
c[v['name']] += v['tea']
my_data = [{'name': name, 'tea':tea} for name, tea in c.items()]
for e in my_data:
print e
The above step returned the following output:
{'name': 'A', 'tea':7,}
{'name': 'B', 'tea':16}
Only I can sum the key 'tea', I'm not able to get the sum for the key 'coffee', can you guys please help to solve this solution to get the grouped_data format
Using pandas:
df = pd.DataFrame(data)
df
coffee name tea
0 6 A 5
1 3 A 2
2 1 B 7
3 4 B 9
g = df.groupby('name', as_index=False).sum()
g
name coffee tea
0 A 9 7
1 B 5 16
And, the final step, df.to_dict:
d = g.to_dict('r')
d
[{'coffee': 9, 'name': 'A', 'tea': 7}, {'coffee': 5, 'name': 'B', 'tea': 16}]
You can try this:
data = [
{'name': 'A', 'tea':5, 'coffee':6},
{'name': 'A', 'tea':2, 'coffee':3},
{'name': 'B', 'tea':7, 'coffee':1},
{'name': 'B', 'tea':9, 'coffee':4},
]
import itertools
final_data = [(a, list(b)) for a, b in itertools.groupby([i.items() for i in data], key=lambda x:dict(x)["name"])]
new_final_data = [{i[0][0]:sum(c[-1] for c in i if isinstance(c[-1], int)) if i[0][0] != "name" else i[0][-1] for i in zip(*b)} for a, b in final_data]
Output:
[{'tea': 7, 'coffee': 9, 'name': 'A'}, {'tea': 16, 'coffee': 5, 'name': 'B'}
Using pandas, this is pretty easy to do:
import pandas as pd
data = [
{'name': 'A', 'tea':5, 'coffee':6},
{'name': 'A', 'tea':2, 'coffee':3},
{'name': 'B', 'tea':7, 'coffee':1},
{'name': 'B', 'tea':9, 'coffee':4},
]
df = pd.DataFrame(data)
df.groupby(['name']).sum()
coffee tea
name
A 9 7
B 5 16
Here's one way to get it into your dict format:
grouped_data = []
for idx in gb.index:
d = {'name': idx}
d = {**d, **{col: gb.loc[idx, col] for col in gb}}
grouped_data.append(d)
grouped_data
Out[15]: [{'coffee': 9, 'name': 'A', 'tea': 7}, {'coffee': 5, 'name': 'B', 'tea': 16}]
But COLDSPEED got the native pandas solution with the as_index=False config...
Click here to see snap shot
import pandas as pd
df = pd.DataFrame(data)
df2=df.groupby('name').sum()
df2.to_dict('r')
Here is a method I created, you can input the key you want to group by:
def group_sum(key,list_of_dicts):
d = {}
for dct in list_of_dicts:
if dct[key] not in d:
d[dct[key]] = {}
for k,v in dct.items():
if k != key:
if k not in d[dct[key]]:
d[dct[key]][k] = v
else:
d[dct[key]][k] += v
final_list = []
for k,v in d.items():
temp_d = {key: k}
for k2,v2 in v.items():
temp_d[k2] = v2
final_list.append(temp_d)
return final_list
data = [
{'name': 'A', 'tea':5, 'coffee':6},
{'name': 'A', 'tea':2, 'coffee':3},
{'name': 'B', 'tea':7, 'coffee':1},
{'name': 'B', 'tea':9, 'coffee':4},
]
grouped_data = group_sum("name",data)
print (grouped_data)
result:
[{'coffee': 5, 'name': 'B', 'tea': 16}, {'coffee': 9, 'name': 'A', 'tea': 7}]
I guess this would be slower when summing thousands of dicts compared to pandas, maybe not, I don't know. It also doesn't seem to maintain order unless you use ordereddict or python 3.6

How to "sort" a dictionary by number of occurrences of a key?

I have a dictionary of values that gives the number of occurrences of a value in a list. How can I return a new dictionary that divides the former dictionary into separate dictionaries based on the value?
In other words, I want to sort this dictionary:
>>> a = {'A':2, 'B':3, 'C':4, 'D':2, 'E':3}
to this one.
b = {2: {'A', 'D'}, 3: {'B', 'E'}, 4: {'C'}}
How do I approach the problem?
from collections import defaultdict
a = {'A': 2, 'B': 3, 'C': 4, 'D': 2, 'E': 3}
b = defaultdict(set)
for k, v in a.items():
b[v].add(k)
This is what you'll get:
defaultdict(<class 'set'>, {2: {'D', 'A'}, 3: {'B', 'E'}, 4: {'C'}})
You can convert b to a normal dict afterwards with b = dict(b).
if you are a python beginner like me, you probably wanna try this
a = {'A': 2 , 'B': 3 , 'C' : 4 , 'D' : 2, 'E' : 3}
b = {}
for key in a:
lst = []
new_key = a[key]
if new_key not in b:
lst.append(key)
b[new_key] = lst
else:
b[new_key].append(key)
print(b)
It uses the mutable property of python dictionary to achieve the result you want.

all possible combinations of dicts based on values inside dicts

I want to generate all possible ways of using dicts, based on the values in them. To explain in code, I have:
a = {'name' : 'a', 'items': 3}
b = {'name' : 'b', 'items': 4}
c = {'name' : 'c', 'items': 5}
I want to be able to pick (say) exactly 7 items from these dicts, and all the possible ways I could do it in.
So:
x = itertools.product(range(a['items']), range(b['items']), range(c['items']))
y = itertools.ifilter(lambda i: sum(i)==7, x)
would give me:
(0, 3, 4)
(1, 2, 4)
(1, 3, 3)
...
What I'd really like is:
({'name' : 'a', 'picked': 0}, {'name': 'b', 'picked': 3}, {'name': 'c', 'picked': 4})
({'name' : 'a', 'picked': 1}, {'name': 'b', 'picked': 2}, {'name': 'c', 'picked': 4})
({'name' : 'a', 'picked': 1}, {'name': 'b', 'picked': 3}, {'name': 'c', 'picked': 3})
....
Any ideas on how to do this, cleanly?
Here it is
import itertools
import operator
a = {'name' : 'a', 'items': 3}
b = {'name' : 'b', 'items': 4}
c = {'name' : 'c', 'items': 5}
dcts = [a,b,c]
x = itertools.product(range(a['items']), range(b['items']), range(c['items']))
y = itertools.ifilter(lambda i: sum(i)==7, x)
z = (tuple([[dct, operator.setitem(dct, 'picked', vval)][0] \
for dct,vval in zip(dcts, val)]) for val in y)
for zz in z:
print zz
You can modify it to create copies of dictionaries. If you need a new dict instance on every iteration, you can change z line to
z = (tuple([[dct, operator.setitem(dct, 'picked', vval)][0] \
for dct,vval in zip(map(dict,dcts), val)]) for val in y)
easy way is to generate new dicts:
names = [x['name'] for x in [a,b,c]]
ziped = map(lambda x: zip(names, x), y)
maped = map(lambda el: [{'name': name, 'picked': count} for name, count in el],
ziped)

How do I map values to values with a common key in Python

In the dictionaries below I want to check whether the value in aa matches the value in bb and produce a mapping of the keys of aa to the keys of bb. Do I need to rearrange the dictionaries? I import the data from a tab separated file, so I am not attached to dictionaries. Note that aa is about 100 times bigger than bb (100k lines for aa), but this is to be run infrequently and offline.
Input:
aa = {1: 'a', 3: 'c', 2 : 'b', 4 : 'd'}
bb = {'apple': 'a', 'pear': 'b', 'mango' : 'g'}
Desired output (or any similar data structure):
dd = {1 : 'apple', 2 : 'pear'}
aa = {1:'a', 3:'c', 2:'b', 4:'d'}
bb = {'apple':'a', 'pear':'b', 'mango': 'g'}
bb_rev = dict((value, key)
for key, value in bb.iteritems()) # bb.items() in python3
dd = dict((key, bb_rev[value])
for key, value in aa.iteritems() # aa.items() in python3
if value in bb_rev)
print dd
You can do something like this:
>>> aa = {1: 'a', 3: 'c', 2 : 'b', 4 : 'd'}
>>> bb = {'apple': 'a', 'pear': 'b', 'mango' : 'g'}
>>> tmp = {v: k for k, v in bb.iteritems()}
>>> dd = {k: tmp[v] for k, v in aa.iteritems() if v in tmp}
>>> dd
{1: 'apple', 2: 'pear'}
but note that this will only work if each value of the aa dictionary appears as a value of the bb dictionary either once or not at all.

Categories