Get the value from a nested dictionary.Python - python

I want to get a list of the values from a nested dictionary.
d = {2.5: {2005: 0.3}, 2.6: {2005: 0.4}, 5.5: {2010: 0.8}, 7.5: {2010: 0.95}}
def get_values_from_nested_dict(dic):
list_of_values = dic.values()
l = []
for i in list_of_values:
a = i.values()
l.append(a)
return l
d1 = get_values_from_nested_dict(d)
print(d1)
My results:
[dict_values([0.3]), dict_values([0.4]), dict_values([0.8]), dict_values([0.95])]
But I want the list to be:
[0.3,0.4,0.8,0.95]

You could simply use a double-comprehension (equivalent to a nested loop) on the dicts's values:
d = {2.5: {2005: 0.3}, 2.6: {2005: 0.4}, 5.5: {2010: 0.8}, 7.5: {2010: 0.95}}
[y for x in d.values() for y in x.values()]
# [0.3, 0.4, 0.8, 0.95]

You need to iterate again through the values of the internal dictionary and append each of them to the output variable.
def get_values_from_nested_dict(dic):
l = []
for outer_value in dic.values():
for value in outer_value.values():
l.append(value)
return l

You can do like this,
In [97]: d
Out[97]: {2.5: {2005: 0.3}, 2.6: {2005: 0.4}, 5.5: {2010: 0.8}, 7.5: {2010: 0.95}}
In [98]: list(map(lambda x:list(x.values())[0], d.values()))
Out[98]: [0.3, 0.4, 0.8, 0.95]

Related

Fill a python dictionary with values from a pandas dataFrame

This is my dictionary, called "reviews":
reviews= {1: {'like', 'the', 'acting'},
2: {'hate', 'plot', 'story'}}
And this is my "lexicon" dataFrame:
import pandas as pd
lexicon = {'word': ['like', 'movie', 'hate'],
'neg': [0.0005, 0.0014, 0.0029],
'pos': [0.0025, 0.0019, 0.0002]
}
lexicon = pd.DataFrame(lexicon, columns = ['word', 'neg','pos'])
print (lexicon)
I need to fill my "reviews" dictionary with the neg and pos values from the "lexicon" dataFrame.
If there is no value in the lexicon, then I want to put 0.5
To finally get this outcome:
reviews= {1: {'like': [0.0005, 0.0025], 'the': [0.5, 0.5], 'acting': [0.5, 0.5]},
2: {'plot': [0.5, 0.5], 'hate': [0.0029, 0.0002], 'story': [0.5, 0.5]}}
You can use df.reindex here.
df_ = lexicon.set_index("word").agg(list, axis=1)
out = {k: df_.reindex(v, fill_value=[0.5, 0.5]).to_dict() for k, v in reviews.items()}
# {1: {'the': [0.5, 0.5], 'like': [0.0005, 0.0025], 'acting': [0.5, 0.5]},
# 2: {'story': [0.5, 0.5], 'hate': [0.0029, 0.0002], 'plot': [0.5, 0.5]}}
Create dictionary from lexicon and then in double dictionary comprehension mapping by dict.get for possible add default value if no match:
d = lexicon.set_index('word').agg(list, axis=1).to_dict()
print (d)
{'like': [0.0005, 0.0025], 'movie': [0.0014, 0.0019], 'hate': [0.0029, 0.0002]}
out = {k: {x: d.get(x, [0.5,0.5]) for x in v} for k, v in reviews.items()}
print (out)
{1: {'like': [0.0005, 0.0025], 'the': [0.5, 0.5], 'acting': [0.5, 0.5]},
2: {'story': [0.5, 0.5], 'hate': [0.0029, 0.0002], 'plot': [0.5, 0.5]}}

Add nested dictionaries on matching keys

I have a nested dictionary, such as:
{'A1': {'T1': [1, 3.0, 3, 4.0], 'T2': [2, 2.0]}, 'A2': {'T1': [1, 0.0, 3, 5.0], 'T2': [2, 3.0]}}
What I want to do is sum each sub dictionary, to obtain this:
A1 A2 A1 A2
T1+T1 T2+T2 (ignore the first entry of the list)
[3.0, 5.0, 9.0] <<<< output
1 2 3
res 3.0 + 0.0 = 3.0 and 2.0 + 3.0 = 5.0 and 5.0 + 4.0 = 9.0
How can I do this? I've tried a for, but I've created a big mess
One way is to use collections.Counter in a list comprehension, and sum the resulting Counter objects:
from collections import Counter
d = {'A1': {'T1': 3.0, 'T2': 2.0}, 'A2': {'T1': 0.0, 'T2': 3.0}}
l = (Counter(i) for i in d.values())
sum(l, Counter())
# Counter({'T1': 3.0, 'T2': 5.0})
For sum to work here, I've defined an empty Counter() as the start argument, so sum expects other Counter objects.
To get only the values, you can do:
sum(l, Counter()).values()
# dict_values([3.0, 5.0])
you could use a list comprehension with zip:
d = {'A1': {'T1': 3.0, 'T2': 2.0}, 'A2': {'T1': 0.0, 'T2': 3.0}}
[sum(e) for e in zip(*(e.values() for e in d.values()))]
output:
[3.0, 5.0]
this will work if your python version is >= 3.6
also, you can use 2 for loops:
r = {}
for dv in d.values():
for k, v in dv.items():
r.setdefault(k, []).append(v)
result = [sum(v) for v in r.values()]
print(result)
output:
[3.0, 5.0]
after your edit
you could use:
from itertools import zip_longest
sum_t1, sum_t2 = list(list(map(sum, zip(*t))) for t in zip(*[e.values() for e in d.values()]))
[i for t in zip_longest(sum_t1[1:], sum_t2[1:]) for i in t if i is not None]
output:
[3.0, 5.0, 6, 9.0]

Convert redundant array to dict (or JSON)?

Suppose I have an array:
[['a', 10, 1, 0.1],
['a', 10, 2, 0.2],
['a', 20, 2, 0.3],
['b', 10, 1, 0.4],
['b', 20, 2, 0.5]]
And I want a dict (or JSON):
{
'a': {
10: {1: 0.1, 2: 0.2},
20: {2: 0.3}
}
'b': {
10: {1: 0.4},
20: {2: 0.5}
}
}
Is there any good way or some library for this task?
In this example the array is just 4-column, but my original array is more complicated (7-column).
Currently I implement this naively:
import pandas as pd
df = pd.DataFrame(array)
grouped1 = df.groupby('column1')
for column1 in grouped1.groups:
group1 = grouped1.get_group(column1)
grouped2 = group1.groupby('column2')
for column2 in grouped2.groups:
group2 = grouped2.get_group(column2)
...
And defaultdict way:
d = defaultdict(lambda x: defaultdict(lambda y: defaultdict ... ))
for row in array:
d[row[0]][row[1]][row[2]... = row[-1]
But I think neither is smart.
I would suggest this rather simple solution:
from functools import reduce
data = [['a', 10, 1, 0.1],
['a', 10, 2, 0.2],
['a', 20, 2, 0.3],
['b', 10, 1, 0.4],
['b', 20, 2, 0.5]]
result = dict()
for row in data:
reduce(lambda v, k: v.setdefault(k, {}), row[:-2], result)[row[-2]] = row[-1]
print(result)
{'a': {10: {1: 0.1, 2: 0.2}, 20: {2: 0.3}}, 'b': {10: {1: 0.4}, 20: {2: 0.5}}}
An actual recursive solution would be something like this:
def add_to_group(keys: list, group: dict):
if len(keys) == 2:
group[keys[0]] = keys[1]
else:
add_to_group(keys[1:], group.setdefault(keys[0], dict()))
result = dict()
for row in data:
add_to_group(row, result)
print(result)
Introduction
Here is a recursive solution. The base case is when you have a list of 2-element lists (or tuples), in which case, the dict will do what we want:
>>> dict([(1, 0.1), (2, 0.2)])
{1: 0.1, 2: 0.2}
For other cases, we will remove the first column and recurse down until we get to the base case.
The code:
from itertools import groupby
def rows2dict(rows):
if len(rows[0]) == 2:
# e.g. [(1, 0.1), (2, 0.2)] ==> {1: 0.1, 2: 0.2}
return dict(rows)
else:
dict_object = dict()
for column1, groupped_rows in groupby(rows, lambda x: x[0]):
rows_without_first_column = [x[1:] for x in groupped_rows]
dict_object[column1] = rows2dict(rows_without_first_column)
return dict_object
if __name__ == '__main__':
rows = [['a', 10, 1, 0.1],
['a', 10, 2, 0.2],
['a', 20, 2, 0.3],
['b', 10, 1, 0.4],
['b', 20, 2, 0.5]]
dict_object = rows2dict(rows)
print dict_object
Output
{'a': {10: {1: 0.1, 2: 0.2}, 20: {2: 0.3}}, 'b': {10: {1: 0.4}, 20: {2: 0.5}}}
Notes
We use the itertools.groupby generator to simplify grouping of similar rows based on the first column
For each group of rows, we remove the first column and recurse down
This solution assumes that the rows variable has 2 or more columns. The result is unpreditable for rows which has 0 or 1 column.

add list value of a key in a dictionary python

I have a following dictionary:
centroid = {'A': [1.0, 1.0], 'B': [2.0, 1.0]}
Using the above dictionary I am creating two different dictionaries and appending them to a list:
for key in centroids:
clusters_list.append(dict(zip(key, centroids.get(key))))
However when I check my cluster_list I get the following data:
[{'A': 1.0}, {'B': 2.0}]
instead of
[{'A': [1.0, 1.0]}, {'B': [2.0, 1.0]}].
How can i fix this?
You can use a list comprehension:
For Python 2:
cluster_list = [{k: v} for k, v in centroid.iteritems()]
# [{'A': [1.0, 1.0]}, {'B': [2.0, 1.0]}]
For Python 3:
cluster_list = [{k: v} for k, v in centroid.items()]
You can also use starmap from itertools module.
In [1]: from itertools import starmap
In [2]: list(starmap(lambda k,v: {k:v}, centroid.items()))
Out[2]: [{'B': [2.0, 1.0]}, {'A': [1.0, 1.0]}]
And of course, it doesn't guarantee the order in the resulting list.

Python: Count elements on Counter output

I have a nested list as:
List1 = [[A,B,A,A],[C,C,B,B],[A,C,B,B]]..... so on
I used counter function to count the number of elements in the nested lists:
for i,j in enumerate(List1):
print(Counter(j))
I got following output as:
Counter({'A': 3, 'B': 1})
Counter({'C': 2, 'B': 2})
Counter({'B': 2, 'A': 1, 'C': 1})
....
I want to calculate percentage of A in Counter output:
A = number of A's / total number of elements
For example:
Counter({'A': 3, 'B': 1})
Would yield:
A = 3/4 = 0.75
I am not able to calculate A, Can anyone kindly help me with this?
The following would give you a list of dictionaries holding both the counts and the percentages for each entry:
List1 = [['A','B','A','A'],['C','C','B','B'],['A','C','B','B']]
counts = [Counter(x) for x in List1]
percentages = [{k : (v, v / float(len(l1))) for k,v in cc.items()} for l1, cc in zip(List1, counts)]
print percentages
Giving the following output:
[{'A': (3, 0.75), 'B': (1, 0.25)}, {'C': (2, 0.5), 'B': (2, 0.5)}, {'A': (1, 0.25), 'C': (1, 0.25), 'B': (2, 0.5)}]
For just the percentages:
List1 = [['A','B','A','A'],['C','C','B','B'],['A','C','B','B']]
counts = [Counter(x) for x in List1]
percentages = [{k : v / float(len(l1)) for k,v in cc.items()} for l1, cc in zip(List1, counts)]
print percentages
Giving:
[{'A': 0.75, 'B': 0.25}, {'C': 0.5, 'B': 0.5}, {'A': 0.25, 'C': 0.25, 'B': 0.5}]
This:
In [1]: l = [['A','B','A','A'],['C','C','B','B'],['A','C','B','B']]
In [2]: [{i: x.count(i)/float(len(x)) for i in x} for x in l]
Out[2]:
[{'A': 0.75, 'B': 0.25},
{'B': 0.5, 'C': 0.5},
{'A': 0.25, 'B': 0.5, 'C': 0.25}]
>>> for sublist in List1:
c = Counter(sublist)
print(c['A'] / sum(c.values()))
0.75
0.0
0.25
All values at once:
>>> for sublist in List1:
c = Counter(sublist)
s = sum(c.values())
print(c['A'] / s, c['B'] / s, c['C'] / s)
0.75 0.25 0.0
0.0 0.5 0.5
0.25 0.5 0.25
If you want to get a list of all items in a sublist with their respective percentages, you need to iterate the counter:
>>> for sublist in List1:
c = Counter(sublist)
s = sum(c.values())
for elem, count in c.items():
print(elem, count / s)
print()
A 0.75
B 0.25
B 0.5
C 0.5
A 0.25
B 0.5
C 0.25
Or use a dictionary comprehension:
>>> for sublist in List1:
c = Counter(sublist)
s = sum(c.values())
print({ elem: count / s for elem, count in c.items() })
{'A': 0.75, 'B': 0.25}
{'B': 0.5, 'C': 0.5}
{'A': 0.25, 'B': 0.5, 'C': 0.25}
You can use list generator and join method to connect your lists of lists of chars into one-liner list of strings.
>>> List1 = [['A', 'B', 'A', 'A'],['C', 'C', 'B', 'B'],['A', 'C', 'B', 'B']]
>>> [''.join(x) for x in List1]
['ABAA', 'CCBB', 'ACBB']
Then, join again your list to the one string.
>>> ''.join(['ABAA', 'CCBB', 'ACBB'])
'ABAACCBBACBB'
And count 'A' symbol, or any other.
>>> 'ABAACCBBACBB'.count('A')
4
This could be one-liner solution:
>>> ''.join(''.join(x) for x in List1).count('A')
4
String of symbols is iterable type. The same as the list. List of strings is more useful than the list of lists of chars.

Categories