Python: Summing values nested inside different dictionaries in a nested dictionary - python

I have a nested dictionary called "high_low_teams_in_profile" which looks like this:
{
m_profile1:
{
team_size1:
{
low: 1,
high: 1
},
team_size2:
{
low: 1,
high: 1
}
},
m_profile2:
{
team_size1:
{
low: 1,
high: 1
},
team_size2:
{
low: 1,
high: 1
}
}
}
And I want to get {m_profile1: 4, m_profile2: 4}
What is the most eloquent way to do it in python?
Right now I have the following:
new_num_teams_in_profile = {}
for profile in high_low_teams_in_profile:
new_num_teams_in_profile[profile]= dict((team_size, sum(high_low_teams_in_profile[profile][team_size].values())) for team_size in high_low_teams_in_profile[profile])
new_num_teams_in_profile= dict((profile, sum(new_num_teams_in_profile[profile].values())) for profile in new_num_teams_in_profile)

I'm not sure if I'd say it's the most Pythonic, but it's the most functional:
p = high_low_teams_in_profile
{ prof:sum(p[prof][team][hl]
for team in p[prof]
for hl in p[prof][team])
for prof in p}
The arguments of sum is a generator expression and the outer { prof:sum(...) for prof in p} is a dictionary comprehension.

While this may not be the most pythonic, the following code should work and is more readable than your original version. Note the iteritems() method, which allows access to both the keys and values of the dict, while itervalues(), as the name suggests, only iterates the values of the dict.
final = {}
for key, sizes in high_low_teams_in_profile.iteritems():
total = 0
for value in sizes.itervalues():
s = sum(value.itervalues())
total += s
final[key] = total
print final
In addition, you could use the following. While it is a shorter number of lines, it is slightly more difficult to read.
final = {}
for key, sizes in high_low_teams_in_profile.iteritems():
total = sum([sum(value.itervalues()) for value in sizes.itervalues()])
final[key] = total
print final

Related

How to find the depth of a dictionary that contains a list of dictionaries?

I would like to know the depth of a dict that contains a list of dicts, I wrote a simple code but the problem is that it increment the counter of depth at each step.
this is the input that i have as an example :
respons = {
"root":{
"Flow":[{
"Name":"BSB1",
"Output":[{
"Name":"BSB2",
"Output":[{
"Name":"BSB5",
"Output":[{
"Name":"BSB6",
"Output":[{
"Name":"BSB8",
"Output":[]
}]
},
{
"Name":"BSB7",
"Output":[]
}]
}]
},
{
"Name":"BSB3",
"Output":[{
"Name":"BSB4",
"Output":[]
}]
}]
}]
}
}
def calculate_depth(flow,depth):
depth+=1
md = []
if flow['Output']:
for o in flow['Output']:
print(o['BusinessUnit'])
md.append(calculate_depth(o,depth))
print(max(md))
print(md)
return max(md)
else:
return depth
print(calculate_depth(respons['root']['Flow'][0],0))
normally I want the depth of the longest branch of this dict not to go through all of the branches and increment at each step
EDIT
The desired Outcome will be for this structure : 5
Why ?
It is the longest branche BSB1 => BSB2 => BSB5 => BSB6 => BSB8
What the depth is of this structure is debatable. Your code (and the way you indent the data structure) seems to suggest that you don't want to count the intermediate lists as adding a level to a path. Yet if you would want to access deep data you would write
respons['root']['Flow'][0]['Output'][0]['Output'][0]
# ^^^ ^^^ ^^^ ...not a level?
And taking this to the leaves of this tree: is the deepest [] a level?
Here is code that only counts dicts as adding to the level, and only when they are not empty:
def calculate_depth(thing):
if isinstance(thing, list) and len(thing):
return 0 + max(calculate_depth(item) for item in thing)
if isinstance(thing, dict) and len(thing):
return 1 + max(calculate_depth(item) for item in thing.values())
return 0
This prints 19 for the example data:
print(calculate_depth(respons['root']['Flow'][0]))
Adapt to your need.

How to combine every nth dict element in python list?

Input:
list1 = [
{
"dict_a":"dict_a_values"
},
{
"dict_b":"dict_b_values"
},
{
"dict_c":"dict_c_values"
},
{
"dict_d":"dict_d_values"
}
]
Assuming n=2, every two elements have to be combined together.
Output:
list1 = [
{
"dict_a":"dict_a_values",
"dict_c":"dict_c_values"
},
{
"dict_b":"dict_b_values",
"dict_d":"dict_d_values"
}
]
Ideally, it'd be nicer if the output could look like something as follows with an extra layer of nesting:
[
{"dict_combined_ac": {
"dict_a":"dict_a_values",
"dict_c":"dict_c_values"
}},
{"dict_combined_bd": {
"dict_b":"dict_b_values",
"dict_d":"dict_d_values"
}}
]
But since this is really difficult to implement, I'd be more than satisfied with an output looking something similar to the first example. Thanks in advance!
What I've tried so far:
[ ''.join(x) for x in zip(list1[0::2], list1[1::2]) ]
However, I know this doesn't work because I'm working with dict elements and not str elements and when wrapping the lists with str(), every two letters is being combined instead. I'm also unsure of how I can adjust this to be for every n elements instead of just 2.
Given the original list, as in the question, the following should generate the required output:
result_list = list()
n = 2 # number of elements you want in each partition
seen_idx = set()
for i in range(len(list1)): # iterate over all indices
if i not in seen_idx:
curr_idx_list = list() # current partition
for j in range(i, len(list1), n): # generate indices for a combination partition
seen_idx.add(j) # keep record of seen indices
curr_idx_list.append(j) # store indices for current partition
# At this point we have indices of a partition, now combine
temp_dict = dict() # temporary dictionary where we store combined values
for j in curr_idx_list: # iterate over indices of current partition
temp_dict.update(list1[j])
result_list.append(temp_dict) # add to result list
print(result_list, '\n')
# Bonus: change result list into list of nested dictionaries
new_res_list = list()
for elem in result_list: # for each (combined) dictionary in the list, we make new keys
key_names = list(elem.keys())
key_names = [e.split('_')[1] for e in key_names]
new_key = 'dict_combined_' + ''.join(key_names)
temp_dict = {new_key: elem}
new_res_list.append(temp_dict)
print(new_res_list, '\n')
The output is as follows:
[{'dict_a': 'dict_a_values', 'dict_c': 'dict_c_values'}, {'dict_b': 'dict_b_values', 'dict_d': 'dict_d_values'}]
[{'dict_combined_ac': {'dict_a': 'dict_a_values', 'dict_c': 'dict_c_values'}}, {'dict_combined_bd': {'dict_b': 'dict_b_values', 'dict_d': 'dict_d_values'}}]

How to compare differently nested dictionaries and lists in Python and find intersection?

I now have two (more or less complex) lists / dictionaries. The first one contains image names and the image pixel colors in hex. So it looks like this:
{
0: {'hex': ['#c3d6db', '#c7ccc0', '#9a8f6a', '#8a8e3e'], 'filename': 'imag0'},
1: {'hex': ['#705b3c', '#6a5639', '#442f1e', '#4a3d28'], 'filename': 'img-xyz'},
…
}
So in this case I would have 2 images 2 x 2 px.
The second dictionary contains a lot of hex-values as keys and an id as value. It looks like:
{'#b0a7aa': '9976', '#595f5b': '19367', '#9a8f6a': '24095'…}
Now what I would like to do is to look if there is a color-value from my images (first list) that matches with one of the second list. If so, then I would like to know the filename from the first list and the value, the id, of the matched key in the second list.
How could I achieve this?
Use dictionary view objects to produce an intersection between your hex lists and the hex-id dictionary:
for entry in images.values():
for key in hexidmap.keys() & entry['hex']:
print('{} {} {}'.format(entry['filename'], key, hexidmap[key]))
& produces the intersection between the key set and your list of hex values.
The above assumes you are using Python 3; if you are using Python 2 instead, use dict.viewkeys() instead of .keys().
Demo:
>>> images = {
... 0: {'hex': ['#c3d6db', '#c7ccc0', '#9a8f6a', '#8a8e3e'], 'filename': 'imag0'},
... 1: {'hex': ['#705b3c', '#6a5639', '#442f1e', '#4a3d28'], 'filename': 'img-xyz'},
... }
>>> hexidmap = {'#b0a7aa': '9976', '#595f5b': '19367', '#9a8f6a': '24095'}
>>> for entry in images.values():
... for key in hexidmap.keys() & entry['hex']:
... print('{} {} {}'.format(entry['filename'], key, hexidmap[key]))
...
imag0 #9a8f6a 24095
for index in d1:
print [(d1[index]["filename"], d2[i], i) for i in d1[index]["hex"] if i in d2]
>>> [('imag0', '24095', '#9a8f6a')]
[]
dicta = {
0: {'hex': ['#c3d6db', '#c7ccc0', '#9a8f6a', '#8a8e3e'], 'filename': 'imag0'},
1: {'hex': ['#705b3c', '#6a5639', '#442f1e', '#4a3d28'], 'filename': 'img-xyz'},
}
dictb = {'#c3d6db': '9976', '#595f5b': '19367', '#9a8f6a': '24095'}
intersection = {}
for o in dicta.values():
intersect = list(set(o['hex']) & set(dictb.keys()))
intersection[o['filename']] = intersect if intersect else "No intersection"
print (intersection)
>>>{'imag0': ['#c3d6db', '#9a8f6a'], 'img-xyz': 'No intersection'}

Accessing the values of a key

I have a dictionary like:
Data = {
"weight_factors" : {
"parameter1" : 10,
"parameter2" : 30,
"parameter3" : 30
},
"other_info" : {
}
}
I want to get the sum of all values that are under the key "weight_factors":
sum = Data["weight_factors"]["parameter1"] +
Data["weight_factors"]["parameter2"] +
Data["weight_factors"]["parameter3"]
Currently, in order to avoid entering Data["weight_factors"] repeatedly, I use the following commands:
d = Data["weight_factors"]
d["parameter1"] + d["parameter2"] + d["parameter3"]
But, I guess there should be an operator that does the same thing, without storing Data["weight_factors"] as an intermediate variable. I was wondering if such a command or an operator exists.
Data["weight_factors"]<unknown operator>(["parameter1"] +
["parameter2"] +
...
["parametern"])<unknown operator>
EDIT:
In the example given above, it was just a sum operation. But it could for example be:
Data["weight_factors"]["parameter1"] * Data["weight_factors"]["parameter2"] + Data[‌​"weight_factors"]["parameter3"]
But I do not want enter Data["weight_factors"] repeatedly. That's the thing I am searching for... I don't know whether such an operator exists. (In MATLAB, there exists such a thing for cell structures).
No, that kind of operator does not exist for the built-in dict type.
I suppose you could make your own dict type that inherited from dict and overloaded an operator:
class MyDict(dict):
def __add__(self, other):
"""Overload the + operator."""
...
but that is somewhat inefficient and not very good for readability.
If you just want to sum the values, you can use sum and dict.values (dict.itervalues if you are using Python 2.x):
>>> Data = {
... "weight_factors" : {
... "parameter1" : 10,
... "parameter2" : 30,
... "parameter3" : 30
... },
... "other_info" : {
... }
... }
>>> sum(Data["weight_factors"].values())
70
>>>
Otherwise, I would just use what you have now:
d = Data["weight_factors"]
myvar = d["parameter1"] * d["parameter2"] + d["parameter3"]
It is about as clean and efficient as you can get.
For a general solution to repeatedly get the same item from a mapping or index, I suggest the operator module's itemgetter:
>>> import operator
>>> Data = {
"weight_factors" : {
"parameter1" : 10,
"parameter2" : 30,
"parameter3" : 30
},
"other_info" : {
}
}
Now create our easy getter:
>>> get = operator.itemgetter('weight_factors')
And call it on the object whenever you want your sub-dict:
>>> get(Data)['parameter1']
returns:
10
and
>>> sum(get(Data).values())
returns
70
If this is just "how do I access a dict's values easily and repeatedly?" you should just assign them like this, and you can reuse them again and again.
In Python 2:
vals = Data['weight_factors'].values()
In Python 3, values returns an iterator, which you can't reuse, so materialize it in a list:
vals = list(Data['weight_factors'].values())
and then you can do whatever you want with it:
sum(vals)
max(vals)
min(vals)
etc...

Printing a particular subset of keys in a dictionary

I have a dictionary in Python where the keys are pathnames. For example:
dict["/A"] = 0
dict["/A/B"] = 1
dict["/A/C"] = 1
dict["/X"] = 10
dict["/X/Y"] = 11
I was wondering, what's a good way to print all "subpaths" given any key.
For example, given a function called "print_dict_path" that does this, something like
print_dict_path("/A")
or
print_dict_path("/A/B")
would print out something like:
"B" = 1
"C" = 1
The only method I can think of is something like using regex and going through the entire dictionary, but I'm not sure if that's the best method (nor am I that well versed in regex).
Thanks for any help.
One possibility without using regex is to just use startswith
top_path = '/A/B'
for p in d.iterkeys():
if p.startswith(top_path):
print d[p]
You can use str.find:
def print_dict_path(prefix, d):
for k in d:
if k.find(prefix) == 0:
print "\"{0}\" = {1}".format(k,d[k])
Well, you'll definitely have to loop through the entire dict.
def filter_dict_path( d, sub ):
for key, val in d.iteritems():
if key.startswith(sub): ## or do you want `sub in key` ?
yield key, val
print dict(filter_dict_path( old_dict, sub ))
You could speed this up by using the appropriate data structure: a Tree.
Is your dictionary structure fixed? It would be nicer to do this using nested dictionaries:
{
"A": {
"value": 0
"dirs": {
"B": {
"value": 1
}
"C": {
"value": 1
}
}
"X": {
"value": 10
"dirs": {
"Y": {
"value": 11
}
}
The underlying data structure here is a tree, but Python doesn't have that built in.
This removes one level of indenting, which may make the code in the body of the for loop more readable in some cases
top_path = '/A/B'
for p in (p for p in d.iterkeys() if p.startswith(top_path)):
print d[p]
If you find performance to be a problem, consider using a trie instead of the dictionary

Categories