combining two python lists and aggregating the values by key - python

I have two lists seen here.
a = ['a','b','a']
b = [200,300,300]
when I print like so:
print dict(zip(a,b))
I get:
{'A': 300, 'B': 300}
How would I aggregate the values based off the keys so that I get
{'A': 500, 'B': 300} ?

result = {}
for k,v in zip (['a','b','a'], [200,300,300]):
result[k] = result.get(k,0) + v
print result

from collections import Counter
a = ['a','b','a']
b = [200,300,300]
c = Counter()
for i, j in zip(a, b):
c[i] += j
print(c)

I suppose a clear way (per Python's Zen) to achieve your goal is:
from __future__ import print_function
a = ['a','b','a']
b = [200,300,300]
d = dict()
for place, key in enumerate(a):
try:
d[key] += b[place]
except KeyError:
d[key] = b[place]
print(d)
Which gives your expected output:
{'a': 500, 'b': 300}

You just need to iterate over zip for key and values and put them in dictionary.
a = ['a','b','a']
b = [200,300,300]
for key, val in zip(a,b):
if key in combined_dict:
combined_dict[key] += val
else:
combined_dict[key] = val
print(combined_dict)
=> {'a': 500, 'b': 300}

One way to do it is like below, by avoiding zip function,
aggregateDict = {}
a= ['a', 'b', 'a']
b=[200, 300, 200]
for i in range(len(a)):
aggregateDict[a[i]] = aggregateDict.get(a[i], 0) + b[i]
Output will be
{'a': 400, 'b': 300}

Related

How to keep only the key-value-pair with the global max value over several dictionaries?

Input:
A= [{'A':0.4, 'B':0.8, 'C':1.0},
{'A':0.5, 'B':0.9, 'C':0.0}]
Desired Output:
A=[{'C':1.0},{'A':0.5, 'B':0.9}]
Note: I wanna iterate through and keep only the key with the maximum value among same keys !! And list of dictionaries should be kept !
A more concise approach based on #Ioannis' answer
max_dict = {k: max([d[k] for d in A]) for k in A[0]}
result = [{k: v for k, v in i.items() if v == max_dict[k]} for i in A]
print(result)
output:
[{'C': 1.0}, {'A': 0.5, 'B': 0.9}]
Detailed approach:
You can create a dictionary with max values and check with it for each list item. You can implement it using 2 for loops.
A = [{'A': 0.4, 'B': 0.8, 'C': 1.0},
{'A': 0.5, 'B': 0.9, 'C': 0.0}]
max_dict = {}
for a_dict in A:
for key, value in a_dict.items():
if key not in max_dict or max_dict[key] < value:
max_dict[key] = value
result = []
for a_dict in A:
temp_dict = {}
for key, value in a_dict.items():
if value == max_dict[key]:
temp_dict[key] = value
result.append(temp_dict)
print(result)
output:
[{'C': 1.0}, {'A': 0.5, 'B': 0.9}]
Ioannis's answer should work, but here is a more visual answer if that didn't make sense.
B={}
for i in range(len(A)):
d = A[i]
for key in d:
if (key not in B): B[key]=i
Now we store the index of each key max in B.
for i in range(len(A)):
d=A[i]
for key in d:
if (A[B[key]][key]<d[key]):
# A[B[key]][key] gets the index of the max using B[key]
# then gets the dictionary at that index using A[B[key]], and then
# then finally gets the value using A[B[key]][key]
B[key]=i
Now, we have all the max indexes. So, we can go through a new list, say C, and add
dicts for the length of A (ie there are the same number of dicts as A)
C=[]
for i in range(len(A)):
C.append({})
for key in B:
ind = B[key]
C[ind][key]=A[B[key]][key]
Output (tested on 3.9.4): [{'C': 1.0}, {'A': 0.5, 'B': 0.9}]
B = {k: max([d[k] for d in A]) for k in A[0]}
If B needs to be within a list, then [B].

Merge random number of dicts in list

The task is to create a list of a random number of dicts (from 2 to 10)
dict's random numbers of keys should be letter, dict's values should
be a number (0-100), example: [{'a': 5, 'b': 7, 'g': 11}, {'a': 3, 'c': 35, 'g': 42}]
get a previously generated list of dicts and create one common dict:
if dicts have same key, we will take max value, and rename key with dict number with max value
if key is only in one dict - take it as is,
example: {'a_1': 5, 'b': 7, 'c': 35, 'g_2': 42}
I've written the following code:
from random import randint, choice
from string import ascii_lowercase
final_dict, indexes_dict = {}, {}
rand_list = [{choice(ascii_lowercase): randint(0, 100) for i in range(len(ascii_lowercase))} for j in range(randint(2, 10))]
for dictionary in rand_list:
for key, value in dictionary.items():
if key not in final_dict:
final_dict.update({key: value}) # add first occurrence
else:
if value < final_dict.get(key):
#TODO indexes_dict.update({:})
continue
else:
final_dict.update({key: value})
#TODO indexes_dict.update({:})
for key in indexes_dict:
final_dict[key + '_' + str(indexes_dict[key])] = final_dict.pop(key)
print(final_dict)
I only need to add some logic in order to keep indexes of final_dict values (created the separated dict for it).
I'm wondering if exists some more Pythonic way in order to solve such tasks.
This approach seems completely reasonable.
I, personally, would probably go around this way, however:
final_dict, tmp_dict = {}, {}
#Transform from list of dicts into dict of lists.
for dictionary in rand_list:
for k, v in dictionary.items():
tmp_dict.setdefault(k, []).append(v)
#Now choose only the biggest one
for k, v in tmp_dict.items():
if len(v) > 1:
final_dict[k+"_"+str(v.index(max(v))+1)] = max(v)
else: final_dict[k] = v[0]
You will need some auxiliary data structure to keep track of unrepeated keys. This uses collections.defaultdict and enumerate to aid the task:
from collections import defaultdict
def merge(dicts):
helper = defaultdict(lambda: [-1, -1, 0]) # key -> max, index_max, count
for i, d in enumerate(dicts, 1): # start indexing at 1
for k, v in d.items():
helper[k][2] += 1 # always increase count
if v > helper[k][0]:
helper[k][:2] = [v, i] # update max and index_max
# build result from helper data structure
result = {}
for k, (max_, index, count) in helper.items():
key = k if count == 1 else "{}_{}".format(k, index)
result[key] = max_
return result
>>> merge([{'a': 5, 'b': 7, 'g': 11}, {'a': 3, 'c': 35, 'g': 42}])
{'a_1': 5, 'b': 7, 'g_2': 42, 'c': 35}

Count the occurance of same value for the key in dictionary python

I have a multiple dicts in the list. I want to count the number of occurances of the certain value from the list of dicts.
Here's the list of dict:
a = [{"a":"data1","b":"Nill","c":"data3","d":"Nill"},{"a":"dat1","b":"dat2","c":"dat3","d":"Nill"},{"a":"sa1","b":"sa2","c":"sa3","d":"Nill"}]
In here, i want to count the occurance of Nill in the Key. How to make it possible.
Here's the code i tried:
from collections import Counter
a = [{"a":"data1","b":"Nill","c":"data3","d":"Nill"},{"a":"dat1","b":"dat2","c":"dat3","d":"Nill"},{"a":"sa1","b":"sa2","c":"sa3","d":"Nill"}]
s = 0
for i in a:
d = (a[s])
#print(d)
q = 0
for z in d:
print(z)
z1=d[z]
#print(z)
if z1 == "Nill":
q = q+1
co = {z:q}
print(co)
Expected Output:
The count of Nill values in the list of dict
{a:0,b:1,c:0,d:3}
Try this :-
a = [{"a":"data1","b":"Nill","c":"data3","d":"Nill"},{"a":"dat1","b":"dat2","c":"dat3","d":"Nill"},{"a":"sa1","b":"sa2","c":"sa3","d":"Nill"}]
result_dict = {'a' : 0, 'b' : 0,'c' :0, 'd' : 0}
for i in a:
for key, value in i.items():
if value =="Nill":
result_dict[key] +=1
print(result_dict)
You can use the Counter directly by counting the boolean expression with something like this that takes advantage of the fact the the counter will count True as 1.
a = [{"a":"data1","b":"Nill","c":"data3","d":"Nill"},{"a":"dat1","b":"dat2","c":"dat3","d":"Nill"},{"a":"sa1","b":"sa2","c":"sa3","d":"Nill"}]
c = Counter()
for d in a:
c.update({k: v == 'Nill' for k, v in d.items()})
# c => Counter({'a': 0, 'b': 1, 'c': 0, 'd': 3})
EDIT:
To match the output required:
import pandas as pd
df = pd.DataFrame(a)
occ = {k: list(v.values()).count('Nill') for k,v in df.to_dict().items()}
Like this?
a = [{"a":"data1","b":"Nill","c":"data3","d":"Nill"},{"a":"dat1","b":"dat2","c":"dat3","d":"Nill"},{"a":"sa1","b":"sa2","c":"sa3","d":"Nill"}]
result = {}
for sub_list in a: # loop through the list
for key, val in sub_list.items(): # loop through the dictionary
result[key] = result.get(key, 0) # if key not in dictionary, add it
if val == 'Nill': # if finding 'Nill', increment that value
result[key] += 1
for key, val in result.items(): # show result
print(key, val)
Try this:
from collections import defaultdict
c = defaultdict(int, {i:0 for i in a[0].keys()})
for i in a:
for k,v in i.items():
if v=='Nill':
c[k] += 1
dict(c) will be your desired output.
{'a': 0, 'b': 1, 'c': 0, 'd': 3}

Using Python's max to return two equally large values

I'm using Python's max function to find the largest integer in a dictionary called count, and the corresponding key (not quite sure if I'm saying it properly; my code probably explains itself better than I'm explaining it). The dictionary count is along the lines of {'a': 100, 'b': 210}, and so on.
number = count[max(count.items(), key=operator.itemgetter(1))[0]]
highest = max(count, key=count.get)
What would I do if there were two equal largest values in there? If I had {'a': 120, 'b': 120, 'c': 100}, this would only find the first of a and b, not both.
Idea is to find max value and get all keys corresponding to that value:
count = {'a': 120, 'b': 120, 'c': 100}
highest = max(count.values())
print([k for k, v in count.items() if v == highest])
Same idea as Asterisk, but without iterating over the list twice. Bit more verbose.
count = { 'a': 120, 'b': 120, 'c': 100 }
answers = []
highest = -1
def f(x):
global highest, answers
if count[x] > highest:
highest = count[x]
answers = [x]
elif count[x] == highest:
answers.append(x)
map(f, count.keys())
print answers
Fast single pass:
a = { 'a': 120, 'b': 120, 'c': 100 }
z = [0]
while a:
key, value = a.popitem()
if value > z[0]:
z = [value,[key]]
elif value == z[0]:
z[1].append(key)
print z
#output:
[120, ['a', 'b']]
And an amusing way with defaultdict:
import collections
b = collections.defaultdict(list)
for key, value in a.iteritems():
b[value].append(key)
print max(b.items())
#output:
(120, ['a', 'b'])
This could be a way (probably not the most efficient).
value = max(count.values())
filter(lambda key: count[key]==value,count)
Sometimes simplest solution may be the best:
max_value = 0
max_keys = []
for k, v in count.items():
if v >= max_value:
if v > max_value:
max_value = v
max_keys = [k]
else:
max_keys.append(k)
print max_keys
The code above is slightly faster than two pass solution like:
highest = max(count.values())
print [k for k,v in count.items() if v == highest]
Of course it's longer, but on the other hand it's very clear and easy to read.
To print a list without bucket. use :
' '.join(map(str, mylist))
or, more verbosely:
' '.join(str(x) for x in mylist)

Modify all values in a dictionary

Code goes below:
d = {'a':0, 'b':0, 'c':0, 'd':0} #at the beginning, all the values are 0.
s = 'cbad' #a string
indices = map(s.index, d.keys()) #get every key's index in s, i.e., a-2, b-1, c-0, d-3
#then set the values to keys' index
d = dict(zip(d.keys(), indices)) #this is how I do it, any better way?
print d #{'a':2, 'c':0, 'b':1, 'd':3}
Any other way to do that?
PS. the code above is just a simple one to demonstrate my question.
Something like this might make your code more readable:
dict([(x,y) for y,x in enumerate('cbad')])
But you should give more details what you really want to do. Your code will probably fail if the characters in s do not fit the keys of d. So d is just a container for the keys and the values are not important. Why not start with a list in that case?
use update() method of dict:
d.update((k,s.index(k)) for k in d.iterkeys())
What about
d = {'a':0, 'b':0, 'c':0, 'd':0}
s = 'cbad'
for k in d.iterkeys():
d[k] = s.index(k)
? It's no functional programming anymore but should be more performant and more pythonic, perhaps :-).
EDIT: A function variant using python dict-comprehensions (needs Python 2.7+ or 3+):
d.update({k : s.index(k) for k in d.iterkeys()})
or even
{k : s.index(k) for k in d.iterkeys()}
if a new dict is okay!
for k in d.iterkeys():
d[k] = s.index[k]
Or, if you don't already know the letters in the string:
d = {}
for i in range(len(s)):
d[s[i]]=i
another one liner:
dict([(k,s.index(k)) for (k,v) in d.items()])
You don't need to pass a list of tuples to dict. Instead, you can use a dictionary comprehension with enumerate:
s = 'cbad'
d = {v: k for k, v in enumerate(s)}
If you need to process the intermediary steps, including initial setting of values, you can use:
d = dict.fromkeys('abcd', 0)
s = 'cbad'
indices = {v: k for k, v in enumerate(s)}
d = {k: indices[k] for k in d} # dictionary comprehension
d = dict(zip(d, map(indices.get, d))) # dict + zip alternative
print(d)
# {'a': 2, 'b': 1, 'c': 0, 'd': 3}
You choose the right way but think that no need to create dict and then modify it if you have ability to do this in the same time:
keys = ['a','b','c','d']
strK = 'bcad'
res = dict(zip(keys, (strK.index(i) for i in keys)))
Dict comprehension for python 2.7 and above
{key : indice for key, indice in zip(d.keys(), map(s.index, d.keys()))}
>>> d = {'a':0, 'b':0, 'c':0, 'd':0}
>>> s = 'cbad'
>>> for x in d:
d[x]=s.find(x)
>>> d
{'a': 2, 'c': 0, 'b': 1, 'd': 3}

Categories