Python: summarizing data from list using index from another list - python

I have two lists:
L1 = ['A','B','A','C','A']
L2 = [1, 4, 6, 1, 3]
I want to create a dictionary which has the following output:
DictOutSum = {'A':10, 'B':4, 'C':1}
DictOutCount = {'A':3, 'B':1, 'C':1}
i.e. Lists L1 and L2 both have same number of elements and the elements in them corresponds one to one. I want to find sum of all numbers in L2 for each unique element in L1 and make a dictionary out of it(DictOutSum). I also want to create another dictionary which stores the counts of number of unique elements of L1(DictOutCount).
I don't even have an idea where to start for this other than to use a for loop.

Pure python implementation:
>>> dict_sum = dict.fromkeys(L1, 0)
>>> dict_count = dict.fromkeys(L1, 0)
>>> for k,n in zip(L1, L2):
... dict_sum[k] += n
... dict_count[k] += 1
...
>>> dict_sum
{'A': 10, 'B': 4, 'C': 1}
>>> dict_count
{'A': 3, 'B': 1, 'C': 1}
Fancy one-liner implementations:
>>> from collections import Counter
>>> Counter(L1) # dict_count
Counter({'A': 3, 'B': 1, 'C': 1})
>>> sum((Counter({k:v}) for k,v in zip(L1, L2)), Counter()) # dict_sum
Counter({'A': 10, 'B': 4, 'C': 1})

You should use the zip builtin function
import collections
DictOutSum = collections.defaultdict(int)
DictOutCount = collections.defaultdict(int)
for l1, l2 in zip(L1, L2):
DictOutSum[l1] += l2
DictOutCount[l1] += 1

>>> L1 = ['A','B','A','C','A']
>>> L2 = [1, 4, 6, 1, 3]
>>>
>>> DictOutCount = {v:0 for v in L1}
>>> DictOutSum = {v:0 for v in L1}
>>> for v1,v2 in zip(L1,L2):
... DictOutCount[v1] += 1
... DictOutSum[v1] += v2
...
>>>
>>> DictOutCount
{'A': 3, 'C': 1, 'B': 1}
>>> DictOutSum
{'A': 10, 'C': 1, 'B': 4}
>>>

The mega elementary way
L1 = ['A','B','A','C','A']
L2 = [1, 4, 6, 1, 3]
# Carries the information
myDict = {}
# Build the dictionary
for x in range(0,len(L1)):
# Initialize the dictionary IF the key doesn't exist
if L1[x] not in myDict:
myDict[L1[x]] = {}
myDict[L1[x]]['sum'] = 0
myDict[L1[x]]['count'] = 0
# Collect the information you need
myDict[L1[x]][x] = L2[x]
myDict[L1[x]]['sum'] += L2[x]
myDict[L1[x]]['count'] += 1
# Build the other two dictionaries
DictOutSum = {}
DictOutCount = {}
# Literally feed the data
for element in myDict:
DictOutSum[element] = myDict[element]['sum']
DictOutCount[element] = myDict[element]['count']
print DictOutSum
# {'A': 10, 'C': 1, 'B': 4}
print DictOutCount
# {'A': 3, 'C': 1, 'B': 1}
Side note: From your username, are you Persian?

DictOutCount, use collections.Counter,
import collections
DictOutCount = collections.Counter(L1)
print(DictOutCount)
Counter({'A': 3, 'C': 1, 'B': 1})
DictOutSum,
DictOutSum = dict()
for k, v in zip(L1, L2):
DictOutSum[k] = DictOutSum.get(k, 0) + v
print(DictOutSum)
# Output
{'A': 10, 'C': 1, 'B': 4}
Previous answer, DictOutSum,
import itertools
import operator
import functools
DictOutSum = dict()
for name, group in itertools.groupby(sorted(itertools.izip(L1, L2)), operator.itemgetter(0)):
DictOutSum[name] = functools.reduce(operator.add, map(operator.itemgetter(1), group))
print(DictOutSum)
{'A': 10, 'C': 1, 'B': 4}
The main steps are:
use itertools.izip to make an iterator that aggregates elements from each of L1 and L2
use itertools.groupby to make an iterator that returns consecutive keys and groups from the iterable (sorting before that)
use functools.reduce for cumulatively addition

Related

how to convert a list to a default dictionary

I want to convert a list to a default dictionary to have a default value of 0 in case the key doesn't have any value (from the list).
list : order = ['a',1,'b',2,'c']
what I did using ZIP :
it = iter(order)
res_dict = dict(zip(it,it))
print(res_dict)
but it excludes c as a key, as the list doesn't have the next index after c.
Result I got : {'a': 1, 'b': 2}
Result i want : {'a': 1, 'b': 2, 'c': 0}
You might want to consider using itertools and .zip_longest().
For example:
import itertools
l = ['a',1,'b',2,'c']
d = dict(itertools.zip_longest(l[::2], l[1::2], fillvalue=0))
print(d)
Output:
{'a': 1, 'b': 2, 'c': 0}
This is working:
d = dict()
for i in range(len(order)):
if i%2==0 and i+1<len(order):
d[order[i]] =order[i+1]
elif i+2>(len(order)):
d[order[i]]=0
Result:
{'a': 1, 'b': 2, 'c': 0}
To solve the problem of two consequent keys you can use a custom function to split the list
def splitdefault(o):
i = 0
while i < len(o):
# there is a next element to check
if i + 1 < len(o):
# the next element is int
if isinstance(o[i + 1], int):
yield o[i], o[i + 1]
i += 2
else:
yield o[i], 0
i += 1
# i is the last element
else:
yield o[i], 0
i += 1
order = ["a", 1, "b", 2, "c", "d", 3, "e"]
for g in splitdefault(order):
print(g)
res_dict = dict(splitdefault(order))
print(res_dict)
Which produces
{'a': 1, 'b': 2, 'c': 0, 'd': 3, 'e': 0}
Cheers!

Nesting dictionary algorithm

Suppose I have the following dictionary:
{'a': 0, 'b': 1, 'c': 2, 'c.1': 3, 'd': 4, 'd.1': 5, 'd.1.2': 6}
I wish to write an algorithm which outputs the following:
{
"a": 0,
"b": 1,
"c": {
"c": 2,
"c.1": 3
},
"d":{
"d": 4,
"d.1": {
"d.1": 5,
"d.1.2": 6
}
}
}
Note how the names are repeated inside the dictionary. And some have variable level of nesting (eg. "d").
I was wondering how you would go about doing this, or if there is a python library for this? I know you'd have to use recursion for something like this, but my recursion skills are quite poor. Any thoughts would be highly appreciated.
You can use a recursive function for this or just a loop. The tricky part is wrapping existing values into dictionaries if further child nodes have to be added below them.
def nested(d):
res = {}
for key, val in d.items():
t = res
# descend deeper into the nested dict
for x in [key[:i] for i, c in enumerate(key) if c == "."]:
if x in t and not isinstance(t[x], dict):
# wrap leaf value into another dict
t[x] = {x: t[x]}
t = t.setdefault(x, {})
# add actual key to nested dict
if key in t:
# already exists, go one level deeper
t[key][key] = val
else:
t[key] = val
return res
Your example:
d = {'a': 0, 'b': 1, 'c': 2, 'c.1': 3, 'd': 4, 'd.1': 5, 'd.1.2': 6}
print(nested(d))
# {'a': 0,
# 'b': 1,
# 'c': {'c': 2, 'c.1': 3},
# 'd': {'d': 4, 'd.1': {'d.1': 5, 'd.1.2': 6}}}
Nesting dictionary algorithm ...
how you would go about doing this,
sort the dictionary items
group the result by index 0 of the keys (first item in the tuples)
iterate over the groups
if there are is than one item in a group make a key for the group and add the group items as the values.
Slightly shorter recursion approach with collections.defaultdict:
from collections import defaultdict
data = {'a': 0, 'b': 1, 'c': 2, 'c.1': 3, 'd': 4, 'd.1': 5, 'd.1.2': 6}
def group(d, p = []):
_d, r = defaultdict(list), {}
for n, [a, *b], c in d:
_d[a].append((n, b, c))
for a, b in _d.items():
if (k:=[i for i in b if i[1]]):
r['.'.join(p+[a])] = {**{i[0]:i[-1] for i in b if not i[1]}, **group(k, p+[a])}
else:
r[b[0][0]] = b[0][-1]
return r
print(group([(a, a.split('.'), b) for a, b in data.items()]))
Output:
{'a': 0, 'b': 1, 'c': {'c': 2, 'c.1': 3}, 'd': {'d': 4, 'd.1': {'d.1': 5, 'd.1.2': 6}}}

Add two dictionaries in python and subtract result from another

I have three dictionaries:
X = {'a':2, 'b':3,'e':4}
Y = {'c':3, 'b':4,'a':5, 'd':7}
Z = {'c':8, 'b':7,'a':9, 'e':10,'f':10}
I want to add elements of X and Y if they are present in both dicts and then subtract them from z i.e. Z-X+Y
How can I do that ?
expected result:
res = {'a':2,'b':0,'c':5,'d':7,'e':6,'f':10}
What I tried:
from collections import Counter
xy = Counter(X) + Counter(Y)
res = Counter(Z) - xy
which return:
Counter({'c': 5, 'a': 2, 'e': 6, 'f': 10})
as you can see b and d are missing from my attempt
Your expected result is actually an operation of symmetric difference in terms of sets, but since collections.Counter doesn't support such an operation, you can emulate it with:
xy = Counter(X) + Counter(Y)
z = Counter(Z)
res = z - xy | xy - z
res becomes:
Counter({'f': 10, 'd': 7, 'e': 6, 'c': 5, 'a': 2})
But if you do want keys with value of 0, which Counter would hide from its output, you would have to iterate through a union of the keys of the 3 dicts:
{k: res.get(k, 0) for k in {*X, *Y, *Z}}
This returns:
{'a': 2, 'd': 7, 'e': 6, 'b': 0, 'f': 10, 'c': 5}

How to take a linear combination of several dictionaries in Python?

Here's some code to take a linear combination of two dictionaries:
def linearcombination(a1,d1,a2,d2):
return {k:a1*d1.get(k,0)+a2*d2.get(k,0) for k in {**d1,**d2}.keys()}
choosy1={"a":1,"b":2,"c":3}
choosy2={"a":1,"d":1}
choosy=linearcombination(1,choosy1,10,choosy2)
choosy is:
{'a': 11, 'c': 3, 'd': 10, 'b': 2}
How can I generalise it to allow linear combinations of arbitrary numbers of dictionaries?
Solution using sum in a dict comprehension over a set of keys:
from itertools import chain
def linear_combination_of_dicts(dicts, weights):
return {
k: sum( w * d.get(k, 0) for d, w in zip(dicts, weights) )
for k in set(chain.from_iterable(dicts))
}
Example:
>>> dicts = [{'a': 1, 'b': 2, 'c': 3}, {'a': 1, 'd': 1}]
>>> weights = [1, 10]
>>> linear_combination_of_dicts(dicts, weights)
{'c': 3, 'd': 10, 'a': 11, 'b': 2}
Here's an approach with pandas to handle dict key alignment:
def lc(coeffs, dicts):
return (pd.concat(pd.Series(d).fillna(0)*a for a,d in zip(coeffs,dicts))
.sum(level=0)
.to_dict()
)
lc([1,10], [choosy1, choosy2])
# {'a': 11, 'b': 2, 'c': 3, 'd': 10}

How to assign certain scores from a list to values in multiple lists and get the sum for each value in python?

Could you explain how to assign certain scores from a list to values in multiple lists and get the total score for each value?
score = [1,2,3,4,5] assigne a score based on the position in the list
l_1 = [a,b,c,d,e]
assign a=1, b=2, c=3, d=4, e=5
l_2 = [c,a,d,e,b]
assign c=1, a=2, d=3, e=4, b=5
I am trying to get the result like
{'e':9, 'b': 7, 'd':7, 'c': 4, 'a': 3}
Thank you!
You can zip the values of score to each list, which gives you a tuple of (key, value) for each letter-score combination. Make each zipped object a dict. Then use a dict comprehension to add the values for each key together.
d_1 = dict(zip(l_1, score))
d_2 = dict(zip(l_2, score))
{k: v + d_2[k] for k, v in d_1.items()}
# {'a': 3, 'b': 7, 'c': 4, 'd': 7, 'e': 9}
You better use zip function:
dic = {'a':0, 'b': 0, 'c':0, 'd': 0, 'e': 0}
def score(dic, *args):
for lst in args:
for k, v in zip(lst, range(len(lst))):
dic[k] += v+1
return dic
l_1 = ['a','b','c','d','e']
l_2 = ['c','a','d','e','b']
score(dic, l_1, l_2)
Instead of storing your lists in separate variables, you should put them in a list of lists so that you can iterate through it and calculate the sums of the scores according to each key's indices in the sub-lists:
score = [1, 2, 3, 4, 5]
lists = [
['a','b','c','d','e'],
['c','a','d','e','b']
]
d = {}
for l in lists:
for i, k in enumerate(l):
d[k] = d.get(k, 0) + score[i]
d would become:
{'a': 3, 'b': 7, 'c': 4, 'd': 7, 'e': 9}
from collections import defaultdict
score = [1,2,3,4,5] # note: 0 no need to use this list if there is no scenario like [5,6,9,10,4]
l_1 = ['a','b','c','d','e']
l_2 = ['c','a','d','e','b']
score_dict = defaultdict(int)
'''
for note: 0
if your score is always consecutive
like score = [2,3,4,5,6] or [5,6,7,8,9]...
you don't need to have seperate list of score you can set
start = score_of_char_at_first_position_ie_at_zero-th_index
like start = 2, or start = 5
else use this function
def add2ScoreDict( lst):
for pos_score, char in zip(score,lst):
score_dict[char] += pos_score
'''
def add2ScoreDict( lst):
for pos, char in enumerate( lst,start =1):
score_dict[char] += pos
# note: 1
add2ScoreDict( l_1)
add2ScoreDict( l_2)
#print(score_dict) # defaultdict(<class 'int'>, {'a': 3, 'b': 7, 'c': 4, 'd': 7, 'e': 9})
score_dict = dict(sorted(score_dict.items(), reverse = True, key=lambda x: x[1]))
print(score_dict) # {'e': 9, 'b': 7, 'd': 7, 'c': 4, 'a': 3}
edit 1:
if you have multiple lists put them in list_of_list = [l_1, l_2] so that you don't have to call func add2ScoreDict yourself again and again.
# for note: 1
for lst in list_of_list:
add2ScoreDict( lst)
You could zip both lists with score as one list l3 then you could use dictionary comprehension with filterto construct your dicitonary. The key being index 1 of the the newly formed tuples in l3, and the value being the sum of all index 0's in l3 after creating a sublist that is filtered for only matching index 0's
score = [1,2,3,4,5]
l_1 = ['a', 'b', 'c', 'd', 'e']
l_2 = ['c', 'a', 'd', 'e', 'b']
l3 = [*zip(score, l_1), *zip(score,l_2)]
d = {i[1]: sum([j[0] for j in list(filter(lambda x: x[1] ==i[1], l3))]) for i in l3}
{'a': 3, 'b': 7, 'c': 4, 'd': 7, 'e': 9}
Expanded Explanation:
d = {}
for i in l3:
f = list(filter(lambda x: x[1] == i[1], l3))
vals = []
for j in f:
vals.append(j[0])
total_vals = sum(vals)
d[i[1]] = total_vals
The simplest way is probably to use a Counter from the Python standard library.
from collections import Counter
tally = Counter()
scores = [1, 2, 3, 4, 5]
def add_scores(letters):
for letter, score in zip(letters, scores):
tally[letter] += score
L1 = ['a', 'b', 'c', 'd', 'e']
add_scores(L1)
L2 = ['c', 'a', 'd', 'e', 'b']
add_scores(L2)
print(tally)
>>> python tally.py
Counter({'e': 9, 'b': 7, 'd': 7, 'c': 4, 'a': 3})
zip is used to pair letters and scores, a for loop to iterate over them and a Counter to collect the results. A Counter is actually a dictionary, so you can write things like
tally['a']
to get the score for letter a or
for letter, score in tally.items():
print('Letter %s scored %s' % (letter, score))
to print the results, just as you would with a normal dictionary.
Finally, small ells and letter O's can be troublesome as variable names because they are hard to distinguish from ones and zeros. The Python style guide (often referred to as PEP8) recommends avoiding them.

Categories