I'd like to merge a list of dictionaries with lists as values. Given
arr[0] = {'number':[1,2,3,4], 'alphabet':['a','b','c']}
arr[1] = {'number':[3,4], 'alphabet':['d','e']}
arr[2] = {'number':[6,7], 'alphabet':['e','f']}
the result I want would be
merge_arr = {'number':[1,2,3,4,3,4,6,7,], 'alphabet':['a','b','c','d','e','e','f']}
could you recommend any compact code?
If you know these are the only keys in the dict, you can hard code it. If it isn't so simple, show a complicated example.
from pprint import pprint
arr = [
{
'number':[1,2,3,4],
'alphabet':['a','b','c']
},
{
'number':[3,4],
'alphabet':['d','e']
},
{
'number':[6,7],
'alphabet':['e','f']
}
]
merged_arr = {
'number': [],
'alphabet': []
}
for d in arr:
merged_arr['number'].extend(d['number'])
merged_arr['alphabet'].extend(d['alphabet'])
pprint(merged_arr)
Output:
{'alphabet': ['a', 'b', 'c', 'd', 'e', 'e', 'f'],
'number': [1, 2, 3, 4, 3, 4, 6, 7]}
arr = [{'number':[1,2,3,4], 'alphabet':['a','b','c']},{'number':[3,4], 'alphabet':['d','e']},{'number':[6,7], 'alphabet':['e','f']}]
dict = {}
for k in arr[0].keys():
dict[k] = sum([dict[k] for dict in arr], [])
print (dict)
output:
{'number': [1, 2, 3, 4, 3, 4, 6, 7], 'alphabet': ['a', 'b', 'c', 'd', 'e', 'e', 'f']}
Here is code that uses defaultdict to more easily collect the items. You could leave the result as a defaultdict but this version converts that to a regular dictionary. This code will work with any keys, and the keys in the various dictionaries can differ, as long as the values are lists. Therefore this answer is more general than the other answers given so far.
from collections import defaultdict
arr = [{'number':[1,2,3,4], 'alphabet':['a','b','c']},
{'number':[3,4], 'alphabet':['d','e']},
{'number':[6,7], 'alphabet':['e','f']},
]
merge_arr_default = defaultdict(list)
for adict in arr:
for key, value in adict.items():
merge_arr_default[key].extend(value)
merge_arr = dict(merge_arr_default)
print(merge_arr)
The printed result is
{'number': [1, 2, 3, 4, 3, 4, 6, 7], 'alphabet': ['a', 'b', 'c', 'd', 'e', 'e', 'f']}
EDIT: As noted by #pault, the solution below is of quadratic complexity, and therefore not recommended for large lists. There are more optimal ways to go around it.
However if you’re looking for compactness and relative simplicity, keep reading.
If you want a more functional form, this two-liner will do:
arr = [{'number':[1,2,3,4], 'alphabet':['a','b','c']},{'number':[3,4], 'alphabet':['d','e']},{'number':[6,7], 'alphabet':['e','f']}]
keys = ['number', 'alphabet']
merge_arr = {key: reduce(list.__add__, [dict[key] for dict in arr]) for key in keys}
print arr
Outputs:
{'alphabet': ['a', 'b', 'c', 'd', 'e', 'e', 'f'], 'number': [1, 2, 3, 4, 3, 4, 6, 7]}
This won't merge recursively.
If you want it to work with arbitrary keys, not present in each dict, use:
keys = {k for k in dict.keys() for dict in arr}
merge_arr = {key: reduce(list.__add__, [dict.get(key, []) for dict in arr]) for key in keys}
Related
I am trying to calculate a “score” for each key in a dictionary. The values for the key values are in a different list. Simplified example:
I have:
Key_values = ['a': 1, 'b': 2, 'c': 3, 'd': 4]
My_dict = {'player1': ['a', 'd', 'c'], 'player2': ['b', 'a', 'd']}
I want:
Scores = ['player1': 8, 'player2': 7]
You can create it using a dict comprehension:
Key_values = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
My_dict = {'player1': ['a', 'd', 'c'], 'player2': ['b', 'a', 'd']}
scores = {player: sum(Key_values[mark] for mark in marks) for player, marks in My_dict.items()}
print(scores)
# {'player1': 8, 'player2': 7}
Try this:
>>> Key_values = {"a" : 1, "b" : 2, "c": 3, "d" : 4}
>>> My_dict = {"player1":["a", "d", "c"], "player2":["b", "a", "d"]}
>>> Scores= {k: sum(Key_values.get(v_el, 0) for v_el in v) for k,v in My_dict.items()}
>>> Scores
{'player1': 8, 'player2': 7}
try this:
score = {}
key_values = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
my_dict = {'player1': ['a', 'c', 'd'], 'player2': ['b', 'a', 'd']}
scr = 0
for i in my_dict.keys(): # to get all keys from my_dict
for j in my_dict[i]: # iterate the value list for key.
scr += key_values[j]
score[i] = scr
scr = 0
print(score)
Try this: (Updated the syntax in question. key-value pairs are enclosed within curley braces.)
Key_values = {‘a’ : 1, ‘b’ : 2, ‘c’: 3, ‘d’ : 4}
My_dict = {‘player1’=[‘a’, ‘d’, ‘c’], ‘player2’=[‘b’, ‘a’, ‘d’]}
Scores = dict()
for key, value in My_dict.items():
total = 0
for val in value:
total += Key_values[val]
Scores[key] = total
print(Scores)
# {‘player1’ : 8, ‘player2: 7}
You can do it with appropriate dict methods and map, should be the fastest among the ones already posted.
Key_values = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
My_dict = {'player1': ['a', 'd', 'c'], 'player2': ['b', 'a', 'd']}
new_dict = {key:sum(map(Key_values.get,My_dict[key])) for key in My_dict}
print(new_dict)
Output:
{'player1': 8, 'player2': 7}
My problem, is that I have a nested list
l = [
['a','apple',1],
['b', 'banana', 0],
['a', 'artichoke', 'antenna'],
['b', 'brocolli', 'baton'],
['c', None, 22]
]
and i wanted to merge those list that have a common index value also without sorting the resultant list.
My prefered output:
[
['a','apple', 1, 'artichoke', 'antenna'],
['b', 'banana', 0, 'brocolli', 'baton'],
['c', None, 22]
]
I found the solution from here and here
But the output im getting is somehow sorted, which it comes to my current output:
[['c', None, 22], [1, 'antenna', 'apple', 'artichoke', 'a'], [0, 'b', 'banana', 'brocolli', 'baton']]
My code goes:
len_l = len(l)
i = 0
while i < (len_l - 1):
for j in range(i + 1, len_l):
# i,j iterate over all pairs of l's elements including new
# elements from merged pairs. We use len_l because len(l)
# may change as we iterate
i_set = set(l[i])
j_set = set(l[j])
if len(i_set.intersection(j_set)) > 0:
# Remove these two from list
l.pop(j)
l.pop(i)
# Merge them and append to the orig. list
ij_union = list(i_set.union(j_set))
l.append(ij_union)
# len(l) has changed
len_l -= 1
# adjust 'i' because elements shifted
i -= 1
# abort inner loop, continue with next l[i]
break
i += 1
print(l)
I would appreciate the help in here, and im also open to new suggest on how to do this in an easier way, coz honestly the i havent use the union() nor intersection() methods before.
thanx
You can use a dictionary with the first element of each list as the key and extend a list each time as they're encountered in the list-of-lists, eg:
data = [
['a','apple',1],
['b', 'banana', 0],
['a', 'artichoke', 'antenna'],
['b', 'brocolli', 'baton'],
['c', None, 22]
]
Then we:
d = {}
for k, *vals in data:
d.setdefault(k, []).extend(vals)
Optionally you can use d = collections.OrderedDict() here if it's completely necessary to guarantee the order of the keys is as seen in the list.
Which gives you a d of:
{'a': ['apple', 1, 'artichoke', 'antenna'],
'b': ['banana', 0, 'brocolli', 'baton'],
'c': [None, 22]}
If you then want to unpack back to a lists of lists (although it's probably more useful being a dict) then you can do:
new_data = [[k, *v] for k, v in d.items()]
To get:
[['a', 'apple', 1, 'artichoke', 'antenna'],
['b', 'banana', 0, 'brocolli', 'baton'],
['c', None, 22]]
I'm pulling data from the database and assuming i have something like this:
Product Name Quantity
a 3
a 5
b 2
c 7
I want to sum the Quantity based on Product name, so this is what i want:
product = {'a':8, 'b':2, 'c':7 }
Here's what I'm trying to do after fetching the data from the database:
for row in result:
product[row['product_name']] += row['quantity']
but this will give me: 'a'=5 only, not 8.
Option 1: pandas
This is one way, assuming you begin with a pandas dataframe df. This solution has O(n log n) complexity.
product = df.groupby('Product Name')['Quantity'].sum().to_dict()
# {'a': 8, 'b': 2, 'c': 7}
The idea is you can perform a groupby operation, which produces a series indexed by "Product Name". Then use the to_dict() method to convert to a dictionary.
Option 2: collections.Counter
If you begin with a list or iterator of results, and wish to use a for loop, you can use collections.Counter for O(n) complexity.
from collections import Counter
result = [['a', 3],
['a', 5],
['b', 2],
['c', 7]]
product = Counter()
for row in result:
product[row[0]] += row[1]
print(product)
# Counter({'a': 8, 'c': 7, 'b': 2})
Option 3: itertools.groupby
You can also use a dictionary comprehension with itertools.groupby. This requires sorting beforehand.
from itertools import groupby
res = {i: sum(list(zip(*j))[1]) for i, j in groupby(sorted(result), key=lambda x: x[0])}
# {'a': 8, 'b': 2, 'c': 7}
If you insist on using loops, you can do this:
# fake data to make the script runnable
result = [
{'product_name': 'a', 'quantity': 3},
{'product_name': 'a', 'quantity': 5},
{'product_name': 'b', 'quantity': 2},
{'product_name': 'c', 'quantity': 7}
]
# solution with defaultdict and loops
from collections import defaultdict
d = defaultdict(int)
for row in result:
d[row['product_name']] += row['quantity']
print(dict(d))
The output:
{'a': 8, 'b': 2, 'c': 7}
Since you mention pandas
df.set_index('ProductName').Quantity.sum(level=0).to_dict()
Out[20]: {'a': 8, 'b': 2, 'c': 7}
Use tuple to store the result.
Edit:
Not clear if the data mentioned is really a dataframe.
If yes then li = [tuple(x) for x in df.to_records(index=False)]
li = [('a', 3), ('a', 5), ('b', 2), ('c', 7)]
d = dict()
for key, val in li:
val_old = 0
if key in d:
val_old = d[key]
d[key] = val + val_old
print(d)
Output
{'a': 8, 'b': 2, 'c': 7}
Let's assume i start with this dictionary:
mydict = {
'a': [[2,4], [5,6]],
'b': [[1,1], [1,7,9], [6,2,3]],
'c': [['a'], [4,5]],
}
How can i append values to 'a' yet be able to add a new key if i needed to let's say 'd' what i tried is
plus_min_dict = {}
plus_min_dict[key] = reference_dataset[key][line_number]
but it only gave one value per key apparently = destroyed the previous value, i want to update or append yet still be able to create a new key if it doesn't exist
Edit: To clarify let's assume this is my initial dictionary:
mydict = {
'a': [[2,4]],}
i do other calculations with another dictionary let's assume it's :
second_dict = {
'a': [ [5,6]],
'b': [[1,1], [1,7,9]],
'c': [['a'], [4,5]],
}
these calculations showed me that i have interest in [5,6] of 'a' and [1,7,9] of 'b' so i want mydict to become:
mydict = {
'a': [[2,4], [5,6]],
'b': [[1,7,9]],
}
If I understand well your question, you want to append a new value to your dictionary if the key already exists. If so, I would use a defaultdict for a simple reason. With a defaultdict you can use the method += to create (if does not exist) or add (if exist) an element :
from collections import defaultdict
# Your dictionaries
mydict = {
'a': [[2,4], [5,6]],
'b': [[1,1], [1,7,9], [6,2,3]],
'c': [['a'], [4,5]],
}
plus_min_dict = {'a': [[3,3]]}
# d is a DefaultDict containing mydict values
d=defaultdict(list,mydict)
# d_new is a DefaultDict containing plus_min_dict dict
d_new = defaultdict(list, plus_min_dict)
# Add all key,values of d in d_new
for k, v in d.items():
d_new[k] += d[k]
print(d_new)
Results :
defaultdict(<class 'list'>, {'c': [['a'], [4, 5]], 'a': [[3, 3], [2, 4], [5, 6]], 'b': [[1, 1], [1, 7, 9], [6, 2, 3]]})
Use an if else loop
mydict = {'a': [[2,4]],}
second_dict = {
'a': [ [5,6]],
'b': [[1,1], [1,7,9]],
'c': [['a'], [4,5]]}
missing_values = {
'a': [5,6],
'b': [1,7,9]}
for key, value in missing_values.items():
if key in mydict:
mydict[key ].append(value)
else:
mydict[key ] = [value]
print(mydict)
Result:
{'a': [[2, 4], [5, 6]], 'b': [[1, 7, 9]]}
To append an item into 'a', you can do this:
mydict['a'] += ['test_item']
Or:
mydict['a'].append('test_item')
You can just append:
mydict = {
'a': [[2,4], [5,6]],
'b': [[1,1], [1,7,9], [6,2,3]],
'c': [['a'], [4,5]],
}
mydict['a'].append([7,8])
mydict['d'] = [0,1]
print(mydict)
I have a list of lists of data:
[[1422029700000, 230.84, 230.42, 230.31, 230.32, 378], [1422029800000, 231.84, 231.42, 231.31, 231.32, 379], ...]
and a list of keys:
['a', 'b', 'c', 'd', 'e']
I want to combine them to a dictionary of lists so it looks like:
['a': [1422029700000, 1422029800000], 'b': [230.84, 231.84], ...]
I can do this using loops but I am looking for a pythonic way.
It is quite simple:
In [1]: keys = ['a','b','c']
In [2]: values = [[1,2,3],[4,5,6],[7,8,9]]
In [7]: dict(zip(keys, zip(*values)))
Out[7]: {'a': (1, 4, 7), 'b': (2, 5, 8), 'c': (3, 6, 9)}
If you need lists as values:
In [8]: dict(zip(keys, [list(t) for t in zip(*values)]))
Out[8]: {'a': [1, 4, 7], 'b': [2, 5, 8], 'c': [3, 6, 9]}
or:
In [9]: dict(zip(keys, map(list, zip(*values))))
Out[9]: {'a': [1, 4, 7], 'b': [2, 5, 8], 'c': [3, 6, 9]}
Use:
{k: [d[i] for d in data] for i, k in enumerate(keys)}
Example:
>>> data=[[1422029700000, 230.84, 230.42, 230.31, 230.32, 378], [1422029800000, 231.84, 231.42, 231.31, 231.32, 379]]
>>> keys = ["a", "b", "c"]
>>> {k: [d[i] for d in data] for i, k in enumerate(keys)}
{'c': [230.42, 231.42], 'a': [1422029700000, 1422029800000], 'b': [230.84, 231.84]}
Your question has everything in a list so if you want a list of dicts:
l1= [[1422029700000, 230.84, 230.42, 230.31, 230.32, 378], [1422029800000, 231.84, 231.42, 231.31, 231.32, 379]]
l2 = ['a', 'b', 'c', 'd', 'e',"f"] # added f to match length of sublists
print([{a:list(b)} for a,b in zip(l2,zip(*l1))])
[{'a': [1422029700000, 1422029800000]}, {'b': [230.84, 231.84]}, {'c': [230.42, 231.42]}, {'d': [230.31, 231.31]}, {'e': [230.32, 231.32]}, {'f': [378, 379]}]
If you actually want a dict use a dict comprehension with zip:
print({a:list(b) for a,b in zip(l2,zip(*l1))})
{'f': [378, 379], 'e': [230.32, 231.32], 'a': [1422029700000, 1422029800000], 'b': [230.84, 231.84], 'c': [230.42, 231.42], 'd': [230.31, 231.31]}
You example also has a list of keys shorter than the length of your sublists so zipping will actually mean you lose values from your sublists so you may want to address that.
If you are using python2 you can use itertools.izip:
from itertools import izip
print({a:list(b) for a,b in izip(l2,zip(*l1))