I have a list of lists of data:
[[1422029700000, 230.84, 230.42, 230.31, 230.32, 378], [1422029800000, 231.84, 231.42, 231.31, 231.32, 379], ...]
and a list of keys:
['a', 'b', 'c', 'd', 'e']
I want to combine them to a dictionary of lists so it looks like:
['a': [1422029700000, 1422029800000], 'b': [230.84, 231.84], ...]
I can do this using loops but I am looking for a pythonic way.
It is quite simple:
In [1]: keys = ['a','b','c']
In [2]: values = [[1,2,3],[4,5,6],[7,8,9]]
In [7]: dict(zip(keys, zip(*values)))
Out[7]: {'a': (1, 4, 7), 'b': (2, 5, 8), 'c': (3, 6, 9)}
If you need lists as values:
In [8]: dict(zip(keys, [list(t) for t in zip(*values)]))
Out[8]: {'a': [1, 4, 7], 'b': [2, 5, 8], 'c': [3, 6, 9]}
or:
In [9]: dict(zip(keys, map(list, zip(*values))))
Out[9]: {'a': [1, 4, 7], 'b': [2, 5, 8], 'c': [3, 6, 9]}
Use:
{k: [d[i] for d in data] for i, k in enumerate(keys)}
Example:
>>> data=[[1422029700000, 230.84, 230.42, 230.31, 230.32, 378], [1422029800000, 231.84, 231.42, 231.31, 231.32, 379]]
>>> keys = ["a", "b", "c"]
>>> {k: [d[i] for d in data] for i, k in enumerate(keys)}
{'c': [230.42, 231.42], 'a': [1422029700000, 1422029800000], 'b': [230.84, 231.84]}
Your question has everything in a list so if you want a list of dicts:
l1= [[1422029700000, 230.84, 230.42, 230.31, 230.32, 378], [1422029800000, 231.84, 231.42, 231.31, 231.32, 379]]
l2 = ['a', 'b', 'c', 'd', 'e',"f"] # added f to match length of sublists
print([{a:list(b)} for a,b in zip(l2,zip(*l1))])
[{'a': [1422029700000, 1422029800000]}, {'b': [230.84, 231.84]}, {'c': [230.42, 231.42]}, {'d': [230.31, 231.31]}, {'e': [230.32, 231.32]}, {'f': [378, 379]}]
If you actually want a dict use a dict comprehension with zip:
print({a:list(b) for a,b in zip(l2,zip(*l1))})
{'f': [378, 379], 'e': [230.32, 231.32], 'a': [1422029700000, 1422029800000], 'b': [230.84, 231.84], 'c': [230.42, 231.42], 'd': [230.31, 231.31]}
You example also has a list of keys shorter than the length of your sublists so zipping will actually mean you lose values from your sublists so you may want to address that.
If you are using python2 you can use itertools.izip:
from itertools import izip
print({a:list(b) for a,b in izip(l2,zip(*l1))
Related
Consider the following:
>>> # list of length n
>>> idx = ['a', 'b', 'c', 'd']
>>> # list of length n
>>> l_1 = [1, 2, 3, 4]
>>> # list of length n
>>> l_2 = [5, 6, 7, 8]
>>> # first key
>>> key_1 = 'mkt_o'
>>> # second key
>>> key_2 = 'mkt_c'
How do I zip this mess to look like this?
{
'a': {'mkt_o': 1, 'mkt_c': 5},
'b': {'mkt_o': 2, 'mkt_c': 6},
'c': {'mkt_o': 3, 'mkt_c': 6},
'd': {'mkt_o': 4, 'mkt_c': 7},
...
}
The closest I've got is something like this:
>>> dict(zip(idx, zip(l_1, l_2)))
{'a': (1, 5), 'b': (2, 6), 'c': (3, 7), 'd': (4, 8)}
Which of course has tuples as values instead of dictionaries, and
>>> dict(zip(('mkt_o', 'mkt_c'), (1,2)))
{'mkt_o': 1, 'mkt_c': 2}
Which seems like it might be promising, but again, fails to meet requirements.
{k : {key_1 : v1, key_2 : v2} for k,v1,v2 in zip(idx, l_1, l_2)}
Solution 1: You may use zip twice (actually thrice) with dictionary comprehension to achieve this as:
idx = ['a', 'b', 'c', 'd']
l_1 = [1, 2, 3, 4]
l_2 = [5, 6, 7, 8]
keys = ['mkt_o', 'mkt_c'] # yours keys in another list
new_dict = {k: dict(zip(keys, v)) for k, v in zip(idx, zip(l_1, l_2))}
Solution 2: You may also use zip with nested list comprehension as:
new_dict = dict(zip(idx, [{key_1: i, key_2: j} for i, j in zip(l_1, l_2)]))
Solution 3: using dictionary comprehension on top of zip as shared in DYZ's answer:
new_dict = {k : {key_1 : v1, key_2 : v2} for k,v1,v2 in zip(idx, l_1, l_2)}
All the above solutions will return new_dict as:
{
'a': {'mkt_o': 1, 'mkt_c': 5},
'b': {'mkt_o': 2, 'mkt_c': 6},
'c': {'mkt_o': 3, 'mkt_c': 7},
'd': {'mkt_o': 4, 'mkt_c': 8}
}
You're working with dicts, lists, indices, keys and would like to transpose the data. It might make sense to work with pandas (DataFrame, .T and .to_dict):
>>> import pandas as pd
>>> idx = ['a', 'b', 'c', 'd']
>>> l_1 = [1, 2, 3, 4]
>>> l_2 = [5, 6, 7, 8]
>>> key_1 = 'mkt_o'
>>> key_2 = 'mkt_c'
>>> pd.DataFrame([l_1, l_2], index=[key_1, key_2], columns = idx)
a b c d
mkt_o 1 2 3 4
mkt_c 5 6 7 8
>>> pd.DataFrame([l_1, l_2], index=[key_1, key_2], columns = idx).T
mkt_o mkt_c
a 1 5
b 2 6
c 3 7
d 4 8
>>> pd.DataFrame([l_1, l_2], index=[key_1, key_2], columns = idx).to_dict()
{'a': {'mkt_o': 1, 'mkt_c': 5},
'b': {'mkt_o': 2, 'mkt_c': 6},
'c': {'mkt_o': 3, 'mkt_c': 7},
'd': {'mkt_o': 4, 'mkt_c': 8}
}
It can also be done with dict, zip, map and repeat from itertools:
>>> from itertools import repeat
>>> dict(zip(idx, map(dict, zip(zip(repeat(key_1), l_1), zip(repeat(key_2), l_2)))))
{'a': {'mkt_c': 5, 'mkt_o': 1}, 'c': {'mkt_c': 7, 'mkt_o': 3}, 'b': {'mkt_c': 6, 'mkt_o': 2}, 'd': {'mkt_c': 8, 'mkt_o': 4}}
I'm pulling data from the database and assuming i have something like this:
Product Name Quantity
a 3
a 5
b 2
c 7
I want to sum the Quantity based on Product name, so this is what i want:
product = {'a':8, 'b':2, 'c':7 }
Here's what I'm trying to do after fetching the data from the database:
for row in result:
product[row['product_name']] += row['quantity']
but this will give me: 'a'=5 only, not 8.
Option 1: pandas
This is one way, assuming you begin with a pandas dataframe df. This solution has O(n log n) complexity.
product = df.groupby('Product Name')['Quantity'].sum().to_dict()
# {'a': 8, 'b': 2, 'c': 7}
The idea is you can perform a groupby operation, which produces a series indexed by "Product Name". Then use the to_dict() method to convert to a dictionary.
Option 2: collections.Counter
If you begin with a list or iterator of results, and wish to use a for loop, you can use collections.Counter for O(n) complexity.
from collections import Counter
result = [['a', 3],
['a', 5],
['b', 2],
['c', 7]]
product = Counter()
for row in result:
product[row[0]] += row[1]
print(product)
# Counter({'a': 8, 'c': 7, 'b': 2})
Option 3: itertools.groupby
You can also use a dictionary comprehension with itertools.groupby. This requires sorting beforehand.
from itertools import groupby
res = {i: sum(list(zip(*j))[1]) for i, j in groupby(sorted(result), key=lambda x: x[0])}
# {'a': 8, 'b': 2, 'c': 7}
If you insist on using loops, you can do this:
# fake data to make the script runnable
result = [
{'product_name': 'a', 'quantity': 3},
{'product_name': 'a', 'quantity': 5},
{'product_name': 'b', 'quantity': 2},
{'product_name': 'c', 'quantity': 7}
]
# solution with defaultdict and loops
from collections import defaultdict
d = defaultdict(int)
for row in result:
d[row['product_name']] += row['quantity']
print(dict(d))
The output:
{'a': 8, 'b': 2, 'c': 7}
Since you mention pandas
df.set_index('ProductName').Quantity.sum(level=0).to_dict()
Out[20]: {'a': 8, 'b': 2, 'c': 7}
Use tuple to store the result.
Edit:
Not clear if the data mentioned is really a dataframe.
If yes then li = [tuple(x) for x in df.to_records(index=False)]
li = [('a', 3), ('a', 5), ('b', 2), ('c', 7)]
d = dict()
for key, val in li:
val_old = 0
if key in d:
val_old = d[key]
d[key] = val + val_old
print(d)
Output
{'a': 8, 'b': 2, 'c': 7}
Let's assume i start with this dictionary:
mydict = {
'a': [[2,4], [5,6]],
'b': [[1,1], [1,7,9], [6,2,3]],
'c': [['a'], [4,5]],
}
How can i append values to 'a' yet be able to add a new key if i needed to let's say 'd' what i tried is
plus_min_dict = {}
plus_min_dict[key] = reference_dataset[key][line_number]
but it only gave one value per key apparently = destroyed the previous value, i want to update or append yet still be able to create a new key if it doesn't exist
Edit: To clarify let's assume this is my initial dictionary:
mydict = {
'a': [[2,4]],}
i do other calculations with another dictionary let's assume it's :
second_dict = {
'a': [ [5,6]],
'b': [[1,1], [1,7,9]],
'c': [['a'], [4,5]],
}
these calculations showed me that i have interest in [5,6] of 'a' and [1,7,9] of 'b' so i want mydict to become:
mydict = {
'a': [[2,4], [5,6]],
'b': [[1,7,9]],
}
If I understand well your question, you want to append a new value to your dictionary if the key already exists. If so, I would use a defaultdict for a simple reason. With a defaultdict you can use the method += to create (if does not exist) or add (if exist) an element :
from collections import defaultdict
# Your dictionaries
mydict = {
'a': [[2,4], [5,6]],
'b': [[1,1], [1,7,9], [6,2,3]],
'c': [['a'], [4,5]],
}
plus_min_dict = {'a': [[3,3]]}
# d is a DefaultDict containing mydict values
d=defaultdict(list,mydict)
# d_new is a DefaultDict containing plus_min_dict dict
d_new = defaultdict(list, plus_min_dict)
# Add all key,values of d in d_new
for k, v in d.items():
d_new[k] += d[k]
print(d_new)
Results :
defaultdict(<class 'list'>, {'c': [['a'], [4, 5]], 'a': [[3, 3], [2, 4], [5, 6]], 'b': [[1, 1], [1, 7, 9], [6, 2, 3]]})
Use an if else loop
mydict = {'a': [[2,4]],}
second_dict = {
'a': [ [5,6]],
'b': [[1,1], [1,7,9]],
'c': [['a'], [4,5]]}
missing_values = {
'a': [5,6],
'b': [1,7,9]}
for key, value in missing_values.items():
if key in mydict:
mydict[key ].append(value)
else:
mydict[key ] = [value]
print(mydict)
Result:
{'a': [[2, 4], [5, 6]], 'b': [[1, 7, 9]]}
To append an item into 'a', you can do this:
mydict['a'] += ['test_item']
Or:
mydict['a'].append('test_item')
You can just append:
mydict = {
'a': [[2,4], [5,6]],
'b': [[1,1], [1,7,9], [6,2,3]],
'c': [['a'], [4,5]],
}
mydict['a'].append([7,8])
mydict['d'] = [0,1]
print(mydict)
I have found many threads for sorting by values like here but it doesn't seem to be working for me...
I have a dictionary of lists that have tuples. Each list has a different amount of tuples. I want to sort the dictionary by how many tuples each list contain.
>>>to_format
>>>{"one":[(1,3),(1,4)],"two":[(1,2),(1,2),(1,3)],"three":[(1,1)]}
>>>for key in some_sort(to_format):
print key,
>>>two one three
Is this possible?
>>> d = {"one": [(1,3),(1,4)], "two": [(1,2),(1,2),(1,3)], "three": [(1,1)]}
>>> for k in sorted(d, key=lambda k: len(d[k]), reverse=True):
print k,
two one three
Here is a universal solution that works on Python 2 & Python 3:
>>> print(' '.join(sorted(d, key=lambda k: len(d[k]), reverse=True)))
two one three
dict= {'a': [9,2,3,4,5], 'b': [1,2,3,4, 5, 6], 'c': [], 'd': [1,2,3,4], 'e': [1,2]}
dict_temp = {'a': 'hello', 'b': 'bye', 'c': '', 'd': 'aa', 'e': 'zz'}
def sort_by_values_len(dict):
dict_len= {key: len(value) for key, value in dict.items()}
import operator
sorted_key_list = sorted(dict_len.items(), key=operator.itemgetter(1), reverse=True)
sorted_dict = [{item[0]: dict[item [0]]} for item in sorted_key_list]
return sorted_dict
print (sort_by_values_len(dict))
output:
[{'b': [1, 2, 3, 4, 5, 6]}, {'a': [9, 2, 3, 4, 5]}, {'d': [1, 2, 3, 4]}, {'e': [1, 2]}, {'c': []}]
I wanted to create a dictionary of dictionaries in Python:
Suppose I already have a list which contains the keys:
keys = ['a', 'b', 'c', 'd', 'e']
value = [1, 2, 3, 4, 5]
Suppose I have a data field with numeric values (20 of them)
I want to define a dictionary which stores 4 different dictionaries with the given to a corresponding value
for i in range(0, 3)
for j in range(0, 4)
dictionary[i] = { 'keys[j]' : value[j] }
So basically, it should be like:
dictionary[0] = {'a' : 1, 'b' : 2, 'c' : 3, 'd': 4, 'e':5}
dictionary[1] = {'a' : 1, 'b' : 2, 'c' : 3, 'd': 4, 'e':5}
dictionary[2] = {'a' : 1, 'b' : 2, 'c' : 3, 'd': 4, 'e':5}
dictionary[3] = {'a' : 1, 'b' : 2, 'c' : 3, 'd': 4, 'e':5}
What is the best way to achieve this?
Use a list comprehension and dict(zip(keys,value)) will return the dict for you.
>>> keys = ['a', 'b', 'c', 'd', 'e']
>>> value = [1, 2, 3, 4, 5]
>>> dictionary = [dict(zip(keys,value)) for _ in xrange(4)]
>>> from pprint import pprint
>>> pprint(dictionary)
[{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5},
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5},
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5},
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}]
If you want a dict of dicts then use a dict comprehension:
>>> keys = ['a', 'b', 'c', 'd', 'e']
>>> value = [1, 2, 3, 4, 5]
>>> dictionary = {i: dict(zip(keys,value)) for i in xrange(4)}
>>> pprint(dictionary)
{0: {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5},
1: {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5},
2: {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5},
3: {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}}
An alternative that only zips once...:
from itertools import repeat
map(dict, repeat(zip(keys,values), 4))
Or, maybe, just use dict.copyand construct the dict once:
[d.copy() for d in repeat(dict(zip(keys, values)), 4)]
for a list of dictionaries:
dictionary = [dict(zip(keys,value)) for i in xrange(4)]
If you really wanted a dictionary of dictionaries like you said:
dictionary = dict((i,dict(zip(keys,value))) for i in xrange(4))
I suppose you could use pop or other dict calls which you could not from a list
BTW: if this is really a data/number crunching application, I'd suggest moving on to numpy and/or pandas as great modules.
Edit re: OP comments,
if you want indicies for the type of data you are talking about:
# dict keys must be tuples and not lists
[(i,j) for i in xrange(4) for j in range(3)]
# same can come from itertools.product
from itertools import product
list(product(xrange4, xrange 3))