Most pythonic way to initialize a dict - python

Suppose I have a dict like this
d = {
1: [1,4,7],
2: [2,5,8],
0: [3,6,9]
}
It can be constructed by
d = {}
for i in range(1,10):
key = i % 3
if key not in d: d[key] = []
d[key].append(i)
I used this line if key not in d: d[key] = [] to check existence of the key/value pair in the dict and initiate the pair.
Is there a more pythonic way to achieve this?

This is probably best handled with a defaultdict, which will automatically create any key-value mapping that is accessed if it doesn't already exist. You pass a callable to the defaultdict constructor that will be used to initialize the value. For example:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> d
defaultdict(list, {})
>>> d[3]
[]
>>> d
defaultdict(list, {3: []})

Using a comprehension:
>>> {n%3: list(range(n, n+7, 3)) for n in range(1,4)}
{0: [3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]}
Using dict.setdefault():
>>> d = {}
>>> for i in range(1, 10):
... d.setdefault(i%3, []).append(i)
...
>>> d
{0: [3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]}
Using defaultdict:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> for i in range(1, 10):
... d[i%3].append(i)
...
>>> d
defaultdict(<class 'list'>, {0: [3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]})

from collections import defaultdict
d = defaultdict(list)
for i in range(1,10):
key = i % 3
d[key].append(i)
print(d)
out:
defaultdict(<class 'list'>, {0: [3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]})
When each key is encountered for the first time, it is not already in
the mapping; so an entry is automatically created using the
default_factory function which returns an empty list. The
list.append() operation then attaches the value to the new list. When
keys are encountered again, the look-up proceeds normally (returning
the list for that key) and the list.append() operation adds another
value to the list. This technique is simpler and faster than an
equivalent technique using dict.setdefault():
>>> d = {}
>>> for k, v in s:
d.setdefault(k, []).append(v)

You can use list slices [start:stop:step] nominclature
d={}
for i in range(3):
d[i] = list(range(1,10))[(i+2)%3::3]
{0: [3, 6, 9],
1: [1, 4, 7],
2: [2, 5, 8]}

Given that you haven't given any input nor variable parts you might just initialize it with the literal you already have:
d = {1: [1,4,7],
2: [2,5,8],
0: [3,6,9]}
If you have variable input you may use collections.defaultdict with list as factory. Given that this operation is very common several external libraries have functions for this:
iteration_utilities.groupedby
toolz.groupby
For example:
>>> from iteration_utilities import groupedby
>>> groupedby(range(1, 10), lambda x: x % 3)
{0: [3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]}
or:
>>> from toolz import groupby
>>> groupby(lambda x: x % 3, range(1, 10))
{0: [3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]}

Related

Dict from two lists including multiple values for keys

Is there a possibility to create a dict from two lists with same key occurring multiple times without iterating over the whole dataset?
Minimal example:
keys = [1, 2, 3, 2, 3, 4, 5, 1]
values = [1, 2, 3, 4, 5, 6, 7, 8]
# hoped for result:
dictionary = dict(???)
dictionary = {1 : [1,8], 2:[2,4], 3:[3,5], 4:[6], 5:[7]}
When using zip the key-value-pair is inserted overwriting the old one:
dictionary = dict(zip(keys,values))
dictionary = {1: 8, 2: 4, 3: 5, 4: 6, 5: 7}
I would be happy with a Multidict as well.
This is one approach that doesn't require 2 for loops
h = defaultdict(list)
for k, v in zip(keys, values):
h[k].append(v)
print(h)
# defaultdict(<class 'list'>, {1: [1, 8], 2: [2, 4], 3: [3, 5], 4: [6], 5: [7]})
print(dict(h))
# {1: [1, 8], 2: [2, 4], 3: [3, 5], 4: [6], 5: [7]}
This is the only one-liner I could do.
dictionary = {k: [values[i] for i in [j for j, x in enumerate(keys) if x == k]] for k in set(keys)}
It is far from readable. Remember that clear code is always better than pseudo-clever code ;)
Here is an example that I think is easy to follow logically. Unfortunately it does not use zip like you would prefer, nor does it avoid iterating, because a task like this has to involve iterating In some form.
# Your data
keys = [1, 2, 3, 2, 3, 4, 5, 1]
values = [1, 2, 3, 4, 5, 6, 7, 8]
# Make result dict
result = {}
for x in range(1, max(keys)+1):
result[x] = []
# Populate result dict
for index, num in enumerate(keys):
result[num].append(values[index])
# Print result
print(result)
If you know the range of values in the keys array, you could make this faster by providing the results dictionary as a literal with integer keys and empty list values.

Fastest way to generate dictionary from two lists

I have two lists. For example:
keys = [1, 2, 3, 2, 4, 2, 1]
and
values = [1, 2, 3, 4, 5, 6, 7]
I want to create a dictionary of lists out of them as shown below:
dict = {1: [1, 7], 2: [2, 4, 6], 3: [3], 4: [5]}
What is the fastest way to do it and what generates the efficiency gains, both by using any module and also by not importing any additional module?
Using collections.defaultdict
from collections import defaultdict
d_dict = defaultdict(list)
for k,v in zip(keys, values):
d_dict[k].append(v)
dict(d_dict)
#{1: [1, 7], 2: [2, 4, 6], 3: [3], 4: [5]}
You can use the following:
d = {}
for k, v in zip(keys, values):
d.setdefault(k, []).append(v)
print(d)
which outputs:
{1: [1, 7], 2: [2, 4, 6], 3: [3], 4: [5]}
You could use defaultdict for your case. In case if some key does not exist the default dict calls the factory method (list() for current case) and instead of raising KeyError it will return new empty list, that is appended with value:
from collections import defaultdict
ld = defaultdict(list)
for k,v in zip(keys, values):
ld[k].append(v)
print(ld)

How to generate a list with repeating key from a dictionary?

I have a dictionary
a_dict = {1: 1, 4: 2, 5: 3, 6: 4}
I want to create a list such that the dict key appears value number of times:
a_list = [1, 4, 4, 5, 5, 5, 6, 6, 6, 6]
My current code is like this:
a_list = []
for key in a_dict.keys():
for value in a_dict.values():
I do not know what to do next?
This can be done in a concise way using a list comprehension with nested for loops:
>>> d = {1: 1, 4: 2, 5: 3, 6: 4}
>>> [k for k, v in d.items() for _ in range(v)]
[1, 4, 4, 5, 5, 5, 6, 6, 6, 6]
However, please note that dict is an unordered data structure and therefore the order of keys in the resulting list is arbitrary.
May I ask for which purpose you want to use the resulting list? Maybe there is a better way of solving the actual problem.
How about this?
a={1: 1, 4: 2, 5: 3, 6: 4}
list=[]
for key, value in a.items():
list.extend([key] * value)
print list
A rather ugly list comprehension:
[vals for tuplei in d.items() for vals in [tuplei[0]] * tuplei[1]]
yields
[1, 4, 4, 5, 5, 5, 6, 6, 6, 6]
Slightly more readable (resulting in the same output):
[vals for (keyi, vali) in d.items() for vals in [keyi] * vali]
An itertools solution:
import itertools
list(itertools.chain.from_iterable([[k]*v for k, v in d.items()]))
will also give
[1, 4, 4, 5, 5, 5, 6, 6, 6, 6]
Short explanation:
[[k]*v for k, v in d.items()]
creates
[[1], [4, 4], [5, 5, 5], [6, 6, 6, 6]]
which is then flattened.
You are not mssing much!
a_dict = {1: 1, 4: 2, 5: 3, 6: 4}
a_list = []
for key, value in a_dict.items():
a_list.extend([key]*value)
print(a_list)
dict = {1: 1, 4: 2, 5: 3, 6: 4}
list=[]
for key, value in dict.items():
i = 0
while i < value:
list.append(key)
i+=1
print(list)
Should do the trick

Find intersection of dictionary values which are lists

I have dictionaries in a list with same keys, while the values are variant:
[{1:[1,2,3,4,5], 2:[6,7,8], 3:[1,3,5,7,9]},
{1:[2,3,4], 2:[6,7], 3:[1,3,5]},
...]
I would like to get intersection as dictionary under same keys like this:
{1:[2,3,4], 2:[6,7], 3:[1,3,5]}
Give this a try?
dicts = [{1:[1,2,3,4,5], 2:[6,7,8], 3:[1,3,5,7,9]},{1:[2,3,4], 2:[6,7], 3:[1,3,5]}]
result = { k: set(dicts[0][k]).intersection(*(d[k] for d in dicts[1:])) for k in dicts[0].keys() }
print(result)
# Output:
# {1: {2, 3, 4}, 2: {6, 7}, 3: {1, 3, 5}}
If you want lists instead of sets as the output value type, just throw a list(...) around the set intersection.
For a list of dictionaries, reduce the whole list as this:
>>> from functools import reduce
>>> d = [{1:[1,2,3,4,5], 2:[6,7,8], 3:[1,3,5,7,9]},{1:[2,3,4], 2:[6,7], 3:[1,3,5]}]
>>> reduce(lambda x, y: {k: sorted(list(set(x[k])&set(y[k]))) for k in x.keys()}, d)
{1: [2, 3, 4], 2: [6, 7], 3: [1, 3, 5]}
I would probably do something along these lines:
# Take the first dict and convert the values to `set`.
output = {k: set(v) for k, v in dictionaries[0].items()}
# For the rest of the dicts, update the set at a given key by intersecting it with each of the other lists that have the same key.
for d in dictionaries[1:]:
for k, v in output.items():
output[k] = v.intersection(d[k])
There are different variations on this same theme, but I find this one to be about as simple to read as it gets (and since code is read more often than it is written, I consider that a win :-)
use dict.viewkeys and dict.viewitems
In [103]: dict.viewkeys?
Docstring: D.viewkeys() -> a set-like object providing a view on D's keys
dict.viewitems?
Docstring: D.viewitems() -> a set-like object providing a view on D's items
a = [{1: [1, 2, 3, 4, 5], 2: [6, 7, 8], 3: [1, 3, 5, 7, 9]},
{1: [2, 3, 4], 2: [6, 7], 3: [1, 3, 5]}]
In [100]: dict(zip(a[0].viewkeys() and a[1].viewkeys(), a[0].viewvalues() and a[1].viewvalues()))
Out[100]: {1: [2, 3, 4], 2: [6, 7], 3: [1, 3, 5]}

Fast dictionary population with list of keys

d = {} # or d = defaultdict(int)
list_of_lists = [[9, 7, 5, 3, 1], [2, 1, 3, 2, 5, 3, 7], [3, 5, 8, 1]]
for lst in list_of_lists:
for key in lst:
try:
d[key] += 1
except:
d[key] = 1
Is there a way to perform this operation without the for-loops?
Using a collections.Counter() object and a generator expression:
from collections import Counter
d = Counter(i for nested in list_of_lists for i in nested)
or replacing the generator expression with itertools.chain.from_iterable():
from itertools import chain
d = Counter(chain.from_iterable(list_of_lists))
Demo:
>>> from collections import Counter
>>> from itertools import chain
>>> list_of_lists = [[9, 7, 5, 3, 1], [2, 1, 3, 2, 5, 3, 7], [3, 5, 8, 1]]
>>> Counter(i for nested in list_of_lists for i in nested)
Counter({3: 4, 1: 3, 5: 3, 2: 2, 7: 2, 8: 1, 9: 1})
>>> Counter(chain.from_iterable(list_of_lists))
Counter({3: 4, 1: 3, 5: 3, 2: 2, 7: 2, 8: 1, 9: 1})
My understanding is that you want to count the frequency of each integer in your list of lists.
You can do this with numpy.bincount. The actual counting is very fast, as the core of numpy is C++. Some work needs to be done to get the data in the dictionary format -- you could potentially just use the numpy.array generated by this. The majority of this code is just converting from different formats, which you could do away with if your application allows.
list_of_lists = [[9, 7, 5, 3, 1], [2, 1, 3, 2, 5, 3, 7], [3, 5, 8, 1]]
import numpy as np
x = sum(list_of_lists, []) #convert your list of lists to a flat list
y = np.bincount(x) #count frequency of each element
#convert to dict
d = {}
ctr = 0
while ctr < len(y):
d[ctr] = y[ctr]
ctr += 1
If you are allergic to Counter (the right answer BTW), you can use setdefault:
d={}
for key in (e for sl in list_of_lists for e in sl):
d[key] = d.setdefault(key,0) + 1

Categories