How to construct a defaultdict from a dictionary? - python

If I have d=dict(zip(range(1,10),range(50,61))) how can I build a collections.defaultdict out of the dict?
The only argument defaultdict seems to take is the factory function, will I have to initialize and then go through the original d and update the defaultdict?

Read the docs:
The first argument provides the initial value for the default_factory
attribute; it defaults to None. All remaining arguments are treated
the same as if they were passed to the dict constructor, including
keyword arguments.
from collections import defaultdict
d=defaultdict(int, zip(range(1,10),range(50,61)))
Or given a dictionary d:
from collections import defaultdict
d=dict(zip(range(1,10),range(50,61)))
my_default_dict = defaultdict(int,d)

You can construct a defaultdict from dict, by passing the dict as the second argument.
from collections import defaultdict
d1 = {'foo': 17}
d2 = defaultdict(int, d1)
print(d2['foo']) ## should print 17
print(d2['bar']) ## should print 1 (default int val )

You can create a defaultdict with a dictionary by using a callable.
from collections import defaultdict
def dict_():
return {'foo': 1}
defaultdict_with_dict = defaultdict(dict_)

Related

Cast dict to defaultdict

The following code uses the {} operator to combine two defaultdicts.
from collections import defaultdict
aa=defaultdict(str)
bb=defaultdict(str)
aa['foo']+= '1'
bb['bar']+= '2'
cc = {**aa,**bb}
type(cc)
But, as we see if we run this, the {} operator returns a dict type not a defaultdict type.
Is there a way to cast a dict back to a defaultdict?
You can use unpacking directly in a call to defaultdict. defaultdict is a subclass of dict, and will pass those arguments to its parent to create a dictionary as though they had been passed to dict.
cc = defaultdict(str, **aa, **bb)
# defaultdict(<class 'str'>, {'bar': '2', 'foo': '1'})
You can do it the long way. The benefit of this method is you don't need to re-specify the type of defaultdict:
def merge_two_dicts(x, y):
z = x.copy()
z.update(y)
return z
cc = merge_two_dicts(aa, bb)
Unpacking in a single expression works but is inefficient:
n = 500000
d1 = defaultdict(int)
d1.update({i: i for i in range(n)})
d2 = defaultdict(int)
d2.update({i+n:i+n for i in range(n)})
%timeit defaultdict(int, {**d1, **d2}) # 150 ms per loop
%timeit merge_two_dicts(d1, d2) # 90.9 ms per loop
The defaultdict constructor can take two arguments, where the first is the function to use for the default, and the second a mapping (dict). It copies the keys/values from the dict passed in.
>>> d = defaultdict(list, {'a': [1,2,3]})
>>> d['a']
[1, 2, 3]

Nested dictionary with defaults

Is there a way to make a nested dictionary such that I can say mydict[x][y][z] += 1, where mydict[x][y][z] did not previously exist, and defaults to 0 (and would be 1 after incrementing)?
I looked into answers to a similar question in which you can say mydict[x][y][z] = 1 using defaultdict from the collections class (Declaring a multi dimensional dictionary in python), but this does not allow you to assume a default value and then increment.
Yes, you can do this with the collections module:
from collections import defaultdict, Counter
d = defaultdict(lambda: defaultdict(lambda: Counter()))
d['A']['B']['C'] += 1
# defaultdict(<function __main__.<lambda>>,
# {'A': defaultdict(<function __main__.<lambda>.<locals>.<lambda>>,
# {'B': Counter({'C': 1})})})
Note this is also possible via only using nested defaultdict:
d = defaultdict(lambda: defaultdict(lambda: defaultdict(int)))
However, given Counter was created for the specific purpose of incrementing integers, this would be the method I prefer.

Usage of setdefault instead of defaultdict

I need to create a structure like this :
D = {i:{j:{k:0,l:1,m:2}},a:{b:{c:0,d:4}}}
So this can be done using defaultdict:
D = defaultdict(defaultdict(Counter))
How do i use setdefault here?
EDIT :
Is it possible to combine setdefault and defaultdict ?
To build a multi-level dictionary with setdefault() you'd need to repeatedly access the keys like this:
>>> from collections import Counter
>>> d = {}
>>> d.setdefault("i", {}).setdefault("j", Counter())
Counter()
>>> d
{'i': {'j': Counter()}}
To generalize the usage for new keys you could use a function:
def get_counter(d, i, j):
return d.setdefault(i, {}).setdefault(j, Counter())

Using dict.fromkeys(), assign each value to an empty dictionary

I've hit a bit of a problem with creating empty dictionaries within dictionaries while using fromkeys(); they all link to the same one.
Here's a quick bit of code to demonstrate what I mean:
a = dict.fromkeys( range( 3 ), {} )
for key in a:
a[key][0] = key
Output I'd want is like a[0][0]=0, a[1][0]=1, a[2][0]=2, yet they all equal 2 since it's editing the same dictionarionary 3 times
If I was to define the dictionary like a = {0: {}, 1: {}, 2: {}}, it works, but that's not very practical for if you need to build it from a bigger list.
With fromkeys, I've tried {}, dict(), dict.copy() and b={}; b.copy(), how would I go about doing this?
The problem is that {} is a single value to fromkeys, and not a factory. Therefore you get the single mutable dict, not individual copies of it.
defaultdict is one way to create a dict that has a builtin factory.
from collections import defaultdict as dd
from pprint import pprint as pp
a = dd(dict)
for key in range(3):
a[key][0] = key
pp(a)
If you want something more strictly evaluated, you will need to use a dict comprehension or map.
a = {key: {} for key in range(3)}
But then, if you're going to do that, you may as well get it all done
a = {key: {0: key} for key in range(3)}
Just iterate over keys and insert a dict for each key:
{k: {0: k} for k in keys}
Here, keys is an iterable of hashable values such as range(3) in your example.

Multi-level defaultdict with variable depth?

I have a large list like:
[A][B1][C1]=1
[A][B1][C2]=2
[A][B2]=3
[D][E][F][G]=4
I want to build a multi-level dict like:
A
--B1
-----C1=1
-----C2=1
--B2=3
D
--E
----F
------G=4
I know that if I use recursive defaultdict I can write table[A][B1][C1]=1, table[A][B2]=2, but this works only if I hardcode those insert statement.
While parsing the list, I don't how many []'s I need beforehand to call table[key1][key2][...].
You can do it without even defining a class:
from collections import defaultdict
nested_dict = lambda: defaultdict(nested_dict)
nest = nested_dict()
nest[0][1][2][3][4][5] = 6
Your example says that at any level there can be a value, and also a dictionary of sub-elements. That is called a tree, and there are many implementations available for them. This is one:
from collections import defaultdict
class Tree(defaultdict):
def __init__(self, value=None):
super(Tree, self).__init__(Tree)
self.value = value
root = Tree()
root.value = 1
root['a']['b'].value = 3
print root.value
print root['a']['b'].value
print root['c']['d']['f'].value
Outputs:
1
3
None
You could do something similar by writing the input in JSON and using json.load to read it as a structure of nested dictionaries.
I think the simplest implementation of a recursive dictionary is this. Only leaf nodes can contain values.
# Define recursive dictionary
from collections import defaultdict
tree = lambda: defaultdict(tree)
Usage:
# Create instance
mydict = tree()
mydict['a'] = 1
mydict['b']['a'] = 2
mydict['c']
mydict['d']['a']['b'] = 0
# Print
import prettyprint
prettyprint.pp(mydict)
Output:
{
"a": 1,
"b": {
"a": 1
},
"c": {},
"d": {
"a": {
"b": 0
}
}
}
I'd do it with a subclass of dict that defines __missing__:
>>> class NestedDict(dict):
... def __missing__(self, key):
... self[key] = NestedDict()
... return self[key]
...
>>> table = NestedDict()
>>> table['A']['B1']['C1'] = 1
>>> table
{'A': {'B1': {'C1': 1}}}
You can't do it directly with defaultdict because defaultdict expects the factory function at initialization time, but at initialization time, there's no way to describe the same defaultdict. The above construct does the same thing that default dict does, but since it's a named class (NestedDict), it can reference itself as missing keys are encountered. It is also possible to subclass defaultdict and override __init__.
This is equivalent to the above, but avoiding lambda notation. Perhaps easier to read ?
def dict_factory():
return defaultdict(dict_factory)
your_dict = dict_factory()
Also -- from the comments -- if you'd like to update from an existing dict, you can simply call
your_dict[0][1][2].update({"some_key":"some_value"})
In order to add values to the dict.
Dan O'Huiginn posted a very nice solution on his journal in 2010:
http://ohuiginn.net/mt/2010/07/nested_dictionaries_in_python.html
>>> class NestedDict(dict):
... def __getitem__(self, key):
... if key in self: return self.get(key)
... return self.setdefault(key, NestedDict())
>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}
You may achieve this with a recursive defaultdict.
from collections import defaultdict
def tree():
def the_tree():
return defaultdict(the_tree)
return the_tree()
It is important to protect the default factory name, the_tree here, in a closure ("private" local function scope). Avoid using a one-liner lambda version, which is bugged due to Python's late binding closures, and implement this with a def instead.
The accepted answer, using a lambda, has a flaw where instances must rely on the nested_dict name existing in an outer scope. If for whatever reason the factory name can not be resolved (e.g. it was rebound or deleted) then pre-existing instances will also become subtly broken:
>>> nested_dict = lambda: defaultdict(nested_dict)
>>> nest = nested_dict()
>>> nest[0][1][2][3][4][6] = 7
>>> del nested_dict
>>> nest[8][9] = 10
# NameError: name 'nested_dict' is not defined
To add to #Hugo To have a max depth:
l=lambda x:defaultdict(lambda:l(x-1)) if x>0 else defaultdict(dict)
arr = l(2)
A slightly different possibility that allows regular dictionary initialization:
from collections import defaultdict
def superdict(arg=()):
update = lambda obj, arg: obj.update(arg) or obj
return update(defaultdict(superdict), arg)
Example:
>>> d = {"a":1}
>>> sd = superdict(d)
>>> sd["b"]["c"] = 2
You could use a NestedDict.
from ndicts.ndicts import NestedDict
nd = NestedDict()
nd[0, 1, 2, 3, 4, 5] = 6
The result as a dictionary:
>>> nd.to_dict()
{0: {1: {2: {3: {4: {5: 6}}}}}}
To install ndicts
pip install ndicts

Categories