Recursive collections.defaultdict initialization

Recursive collections.defaultdict initialization - python

My goal is to, given a python dict which I call datainit, create a recursive collections.defaultdict which I call data, such as data is initialised with datainit, and data can be extended with any path of missing keys,
as illustrated below
from collections import *
datainit={'number':1}
data =something_with(defaultdict(), datainit)
data['A']['B']['C']=3
#At this stage, I want:
#data['A']['B']['C'] ==3
#data['number'] == 1
#and nothing else.
The normal way to do that starting with an empty dict is, for instance:
nested_dict = lambda: defaultdict(nested_dict)
data = nested_dict()
Trying:
nested_dict = lambda: defaultdict(nested_dict, datainit)
data = nested_dict()
Will logically result in my datainit being duplicated for each missing key:
>>> datainit={'number':1}
>>> nested_dict = lambda: defaultdict(nested_dict, datainit)
>>> data=nested_dict()
>>> data
defaultdict(<function <lambda> at 0x7f58e5323758>, {'number': 1})
>>> data['A']['B']['C']=2
>>> data
defaultdict(<function <lambda> at 0x7f58e5323758>, {'A': defaultdict(<function <lambda> at 0x7f58e5323758>, {'B': defaultdict(<function <lambda> at 0x7f58e5323758>, {'C': 2, 'number': 1}), 'number': 1}),
'number': 1})
All this makes sense, but how do I do to just start with an initial dict and then just use an empty dict for each missing keys?
What should be my something_with(defaultdict(), datainit).
Probably obvious, but I cannot see it!

You have two tiers; the top level defaultdict, which must have the number key, and a series of arbitrarily nested dictionaries, which must not. Your mistake is to try to treat these as one, and to try to treat 'number' as something the factory for missing values should handle.
Just set the number key in the top dictionary. There is just one such key, with a value, and it should not be handled by the defaultdict() factory. The factory is there to provide a default value for arbitrary missing keys, number is not an arbitrary key.
from collections import defaultdict
def topleveldict():
nested_dict = lambda *a, **kw: defaultdict(nested_dict, *a, **kw)
return nested_dict(number=1) # or nested_dict({'number': 1})
data = topleveldict()
The topleveldict() function is only needed if you plan to create the structure in multiple places in your codebase. If there is just one such object or just one place you create these, then just inline the code from that function:
nested_dict = lambda *a, **kw: defaultdict(nested_dict, *a, **kw)
data = nested_dict(number=1) # or nested_dict({'number': 1})

Related

Can someone explain what this does "defaultdict(lambda:0)" [duplicate]

In someone else's code I read the following two lines:
x = defaultdict(lambda: 0)
y = defaultdict(lambda: defaultdict(lambda: 0))
As the argument of defaultdict is a default factory, I think the first line means that when I call x[k] for a nonexistent key k (such as a statement like v=x[k]), the key-value pair (k,0) will be automatically added to the dictionary, as if the statement x[k]=0 is first executed. Am I correct?
And what about y? It seems that the default factory will create a defaultdict with default 0. But what does that mean concretely? I tried to play around with it in Python shell, but couldn't figure out what it is exactly.

I think the first line means that when I call x[k] for a nonexistent key k (such as a statement like v=x[k]), the key-value pair (k,0) will be automatically added to the dictionary, as if the statement x[k]=0 is first executed.
That's right. This is more idiomatically written
x = defaultdict(int)
In the case of y, when you do y["ham"]["spam"], the key "ham" is inserted in y if it does not exist. The value associated with it becomes a defaultdict in which "spam" is automatically inserted with a value of 0.
I.e., y is a kind of "two-tiered" defaultdict. If "ham" not in y, then evaluating y["ham"]["spam"] is like doing
y["ham"] = {}
y["ham"]["spam"] = 0
in terms of ordinary dict.

You are correct for what the first one does. As for y, it will create a defaultdict with default 0 when a key doesn't exist in y, so you can think of this as a nested dictionary. Consider the following example:
y = defaultdict(lambda: defaultdict(lambda: 0))
print y['k1']['k2'] # 0
print dict(y['k1']) # {'k2': 0}
To create an equivalent nested dictionary structure without defaultdict you would need to create an inner dict for y['k1'] and then set y['k1']['k2'] to 0, but defaultdict does all of this behind the scenes when it encounters keys it hasn't seen:
y = {}
y['k1'] = {}
y['k1']['k2'] = 0
The following function may help for playing around with this on an interpreter to better your understanding:
def to_dict(d):
if isinstance(d, defaultdict):
return dict((k, to_dict(v)) for k, v in d.items())
return d
This will return the dict equivalent of a nested defaultdict, which is a lot easier to read, for example:
>>> y = defaultdict(lambda: defaultdict(lambda: 0))
>>> y['a']['b'] = 5
>>> y
defaultdict(<function <lambda> at 0xb7ea93e4>, {'a': defaultdict(<function <lambda> at 0xb7ea9374>, {'b': 5})})
>>> to_dict(y)
{'a': {'b': 5}}

defaultdict takes a zero-argument callable to its constructor, which is called when the key is not found, as you correctly explained.
lambda: 0 will of course always return zero, but the preferred method to do that is defaultdict(int), which will do the same thing.
As for the second part, the author would like to create a new defaultdict(int), or a nested dictionary, whenever a key is not found in the top-level dictionary.

All answers are good enough still I am giving the answer to add more info:
"defaultdict requires an argument that is callable. That return result of that callable object is the default value that the dictionary returns when you try to access the dictionary with a key that does not exist."
Here's an example
SAMPLE= {'Age':28, 'Salary':2000}
SAMPLE = defaultdict(lambda:0,SAMPLE)
>>> SAMPLE
defaultdict(<function <lambda> at 0x0000000002BF7C88>, {'Salary': 2000, 'Age': 28})
>>> SAMPLE['Age']----> This will return 28
>>> SAMPLE['Phone']----> This will return 0 # you got 0 as output for a non existing key inside SAMPLE

y = defaultdict(lambda:defaultdict(lambda:0))
will be helpful if you try this y['a']['b'] += 1

why does dict.update(key=val) not use the string referenced by key?

As the title suggests, I am trying to update a dictionary using the update() method like in the following code block
for key, val in my_dict.items():
new_dict.update(key=val)
If my_dict = {'a': 1, 'b': 2} I would expect the result to be that new_dict = {'a': 1, 'b': 2} (assuming of course that new_dict is already defined). However, when executed, I instead get new_dict = {'key': 2}.
What am I doing wrong?

Keyword arguments always use the fixed identifier as the key. Use keyword expansion instead.
new_dict.update(**{key: val})
Or if new_dict really is a dict, just pass the dict itself.
new_dict.update({key: val})

Here is a code for the update method, so you can see why it behaves the way it does (it is not the real source code, just an example):
def update(self, other_dict={}, **kwargs):
for k, v in other_dict.items():
self[k] = v
for k, v in kwargs.items():
self[k] = v
So if you call new_dict.update(key=val) your kwargs will be equal to {"key": value}.
You need to pass your arguments inside a dictionary if you want to dinamically set the new keys.

update uses keyword arguments to update dictionary, or dictionary or iterable of pairs. You can just pass your dictionary as the first argument:
new_dict.update(my_dict)
update designed to work with several keys at once. If you just want to set single value, you can just set the value:
new_dict[key] = value

Dictionaries in Python 3 [duplicate]

How do I add a key to an existing dictionary? It doesn't have an .add() method.

You create a new key/value pair on a dictionary by assigning a value to that key
d = {'key': 'value'}
print(d) # {'key': 'value'}
d['mynewkey'] = 'mynewvalue'
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue'}
If the key doesn't exist, it's added and points to that value. If it exists, the current value it points to is overwritten.

I feel like consolidating info about Python dictionaries:
Creating an empty dictionary
data = {}
# OR
data = dict()
Creating a dictionary with initial values
data = {'a': 1, 'b': 2, 'c': 3}
# OR
data = dict(a=1, b=2, c=3)
# OR
data = {k: v for k, v in (('a', 1), ('b',2), ('c',3))}
Inserting/Updating a single value
data['a'] = 1 # Updates if 'a' exists, else adds 'a'
# OR
data.update({'a': 1})
# OR
data.update(dict(a=1))
# OR
data.update(a=1)
Inserting/Updating multiple values
data.update({'c':3,'d':4}) # Updates 'c' and adds 'd'
Python 3.9+:
The update operator |= now works for dictionaries:
data |= {'c':3,'d':4}
Creating a merged dictionary without modifying originals
data3 = {}
data3.update(data) # Modifies data3, not data
data3.update(data2) # Modifies data3, not data2
Python 3.5+:
This uses a new feature called dictionary unpacking.
data = {**data1, **data2, **data3}
Python 3.9+:
The merge operator | now works for dictionaries:
data = data1 | {'c':3,'d':4}
Deleting items in dictionary
del data[key] # Removes specific element in a dictionary
data.pop(key) # Removes the key & returns the value
data.clear() # Clears entire dictionary
Check if a key is already in dictionary
key in data
Iterate through pairs in a dictionary
for key in data: # Iterates just through the keys, ignoring the values
for key, value in d.items(): # Iterates through the pairs
for key in d.keys(): # Iterates just through key, ignoring the values
for value in d.values(): # Iterates just through value, ignoring the keys
Create a dictionary from two lists
data = dict(zip(list_with_keys, list_with_values))

To add multiple keys simultaneously, use dict.update():
>>> x = {1:2}
>>> print(x)
{1: 2}
>>> d = {3:4, 5:6, 7:8}
>>> x.update(d)
>>> print(x)
{1: 2, 3: 4, 5: 6, 7: 8}
For adding a single key, the accepted answer has less computational overhead.

"Is it possible to add a key to a Python dictionary after it has been created? It doesn't seem to have an .add() method."
Yes it is possible, and it does have a method that implements this, but you don't want to use it directly.
To demonstrate how and how not to use it, let's create an empty dict with the dict literal, {}:
my_dict = {}
Best Practice 1: Subscript notation
To update this dict with a single new key and value, you can use the subscript notation (see Mappings here) that provides for item assignment:
my_dict['new key'] = 'new value'
my_dict is now:
{'new key': 'new value'}
Best Practice 2: The update method - 2 ways
We can also update the dict with multiple values efficiently as well using the update method. We may be unnecessarily creating an extra dict here, so we hope our dict has already been created and came from or was used for another purpose:
my_dict.update({'key 2': 'value 2', 'key 3': 'value 3'})
my_dict is now:
{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value'}
Another efficient way of doing this with the update method is with keyword arguments, but since they have to be legitimate python words, you can't have spaces or special symbols or start the name with a number, but many consider this a more readable way to create keys for a dict, and here we certainly avoid creating an extra unnecessary dict:
my_dict.update(foo='bar', foo2='baz')
and my_dict is now:
{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value',
'foo': 'bar', 'foo2': 'baz'}
So now we have covered three Pythonic ways of updating a dict.
Magic method, __setitem__, and why it should be avoided
There's another way of updating a dict that you shouldn't use, which uses the __setitem__ method. Here's an example of how one might use the __setitem__ method to add a key-value pair to a dict, and a demonstration of the poor performance of using it:
>>> d = {}
>>> d.__setitem__('foo', 'bar')
>>> d
{'foo': 'bar'}
>>> def f():
... d = {}
... for i in xrange(100):
... d['foo'] = i
...
>>> def g():
... d = {}
... for i in xrange(100):
... d.__setitem__('foo', i)
...
>>> import timeit
>>> number = 100
>>> min(timeit.repeat(f, number=number))
0.0020880699157714844
>>> min(timeit.repeat(g, number=number))
0.005071878433227539
So we see that using the subscript notation is actually much faster than using __setitem__. Doing the Pythonic thing, that is, using the language in the way it was intended to be used, usually is both more readable and computationally efficient.

dictionary[key] = value

If you want to add a dictionary within a dictionary you can do it this way.
Example: Add a new entry to your dictionary & sub dictionary
dictionary = {}
dictionary["new key"] = "some new entry" # add new dictionary entry
dictionary["dictionary_within_a_dictionary"] = {} # this is required by python
dictionary["dictionary_within_a_dictionary"]["sub_dict"] = {"other" : "dictionary"}
print (dictionary)
Output:
{'new key': 'some new entry', 'dictionary_within_a_dictionary': {'sub_dict': {'other': 'dictionarly'}}}
NOTE: Python requires that you first add a sub
dictionary["dictionary_within_a_dictionary"] = {}
before adding entries.

The conventional syntax is d[key] = value, but if your keyboard is missing the square bracket keys you could also do:
d.__setitem__(key, value)
In fact, defining __getitem__ and __setitem__ methods is how you can make your own class support the square bracket syntax. See Dive Into Python, Classes That Act Like Dictionaries.

You can create one:
class myDict(dict):
def __init__(self):
self = dict()
def add(self, key, value):
self[key] = value
## example
myd = myDict()
myd.add('apples',6)
myd.add('bananas',3)
print(myd)
Gives:
>>>
{'apples': 6, 'bananas': 3}

This popular question addresses functional methods of merging dictionaries a and b.
Here are some of the more straightforward methods (tested in Python 3)...
c = dict( a, **b ) ## see also https://stackoverflow.com/q/2255878
c = dict( list(a.items()) + list(b.items()) )
c = dict( i for d in [a,b] for i in d.items() )
Note: The first method above only works if the keys in b are strings.
To add or modify a single element, the b dictionary would contain only that one element...
c = dict( a, **{'d':'dog'} ) ## returns a dictionary based on 'a'
This is equivalent to...
def functional_dict_add( dictionary, key, value ):
temp = dictionary.copy()
temp[key] = value
return temp
c = functional_dict_add( a, 'd', 'dog' )

Let's pretend you want to live in the immutable world and do not want to modify the original but want to create a new dict that is the result of adding a new key to the original.
In Python 3.5+ you can do:
params = {'a': 1, 'b': 2}
new_params = {**params, **{'c': 3}}
The Python 2 equivalent is:
params = {'a': 1, 'b': 2}
new_params = dict(params, **{'c': 3})
After either of these:
params is still equal to {'a': 1, 'b': 2}
and
new_params is equal to {'a': 1, 'b': 2, 'c': 3}
There will be times when you don't want to modify the original (you only want the result of adding to the original). I find this a refreshing alternative to the following:
params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params['c'] = 3
or
params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params.update({'c': 3})
Reference: What does `**` mean in the expression `dict(d1, **d2)`?

There is also the strangely named, oddly behaved, and yet still handy dict.setdefault().
This
value = my_dict.setdefault(key, default)
basically just does this:
try:
value = my_dict[key]
except KeyError: # key not found
value = my_dict[key] = default
E.g.,
>>> mydict = {'a':1, 'b':2, 'c':3}
>>> mydict.setdefault('d', 4)
4 # returns new value at mydict['d']
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # a new key/value pair was indeed added
# but see what happens when trying it on an existing key...
>>> mydict.setdefault('a', 111)
1 # old value was returned
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # existing key was ignored

This question has already been answered ad nauseam, but since my
comment
gained a lot of traction, here it is as an answer:
Adding new keys without updating the existing dict
If you are here trying to figure out how to add a key and return a new dictionary (without modifying the existing one), you can do this using the techniques below
Python >= 3.5
new_dict = {**mydict, 'new_key': new_val}
Python < 3.5
new_dict = dict(mydict, new_key=new_val)
Note that with this approach, your key will need to follow the rules of valid identifier names in Python.

If you're not joining two dictionaries, but adding new key-value pairs to a dictionary, then using the subscript notation seems like the best way.
import timeit
timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary.update({"aaa": 123123, "asd": 233})')
>> 0.49582505226135254
timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary["aaa"] = 123123; dictionary["asd"] = 233;')
>> 0.20782899856567383
However, if you'd like to add, for example, thousands of new key-value pairs, you should consider using the update() method.

Here's another way that I didn't see here:
>>> foo = dict(a=1,b=2)
>>> foo
{'a': 1, 'b': 2}
>>> goo = dict(c=3,**foo)
>>> goo
{'c': 3, 'a': 1, 'b': 2}
You can use the dictionary constructor and implicit expansion to reconstruct a dictionary. Moreover, interestingly, this method can be used to control the positional order during dictionary construction (post Python 3.6). In fact, insertion order is guaranteed for Python 3.7 and above!
>>> foo = dict(a=1,b=2,c=3,d=4)
>>> new_dict = {k: v for k, v in list(foo.items())[:2]}
>>> new_dict
{'a': 1, 'b': 2}
>>> new_dict.update(newvalue=99)
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99}
>>> new_dict.update({k: v for k, v in list(foo.items())[2:]})
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99, 'c': 3, 'd': 4}
>>>
The above is using dictionary comprehension.

First to check whether the key already exists:
a={1:2,3:4}
a.get(1)
2
a.get(5)
None
Then you can add the new key and value.

Add a dictionary (key,value) class.
class myDict(dict):
def __init__(self):
self = dict()
def add(self, key, value):
#self[key] = value # add new key and value overwriting any exiting same key
if self.get(key)!=None:
print('key', key, 'already used') # report if key already used
self.setdefault(key, value) # if key exit do nothing
## example
myd = myDict()
name = "fred"
myd.add('apples',6)
print('\n', myd)
myd.add('bananas',3)
print('\n', myd)
myd.add('jack', 7)
print('\n', myd)
myd.add(name, myd)
print('\n', myd)
myd.add('apples', 23)
print('\n', myd)
myd.add(name, 2)
print(myd)

I think it would also be useful to point out Python's collections module that consists of many useful dictionary subclasses and wrappers that simplify the addition and modification of data types in a dictionary, specifically defaultdict:
dict subclass that calls a factory function to supply missing values
This is particularly useful if you are working with dictionaries that always consist of the same data types or structures, for example a dictionary of lists.
>>> from collections import defaultdict
>>> example = defaultdict(int)
>>> example['key'] += 1
>>> example['key']
defaultdict(<class 'int'>, {'key': 1})
If the key does not yet exist, defaultdict assigns the value given (in our case 10) as the initial value to the dictionary (often used inside loops). This operation therefore does two things: it adds a new key to a dictionary (as per question), and assigns the value if the key doesn't yet exist. With the standard dictionary, this would have raised an error as the += operation is trying to access a value that doesn't yet exist:
>>> example = dict()
>>> example['key'] += 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'key'
Without the use of defaultdict, the amount of code to add a new element would be much greater and perhaps looks something like:
# This type of code would often be inside a loop
if 'key' not in example:
example['key'] = 0 # add key and initial value to dict; could also be a list
example['key'] += 1 # this is implementing a counter
defaultdict can also be used with complex data types such as list and set:
>>> example = defaultdict(list)
>>> example['key'].append(1)
>>> example
defaultdict(<class 'list'>, {'key': [1]})
Adding an element automatically initialises the list.

Adding keys to dictionary without using add
# Inserting/Updating single value
# subscript notation method
d['mynewkey'] = 'mynewvalue' # Updates if 'a' exists, else adds 'a'
# OR
d.update({'mynewkey': 'mynewvalue'})
# OR
d.update(dict('mynewkey'='mynewvalue'))
# OR
d.update('mynewkey'='mynewvalue')
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue'}
# To add/update multiple keys simultaneously, use d.update():
x = {3:4, 5:6, 7:8}
d.update(x)
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue', 3: 4, 5: 6, 7: 8}
# update operator |= now works for dictionaries:
d |= {'c':3,'d':4}
# Assigning new key value pair using dictionary unpacking.
data1 = {4:6, 9:10, 17:20}
data2 = {20:30, 32:48, 90:100}
data3 = { 38:"value", 99:"notvalid"}
d = {**data1, **data2, **data3}
# The merge operator | now works for dictionaries:
data = data1 | {'c':3,'d':4}
# Create a dictionary from two lists
data = dict(zip(list_with_keys, list_with_values))

dico["new key"] = "value"

Multi-level defaultdict with variable depth?

I have a large list like:
[A][B1][C1]=1
[A][B1][C2]=2
[A][B2]=3
[D][E][F][G]=4
I want to build a multi-level dict like:
A
--B1
-----C1=1
-----C2=1
--B2=3
D
--E
----F
------G=4
I know that if I use recursive defaultdict I can write table[A][B1][C1]=1, table[A][B2]=2, but this works only if I hardcode those insert statement.
While parsing the list, I don't how many []'s I need beforehand to call table[key1][key2][...].

You can do it without even defining a class:
from collections import defaultdict
nested_dict = lambda: defaultdict(nested_dict)
nest = nested_dict()
nest[0][1][2][3][4][5] = 6

Your example says that at any level there can be a value, and also a dictionary of sub-elements. That is called a tree, and there are many implementations available for them. This is one:
from collections import defaultdict
class Tree(defaultdict):
def __init__(self, value=None):
super(Tree, self).__init__(Tree)
self.value = value
root = Tree()
root.value = 1
root['a']['b'].value = 3
print root.value
print root['a']['b'].value
print root['c']['d']['f'].value
Outputs:
1
3
None
You could do something similar by writing the input in JSON and using json.load to read it as a structure of nested dictionaries.

I think the simplest implementation of a recursive dictionary is this. Only leaf nodes can contain values.
# Define recursive dictionary
from collections import defaultdict
tree = lambda: defaultdict(tree)
Usage:
# Create instance
mydict = tree()
mydict['a'] = 1
mydict['b']['a'] = 2
mydict['c']
mydict['d']['a']['b'] = 0
# Print
import prettyprint
prettyprint.pp(mydict)
Output:
{
"a": 1,
"b": {
"a": 1
},
"c": {},
"d": {
"a": {
"b": 0
}
}
}

I'd do it with a subclass of dict that defines __missing__:
>>> class NestedDict(dict):
... def __missing__(self, key):
... self[key] = NestedDict()
... return self[key]
...
>>> table = NestedDict()
>>> table['A']['B1']['C1'] = 1
>>> table
{'A': {'B1': {'C1': 1}}}
You can't do it directly with defaultdict because defaultdict expects the factory function at initialization time, but at initialization time, there's no way to describe the same defaultdict. The above construct does the same thing that default dict does, but since it's a named class (NestedDict), it can reference itself as missing keys are encountered. It is also possible to subclass defaultdict and override __init__.

This is equivalent to the above, but avoiding lambda notation. Perhaps easier to read ?
def dict_factory():
return defaultdict(dict_factory)
your_dict = dict_factory()
Also -- from the comments -- if you'd like to update from an existing dict, you can simply call
your_dict[0][1][2].update({"some_key":"some_value"})
In order to add values to the dict.

Dan O'Huiginn posted a very nice solution on his journal in 2010:
http://ohuiginn.net/mt/2010/07/nested_dictionaries_in_python.html
>>> class NestedDict(dict):
... def __getitem__(self, key):
... if key in self: return self.get(key)
... return self.setdefault(key, NestedDict())
>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}

You may achieve this with a recursive defaultdict.
from collections import defaultdict
def tree():
def the_tree():
return defaultdict(the_tree)
return the_tree()
It is important to protect the default factory name, the_tree here, in a closure ("private" local function scope). Avoid using a one-liner lambda version, which is bugged due to Python's late binding closures, and implement this with a def instead.
The accepted answer, using a lambda, has a flaw where instances must rely on the nested_dict name existing in an outer scope. If for whatever reason the factory name can not be resolved (e.g. it was rebound or deleted) then pre-existing instances will also become subtly broken:
>>> nested_dict = lambda: defaultdict(nested_dict)
>>> nest = nested_dict()
>>> nest[0][1][2][3][4][6] = 7
>>> del nested_dict
>>> nest[8][9] = 10
# NameError: name 'nested_dict' is not defined

To add to #Hugo To have a max depth:
l=lambda x:defaultdict(lambda:l(x-1)) if x>0 else defaultdict(dict)
arr = l(2)

A slightly different possibility that allows regular dictionary initialization:
from collections import defaultdict
def superdict(arg=()):
update = lambda obj, arg: obj.update(arg) or obj
return update(defaultdict(superdict), arg)
Example:
>>> d = {"a":1}
>>> sd = superdict(d)
>>> sd["b"]["c"] = 2

You could use a NestedDict.
from ndicts.ndicts import NestedDict
nd = NestedDict()
nd[0, 1, 2, 3, 4, 5] = 6
The result as a dictionary:
>>> nd.to_dict()
{0: {1: {2: {3: {4: {5: 6}}}}}}
To install ndicts
pip install ndicts

How can I add new keys to a dictionary?

How do I add a key to an existing dictionary? It doesn't have an .add() method.

You create a new key/value pair on a dictionary by assigning a value to that key
d = {'key': 'value'}
print(d) # {'key': 'value'}
d['mynewkey'] = 'mynewvalue'
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue'}
If the key doesn't exist, it's added and points to that value. If it exists, the current value it points to is overwritten.

I feel like consolidating info about Python dictionaries:
Creating an empty dictionary
data = {}
# OR
data = dict()
Creating a dictionary with initial values
data = {'a': 1, 'b': 2, 'c': 3}
# OR
data = dict(a=1, b=2, c=3)
# OR
data = {k: v for k, v in (('a', 1), ('b',2), ('c',3))}
Inserting/Updating a single value
data['a'] = 1 # Updates if 'a' exists, else adds 'a'
# OR
data.update({'a': 1})
# OR
data.update(dict(a=1))
# OR
data.update(a=1)
Inserting/Updating multiple values
data.update({'c':3,'d':4}) # Updates 'c' and adds 'd'
Python 3.9+:
The update operator |= now works for dictionaries:
data |= {'c':3,'d':4}
Creating a merged dictionary without modifying originals
data3 = {}
data3.update(data) # Modifies data3, not data
data3.update(data2) # Modifies data3, not data2
Python 3.5+:
This uses a new feature called dictionary unpacking.
data = {**data1, **data2, **data3}
Python 3.9+:
The merge operator | now works for dictionaries:
data = data1 | {'c':3,'d':4}
Deleting items in dictionary
del data[key] # Removes specific element in a dictionary
data.pop(key) # Removes the key & returns the value
data.clear() # Clears entire dictionary
Check if a key is already in dictionary
key in data
Iterate through pairs in a dictionary
for key in data: # Iterates just through the keys, ignoring the values
for key, value in d.items(): # Iterates through the pairs
for key in d.keys(): # Iterates just through key, ignoring the values
for value in d.values(): # Iterates just through value, ignoring the keys
Create a dictionary from two lists
data = dict(zip(list_with_keys, list_with_values))

To add multiple keys simultaneously, use dict.update():
>>> x = {1:2}
>>> print(x)
{1: 2}
>>> d = {3:4, 5:6, 7:8}
>>> x.update(d)
>>> print(x)
{1: 2, 3: 4, 5: 6, 7: 8}
For adding a single key, the accepted answer has less computational overhead.

"Is it possible to add a key to a Python dictionary after it has been created? It doesn't seem to have an .add() method."
Yes it is possible, and it does have a method that implements this, but you don't want to use it directly.
To demonstrate how and how not to use it, let's create an empty dict with the dict literal, {}:
my_dict = {}
Best Practice 1: Subscript notation
To update this dict with a single new key and value, you can use the subscript notation (see Mappings here) that provides for item assignment:
my_dict['new key'] = 'new value'
my_dict is now:
{'new key': 'new value'}
Best Practice 2: The update method - 2 ways
We can also update the dict with multiple values efficiently as well using the update method. We may be unnecessarily creating an extra dict here, so we hope our dict has already been created and came from or was used for another purpose:
my_dict.update({'key 2': 'value 2', 'key 3': 'value 3'})
my_dict is now:
{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value'}
Another efficient way of doing this with the update method is with keyword arguments, but since they have to be legitimate python words, you can't have spaces or special symbols or start the name with a number, but many consider this a more readable way to create keys for a dict, and here we certainly avoid creating an extra unnecessary dict:
my_dict.update(foo='bar', foo2='baz')
and my_dict is now:
{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value',
'foo': 'bar', 'foo2': 'baz'}
So now we have covered three Pythonic ways of updating a dict.
Magic method, __setitem__, and why it should be avoided
There's another way of updating a dict that you shouldn't use, which uses the __setitem__ method. Here's an example of how one might use the __setitem__ method to add a key-value pair to a dict, and a demonstration of the poor performance of using it:
>>> d = {}
>>> d.__setitem__('foo', 'bar')
>>> d
{'foo': 'bar'}
>>> def f():
... d = {}
... for i in xrange(100):
... d['foo'] = i
...
>>> def g():
... d = {}
... for i in xrange(100):
... d.__setitem__('foo', i)
...
>>> import timeit
>>> number = 100
>>> min(timeit.repeat(f, number=number))
0.0020880699157714844
>>> min(timeit.repeat(g, number=number))
0.005071878433227539
So we see that using the subscript notation is actually much faster than using __setitem__. Doing the Pythonic thing, that is, using the language in the way it was intended to be used, usually is both more readable and computationally efficient.

dictionary[key] = value

If you want to add a dictionary within a dictionary you can do it this way.
Example: Add a new entry to your dictionary & sub dictionary
dictionary = {}
dictionary["new key"] = "some new entry" # add new dictionary entry
dictionary["dictionary_within_a_dictionary"] = {} # this is required by python
dictionary["dictionary_within_a_dictionary"]["sub_dict"] = {"other" : "dictionary"}
print (dictionary)
Output:
{'new key': 'some new entry', 'dictionary_within_a_dictionary': {'sub_dict': {'other': 'dictionarly'}}}
NOTE: Python requires that you first add a sub
dictionary["dictionary_within_a_dictionary"] = {}
before adding entries.

The conventional syntax is d[key] = value, but if your keyboard is missing the square bracket keys you could also do:
d.__setitem__(key, value)
In fact, defining __getitem__ and __setitem__ methods is how you can make your own class support the square bracket syntax. See Dive Into Python, Classes That Act Like Dictionaries.

You can create one:
class myDict(dict):
def __init__(self):
self = dict()
def add(self, key, value):
self[key] = value
## example
myd = myDict()
myd.add('apples',6)
myd.add('bananas',3)
print(myd)
Gives:
>>>
{'apples': 6, 'bananas': 3}

This popular question addresses functional methods of merging dictionaries a and b.
Here are some of the more straightforward methods (tested in Python 3)...
c = dict( a, **b ) ## see also https://stackoverflow.com/q/2255878
c = dict( list(a.items()) + list(b.items()) )
c = dict( i for d in [a,b] for i in d.items() )
Note: The first method above only works if the keys in b are strings.
To add or modify a single element, the b dictionary would contain only that one element...
c = dict( a, **{'d':'dog'} ) ## returns a dictionary based on 'a'
This is equivalent to...
def functional_dict_add( dictionary, key, value ):
temp = dictionary.copy()
temp[key] = value
return temp
c = functional_dict_add( a, 'd', 'dog' )

Let's pretend you want to live in the immutable world and do not want to modify the original but want to create a new dict that is the result of adding a new key to the original.
In Python 3.5+ you can do:
params = {'a': 1, 'b': 2}
new_params = {**params, **{'c': 3}}
The Python 2 equivalent is:
params = {'a': 1, 'b': 2}
new_params = dict(params, **{'c': 3})
After either of these:
params is still equal to {'a': 1, 'b': 2}
and
new_params is equal to {'a': 1, 'b': 2, 'c': 3}
There will be times when you don't want to modify the original (you only want the result of adding to the original). I find this a refreshing alternative to the following:
params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params['c'] = 3
or
params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params.update({'c': 3})
Reference: What does `**` mean in the expression `dict(d1, **d2)`?

There is also the strangely named, oddly behaved, and yet still handy dict.setdefault().
This
value = my_dict.setdefault(key, default)
basically just does this:
try:
value = my_dict[key]
except KeyError: # key not found
value = my_dict[key] = default
E.g.,
>>> mydict = {'a':1, 'b':2, 'c':3}
>>> mydict.setdefault('d', 4)
4 # returns new value at mydict['d']
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # a new key/value pair was indeed added
# but see what happens when trying it on an existing key...
>>> mydict.setdefault('a', 111)
1 # old value was returned
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # existing key was ignored

This question has already been answered ad nauseam, but since my
comment
gained a lot of traction, here it is as an answer:
Adding new keys without updating the existing dict
If you are here trying to figure out how to add a key and return a new dictionary (without modifying the existing one), you can do this using the techniques below
Python >= 3.5
new_dict = {**mydict, 'new_key': new_val}
Python < 3.5
new_dict = dict(mydict, new_key=new_val)
Note that with this approach, your key will need to follow the rules of valid identifier names in Python.

If you're not joining two dictionaries, but adding new key-value pairs to a dictionary, then using the subscript notation seems like the best way.
import timeit
timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary.update({"aaa": 123123, "asd": 233})')
>> 0.49582505226135254
timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary["aaa"] = 123123; dictionary["asd"] = 233;')
>> 0.20782899856567383
However, if you'd like to add, for example, thousands of new key-value pairs, you should consider using the update() method.

Here's another way that I didn't see here:
>>> foo = dict(a=1,b=2)
>>> foo
{'a': 1, 'b': 2}
>>> goo = dict(c=3,**foo)
>>> goo
{'c': 3, 'a': 1, 'b': 2}
You can use the dictionary constructor and implicit expansion to reconstruct a dictionary. Moreover, interestingly, this method can be used to control the positional order during dictionary construction (post Python 3.6). In fact, insertion order is guaranteed for Python 3.7 and above!
>>> foo = dict(a=1,b=2,c=3,d=4)
>>> new_dict = {k: v for k, v in list(foo.items())[:2]}
>>> new_dict
{'a': 1, 'b': 2}
>>> new_dict.update(newvalue=99)
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99}
>>> new_dict.update({k: v for k, v in list(foo.items())[2:]})
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99, 'c': 3, 'd': 4}
>>>
The above is using dictionary comprehension.

First to check whether the key already exists:
a={1:2,3:4}
a.get(1)
2
a.get(5)
None
Then you can add the new key and value.

Add a dictionary (key,value) class.
class myDict(dict):
def __init__(self):
self = dict()
def add(self, key, value):
#self[key] = value # add new key and value overwriting any exiting same key
if self.get(key)!=None:
print('key', key, 'already used') # report if key already used
self.setdefault(key, value) # if key exit do nothing
## example
myd = myDict()
name = "fred"
myd.add('apples',6)
print('\n', myd)
myd.add('bananas',3)
print('\n', myd)
myd.add('jack', 7)
print('\n', myd)
myd.add(name, myd)
print('\n', myd)
myd.add('apples', 23)
print('\n', myd)
myd.add(name, 2)
print(myd)

I think it would also be useful to point out Python's collections module that consists of many useful dictionary subclasses and wrappers that simplify the addition and modification of data types in a dictionary, specifically defaultdict:
dict subclass that calls a factory function to supply missing values
This is particularly useful if you are working with dictionaries that always consist of the same data types or structures, for example a dictionary of lists.
>>> from collections import defaultdict
>>> example = defaultdict(int)
>>> example['key'] += 1
>>> example['key']
defaultdict(<class 'int'>, {'key': 1})
If the key does not yet exist, defaultdict assigns the value given (in our case 10) as the initial value to the dictionary (often used inside loops). This operation therefore does two things: it adds a new key to a dictionary (as per question), and assigns the value if the key doesn't yet exist. With the standard dictionary, this would have raised an error as the += operation is trying to access a value that doesn't yet exist:
>>> example = dict()
>>> example['key'] += 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'key'
Without the use of defaultdict, the amount of code to add a new element would be much greater and perhaps looks something like:
# This type of code would often be inside a loop
if 'key' not in example:
example['key'] = 0 # add key and initial value to dict; could also be a list
example['key'] += 1 # this is implementing a counter
defaultdict can also be used with complex data types such as list and set:
>>> example = defaultdict(list)
>>> example['key'].append(1)
>>> example
defaultdict(<class 'list'>, {'key': [1]})
Adding an element automatically initialises the list.

Adding keys to dictionary without using add
# Inserting/Updating single value
# subscript notation method
d['mynewkey'] = 'mynewvalue' # Updates if 'a' exists, else adds 'a'
# OR
d.update({'mynewkey': 'mynewvalue'})
# OR
d.update(dict('mynewkey'='mynewvalue'))
# OR
d.update('mynewkey'='mynewvalue')
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue'}
# To add/update multiple keys simultaneously, use d.update():
x = {3:4, 5:6, 7:8}
d.update(x)
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue', 3: 4, 5: 6, 7: 8}
# update operator |= now works for dictionaries:
d |= {'c':3,'d':4}
# Assigning new key value pair using dictionary unpacking.
data1 = {4:6, 9:10, 17:20}
data2 = {20:30, 32:48, 90:100}
data3 = { 38:"value", 99:"notvalid"}
d = {**data1, **data2, **data3}
# The merge operator | now works for dictionaries:
data = data1 | {'c':3,'d':4}
# Create a dictionary from two lists
data = dict(zip(list_with_keys, list_with_values))

dico["new key"] = "value"

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Recursive collections.defaultdict initialization - python

Related

Can someone explain what this does "defaultdict(lambda:0)" [duplicate]

why does dict.update(key=val) not use the string referenced by key?

Dictionaries in Python 3 [duplicate]

Multi-level defaultdict with variable depth?

How can I add new keys to a dictionary?

Categories

Resources