I have a dictionary where key is string and value is list.
Now while adding a value associated with given key, I always have to check if there is any list yet, otherwise I have to initialize as empty list somewhat like following snippet:
if not k in myDict:
myDict[k] = []
myDict[k].append(v)
I am wondering if there is any way to combine these steps into single one in python 3.7.
There are at least three ways:
Use dict.setdefault
>>> data = {}
>>> data.setdefault('foo', []).append(42)
>>> data
{'foo': [42]}
Use defaultdict, which unlike .setdefault, takes a callable:
>>> from collections import defaultdict
>>> data = defaultdict(list)
>>> data
defaultdict(<class 'list'>, {})
>>> data['foo'].append(42)
>>> data
defaultdict(<class 'list'>, {'foo': [42]})
Finally, subclass dict and implement __missing__:
>>> class MyDict(dict):
... def __missing__(self, key):
... self[key] = value = []
... return value
...
>>> data = MyDict()
>>> data['foo'].append(42)
>>> data
{'foo': [42]}
Note, you can think of the last one as the most flexible, you have access to the actual key that's missing when you deal with it. defaultdict is a class factory, and it generates a subclass of dict as well. But, the callable is not passed any arguments, nevertheless, it is sufficient for most needs.
Further, note that the defaultdict and __missing__ approaches will keep the default behavior, this may be undesirable after you create your data structure, you probably want a KeyError usually, or at least, you don't want mydict[key] to add a key anymore.
In both cases, you can just create a regular dict from the dict subclasses, e.g. dict(data). This should generally be very fast, even for large dict objects, especially if it is a one-time cost. For defaultdict, you can also set the default_factory to None and the old behavior returns:
>>> data = defaultdict(list)
>>> data
defaultdict(<class 'list'>, {})
>>> data['foo']
[]
>>> data['bar']
[]
>>> data
defaultdict(<class 'list'>, {'foo': [], 'bar': []})
>>> data.default_factory = None
>>> data['baz']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'baz'
>>>
You can use dict.setdefault:
>>> d = {}
>>> d.setdefault('k', []).append(1)
>>> d
{'k': [1]}
>>> d.setdefault('k', []).append(2)
>>> d
{'k': [1, 2]}
Help on method_descriptor in dict:
dict.setdefault = setdefault(...)
D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D
How do I add a key to an existing dictionary? It doesn't have an .add() method.
You create a new key/value pair on a dictionary by assigning a value to that key
d = {'key': 'value'}
print(d) # {'key': 'value'}
d['mynewkey'] = 'mynewvalue'
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue'}
If the key doesn't exist, it's added and points to that value. If it exists, the current value it points to is overwritten.
I feel like consolidating info about Python dictionaries:
Creating an empty dictionary
data = {}
# OR
data = dict()
Creating a dictionary with initial values
data = {'a': 1, 'b': 2, 'c': 3}
# OR
data = dict(a=1, b=2, c=3)
# OR
data = {k: v for k, v in (('a', 1), ('b',2), ('c',3))}
Inserting/Updating a single value
data['a'] = 1 # Updates if 'a' exists, else adds 'a'
# OR
data.update({'a': 1})
# OR
data.update(dict(a=1))
# OR
data.update(a=1)
Inserting/Updating multiple values
data.update({'c':3,'d':4}) # Updates 'c' and adds 'd'
Python 3.9+:
The update operator |= now works for dictionaries:
data |= {'c':3,'d':4}
Creating a merged dictionary without modifying originals
data3 = {}
data3.update(data) # Modifies data3, not data
data3.update(data2) # Modifies data3, not data2
Python 3.5+:
This uses a new feature called dictionary unpacking.
data = {**data1, **data2, **data3}
Python 3.9+:
The merge operator | now works for dictionaries:
data = data1 | {'c':3,'d':4}
Deleting items in dictionary
del data[key] # Removes specific element in a dictionary
data.pop(key) # Removes the key & returns the value
data.clear() # Clears entire dictionary
Check if a key is already in dictionary
key in data
Iterate through pairs in a dictionary
for key in data: # Iterates just through the keys, ignoring the values
for key, value in d.items(): # Iterates through the pairs
for key in d.keys(): # Iterates just through key, ignoring the values
for value in d.values(): # Iterates just through value, ignoring the keys
Create a dictionary from two lists
data = dict(zip(list_with_keys, list_with_values))
To add multiple keys simultaneously, use dict.update():
>>> x = {1:2}
>>> print(x)
{1: 2}
>>> d = {3:4, 5:6, 7:8}
>>> x.update(d)
>>> print(x)
{1: 2, 3: 4, 5: 6, 7: 8}
For adding a single key, the accepted answer has less computational overhead.
"Is it possible to add a key to a Python dictionary after it has been created? It doesn't seem to have an .add() method."
Yes it is possible, and it does have a method that implements this, but you don't want to use it directly.
To demonstrate how and how not to use it, let's create an empty dict with the dict literal, {}:
my_dict = {}
Best Practice 1: Subscript notation
To update this dict with a single new key and value, you can use the subscript notation (see Mappings here) that provides for item assignment:
my_dict['new key'] = 'new value'
my_dict is now:
{'new key': 'new value'}
Best Practice 2: The update method - 2 ways
We can also update the dict with multiple values efficiently as well using the update method. We may be unnecessarily creating an extra dict here, so we hope our dict has already been created and came from or was used for another purpose:
my_dict.update({'key 2': 'value 2', 'key 3': 'value 3'})
my_dict is now:
{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value'}
Another efficient way of doing this with the update method is with keyword arguments, but since they have to be legitimate python words, you can't have spaces or special symbols or start the name with a number, but many consider this a more readable way to create keys for a dict, and here we certainly avoid creating an extra unnecessary dict:
my_dict.update(foo='bar', foo2='baz')
and my_dict is now:
{'key 2': 'value 2', 'key 3': 'value 3', 'new key': 'new value',
'foo': 'bar', 'foo2': 'baz'}
So now we have covered three Pythonic ways of updating a dict.
Magic method, __setitem__, and why it should be avoided
There's another way of updating a dict that you shouldn't use, which uses the __setitem__ method. Here's an example of how one might use the __setitem__ method to add a key-value pair to a dict, and a demonstration of the poor performance of using it:
>>> d = {}
>>> d.__setitem__('foo', 'bar')
>>> d
{'foo': 'bar'}
>>> def f():
... d = {}
... for i in xrange(100):
... d['foo'] = i
...
>>> def g():
... d = {}
... for i in xrange(100):
... d.__setitem__('foo', i)
...
>>> import timeit
>>> number = 100
>>> min(timeit.repeat(f, number=number))
0.0020880699157714844
>>> min(timeit.repeat(g, number=number))
0.005071878433227539
So we see that using the subscript notation is actually much faster than using __setitem__. Doing the Pythonic thing, that is, using the language in the way it was intended to be used, usually is both more readable and computationally efficient.
dictionary[key] = value
If you want to add a dictionary within a dictionary you can do it this way.
Example: Add a new entry to your dictionary & sub dictionary
dictionary = {}
dictionary["new key"] = "some new entry" # add new dictionary entry
dictionary["dictionary_within_a_dictionary"] = {} # this is required by python
dictionary["dictionary_within_a_dictionary"]["sub_dict"] = {"other" : "dictionary"}
print (dictionary)
Output:
{'new key': 'some new entry', 'dictionary_within_a_dictionary': {'sub_dict': {'other': 'dictionarly'}}}
NOTE: Python requires that you first add a sub
dictionary["dictionary_within_a_dictionary"] = {}
before adding entries.
The conventional syntax is d[key] = value, but if your keyboard is missing the square bracket keys you could also do:
d.__setitem__(key, value)
In fact, defining __getitem__ and __setitem__ methods is how you can make your own class support the square bracket syntax. See Dive Into Python, Classes That Act Like Dictionaries.
You can create one:
class myDict(dict):
def __init__(self):
self = dict()
def add(self, key, value):
self[key] = value
## example
myd = myDict()
myd.add('apples',6)
myd.add('bananas',3)
print(myd)
Gives:
>>>
{'apples': 6, 'bananas': 3}
This popular question addresses functional methods of merging dictionaries a and b.
Here are some of the more straightforward methods (tested in Python 3)...
c = dict( a, **b ) ## see also https://stackoverflow.com/q/2255878
c = dict( list(a.items()) + list(b.items()) )
c = dict( i for d in [a,b] for i in d.items() )
Note: The first method above only works if the keys in b are strings.
To add or modify a single element, the b dictionary would contain only that one element...
c = dict( a, **{'d':'dog'} ) ## returns a dictionary based on 'a'
This is equivalent to...
def functional_dict_add( dictionary, key, value ):
temp = dictionary.copy()
temp[key] = value
return temp
c = functional_dict_add( a, 'd', 'dog' )
Let's pretend you want to live in the immutable world and do not want to modify the original but want to create a new dict that is the result of adding a new key to the original.
In Python 3.5+ you can do:
params = {'a': 1, 'b': 2}
new_params = {**params, **{'c': 3}}
The Python 2 equivalent is:
params = {'a': 1, 'b': 2}
new_params = dict(params, **{'c': 3})
After either of these:
params is still equal to {'a': 1, 'b': 2}
and
new_params is equal to {'a': 1, 'b': 2, 'c': 3}
There will be times when you don't want to modify the original (you only want the result of adding to the original). I find this a refreshing alternative to the following:
params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params['c'] = 3
or
params = {'a': 1, 'b': 2}
new_params = params.copy()
new_params.update({'c': 3})
Reference: What does `**` mean in the expression `dict(d1, **d2)`?
There is also the strangely named, oddly behaved, and yet still handy dict.setdefault().
This
value = my_dict.setdefault(key, default)
basically just does this:
try:
value = my_dict[key]
except KeyError: # key not found
value = my_dict[key] = default
E.g.,
>>> mydict = {'a':1, 'b':2, 'c':3}
>>> mydict.setdefault('d', 4)
4 # returns new value at mydict['d']
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # a new key/value pair was indeed added
# but see what happens when trying it on an existing key...
>>> mydict.setdefault('a', 111)
1 # old value was returned
>>> print(mydict)
{'a':1, 'b':2, 'c':3, 'd':4} # existing key was ignored
This question has already been answered ad nauseam, but since my
comment
gained a lot of traction, here it is as an answer:
Adding new keys without updating the existing dict
If you are here trying to figure out how to add a key and return a new dictionary (without modifying the existing one), you can do this using the techniques below
Python >= 3.5
new_dict = {**mydict, 'new_key': new_val}
Python < 3.5
new_dict = dict(mydict, new_key=new_val)
Note that with this approach, your key will need to follow the rules of valid identifier names in Python.
If you're not joining two dictionaries, but adding new key-value pairs to a dictionary, then using the subscript notation seems like the best way.
import timeit
timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary.update({"aaa": 123123, "asd": 233})')
>> 0.49582505226135254
timeit.timeit('dictionary = {"karga": 1, "darga": 2}; dictionary["aaa"] = 123123; dictionary["asd"] = 233;')
>> 0.20782899856567383
However, if you'd like to add, for example, thousands of new key-value pairs, you should consider using the update() method.
Here's another way that I didn't see here:
>>> foo = dict(a=1,b=2)
>>> foo
{'a': 1, 'b': 2}
>>> goo = dict(c=3,**foo)
>>> goo
{'c': 3, 'a': 1, 'b': 2}
You can use the dictionary constructor and implicit expansion to reconstruct a dictionary. Moreover, interestingly, this method can be used to control the positional order during dictionary construction (post Python 3.6). In fact, insertion order is guaranteed for Python 3.7 and above!
>>> foo = dict(a=1,b=2,c=3,d=4)
>>> new_dict = {k: v for k, v in list(foo.items())[:2]}
>>> new_dict
{'a': 1, 'b': 2}
>>> new_dict.update(newvalue=99)
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99}
>>> new_dict.update({k: v for k, v in list(foo.items())[2:]})
>>> new_dict
{'a': 1, 'b': 2, 'newvalue': 99, 'c': 3, 'd': 4}
>>>
The above is using dictionary comprehension.
First to check whether the key already exists:
a={1:2,3:4}
a.get(1)
2
a.get(5)
None
Then you can add the new key and value.
Add a dictionary (key,value) class.
class myDict(dict):
def __init__(self):
self = dict()
def add(self, key, value):
#self[key] = value # add new key and value overwriting any exiting same key
if self.get(key)!=None:
print('key', key, 'already used') # report if key already used
self.setdefault(key, value) # if key exit do nothing
## example
myd = myDict()
name = "fred"
myd.add('apples',6)
print('\n', myd)
myd.add('bananas',3)
print('\n', myd)
myd.add('jack', 7)
print('\n', myd)
myd.add(name, myd)
print('\n', myd)
myd.add('apples', 23)
print('\n', myd)
myd.add(name, 2)
print(myd)
I think it would also be useful to point out Python's collections module that consists of many useful dictionary subclasses and wrappers that simplify the addition and modification of data types in a dictionary, specifically defaultdict:
dict subclass that calls a factory function to supply missing values
This is particularly useful if you are working with dictionaries that always consist of the same data types or structures, for example a dictionary of lists.
>>> from collections import defaultdict
>>> example = defaultdict(int)
>>> example['key'] += 1
>>> example['key']
defaultdict(<class 'int'>, {'key': 1})
If the key does not yet exist, defaultdict assigns the value given (in our case 10) as the initial value to the dictionary (often used inside loops). This operation therefore does two things: it adds a new key to a dictionary (as per question), and assigns the value if the key doesn't yet exist. With the standard dictionary, this would have raised an error as the += operation is trying to access a value that doesn't yet exist:
>>> example = dict()
>>> example['key'] += 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'key'
Without the use of defaultdict, the amount of code to add a new element would be much greater and perhaps looks something like:
# This type of code would often be inside a loop
if 'key' not in example:
example['key'] = 0 # add key and initial value to dict; could also be a list
example['key'] += 1 # this is implementing a counter
defaultdict can also be used with complex data types such as list and set:
>>> example = defaultdict(list)
>>> example['key'].append(1)
>>> example
defaultdict(<class 'list'>, {'key': [1]})
Adding an element automatically initialises the list.
Adding keys to dictionary without using add
# Inserting/Updating single value
# subscript notation method
d['mynewkey'] = 'mynewvalue' # Updates if 'a' exists, else adds 'a'
# OR
d.update({'mynewkey': 'mynewvalue'})
# OR
d.update(dict('mynewkey'='mynewvalue'))
# OR
d.update('mynewkey'='mynewvalue')
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue'}
# To add/update multiple keys simultaneously, use d.update():
x = {3:4, 5:6, 7:8}
d.update(x)
print(d) # {'key': 'value', 'mynewkey': 'mynewvalue', 3: 4, 5: 6, 7: 8}
# update operator |= now works for dictionaries:
d |= {'c':3,'d':4}
# Assigning new key value pair using dictionary unpacking.
data1 = {4:6, 9:10, 17:20}
data2 = {20:30, 32:48, 90:100}
data3 = { 38:"value", 99:"notvalid"}
d = {**data1, **data2, **data3}
# The merge operator | now works for dictionaries:
data = data1 | {'c':3,'d':4}
# Create a dictionary from two lists
data = dict(zip(list_with_keys, list_with_values))
dico["new key"] = "value"
I know you can use setdefault(key, value) to set default value for a given key, but is there a way to set default values of all keys to some value after creating a dict ?
Put it another way, I want the dict to return the specified default value for every key I didn't yet set.
You can replace your old dictionary with a defaultdict:
>>> from collections import defaultdict
>>> d = {'foo': 123, 'bar': 456}
>>> d['baz']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'baz'
>>> d = defaultdict(lambda: -1, d)
>>> d['baz']
-1
The "trick" here is that a defaultdict can be initialized with another dict. This means
that you preserve the existing values in your normal dict:
>>> d['foo']
123
Use defaultdict
from collections import defaultdict
a = {}
a = defaultdict(lambda:0,a)
a["anything"] # => 0
This is very useful for case like this,where default values for every key is set as 0:
results ={ 'pre-access' : {'count': 4, 'pass_count': 2},'no-access' : {'count': 55, 'pass_count': 19}
for k,v in results.iteritems():
a['count'] += v['count']
a['pass_count'] += v['pass_count']
In case you actually mean what you seem to ask, I'll provide this alternative answer.
You say you want the dict to return a specified value, you do not say you want to set that value at the same time, like defaultdict does. This will do so:
class DictWithDefault(dict):
def __init__(self, default, **kwargs):
self.default = default
super(DictWithDefault, self).__init__(**kwargs)
def __getitem__(self, key):
if key in self:
return super(DictWithDefault, self).__getitem__(key)
return self.default
Use like this:
d = DictWIthDefault(99, x=5, y=3)
print d["x"] # 5
print d[42] # 99
42 in d # False
d[42] = 3
42 in d # True
Alternatively, you can use a standard dict like this:
d = {3: 9, 4: 2}
default = 99
print d.get(3, default) # 9
print d.get(42, default) # 99
defaultdict can do something like that for you.
Example:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> d
defaultdict(<class 'list'>, {})
>>> d['new'].append(10)
>>> d
defaultdict(<class 'list'>, {'new': [10]})
Is this what you want:
>>> d={'a':1,'b':2,'c':3}
>>> default_val=99
>>> for k in d:
... d[k]=default_val
...
>>> d
{'a': 99, 'b': 99, 'c': 99}
>>>
>>> d={'a':1,'b':2,'c':3}
>>> from collections import defaultdict
>>> d=defaultdict(lambda:99,d)
>>> d
defaultdict(<function <lambda> at 0x03D21630>, {'a': 1, 'c': 3, 'b': 2})
>>> d[3]
99
Not after creating it, no. But you could use a defaultdict in the first place, which sets default values when you initialize it.
You can use the following class. Just change zero to any default value you like. The solution was tested in Python 2.7.
class cDefaultDict(dict):
# dictionary that returns zero for missing keys
# keys with zero values are not stored
def __missing__(self,key):
return 0
def __setitem__(self, key, value):
if value==0:
if key in self: # returns zero anyway, so no need to store it
del self[key]
else:
dict.__setitem__(self, key, value)
I have a large list like:
[A][B1][C1]=1
[A][B1][C2]=2
[A][B2]=3
[D][E][F][G]=4
I want to build a multi-level dict like:
A
--B1
-----C1=1
-----C2=1
--B2=3
D
--E
----F
------G=4
I know that if I use recursive defaultdict I can write table[A][B1][C1]=1, table[A][B2]=2, but this works only if I hardcode those insert statement.
While parsing the list, I don't how many []'s I need beforehand to call table[key1][key2][...].
You can do it without even defining a class:
from collections import defaultdict
nested_dict = lambda: defaultdict(nested_dict)
nest = nested_dict()
nest[0][1][2][3][4][5] = 6
Your example says that at any level there can be a value, and also a dictionary of sub-elements. That is called a tree, and there are many implementations available for them. This is one:
from collections import defaultdict
class Tree(defaultdict):
def __init__(self, value=None):
super(Tree, self).__init__(Tree)
self.value = value
root = Tree()
root.value = 1
root['a']['b'].value = 3
print root.value
print root['a']['b'].value
print root['c']['d']['f'].value
Outputs:
1
3
None
You could do something similar by writing the input in JSON and using json.load to read it as a structure of nested dictionaries.
I think the simplest implementation of a recursive dictionary is this. Only leaf nodes can contain values.
# Define recursive dictionary
from collections import defaultdict
tree = lambda: defaultdict(tree)
Usage:
# Create instance
mydict = tree()
mydict['a'] = 1
mydict['b']['a'] = 2
mydict['c']
mydict['d']['a']['b'] = 0
# Print
import prettyprint
prettyprint.pp(mydict)
Output:
{
"a": 1,
"b": {
"a": 1
},
"c": {},
"d": {
"a": {
"b": 0
}
}
}
I'd do it with a subclass of dict that defines __missing__:
>>> class NestedDict(dict):
... def __missing__(self, key):
... self[key] = NestedDict()
... return self[key]
...
>>> table = NestedDict()
>>> table['A']['B1']['C1'] = 1
>>> table
{'A': {'B1': {'C1': 1}}}
You can't do it directly with defaultdict because defaultdict expects the factory function at initialization time, but at initialization time, there's no way to describe the same defaultdict. The above construct does the same thing that default dict does, but since it's a named class (NestedDict), it can reference itself as missing keys are encountered. It is also possible to subclass defaultdict and override __init__.
This is equivalent to the above, but avoiding lambda notation. Perhaps easier to read ?
def dict_factory():
return defaultdict(dict_factory)
your_dict = dict_factory()
Also -- from the comments -- if you'd like to update from an existing dict, you can simply call
your_dict[0][1][2].update({"some_key":"some_value"})
In order to add values to the dict.
Dan O'Huiginn posted a very nice solution on his journal in 2010:
http://ohuiginn.net/mt/2010/07/nested_dictionaries_in_python.html
>>> class NestedDict(dict):
... def __getitem__(self, key):
... if key in self: return self.get(key)
... return self.setdefault(key, NestedDict())
>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}
You may achieve this with a recursive defaultdict.
from collections import defaultdict
def tree():
def the_tree():
return defaultdict(the_tree)
return the_tree()
It is important to protect the default factory name, the_tree here, in a closure ("private" local function scope). Avoid using a one-liner lambda version, which is bugged due to Python's late binding closures, and implement this with a def instead.
The accepted answer, using a lambda, has a flaw where instances must rely on the nested_dict name existing in an outer scope. If for whatever reason the factory name can not be resolved (e.g. it was rebound or deleted) then pre-existing instances will also become subtly broken:
>>> nested_dict = lambda: defaultdict(nested_dict)
>>> nest = nested_dict()
>>> nest[0][1][2][3][4][6] = 7
>>> del nested_dict
>>> nest[8][9] = 10
# NameError: name 'nested_dict' is not defined
To add to #Hugo To have a max depth:
l=lambda x:defaultdict(lambda:l(x-1)) if x>0 else defaultdict(dict)
arr = l(2)
A slightly different possibility that allows regular dictionary initialization:
from collections import defaultdict
def superdict(arg=()):
update = lambda obj, arg: obj.update(arg) or obj
return update(defaultdict(superdict), arg)
Example:
>>> d = {"a":1}
>>> sd = superdict(d)
>>> sd["b"]["c"] = 2
You could use a NestedDict.
from ndicts.ndicts import NestedDict
nd = NestedDict()
nd[0, 1, 2, 3, 4, 5] = 6
The result as a dictionary:
>>> nd.to_dict()
{0: {1: {2: {3: {4: {5: 6}}}}}}
To install ndicts
pip install ndicts