I have a large list like:
[A][B1][C1]=1
[A][B1][C2]=2
[A][B2]=3
[D][E][F][G]=4
I want to build a multi-level dict like:
A
--B1
-----C1=1
-----C2=1
--B2=3
D
--E
----F
------G=4
I know that if I use recursive defaultdict I can write table[A][B1][C1]=1, table[A][B2]=2, but this works only if I hardcode those insert statement.
While parsing the list, I don't how many []'s I need beforehand to call table[key1][key2][...].
You can do it without even defining a class:
from collections import defaultdict
nested_dict = lambda: defaultdict(nested_dict)
nest = nested_dict()
nest[0][1][2][3][4][5] = 6
Your example says that at any level there can be a value, and also a dictionary of sub-elements. That is called a tree, and there are many implementations available for them. This is one:
from collections import defaultdict
class Tree(defaultdict):
def __init__(self, value=None):
super(Tree, self).__init__(Tree)
self.value = value
root = Tree()
root.value = 1
root['a']['b'].value = 3
print root.value
print root['a']['b'].value
print root['c']['d']['f'].value
Outputs:
1
3
None
You could do something similar by writing the input in JSON and using json.load to read it as a structure of nested dictionaries.
I think the simplest implementation of a recursive dictionary is this. Only leaf nodes can contain values.
# Define recursive dictionary
from collections import defaultdict
tree = lambda: defaultdict(tree)
Usage:
# Create instance
mydict = tree()
mydict['a'] = 1
mydict['b']['a'] = 2
mydict['c']
mydict['d']['a']['b'] = 0
# Print
import prettyprint
prettyprint.pp(mydict)
Output:
{
"a": 1,
"b": {
"a": 1
},
"c": {},
"d": {
"a": {
"b": 0
}
}
}
I'd do it with a subclass of dict that defines __missing__:
>>> class NestedDict(dict):
... def __missing__(self, key):
... self[key] = NestedDict()
... return self[key]
...
>>> table = NestedDict()
>>> table['A']['B1']['C1'] = 1
>>> table
{'A': {'B1': {'C1': 1}}}
You can't do it directly with defaultdict because defaultdict expects the factory function at initialization time, but at initialization time, there's no way to describe the same defaultdict. The above construct does the same thing that default dict does, but since it's a named class (NestedDict), it can reference itself as missing keys are encountered. It is also possible to subclass defaultdict and override __init__.
This is equivalent to the above, but avoiding lambda notation. Perhaps easier to read ?
def dict_factory():
return defaultdict(dict_factory)
your_dict = dict_factory()
Also -- from the comments -- if you'd like to update from an existing dict, you can simply call
your_dict[0][1][2].update({"some_key":"some_value"})
In order to add values to the dict.
Dan O'Huiginn posted a very nice solution on his journal in 2010:
http://ohuiginn.net/mt/2010/07/nested_dictionaries_in_python.html
>>> class NestedDict(dict):
... def __getitem__(self, key):
... if key in self: return self.get(key)
... return self.setdefault(key, NestedDict())
>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}
You may achieve this with a recursive defaultdict.
from collections import defaultdict
def tree():
def the_tree():
return defaultdict(the_tree)
return the_tree()
It is important to protect the default factory name, the_tree here, in a closure ("private" local function scope). Avoid using a one-liner lambda version, which is bugged due to Python's late binding closures, and implement this with a def instead.
The accepted answer, using a lambda, has a flaw where instances must rely on the nested_dict name existing in an outer scope. If for whatever reason the factory name can not be resolved (e.g. it was rebound or deleted) then pre-existing instances will also become subtly broken:
>>> nested_dict = lambda: defaultdict(nested_dict)
>>> nest = nested_dict()
>>> nest[0][1][2][3][4][6] = 7
>>> del nested_dict
>>> nest[8][9] = 10
# NameError: name 'nested_dict' is not defined
To add to #Hugo To have a max depth:
l=lambda x:defaultdict(lambda:l(x-1)) if x>0 else defaultdict(dict)
arr = l(2)
A slightly different possibility that allows regular dictionary initialization:
from collections import defaultdict
def superdict(arg=()):
update = lambda obj, arg: obj.update(arg) or obj
return update(defaultdict(superdict), arg)
Example:
>>> d = {"a":1}
>>> sd = superdict(d)
>>> sd["b"]["c"] = 2
You could use a NestedDict.
from ndicts.ndicts import NestedDict
nd = NestedDict()
nd[0, 1, 2, 3, 4, 5] = 6
The result as a dictionary:
>>> nd.to_dict()
{0: {1: {2: {3: {4: {5: 6}}}}}}
To install ndicts
pip install ndicts
Related
This question already has answers here:
How to implement an efficient bidirectional hash table?
(8 answers)
Closed 2 years ago.
I'm doing this switchboard thing in python where I need to keep track of who's talking to whom, so if Alice --> Bob, then that implies that Bob --> Alice.
Yes, I could populate two hash maps, but I'm wondering if anyone has an idea to do it with one.
Or suggest another data structure.
There are no multiple conversations. Let's say this is for a customer service call center, so when Alice dials into the switchboard, she's only going to talk to Bob. His replies also go only to her.
You can create your own dictionary type by subclassing dict and adding the logic that you want. Here's a basic example:
class TwoWayDict(dict):
def __setitem__(self, key, value):
# Remove any previous connections with these values
if key in self:
del self[key]
if value in self:
del self[value]
dict.__setitem__(self, key, value)
dict.__setitem__(self, value, key)
def __delitem__(self, key):
dict.__delitem__(self, self[key])
dict.__delitem__(self, key)
def __len__(self):
"""Returns the number of connections"""
return dict.__len__(self) // 2
And it works like so:
>>> d = TwoWayDict()
>>> d['foo'] = 'bar'
>>> d['foo']
'bar'
>>> d['bar']
'foo'
>>> len(d)
1
>>> del d['foo']
>>> d['bar']
Traceback (most recent call last):
File "<stdin>", line 7, in <module>
KeyError: 'bar'
I'm sure I didn't cover all the cases, but that should get you started.
In your special case you can store both in one dictionary:
relation = {}
relation['Alice'] = 'Bob'
relation['Bob'] = 'Alice'
Since what you are describing is a symmetric relationship. A -> B => B -> A
I know it's an older question, but I wanted to mention another great solution to this problem, namely the python package bidict. It's extremely straight forward to use:
from bidict import bidict
map = bidict(Bob = "Alice")
print(map["Bob"])
print(map.inv["Alice"])
I would just populate a second hash, with
reverse_map = dict((reversed(item) for item in forward_map.items()))
Two hash maps is actually probably the fastest-performing solution assuming you can spare the memory. I would wrap those in a single class - the burden on the programmer is in ensuring that two the hash maps sync up correctly.
A less verbose way, still using reversed:
dict(map(reversed, my_dict.items()))
You have two separate issues.
You have a "Conversation" object. It refers to two Persons. Since a Person can have multiple conversations, you have a many-to-many relationship.
You have a Map from Person to a list of Conversations. A Conversion will have a pair of Persons.
Do something like this
from collections import defaultdict
switchboard= defaultdict( list )
x = Conversation( "Alice", "Bob" )
y = Conversation( "Alice", "Charlie" )
for c in ( x, y ):
switchboard[c.p1].append( c )
switchboard[c.p2].append( c )
No, there is really no way to do this without creating two dictionaries. How would it be possible to implement this with just one dictionary while continuing to offer comparable performance?
You are better off creating a custom type that encapsulates two dictionaries and exposes the functionality you want.
You may be able to use a DoubleDict as shown in recipe 578224 on the Python Cookbook.
Another possible solution is to implement a subclass of dict, that holds the original dictionary and keeps track of a reversed version of it. Keeping two seperate dicts can be useful if keys and values are overlapping.
class TwoWayDict(dict):
def __init__(self, my_dict):
dict.__init__(self, my_dict)
self.rev_dict = {v : k for k,v in my_dict.iteritems()}
def __setitem__(self, key, value):
dict.__setitem__(self, key, value)
self.rev_dict.__setitem__(value, key)
def pop(self, key):
self.rev_dict.pop(self[key])
dict.pop(self, key)
# The above is just an idea other methods
# should also be overridden.
Example:
>>> d = {'a' : 1, 'b' : 2} # suppose we need to use d and its reversed version
>>> twd = TwoWayDict(d) # create a two-way dict
>>> twd
{'a': 1, 'b': 2}
>>> twd.rev_dict
{1: 'a', 2: 'b'}
>>> twd['a']
1
>>> twd.rev_dict[2]
'b'
>>> twd['c'] = 3 # we add to twd and reversed version also changes
>>> twd
{'a': 1, 'c': 3, 'b': 2}
>>> twd.rev_dict
{1: 'a', 2: 'b', 3: 'c'}
>>> twd.pop('a') # we pop elements from twd and reversed version changes
>>> twd
{'c': 3, 'b': 2}
>>> twd.rev_dict
{2: 'b', 3: 'c'}
There's the collections-extended library on pypi: https://pypi.python.org/pypi/collections-extended/0.6.0
Using the bijection class is as easy as:
RESPONSE_TYPES = bijection({
0x03 : 'module_info',
0x09 : 'network_status_response',
0x10 : 'trust_center_device_update'
})
>>> RESPONSE_TYPES[0x03]
'module_info'
>>> RESPONSE_TYPES.inverse['network_status_response']
0x09
I like the suggestion of bidict in one of the comments.
pip install bidict
Useage:
# This normalization method should save hugely as aDaD ~ yXyX have the same form of smallest grammar.
# To get back to your grammar's alphabet use trans
def normalize_string(s, nv=None):
if nv is None:
nv = ord('a')
trans = bidict()
r = ''
for c in s:
if c not in trans.inverse:
a = chr(nv)
nv += 1
trans[a] = c
else:
a = trans.inverse[c]
r += a
return r, trans
def translate_string(s, trans):
res = ''
for c in s:
res += trans[c]
return res
if __name__ == "__main__":
s = "bnhnbiodfjos"
n, tr = normalize_string(s)
print(n)
print(tr)
print(translate_string(n, tr))
Since there aren't much docs about it. But I've got all the features I need from it working correctly.
Prints:
abcbadefghei
bidict({'a': 'b', 'b': 'n', 'c': 'h', 'd': 'i', 'e': 'o', 'f': 'd', 'g': 'f', 'h': 'j', 'i': 's'})
bnhnbiodfjos
The kjbuckets C extension module provides a "graph" data structure which I believe gives you what you want.
Here's one more two-way dictionary implementation by extending pythons dict class in case you didn't like any of those other ones:
class DoubleD(dict):
""" Access and delete dictionary elements by key or value. """
def __getitem__(self, key):
if key not in self:
inv_dict = {v:k for k,v in self.items()}
return inv_dict[key]
return dict.__getitem__(self, key)
def __delitem__(self, key):
if key not in self:
inv_dict = {v:k for k,v in self.items()}
dict.__delitem__(self, inv_dict[key])
else:
dict.__delitem__(self, key)
Use it as a normal python dictionary except in construction:
dd = DoubleD()
dd['foo'] = 'bar'
A way I like to do this kind of thing is something like:
{my_dict[key]: key for key in my_dict.keys()}
When a missing key is queried in a defaultdict object, the key is automatically added to the dictionary:
from collections import defaultdict
d = defaultdict(int)
res = d[5]
print(d)
# defaultdict(<class 'int'>, {5: 0})
# we want this dictionary to remain empty
However, often we want to only add keys when they are assigned explicitly or implicitly:
d[8] = 1 # we want this key added
d[3] += 1 # we want this key added
One use case is simple counting, to avoid the higher overhead of collections.Counter, but this feature may also be desirable generally.
Counter example [pardon the pun]
This is the functionality I want:
from collections import Counter
c = Counter()
res = c[5] # 0
print(c) # Counter()
c[8] = 1 # key added successfully
c[3] += 1 # key added successfully
But Counter is significantly slower than defaultdict(int). I find the performance hit usually ~2x slower vs defaultdict(int).
In addition, obviously Counter is only comparable to int argument in defaultdict, while defaultdict can take list, set, etc.
Is there a way to implement the above behaviour efficiently; for instance, by subclassing defaultdict?
Benchmarking example
%timeit DwD(lst) # 72 ms
%timeit dd(lst) # 44 ms
%timeit counter_func(lst) # 98 ms
%timeit af(lst) # 72 ms
Test code:
import numpy as np
from collections import defaultdict, Counter, UserDict
class DefaultDict(defaultdict):
def get_and_forget(self, key):
_sentinel = object()
value = self.get(key, _sentinel)
if value is _sentinel:
return self.default_factory()
return value
class DictWithDefaults(dict):
__slots__ = ['_factory'] # avoid using extra memory
def __init__(self, factory, *args, **kwargs):
self._factory = factory
super().__init__(*args, **kwargs)
def __missing__(self, key):
return self._factory()
lst = np.random.randint(0, 10, 100000)
def DwD(lst):
d = DictWithDefaults(int)
for i in lst:
d[i] += 1
return d
def dd(lst):
d = defaultdict(int)
for i in lst:
d[i] += 1
return d
def counter_func(lst):
d = Counter()
for i in lst:
d[i] += 1
return d
def af(lst):
d = DefaultDict(int)
for i in lst:
d[i] += 1
return d
Note Regarding Bounty Comment:
#Aran-Fey's solution has been updated since Bounty was offered, so please disregard the Bounty comment.
Rather than messing about with collections.defaultdict to make it do what we want, it seems easier to implement our own:
class DefaultDict(dict):
def __init__(self, default_factory, **kwargs):
super().__init__(**kwargs)
self.default_factory = default_factory
def __getitem__(self, key):
try:
return super().__getitem__(key)
except KeyError:
return self.default_factory()
This works the way you want:
d = DefaultDict(int)
res = d[5]
d[8] = 1
d[3] += 1
print(d) # {8: 1, 3: 1}
However, it can behave unexpectedly for mutable types:
d = DefaultDict(list)
d[5].append('foobar')
print(d) # output: {}
This is probably the reason why defaultdict remembers the value when a nonexistant key is accessed.
Another option is to extend defaultdict and add a new method that looks up a value without remembering it:
from collections import defaultdict
class DefaultDict(defaultdict):
def get_and_forget(self, key):
return self.get(key, self.default_factory())
Note that the get_and_forget method calls the default_factory() every time, regardless of whether the key already exists in the dict or not. If this is undesirable, you can implement it with a sentinel value instead:
class DefaultDict(defaultdict):
def get_and_forget(self, key):
_sentinel = object()
value = self.get(key, _sentinel)
if value is _sentinel:
return self.default_factory()
return value
This has better support for mutable types, because it allows you to choose whether the value should be added to the dict or not.
If you just want a dict that returns a default value when you access a non-existing key then you could simply subclass dict and implement __missing__:
object.__missing__(self, key)
Called by dict.__getitem__() to implement self[key] for dict subclasses when key is not in the dictionary.
That would look like this:
class DictWithDefaults(dict):
# not necessary, just a memory optimization
__slots__ = ['_factory']
def __init__(self, factory, *args, **kwargs):
self._factory = factory
super().__init__(*args, **kwargs)
def __missing__(self, key):
return self._factory()
In this case I used a defaultdict-like approach so you have to pass in a factory that should provide the default value when called:
>>> dwd = DictWithDefaults(int)
>>> dwd[0] # key does not exist
0
>>> dwd # key still doesn't exist
{}
>>> dwd[0] = 10
>>> dwd
{0: 10}
When you do assignments (explicitly or implicitly) the value will be added to the dictionary:
>>> dwd = DictWithDefaults(int)
>>> dwd[0] += 1
>>> dwd
{0: 1}
>>> dwd = DictWithDefaults(list)
>>> dwd[0] += [1]
>>> dwd
{0: [1]}
You wondered how collections.Counter is doing it and as of CPython 3.6.5 it also uses __missing__:
class Counter(dict):
...
def __missing__(self, key):
'The count of elements not in the Counter is zero.'
# Needed so that self[missing_item] does not raise KeyError
return 0
...
Better performance?!
You mentioned that speed is of concern, so you could make that class a C extension class (assuming you use CPython), for example using Cython (I'm using the Jupyter magic commands to create the extension class):
%load_ext cython
%%cython
cdef class DictWithDefaultsCython(dict):
cdef object _factory
def __init__(self, factory, *args, **kwargs):
self._factory = factory
super().__init__(*args, **kwargs)
def __missing__(self, key):
return self._factory()
Benchmark
Based on your benchmark:
from collections import Counter, defaultdict
def d_py(lst):
d = DictWithDefaults(int)
for i in lst:
d[i] += 1
return d
def d_cy(lst):
d = DictWithDefaultsCython(int)
for i in lst:
d[i] += 1
return d
def d_dd(lst):
d = defaultdict(int)
for i in lst:
d[i] += 1
return d
Given that this is just counting it would be an (unforgivable) oversight to not include a benchmark simply using the Counter initializer.
I have recently written a small benchmarking tool that I think might come in handy here (but you could do it using %timeit as well):
from simple_benchmark import benchmark
import random
sizes = [2**i for i in range(2, 20)]
unique_lists = {i: list(range(i)) for i in sizes}
identical_lists = {i: [0]*i for i in sizes}
mixed = {i: [random.randint(0, i // 2) for j in range(i)] for i in sizes}
functions = [d_py, d_cy, d_dd, d_c, Counter]
b_unique = benchmark(functions, unique_lists, 'list size')
b_identical = benchmark(functions, identical_lists, 'list size')
b_mixed = benchmark(functions, mixed, 'list size')
With this result:
import matplotlib.pyplot as plt
f, (ax1, ax2, ax3) = plt.subplots(1, 3, sharey=True)
ax1.set_title('unique elements')
ax2.set_title('identical elements')
ax3.set_title('mixed elements')
b_unique.plot(ax=ax1)
b_identical.plot(ax=ax2)
b_mixed.plot(ax=ax3)
Note that it uses log-log scale for better visibility of differences:
For long iterables the Counter(iterable) was by far the fastest. DictWithDefaultCython and defaultdict were equal (with DictWithDefault being slightly faster most of the times, even if that's not visible here) followed by DictWithDefault and then Counter with the manual for-loop. Funny how Counter is fastest and slowest.
Implicitly add the returned value if it is modifie
Something I glossed over is the fact that it differs considerably from defaultdict because of the desired "just return the default don't save it" with mutable types:
>>> from collections import defaultdict
>>> dd = defaultdict(list)
>>> dd[0].append(10)
>>> dd
defaultdict(list, {0: [10]})
>>> dwd = DictWithDefaults(list)
>>> dwd[0].append(10)
>>> dwd
{}
That means you actually need to set the element when you want the modified value to be visible in the dictionary.
However this somewhat intrigued me so I want to share a way how you could make that work (if desired). But it's just a quick test and only works for append calls using a proxy. Please don't use that in production code (from my point of view this just has entertainment value):
from wrapt import ObjectProxy
class DictWithDefaultsFunky(dict):
__slots__ = ['_factory'] # avoid using extra memory
def __init__(self, factory, *args, **kwargs):
self._factory = factory
super().__init__(*args, **kwargs)
def __missing__(self, key):
ret = self._factory()
dict_ = self
class AppendTrigger(ObjectProxy):
def append(self, val):
self.__wrapped__.append(val)
dict_[key] = ret
return AppendTrigger(ret)
That's a dictionary that returns a proxy object (instead of the real default) and it overloads a method that, if called, adds the return value to the dictionary. And it "works":
>>> d = DictWithDefaultsFunky(list)
>>> a = d[10]
>>> d
[]
>>> a.append(1)
>>> d
{10: [1]}
But it does have a few pitfalls (that could be solved but it's just a proof-of-concept so I won't attempt it here):
>>> d = DictWithDefaultsFunky(list)
>>> a = d[10]
>>> b = d[10]
>>> d
{}
>>> a.append(1)
>>> d
{10: [1]}
>>> b.append(10)
>>> d # oups, that overwrote the previous stored value ...
{10: [10]}
If you really want something like that you probably need to implement a class that really tracks changes within the values (and not just append calls).
If you want to avoid implicit assignments
In case you don't like the fact that += or similar operations add the value to the dictionary (opposed to the previous example which even tried to add the value in a very implicit fashion) then you probably should implement it as method instead of as special method.
For example:
class SpecialDict(dict):
__slots__ = ['_factory']
def __init__(self, factory, *args, **kwargs):
self._factory = factory
def get_or_default_from_factory(self, key):
try:
return self[key]
except KeyError:
return self._factory()
>>> sd = SpecialDict(int)
>>> sd.get_or_default_from_factory(0)
0
>>> sd
{}
>>> sd[0] = sd.get_or_default_from_factory(0) + 1
>>> sd
{0: 1}
Which is similar to the behavior of Aran-Feys answer but instead of get with a sentinel it uses a try and catch approach.
Your bounty message says Aran-Fey's answer "does not work with mutable types". (For future readers, the bounty message is "The current answer is good, but it does not work with mutable types. If the existing answer can be adapted, or another option solution put forward, to suit this purpose, this would be ideal.")
The thing is, it does work for mutable types:
>>> d = DefaultDict(list)
>>> d[0] += [1]
>>> d[0]
[1]
>>> d[1]
[]
>>> 1 in d
False
What doesn't work is something like d[1].append(2):
>>> d[1].append(2)
>>> d[1]
[]
That's because this doesn't involve a store operation on the dict. The only dict operation involved is an item retrieval.
There is no difference between what the dict object sees in d[1] or d[1].append(2). The dict is not involved in the append operation. Without nasty, fragile stack inspection or something similar, there is no way for the dict to store the list only for d[1].append(2).
So that's hopeless. What should you do instead?
Well, one option is to use a regular collections.defaultdict, and just not use [] when you don't want to store defaults. You can use in or get:
if key in d:
value = d[key]
else:
...
or
value = d.get(key, sentinel)
Alternatively, you can turn off the default factory when you don't want it. This is frequently reasonable when you have separate "build" and "read" phases, and you don't want the default factory during the read phase:
d = collections.defaultdict(list)
for thing in whatever:
d[thing].append(other_thing)
# turn off default factory
d.default_factory = None
use(d)
is there any way to dynamically create missing keys if i want to want to set a variable in a subdictionary.
essentially I want to create any missing keys and set my value.
self.portdict[switchname][str(neighbor['name'])]['local']['ports'] = []
currently i'm doing it but its messy:
if not switchname in self.portdict:
self.portdict[switchname] = {}
if not str(neighbor['name']) in self.portdict[switchname]:
self.portdict[switchname][str(neighbor['name'])] = {}
if not 'local' in self.portdict[switchname][str(neighbor['name'])]:
self.portdict[switchname][str(neighbor['name'])]['local'] = {}
if not 'ports' in self.portdict[switchname][str(neighbor['name'])]['local']:
self.portdict[switchname][str(neighbor['name'])]['local']['ports'] = []
Is there any way to do this in one or two lines instead?
This is easier to do without recursion:
def set_by_path(dct, path, value):
ipath = iter(path)
p_last = next(ipath)
try:
while True:
p_next = next(ipath)
dct = dct.setdefault(p_last, {})
p_last = p_next
except StopIteration:
dct[p_last] = value
And a test case:
d = {}
set_by_path(d, ['foo', 'bar', 'baz'], 'qux')
print d # {'foo': {'bar': {'baz': 'qux'}}}
If you want to have it so you don't need a function, you can use the following defaultdict factory which allows you to nest things arbitrarily deeply:
from collections import defaultdict
defaultdict_factory = lambda : defaultdict(defaultdict_factory)
d = defaultdict_factory()
d['foo']['bar']['baz'] = 'qux'
print d
Use collections.defaultdict
self.portdict = defaultdict(lambda: defaultdict(lambda: defaultdict(lambda: defaultdict(lambda: []))))
I've run into a similar problem in the past. I found that defaultdict was the right answer for me—but writing the super long definitions (like the one in #o11c's answer or #Apero's answer) was no good. Here's what I came up with instead:
from collections import defaultdict
from functools import partial
def NestedDefaultDict(levels, baseFn):
def NDD(lvl):
return partial(defaultdict, NDD(lvl-1)) if lvl > 0 else baseFn
return defaultdict(NDD(levels-1))
This creates a dictionary with levels of nested dictionaries. So if you have levels=3, then you need 3 keys to access the bottom-level value. The second argument is a function which is used to create the bottom-level values. Something like list or lambda: 0 or even dict would work well.
Here's an example of using the "automatic" keys with 4 levels, and list as the default function:
>>> x = NestedDefaultDict(4, list)
>>> x[1][2][3][4].append('hello')
>>> x
defaultdict(<functools.partial object at 0x10b5c22b8>, {1: defaultdict(<functools.partial object at 0x10b5c2260>, {2: defaultdict(<functools.partial object at 0x10b5c2208>, {3: defaultdict(<type 'list'>, {4: ['hello']})})})})
I think that's basically what you'd want for the case in your question. Your 4 "levels" are switch-name, neighbor-name, local, & ports—and it looks like you want a list at the bottom-level to store your ports.
Another example using 2 levels and lambda: 0 as the default:
>>> y = NestedDefaultDict(2, lambda: 0)
>>> y['foo']['bar'] += 7
>>> y['foo']['baz'] += 10
>>> y['foo']['bar'] += 1
>>> y
defaultdict(<functools.partial object at 0x1021f1310>, {'foo': defaultdict(<function <lambda> at 0x1021f3938>, {'baz': 10, 'bar': 8})})
Have a close look to collections.defaultdict:
from collections import defaultdict
foo = defaultdict(dict)
foo['bar'] = defaultdict(dict)
foo['bar']['baz'] = defaultdict(dict)
foo['bar']['baz']['aaa'] = 1
foo['bor'] = 0
foo['bir'] = defaultdict(list)
foo['bir']['biz'].append(1)
foo['bir']['biz'].append(2)
print foo
defaultdict(<type 'dict'>, {'bir': defaultdict(<type 'list'>, {'biz': [1, 2]}), 'bor': 0, 'bar': defaultdict(<type 'dict'>, {'baz': defaultdict(<type 'dict'>, {'aaa': 1})})})
I use a dict as a short-term cache. I want to get a value from the dictionary, and if the dictionary didn't already have that key, set it, e.g.:
val = cache.get('the-key', calculate_value('the-key'))
cache['the-key'] = val
In the case where 'the-key' was already in cache, the second line is not necessary. Is there a better, shorter, more expressive idiom for this?
yes, use:
val = cache.setdefault('the-key', calculate_value('the-key'))
An example in the shell:
>>> cache = {'a': 1, 'b': 2}
>>> cache.setdefault('a', 0)
1
>>> cache.setdefault('b', 0)
2
>>> cache.setdefault('c', 0)
0
>>> cache
{'a': 1, 'c': 0, 'b': 2}
See: http://docs.python.org/release/2.5.2/lib/typesmapping.html
Readability matters!
if 'the-key' not in cache:
cache['the-key'] = calculate_value('the-key')
val = cache['the-key']
If you really prefer an one-liner:
val = cache['the-key'] if 'the-key' in cache else cache.setdefault('the-key', calculate_value('the-key'))
Another option is to define __missing__ in the cache class:
class Cache(dict):
def __missing__(self, key):
return self.setdefault(key, calculate_value(key))
Have a look at the Python Decorator Library, and more specifically Memoize which acts as a cache. That way you can just decorate your call the calculate_value with the Memoize decorator.
Approach with
cache.setdefault('the-key',calculate_value('the-key'))
is great if calculate_value is not costly, because it will be evaluated each time. So if you have to read from DB, open a file or network connection or do anything "expensive", then use the following structure:
try:
val = cache['the-key']
except KeyError:
val = calculate_value('the-key')
cache['the-key'] = val
You might want to take a look at (the entire page at) "Code Like a Pythonista" http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#dictionary-get-method
It covers the setdefault() technique described above, and the defaultdict technique is also very handy for making dictionaries of sets or arrays for example.
You can also use defaultdict to do something similar:
>>> from collections import defaultdict
>>> d = defaultdict(int) # will default values to 0
>>> d["a"] = 1
>>> d["a"]
1
>>> d["b"]
0
>>>
You can assign any default you want by supplying your own factory function and itertools.repeat:
>>> from itertools import repeat
>>> def constant_factory(value):
... return repeat(value).next
...
>>> default_value = "default"
>>> d = defaultdict(constant_factory(default_value))
>>> d["a"]
'default'
>>> d["b"] = 5
>>> d["b"]
5
>>> d.keys()
['a', 'b']
use setdefault method,
if the key is already not present then setdefault creates the new key with the value provided in the second argument, in case the key is already present then it returns the value of that key.
val = cache.setdefault('the-key',value)
Use get to extract the value or to get None.
Combining None with or will let you chain another operation (setdefault)
def get_or_add(cache, key, value_factory):
return cache.get(key) or cache.setdefault(key, value_factory())
usage:
in order to make it lazy the method expects a function as the third parameter
get_or_add(cache, 'the-key', lambda: calculate_value('the-key'))
I want to rewrite Python's dictionary access mechanism "getitem" to be able to return default values.
The functionality I am looking for is something like
a = dict()
a.setdefault_value(None)
print a[100] #this would return none
any hints ?
Thanks
There is already a collections.defaultdict:
from collections import defaultdict
a = defaultdict(lambda:None)
print a[100]
There is a defaultdict built-in starting with Python 2.6. The constructor takes a function which will be called when a value is not found. This gives more flexibility than simply returning None.
from collections import defaultdict
a = defaultdict(lambda: None)
print a[100] #gives None
The lambda is just a quick way to define a one-line function with no name. This code is equivalent:
def nonegetter():
return None
a = defaultdict(nonegetter)
print a[100] #gives None
This is a very useful pattern which gives you a hash showing the count of each unique object. Using a normal dict, you would need special cases to avoid KeyError.
counts = defaultdict(int)
for obj in mylist:
counts[obj] += 1
use a defaultdict (http://docs.python.org/library/collections.html#collections.defaultdict)
import collections
a = collections.defaultdict(lambda:None)
where the argument to the defaultdict constructor is a function which returns the default value.
Note that if you access an unset entry, it actually sets it to the default:
>>> print a[100]
None
>>> a
defaultdict(<function <lambda> at 0x38faf0>, {100: None})
If you really want to not use the defaultdict builtin, you need to define your own subclass of dict, like so:
class MyDefaultDict(dict):
def setdefault_value(self, default):
self.__default = default
def __getitem__(self, key):
try:
return self[key]
except IndexError:
return self.__default
i wasnt aware of defaultdict, and thats probably the best way to go. if you are opposed for some reason ive written small wrapper function for this purpose in the past. Has slightly different functionality that may or may not be better for you.
def makeDictGet(d, defaultVal):
return lambda key: d[key] if key in dict else defaultVal
And using it...
>>> d1 = {'a':1,'b':2}
>>> d1Get = makeDictGet(d1, 0)
>>> d1Get('a')
1
>>> d1Get(5)
0
>>> d1['newAddition'] = 'justAddedThisOne' #changing dict after the fact is fine
>>> d1Get('newAddition')
'justAddedThisOne'
>>> del d1['a']
>>> d1Get('a')
0
>>> d1GetDefaultNone = makeDictGet(d1, None) #having more than one such function is fine
>>> print d1GetDefaultNone('notpresent')
None
>>> d1Get('notpresent')
0
>>> f = makeDictGet({'k1':'val1','pi':3.14,'e':2.718},False) #just put new dict as arg if youre ok with being unable to change it or access directly
>>> f('e')
2.718
>>> f('bad')
False