I am trying to understand some code and I found the following script:
defaultdict(lambda: defaultdict(lambda: 0))
I am not familiar with defaultdict neither with lambda function. I suspect that this is equivalent to initialize a dictionary which values are also a dictionary. Am I wright?
A lambda expression defines a function in-place. Arguments go before the :, and the result of the function goes after it. For example:
>>> inc = lambda x: x+1
>>> inc(3)
4
>>> add = lambda x, y: x + y
>>> add(19, 23)
42
>>> zero = lambda: 0
>>> zero()
0
defaultdict is a dict that creates a default value any time you access it with a non-existent key. It does this by calling the function you pass to it. A common use case is to create a counter by having a defaultdict that automatically creates zero values that can then be incremented:
>>> foo = defaultdict(lambda: 0)
>>> foo["bar"]
0
>>> foo["bar"] += 1
>>> foo["bar"]
1
Since the function used by a defaultdict can be anything, we can nest them by giving an outer dict a function that returns an inner defaultdict:
>>> foo = defaultdict(lambda: defaultdict(lambda: 0))
>>> foo["bar"]["baz"]
0
Related
In someone else's code I read the following two lines:
x = defaultdict(lambda: 0)
y = defaultdict(lambda: defaultdict(lambda: 0))
As the argument of defaultdict is a default factory, I think the first line means that when I call x[k] for a nonexistent key k (such as a statement like v=x[k]), the key-value pair (k,0) will be automatically added to the dictionary, as if the statement x[k]=0 is first executed. Am I correct?
And what about y? It seems that the default factory will create a defaultdict with default 0. But what does that mean concretely? I tried to play around with it in Python shell, but couldn't figure out what it is exactly.
I think the first line means that when I call x[k] for a nonexistent key k (such as a statement like v=x[k]), the key-value pair (k,0) will be automatically added to the dictionary, as if the statement x[k]=0 is first executed.
That's right. This is more idiomatically written
x = defaultdict(int)
In the case of y, when you do y["ham"]["spam"], the key "ham" is inserted in y if it does not exist. The value associated with it becomes a defaultdict in which "spam" is automatically inserted with a value of 0.
I.e., y is a kind of "two-tiered" defaultdict. If "ham" not in y, then evaluating y["ham"]["spam"] is like doing
y["ham"] = {}
y["ham"]["spam"] = 0
in terms of ordinary dict.
You are correct for what the first one does. As for y, it will create a defaultdict with default 0 when a key doesn't exist in y, so you can think of this as a nested dictionary. Consider the following example:
y = defaultdict(lambda: defaultdict(lambda: 0))
print y['k1']['k2'] # 0
print dict(y['k1']) # {'k2': 0}
To create an equivalent nested dictionary structure without defaultdict you would need to create an inner dict for y['k1'] and then set y['k1']['k2'] to 0, but defaultdict does all of this behind the scenes when it encounters keys it hasn't seen:
y = {}
y['k1'] = {}
y['k1']['k2'] = 0
The following function may help for playing around with this on an interpreter to better your understanding:
def to_dict(d):
if isinstance(d, defaultdict):
return dict((k, to_dict(v)) for k, v in d.items())
return d
This will return the dict equivalent of a nested defaultdict, which is a lot easier to read, for example:
>>> y = defaultdict(lambda: defaultdict(lambda: 0))
>>> y['a']['b'] = 5
>>> y
defaultdict(<function <lambda> at 0xb7ea93e4>, {'a': defaultdict(<function <lambda> at 0xb7ea9374>, {'b': 5})})
>>> to_dict(y)
{'a': {'b': 5}}
defaultdict takes a zero-argument callable to its constructor, which is called when the key is not found, as you correctly explained.
lambda: 0 will of course always return zero, but the preferred method to do that is defaultdict(int), which will do the same thing.
As for the second part, the author would like to create a new defaultdict(int), or a nested dictionary, whenever a key is not found in the top-level dictionary.
All answers are good enough still I am giving the answer to add more info:
"defaultdict requires an argument that is callable. That return result of that callable object is the default value that the dictionary returns when you try to access the dictionary with a key that does not exist."
Here's an example
SAMPLE= {'Age':28, 'Salary':2000}
SAMPLE = defaultdict(lambda:0,SAMPLE)
>>> SAMPLE
defaultdict(<function <lambda> at 0x0000000002BF7C88>, {'Salary': 2000, 'Age': 28})
>>> SAMPLE['Age']----> This will return 28
>>> SAMPLE['Phone']----> This will return 0 # you got 0 as output for a non existing key inside SAMPLE
y = defaultdict(lambda:defaultdict(lambda:0))
will be helpful if you try this y['a']['b'] += 1
I have this current code
lst = [1,2,3,4]
c = dict((el,0) for el in lst)
for key in lst:
c[key] += increase_val(key)
Is there a more pythonic way to do it? Like using map? This code words but i would like probably a one-liner or maybe better way of writing this
In my opinion, that is a very clean, readable way of updating the dictionary in the way you wanted.
However, if you are looking for a one-liner, here's one:
new_dict = {x: y + increase_val(x) for x, y in old_dict.items()}
What's different is that this create's a new dictionary instead of updating the original one. If you want to mutate the dictionary in place, I think the plain old for-loop would be the most readable alternative.
In your case no need of c = dict((el,0) for el in lst) statement, because we create dictionary where value of each key is 0.
and in next for loop you are adding increment value to 0 i.e. 0 + 100 = 100, so need of addition also.
You can write code like:
lst = [1,2,3,4]
c = {}
for key in lst:
c[key] = increase_val(key)
collection.Counter()
Use collections.Counter() to remove one iteration over list to create dictionary because default value of every key in your case is 0.
Use Collections library, import collections
Demo:
>>> lst = [1,2,3,4]
>>> data = collections.Counter()
>>> for key in lst:
data[key] += increase_val(key)
collection.defaultdict()
We can use collections.defaultdict also. Just use data = collections.defaultdict(int) in above code. Here default value is zero.
But if we want to set default value to any constant value like 100 then we can use lambda function to set default value to 100
Demo:
>>> data = {}
>>> data["any"]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'any'
Get key error because there is on any key in dictionary.
>>> data1 = collections.defaultdict(lambda:0, data)
>>> data1["any"]
0
>>> data1 = collections.defaultdict(lambda:100, data)
>>> data1["any"]
>>> 100
What exactly does the TYPE lambda do when used with defaultdict? I have this example and works fine even for int, list & lambda as argument:
d = defaultdict(int)
d['one'] = lambda x:x*x
d['one'](2)
4
d = defaultdict(list)
d['one'] = lambda x:x*x
d['one'](2)
4
d = defaultdict(lambda: None)
d['one'] = lambda x:x*x
d['one'](2)
4
I have the same result each time. So what is the main reason to initialize with lambda "default (lambda: None)"? Looks defaultdict dictionary does not care about the what TYPE of argument is passed in.
Your example only makes sense when you access keys that are not explicitly added to the dictionary:
>>> d = defaultdict(int)
>>> d['one']
0
>>> d = defaultdict(list)
>>> d['one']
[]
>>> d = defaultdict(lambda: None)
>>> d['one'] is None
True
As you can see, using a default dict will give every key you try to access a default value. That default value is taken by calling the function you pass to the constructor. So passing int will set int() as the default value (which is 0); passing list will set list() as the default value (which is an empty list []); and passing lambda: None will set (lambda: None)() as the default value (which is None).
That’s what the default dictionary does. Nothing else.
The idea is that this way, you can set up defaults which you don’t need to manually set up the first time you want to access the key. So for example something like this:
d = {}
for item in some_source_for_items:
if item['key'] not in d:
d[item['key']] = []
d[item['key']].append(item)
which just sets up a new empty list for every dictionary item when it is accessed, can be reduced to just this:
d = defaultdict(list)
for item in some_source_for_items:
d[item['key']].append(item)
And the defaultdict will make sure to initialize the list correctly.
You are not using the default value factory. You won't see a difference if all you do is assign to keys, rather than try and retrieve a key that isn't in the dictionary yet.
The default value factory (the first argument to defaultdict()) is not a type declaration. It is instead called whenever you try and access a key that isn't in the dictionary yet:
>>> from collections import defaultdict
>>> def demo_factory():
... print('Called the factory for a missing key')
... return 'Default value'
...
>>> d = defaultdict(demo_factory)
>>> list(d) # list the keys
[]
>>> d['foo']
Called the factory for a missing key
'Default value'
>>> list(d)
['foo']
>>> d['foo']
'Default value'
>>> d['bar'] = 'spam' # assignment is not the same thing
>>> list(d)
['foo', 'bar']
>>> d['bar']
'spam'
Only the first time when I tried to access the key 'foo' was the factory called to produce a default value, which is then stored in the dictionary for future access.
So for each of your different examples, what varies between them is what default value will be produced for each. You never access this functionality, because you directly assigned to the 'one' key.
Had you accessed a non-existing key you'd have created an integer with value 0, an empty list or None, respectively.
I use a dict as a short-term cache. I want to get a value from the dictionary, and if the dictionary didn't already have that key, set it, e.g.:
val = cache.get('the-key', calculate_value('the-key'))
cache['the-key'] = val
In the case where 'the-key' was already in cache, the second line is not necessary. Is there a better, shorter, more expressive idiom for this?
yes, use:
val = cache.setdefault('the-key', calculate_value('the-key'))
An example in the shell:
>>> cache = {'a': 1, 'b': 2}
>>> cache.setdefault('a', 0)
1
>>> cache.setdefault('b', 0)
2
>>> cache.setdefault('c', 0)
0
>>> cache
{'a': 1, 'c': 0, 'b': 2}
See: http://docs.python.org/release/2.5.2/lib/typesmapping.html
Readability matters!
if 'the-key' not in cache:
cache['the-key'] = calculate_value('the-key')
val = cache['the-key']
If you really prefer an one-liner:
val = cache['the-key'] if 'the-key' in cache else cache.setdefault('the-key', calculate_value('the-key'))
Another option is to define __missing__ in the cache class:
class Cache(dict):
def __missing__(self, key):
return self.setdefault(key, calculate_value(key))
Have a look at the Python Decorator Library, and more specifically Memoize which acts as a cache. That way you can just decorate your call the calculate_value with the Memoize decorator.
Approach with
cache.setdefault('the-key',calculate_value('the-key'))
is great if calculate_value is not costly, because it will be evaluated each time. So if you have to read from DB, open a file or network connection or do anything "expensive", then use the following structure:
try:
val = cache['the-key']
except KeyError:
val = calculate_value('the-key')
cache['the-key'] = val
You might want to take a look at (the entire page at) "Code Like a Pythonista" http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#dictionary-get-method
It covers the setdefault() technique described above, and the defaultdict technique is also very handy for making dictionaries of sets or arrays for example.
You can also use defaultdict to do something similar:
>>> from collections import defaultdict
>>> d = defaultdict(int) # will default values to 0
>>> d["a"] = 1
>>> d["a"]
1
>>> d["b"]
0
>>>
You can assign any default you want by supplying your own factory function and itertools.repeat:
>>> from itertools import repeat
>>> def constant_factory(value):
... return repeat(value).next
...
>>> default_value = "default"
>>> d = defaultdict(constant_factory(default_value))
>>> d["a"]
'default'
>>> d["b"] = 5
>>> d["b"]
5
>>> d.keys()
['a', 'b']
use setdefault method,
if the key is already not present then setdefault creates the new key with the value provided in the second argument, in case the key is already present then it returns the value of that key.
val = cache.setdefault('the-key',value)
Use get to extract the value or to get None.
Combining None with or will let you chain another operation (setdefault)
def get_or_add(cache, key, value_factory):
return cache.get(key) or cache.setdefault(key, value_factory())
usage:
in order to make it lazy the method expects a function as the third parameter
get_or_add(cache, 'the-key', lambda: calculate_value('the-key'))
I want to rewrite Python's dictionary access mechanism "getitem" to be able to return default values.
The functionality I am looking for is something like
a = dict()
a.setdefault_value(None)
print a[100] #this would return none
any hints ?
Thanks
There is already a collections.defaultdict:
from collections import defaultdict
a = defaultdict(lambda:None)
print a[100]
There is a defaultdict built-in starting with Python 2.6. The constructor takes a function which will be called when a value is not found. This gives more flexibility than simply returning None.
from collections import defaultdict
a = defaultdict(lambda: None)
print a[100] #gives None
The lambda is just a quick way to define a one-line function with no name. This code is equivalent:
def nonegetter():
return None
a = defaultdict(nonegetter)
print a[100] #gives None
This is a very useful pattern which gives you a hash showing the count of each unique object. Using a normal dict, you would need special cases to avoid KeyError.
counts = defaultdict(int)
for obj in mylist:
counts[obj] += 1
use a defaultdict (http://docs.python.org/library/collections.html#collections.defaultdict)
import collections
a = collections.defaultdict(lambda:None)
where the argument to the defaultdict constructor is a function which returns the default value.
Note that if you access an unset entry, it actually sets it to the default:
>>> print a[100]
None
>>> a
defaultdict(<function <lambda> at 0x38faf0>, {100: None})
If you really want to not use the defaultdict builtin, you need to define your own subclass of dict, like so:
class MyDefaultDict(dict):
def setdefault_value(self, default):
self.__default = default
def __getitem__(self, key):
try:
return self[key]
except IndexError:
return self.__default
i wasnt aware of defaultdict, and thats probably the best way to go. if you are opposed for some reason ive written small wrapper function for this purpose in the past. Has slightly different functionality that may or may not be better for you.
def makeDictGet(d, defaultVal):
return lambda key: d[key] if key in dict else defaultVal
And using it...
>>> d1 = {'a':1,'b':2}
>>> d1Get = makeDictGet(d1, 0)
>>> d1Get('a')
1
>>> d1Get(5)
0
>>> d1['newAddition'] = 'justAddedThisOne' #changing dict after the fact is fine
>>> d1Get('newAddition')
'justAddedThisOne'
>>> del d1['a']
>>> d1Get('a')
0
>>> d1GetDefaultNone = makeDictGet(d1, None) #having more than one such function is fine
>>> print d1GetDefaultNone('notpresent')
None
>>> d1Get('notpresent')
0
>>> f = makeDictGet({'k1':'val1','pi':3.14,'e':2.718},False) #just put new dict as arg if youre ok with being unable to change it or access directly
>>> f('e')
2.718
>>> f('bad')
False