Understanding the use of defaultdict in Python [duplicate] - python

This question already has answers here:
Collections.defaultdict difference with normal dict
(16 answers)
Closed 6 years ago.
I am starting to learn Python and have run across a piece of code that I'm hoping one of you can help me understand.
from collections import defaultdict
dd_dict = defaultdict(dict)
dd_dict["Joel"]["City"] = "Seattle"
result:
{ "Joel" : { "City" : Seattle"}}
The part I am having a problem with is the third line. Could someone please explain to me what is happening here?

The third line inserts a dictionary inside a dictionary. By using dict as a default value in default dict you are telling python to initialize every new dd_dict value with an empty dict. The above code is equivalent to
dd_dict["Joel"] = {}
dd_dict['Joel"]["City"] = "Seattle"
If you didn't use default dict the second line would have raised a key error. So default dicts are a way of avoiding such errors by initializing the default value of your data structure.

From the documentation of defaultdict:
If default_factory is not None, it is called without arguments to provide a default value for the given key, this value is inserted in the dictionary for the key, and returned.
Since "Joel" doesn't exist as key yet the dd_dict["Joel"] part creates an empty dictionary as value for the key "Joel". The following part ["City"] = "Seattle" is just like adding a normal key-value pair a dictionary - in this case the dd_dict["Joel"] dictionary.

The first argument provides the initial value for the default_factory
attribute; it defaults to None. If default_factory is not None, it is
called without arguments to provide a default value for the given key,
this value is inserted in the dictionary for the key, and returned.
dd_dict = defaultdict(dict)
dd_dict["Joel"]["City"] = "Seattle"
in you case, when you call dd_dict["Joel"], there is no such key in the dd_dict, this raises a KeyError exception. defaultdict has __missing__(key) protocol to handle this error, when it can not find the key, it will call the default_factory without arguments to provide a default value for the given key.
so when you call dd_dict["Joel"], this will give you a dict {}, then you add item ["City"] = "Seattle" to the empty dict, someting like:
{}["City"] = "Seattle"

When a key is accessed and is missing, the __missing__ method is accessed.
For a regular dict, a KeyError is raised
For a defaultdict, the object you passed as a parameter is created and accessed.
If you made a defaultdict(list), and tried to access a missing key, you would get a list back.
Example:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> d['missing']
[]

When you access a key of a defaultdict that does not exits, you will get what the function you supply returns.
In your case you supplied dict, therefore you get a new empty dictionary:
>>> dict()
{}
>>> from collections import defaultdict
... dd_dict = defaultdict(dict)
...
>>> dd_dict['Joel']
{}
Now you add your key-value pair to this dictionary:
>>> dd_dict["Joel"]["City"] = "Seattle"
"Joel" : { "City" : Seattle"}}

defaultdict(dict) returns a dictionary object that will return an empty dictionary value if you index into it with a key that doesn't yet exist:
>>> from collections import defaultdict
>>> dd_dict = defaultdict(dict)
>>> dd_dict
defaultdict(<class 'dict'>, {})
>>> dd_dict["Joel"]
{}
>>> dd_dict["anything"]
{}
>>> dd_dict[99]
{}
So the third line creates a key-value pair ("Joel", {}) in dd_dict, then sets the ("City", "Seattle") key-value pair on the empty dictionary.
It's equivalent to:
>>> dd_dict = defaultdict(dict)
>>> dd_dict["Joel"] = {}
>>> dd_dict
defaultdict(<class 'dict'>, {'Joel': {}})
>>> dd_dict["Joel"]["City"] = "Seattle"
>>> dd_dict
defaultdict(<class 'dict'>, {'Joel': {'City': 'Seattle'}})

Related

(STUPIDITY WARNING) Python - nested dictionary creation (and code cleanup?) [duplicate]

I have a dictionary of zoo animals. I want to put it into the dictionary in a nested dictionary but get a KeyError because that particular species has not been added to the dictionary.
def add_to_world(self, species, name, zone = 'retreat'):
self.object_attr[species][name] = {'zone' : zone}
Is there a shortcut to checking if that species is in the dictionary and create it if it is not or do i have to do it the long way and manually check if that species has been added?
def add_to_world(self, species, name, zone = 'retreat'):
self.object_attr.setdefault(species, {})[name] = {'zone' : zone}
Here's an example of using defaultdict with a dictionary as a value.
>>> from collections import defaultdict
>>> d = defaultdict(dict)
>>> d["species"]["name"] = {"zone": "1"}
>>> d
defaultdict(<type 'dict'>, {'species': {'name': {'zone': '1'}}})
>>>
If you want further nesting you'll need to make a function to return defaultdict(dict).
def nested_defaultdict():
return defaultdict(dict)
# Then you can use a dictionary nested to 3 levels
d2 = defaultdict(nested_defaultdict)
d2["species"]["name"]["zone"] = 1
Autovivification of dictionary values can be performed by collections.defaultdict.

LEFT JOIN dictionaries in python based on value

#Input
dict_1 = {"conn": {"ts":15,"uid":"ABC","orig_h":"10.10.210.250"}}
dict_2 = {"conn": {"ts":15,"uid":"ABC","orig_h":"10.10.210.252"}}
#Mapper can be modified as required
mapper = {"10.10.210.250":"black","192.168.2.1":"black"}
I am getting each dict in a loop, in each iteration I need to check a dict against the mapper and append a flag based on match between dict_1.orig_h and mapper.10.10.210.250 . I have the flexibility to define the mapper however I need.
So the desired result would be:
dict_1 = {"conn": {"ts":15,"uid":"ABC","orig_h":"10.10.210.250", "class":"black"}}
dict_2 will remain unchanged since there is no matching value in mapper.
This is kinda what I want, but it works only if orig_h is an int
import collections
result = collections.defaultdict(dict)
for d in dict_1:
result[d[int('orig_h')]].update(d)
for d in mapper:
result[d[int('orig_h')]].update(d)
Not much explaining to be done; if the ip is in the mapper dictionary (if mapper has a key which is that ip) then set the desired attribute of the dict to the value of the key in the mapper dict ('black' here).
def update_dict(dic, mapper):
ip = dic['conn']['orig_h']
if ip in mapper:
dic['conn']['class'] = mapper[ip]
which works exactly as desired:
>>> update_dict(dict_1, mapper)
>>> dict_1
{'conn': {'ts': 15, 'uid': 'ABC', 'orig_h': '10.10.210.250', 'class': 'black'}}
>>> update_dict(dict_2, mapper)
>>> dict_2
{'conn': {'ts': 15, 'uid': 'ABC', 'orig_h': '10.10.210.252'}}
Extracting the conn value for simplicity:
conn_data = dict_1['conn']
conn_data['class'] = mapper[conn_data['orig_h']]
A two liner, extracting class and dict if the 'orig_h' is in the mapper dictionary's keys, if it id, keep it, otherwise don't keep it, then create a new dictionary comprehension inside the list comprehension to add 'class' to the dictionary's 'conn' key's dictionary.
l=[(i,mapper[i['conn']['orig_h']]) for i in (dict_1,dict_2) if i['conn']['orig_h'] in mapper]
print([{'conn':dict(a['conn'],**{'class':b})} for a,b in l])
BTW this answer chooses the dictionaries automatically

How do I append a value to dict key? (AttributeError: 'str' object has no attribute 'append')

Say I have a dictionary with one key (and a value):
dict = {'key': '500'}.
Now I want to add a new value '1000' to the same key. However,
dict[key].append('1000')
just gives me "AttributeError: 'str' object has no attribute 'append'".
If I do
dict[key] = '1000'
it replaces the previous value.
I'm guessing I have to create a list as a value and somehow append that list as the key's value but I'm not sure how I would go about this. Thanks for any help!
I suggest the usage of a defaultdict that instantiates an empty list when a key is missing.
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> d['key'].append(500)
>>> d
defaultdict(<type 'list'>, {'key': [500]})
>>> d['key'].append(1000)
>>> d
defaultdict(<type 'list'>, {'key': [500, 1000]})
I don't recommend having strings/integers as values and then switching to lists once you want to append to a field. Keep it consistent.

Python - Add to a dictionary using a string

[Python 3.4.2]
I know this question sounds ridiculous, but I can't figure out where I'm messing up. I'm trying to add keys and values to a dictionary by using strings instead of quoted text. So instead of this,
dict['key'] = value
this:
dict[key] = value
When I run the command above, I get this error:
TypeError: 'str' object does not support item assignment
I think Python is thinking that I'm trying to create a string, not add to a dictionary. I'm guessing I'm using the wrong syntax. This is what I'm trying to do:
dict[string_for_key][string_for_value] = string_for_deeper_value
I want this^ command to do this:
dict = {string_for_key: string_for_value: string_for_deeper_value}
I'm getting this error:
TypeError: 'str' object does not support item assignment
I should probably give some more context. I'm:
creating one dictionary
creating a copy of it (because I need to edit the dictionary while iterating through it)
iterating through the first dictionary while running some queries
trying to assign a query's result as a value for each "key: value" in the dictionary.
Here's a picture to show what I mean:
key: value: query_as_new_value
-----EDIT-----
Sorry, I should have clarified: the dictionary's name is not actually 'dict'; I called it 'dict' in my question to show that it was a dictionary.
-----EDIT-----
I'll just post the whole process I'm writing in my script. The error occurs during the last command of the function. Commented out at the very bottom are some other things I've tried.
from collections import defaultdict
global query_line, pericope_p, pericope_f, pericope_e, pericope_g
def _pre_query(self, typ):
with open(self) as f:
i = 1
for line in f:
if i == query_line:
break
i += 1
target = repr(line.strip())
###skipping some code
pericope_dict_post[self][typ] = line.strip()
#^Outputs error TypeError: 'str' object does not support item assignment
return
pericope_dict_pre = {'pericope-p.txt': 'pericope_p',
'pericope-f.txt': 'pericope_f',
'pericope-e.txt': 'pericope_e',
'pericope-g.txt': 'pericope_g'}
pericope_dict_post = defaultdict(dict)
#pericope_dict_post = defaultdict(list)
#pericope_dict_post = {}
for key, value in pericope_dict_pre.items():
pericope_dict_post[key] = value
#^Works
#pericope_dict_post.update({key: value})
#^Also works
#pericope_dict_post.append(key)
#^AttributeError: 'dict' object has no attribute 'append'
#pericope_dict_post[key].append(value)
#^AttributeError: 'dict' object has no attribute 'append'
_pre_query(key, value)
-----FINAL EDIT-----
Matthias helped me figure it out, although acushner had the solution too. I was trying to make the dictionary three "levels" deep, but Python dictionaries cannot work this way. Instead, I needed to create a nested dictionary. To use an illustration, I was trying to do {key: value: value} when I needed to do {key: {key: value}}.
To apply this to my code, I need to create the [second] dictionary with all three strings at once. So instead of this:
my_dict[key] = value1
my_dict[key][value1] = value2
I need to do this:
my_dict[key][value1] = value2
Thanks a ton for all your help guys!
You could create a dictionary that expands by itself (Python 3 required).
class AutoTree(dict):
"""Dictionary with unlimited levels"""
def __missing__(self, key):
value = self[key] = type(self)()
return value
Use it like this.
data = AutoTree()
data['a']['b'] = 'foo'
print(data)
Result
{'a': {'b': 'foo'}}
Now I'm going to explain your problem with the message TypeError: 'str' object does not support item assignment.
This code will work
from collections import defaultdict
data = defaultdict(dict)
data['a']['b'] = 'c'
data['a'] doesn't exist, so the default value dict is used. Now data['a'] is a dict and this dictionary gets a new value with the key 'b' and the value 'c'.
This code won't work
from collections import defaultdict
data = defaultdict(dict)
data['a'] = 'c'
data['a']['b'] = 'c'
The value of data['a'] is defined as the string 'c'. Now you can only perform string operations with data['a']. You can't use it as a dictionary now and that's why data['a']['b'] = 'c' fails.
first, do not use dict as your variable name as it shadows the built-in of the same name.
second, all you want is a nested dictionary, no?
from collections import defaultdict
d = defaultdict(dict)
d[string_for_key][string_for_value] = 'snth'
another way, as #Matthias suggested, is to create a bottomless dictionary:
dd = lambda: defaultdict(dd)
d = dd()
d[string_for_key][string_for_value] = 'snth'
you can do something like this:
>>> my_dict = {}
>>> key = 'a' # if key is not defined before it will raise NameError
>>> my_dict[key] = [1]
>>> my_dict[key].append(2)
>>> my_dict
{'a': [1, 2]}
Note: dict is inbuilt don't use it as variable name

Initializing a dictionary in python with a key value and no corresponding values

I was wondering if there was a way to initialize a dictionary in python with keys but no corresponding values until I set them. Such as:
Definition = {'apple': , 'ball': }
and then later i can set them:
Definition[key] = something
I only want to initialize keys but I don't know the corresponding values until I have to set them later. Basically I know what keys I want to add the values as they are found. Thanks.
Use the fromkeys function to initialize a dictionary with any default value. In your case, you will initialize with None since you don't have a default value in mind.
empty_dict = dict.fromkeys(['apple','ball'])
this will initialize empty_dict as:
empty_dict = {'apple': None, 'ball': None}
As an alternative, if you wanted to initialize the dictionary with some default value other than None, you can do:
default_value = 'xyz'
nonempty_dict = dict.fromkeys(['apple','ball'],default_value)
You could initialize them to None.
you could use a defaultdict. It will let you set dictionary values without worrying if the key already exists. If you access a key that has not been initialized yet it will return a value you specify (in the below example it will return None)
from collections import defaultdict
your_dict = defaultdict(lambda : None)
It would be good to know what your purpose is, why you want to initialize the keys in the first place. I am not sure you need to do that at all.
1) If you want to count the number of occurrences of keys, you can just do:
Definition = {}
# ...
Definition[key] = Definition.get(key, 0) + 1
2) If you want to get None (or some other value) later for keys that you did not encounter, again you can just use the get() method:
Definition.get(key) # returns None if key not stored
Definition.get(key, default_other_than_none)
3) For all other purposes, you can just use a list of the expected keys, and check if the keys found later match those.
For example, if you only want to store values for those keys:
expected_keys = ['apple', 'banana']
# ...
if key_found in expected_keys:
Definition[key_found] = value
Or if you want to make sure all expected keys were found:
assert(all(key in Definition for key in expected_keys))
You can initialize the values as empty strings and fill them in later as they are found.
dictionary = {'one':'','two':''}
dictionary['one']=1
dictionary['two']=2
Comprehension could be also convenient in this case:
# from a list
keys = ["k1", "k2"]
d = {k:None for k in keys}
# or from another dict
d1 = {"k1" : 1, "k2" : 2}
d2 = {k:None for k in d1.keys()}
d2
# {'k1': None, 'k2': None}
q = input("Apple")
w = input("Ball")
Definition = {'apple': q, 'ball': w}
Based on the clarifying comment by #user2989027, I think a good solution is the following:
definition = ['apple', 'ball']
data = {'orange':1, 'pear':2, 'apple':3, 'ball':4}
my_data = {}
for k in definition:
try:
my_data[k]=data[k]
except KeyError:
pass
print my_data
I tried not to do anything fancy here. I setup my data and an empty dictionary. I then loop through a list of strings that represent potential keys in my data dictionary. I copy each value from data to my_data, but consider the case where data may not have the key that I want.

Categories