(STUPIDITY WARNING) Python - nested dictionary creation (and code cleanup?) [duplicate] - python

I have a dictionary of zoo animals. I want to put it into the dictionary in a nested dictionary but get a KeyError because that particular species has not been added to the dictionary.
def add_to_world(self, species, name, zone = 'retreat'):
self.object_attr[species][name] = {'zone' : zone}
Is there a shortcut to checking if that species is in the dictionary and create it if it is not or do i have to do it the long way and manually check if that species has been added?

def add_to_world(self, species, name, zone = 'retreat'):
self.object_attr.setdefault(species, {})[name] = {'zone' : zone}

Here's an example of using defaultdict with a dictionary as a value.
>>> from collections import defaultdict
>>> d = defaultdict(dict)
>>> d["species"]["name"] = {"zone": "1"}
>>> d
defaultdict(<type 'dict'>, {'species': {'name': {'zone': '1'}}})
>>>
If you want further nesting you'll need to make a function to return defaultdict(dict).
def nested_defaultdict():
return defaultdict(dict)
# Then you can use a dictionary nested to 3 levels
d2 = defaultdict(nested_defaultdict)
d2["species"]["name"]["zone"] = 1

Autovivification of dictionary values can be performed by collections.defaultdict.

Related

LEFT JOIN dictionaries in python based on value

#Input
dict_1 = {"conn": {"ts":15,"uid":"ABC","orig_h":"10.10.210.250"}}
dict_2 = {"conn": {"ts":15,"uid":"ABC","orig_h":"10.10.210.252"}}
#Mapper can be modified as required
mapper = {"10.10.210.250":"black","192.168.2.1":"black"}
I am getting each dict in a loop, in each iteration I need to check a dict against the mapper and append a flag based on match between dict_1.orig_h and mapper.10.10.210.250 . I have the flexibility to define the mapper however I need.
So the desired result would be:
dict_1 = {"conn": {"ts":15,"uid":"ABC","orig_h":"10.10.210.250", "class":"black"}}
dict_2 will remain unchanged since there is no matching value in mapper.
This is kinda what I want, but it works only if orig_h is an int
import collections
result = collections.defaultdict(dict)
for d in dict_1:
result[d[int('orig_h')]].update(d)
for d in mapper:
result[d[int('orig_h')]].update(d)
Not much explaining to be done; if the ip is in the mapper dictionary (if mapper has a key which is that ip) then set the desired attribute of the dict to the value of the key in the mapper dict ('black' here).
def update_dict(dic, mapper):
ip = dic['conn']['orig_h']
if ip in mapper:
dic['conn']['class'] = mapper[ip]
which works exactly as desired:
>>> update_dict(dict_1, mapper)
>>> dict_1
{'conn': {'ts': 15, 'uid': 'ABC', 'orig_h': '10.10.210.250', 'class': 'black'}}
>>> update_dict(dict_2, mapper)
>>> dict_2
{'conn': {'ts': 15, 'uid': 'ABC', 'orig_h': '10.10.210.252'}}
Extracting the conn value for simplicity:
conn_data = dict_1['conn']
conn_data['class'] = mapper[conn_data['orig_h']]
A two liner, extracting class and dict if the 'orig_h' is in the mapper dictionary's keys, if it id, keep it, otherwise don't keep it, then create a new dictionary comprehension inside the list comprehension to add 'class' to the dictionary's 'conn' key's dictionary.
l=[(i,mapper[i['conn']['orig_h']]) for i in (dict_1,dict_2) if i['conn']['orig_h'] in mapper]
print([{'conn':dict(a['conn'],**{'class':b})} for a,b in l])
BTW this answer chooses the dictionaries automatically

Understanding the use of defaultdict in Python [duplicate]

This question already has answers here:
Collections.defaultdict difference with normal dict
(16 answers)
Closed 6 years ago.
I am starting to learn Python and have run across a piece of code that I'm hoping one of you can help me understand.
from collections import defaultdict
dd_dict = defaultdict(dict)
dd_dict["Joel"]["City"] = "Seattle"
result:
{ "Joel" : { "City" : Seattle"}}
The part I am having a problem with is the third line. Could someone please explain to me what is happening here?
The third line inserts a dictionary inside a dictionary. By using dict as a default value in default dict you are telling python to initialize every new dd_dict value with an empty dict. The above code is equivalent to
dd_dict["Joel"] = {}
dd_dict['Joel"]["City"] = "Seattle"
If you didn't use default dict the second line would have raised a key error. So default dicts are a way of avoiding such errors by initializing the default value of your data structure.
From the documentation of defaultdict:
If default_factory is not None, it is called without arguments to provide a default value for the given key, this value is inserted in the dictionary for the key, and returned.
Since "Joel" doesn't exist as key yet the dd_dict["Joel"] part creates an empty dictionary as value for the key "Joel". The following part ["City"] = "Seattle" is just like adding a normal key-value pair a dictionary - in this case the dd_dict["Joel"] dictionary.
The first argument provides the initial value for the default_factory
attribute; it defaults to None. If default_factory is not None, it is
called without arguments to provide a default value for the given key,
this value is inserted in the dictionary for the key, and returned.
dd_dict = defaultdict(dict)
dd_dict["Joel"]["City"] = "Seattle"
in you case, when you call dd_dict["Joel"], there is no such key in the dd_dict, this raises a KeyError exception. defaultdict has __missing__(key) protocol to handle this error, when it can not find the key, it will call the default_factory without arguments to provide a default value for the given key.
so when you call dd_dict["Joel"], this will give you a dict {}, then you add item ["City"] = "Seattle" to the empty dict, someting like:
{}["City"] = "Seattle"
When a key is accessed and is missing, the __missing__ method is accessed.
For a regular dict, a KeyError is raised
For a defaultdict, the object you passed as a parameter is created and accessed.
If you made a defaultdict(list), and tried to access a missing key, you would get a list back.
Example:
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> d['missing']
[]
When you access a key of a defaultdict that does not exits, you will get what the function you supply returns.
In your case you supplied dict, therefore you get a new empty dictionary:
>>> dict()
{}
>>> from collections import defaultdict
... dd_dict = defaultdict(dict)
...
>>> dd_dict['Joel']
{}
Now you add your key-value pair to this dictionary:
>>> dd_dict["Joel"]["City"] = "Seattle"
"Joel" : { "City" : Seattle"}}
defaultdict(dict) returns a dictionary object that will return an empty dictionary value if you index into it with a key that doesn't yet exist:
>>> from collections import defaultdict
>>> dd_dict = defaultdict(dict)
>>> dd_dict
defaultdict(<class 'dict'>, {})
>>> dd_dict["Joel"]
{}
>>> dd_dict["anything"]
{}
>>> dd_dict[99]
{}
So the third line creates a key-value pair ("Joel", {}) in dd_dict, then sets the ("City", "Seattle") key-value pair on the empty dictionary.
It's equivalent to:
>>> dd_dict = defaultdict(dict)
>>> dd_dict["Joel"] = {}
>>> dd_dict
defaultdict(<class 'dict'>, {'Joel': {}})
>>> dd_dict["Joel"]["City"] = "Seattle"
>>> dd_dict
defaultdict(<class 'dict'>, {'Joel': {'City': 'Seattle'}})

dynamic manipulation via dictionary keys

I need to manipulate dynamic a dictionary on python. I have unrecognized information from input information, as in this example:
'properties[props][defaultValue]': ''
'properties[props][dt_precision]': ''
'properties[props][dt_table]': ''
'properties[props][dtfield]': ''
I need to convert to a dictionary like this example:
properties['props']['dt_table'] = 1
properties['props']['dt_table'] = 2
I don't know the real information, but I know that the format is like this:
variable[index] = value
variable[index][index_1] = value
variable[index][index_1] [index_2]= value
variable[index][index_1] [index_2][index_3]= value
My problem is, how can I add a dictionary with infinite layers of keys? In others words, add a large hierarchy of subkeys to subkeys dynamically.
In javascript I use references like this:
f=var['key'];
f['key'] = {};
f = f['key'];
f['key'] = 120;
Which allows me to construct:
var['key']['key'] = 120
but the equivalent in python does not work.
Naive approach
The simplest approach, involves creating new dictionary on each sub-level by hand:
var = {}
var['key'] = {}
var['key']['key'] = 120
print(var['key']['key'])
print(var)
Which gives following output:
120
{'key': {'key': 120}}
Autovivification
You can automate it by using defaultdict as suggested by #martineau in comments:
from collections import defaultdict
def tree():
return defaultdict(tree)
v2 = tree()
v2['key']['key'] = 120
print(v2['key']['key'])
print(v2)
With output:
120
defaultdict(<function tree at 0x1ae7d88>, {'key': defaultdict(<function tree at 0x1ae7d88>, {'key': 120})})

How can you return a default value instead of a key error when accessing a multi-dimenional dictionary in python?

I'm trying to get a value out of a multi-dimensional dictionary, which looks for example like this:
count = {'animals': {'dogs': {'chihuahua': 23}}
So if i want to know how much chihuahua's i got, i'm printing count['animals']['dogs']['chihuahua']
But i want to access count['vehicles']['cars']['vw golf'] too, and instead of key errors i want to return 0.
actually i'm doing this:
if not 'vehicles' in count:
count['vehicles'] = {}
if not 'cars' in count['vehicles']:
count['vehicles']['cars'] = {}
if not 'vw golf' in count['vehicles']['cars']['vw golf']:
count['vehicles']['cars']['vw golf'] = 0
How can i do this better?
I'm thinking of some type of class which inherits from dict, but that's just an idea.
You can just do:
return count.get('vehicles', {}).get('cars', {}).get('vw golf', 0)
basically, return an empty dictionary if not found, and get the count at the end.
This would work assuming the dataset is in the specified format only. It would not raise errors, however you might have to tweak it for other datatypes
Demo
>>> count = {'animals': {'dogs': {'chihuahua': 23}}}
>>> count.get('vehicles', {}).get('cars', {}).get('vw golf', 0)
0
>>> count = {'vehicles': {'cars': {'vw golf': 100}}}
>>> count.get('vehicles', {}).get('cars', {}).get('vw golf', 0)
100
>>>
Use a combination of collections.defaultdict and collections.Counter:
from collections import Counter
from collections import defaultdict
counts = defaultdict(lambda: defaultdict(Counter))
Usage:
>>> counts['animals']['dogs']['chihuahua'] = 23
>>> counts['vehicles']['cars']['vw golf'] = 100
>>>
>>> counts['animals']['dogs']['chihuahua']
23
>>> # No fancy cars yet, Counter defaults to 0
... counts['vehicles']['cars']['porsche']
0
>>>
>>> # No bikes yet, empty counter
... counts['vehicles']['bikes']
Counter()
The lambda in the construction of the defaultdict is needed because defaultdict expects a factory. So lambda: defaultdict(Counter) basically creates a function that will return defaultdict(Counter) when called - which is what's required to create the multi-dimensional dictionary you described:
A dictionary whose values default to a dictionary whose values default to an instance of Counter.
The advantage of this solution is that you don't have to keep track of which categories you already defined. You can simply assign two new categories and a new count in one go, and use the same syntax to add a new count for existing categories:
>>> counts['food']['fruit']['bananas'] = 42
>>> counts['food']['fruit']['apples'] = 3
(This assumes that you'll always want exactly three dimensions to your data structure, the first two being category dictionaries and the third being a Counter where the actual counts of things will be stored).

create a new dict based on old dict

I am have the following dict:
abc = {"type":"insecure","id":"1","name":"peter"}
what I want to do is to have a new dict based on the old dict in which there is no key "type" and key "id" is changed to "identity". The new dict will look as follows:
xyz = {"identity":"1","name":"peter"}
The solution that I came up was as follows:
abc = {"type":"insecure","id":"1","name":"peter"}
xyz = {}
black_list_values = set(("type","id"))
for k in abc:
if k not in blacklist_values:
xyz[k] = abc[k]
xyz["identity"] = abc["id"]
I was wondering if its the fastest and efficient way to do that? Right now, "abc" have only three values. If "abc" is much bigger and have many values then is my solution still the efficient and fast.
You can use a dict-comprehension:
abc = {"type":"insecure","id":"1","name":"peter"}
black_list = {"type"}
rename ={"id":"identity"} #use a mapping dictionary in case you want to rename multiple items
dic = {rename.get(key,key) : val for key ,val in abc.items() if key not in black_list}
print dic
output:
{'name': 'peter', 'identity': '1'}
You want to create a new dictionary anyway. You can iterate over keys/values in a dict comprehension, which is more compact, but functionally the same:
abc = {"type":"insecure","id":"1","name":"peter"}
black_list_values = set(("type","id"))
xyz = {k:v for k,v in abc.iteritems() if k not in black_list_values}
xyz["identity"] = abc["id"]
Without iterating through the original dict:
abc = {"type":"insecure","id":"1","name":"peter"}
xyz = abc.copy()
xyz.pop('type')
xyz['identity'] = xyz.pop('id')
If all the keys are pre-known and it's a short list of keys, then the obvious solution is just
xyz = {"identity":abc["id"],"name":abc["name"]}
A another simple suggestion would be to use the dict() function:
abc = {"type":"insecure","id":"1","name":"peter"}
xyz = dict(abc)
Then perform the replacement in any way you see fit =-)

Categories