Dictionary object to decision tree in Pydot - python

I have a dictionary object as such:
menu = {'dinner':{'chicken':'good','beef':'average','vegetarian':{'tofu':'good','salad':{'caeser':'bad','italian':'average'}},'pork':'bad'}}
I'm trying to create a graph (decision tree) using pydot with the 'menu' data this.
'Dinner' would be the top node and its values (chicken, beef, etc.) are below it. Referring to the link, the graph function takes two parameters; a source and a node.
It would look something like this:
Except 'king' would be 'dinner' and 'lord' would be 'chicken', 'beef', etc.
My question is: How do I access a key in a value? To create a tree from this data I feel like I need to create a loop which checks if there is a value for a specific key and plots it. I'm not sure how to call values for any dictionary object (if it's not necessarily called 'dinner' or have as many elements.).
Any suggestions on how to graph it?

Using a recursive function
You might want to consider using a recursive function (like the visit in my code below), so that you are able to process a general nested dictionary. In this function, you want to pass a parent parameter to keep track of who is your incoming node. Also note you use isinstance to check if the dictionary value of a key is a dictionary of its own, in that case you need to call your visit recursively.
import pydot
menu = {'dinner':
{'chicken':'good',
'beef':'average',
'vegetarian':{
'tofu':'good',
'salad':{
'caeser':'bad',
'italian':'average'}
},
'pork':'bad'}
}
def draw(parent_name, child_name):
edge = pydot.Edge(parent_name, child_name)
graph.add_edge(edge)
def visit(node, parent=None):
for k,v in node.iteritems():# If using python3, use node.items() instead of node.iteritems()
if isinstance(v, dict):
# We start with the root node whose parent is None
# we don't want to graph the None node
if parent:
draw(parent, k)
visit(v, k)
else:
draw(parent, k)
# drawing the label using a distinct name
draw(k, k+'_'+v)
graph = pydot.Dot(graph_type='graph')
visit(menu)
graph.write_png('example1_graph.png')
Resulting tree structure

Your question isn't entirely clear to me, but the way of accessing a dictionary key's value in Python is simply:
dictionary[key]
That will return to you that key's value. If that key is not in the dictionary, it will return a KeyError, so if you are working with dictionaries and you're not sure if the key you're requesting will be in the dictionary, you have two options.
If-statement (preferred):
if key in dictionary:
return dictionary[key]
Try-catch:
try:
return dictionary[key]
except KeyError:
pass
If you don't know the keys in your dictionary and you need to get them, you can simply call dictionary.keys() and it will return a list of all of the keys in the dictionary.
Getting a the value of a dictionary key will return an object that could even be another object. Thus, to find out the value of "tofu", for example, we'd do the following:
menu['dinner']['vegetarian']['tofu']
# returns value 'good'

Related

Multi-level defaultdict with variable depth and with list and int type

I am trying to create a multi-level dict with variable depth and with list and int type.
Data structure is like below
A
--B1
-----C1=1
-----C2=[1]
--B2=[3]
D
--E
----F
------G=4
In the case of above data structure, the last value can be an int or list.
If the above data structure has the only int then I can be easily achieved by using the below code:
from collections import defaultdict
f = lambda: defaultdict(f)
d = f()
d['A']['B1']['C1'] = 1
But as the last value has both list and int, it becomes a bit problematic for me.
Now we can insert data in a list using two ways.
d['A']['B1']['C2']= [1]
d['A']['B1']['C2'].append([2])
But when I am using only the append method it is causing the error.
Error is:
AttributeError: 'collections.defaultdict' object has no attribute 'append'
so Is there any way to use only the append method for a list?
There's no way you can use your current defaultdict-based structure to make d['A']['B1']['C2'].append(1) work properly if the 'C2' key doesn't already exist, since the data structure can't tell that the unknown key should correspond to a list rather than another layer of dictionary. It doesn't know what method you're going to call on the value it returns, so it can't know it shouldn't return a dictionary (like it did when it first looked up 'A' and 'B').
This isn't an issue for bare integers, since for those you're as assigning directly to a new key (and all the earlier levels are dictionaries). When you're assigning, the data structure isn't creating the value, you are, so you can use any type you want.
Now, if your keys are distinctive in some way, so that given a key like 'C2' you can know for sure that it should correspond to a list, you may have a chance. You can write your own dict subclass, defining a __missing__ method to handle lookups of keys that don't exist yet in your own special way:
def Tree(dict):
def __missing__(self, key):
if key_corresponds_to_list(key): # magic from somewhere
result = self[key] = []
else:
result = self[key] = Tree()
return result
# you might also want a custom __repr__
Here's an example run with a magic key function that makes any even-length key default to a list, while an odd-length key defaults to a dict:
> def key_corresponds_to_list(key):
return len(key) % 2 == 0
> t = Tree()
> t["A"]["B"]["C2"].append(1) # the default value for C2 is a list because it's even length
> t
{'A': {'B': {'C2': [1]}}}
> t["A"]["B"]["C10"]["D"] = 2 # C10's another layer of dict, since it's length is odd
> t
{'A': {'B': {'C10': {'D': 2}, 'C2': [1]}}} # it didn't matter what length D was though
You probably won't actually want to use a global function to control the class like this, I just did that as an example. If you go with this approach, I'd suggest putting the logic directly into the __missing__ method (or maybe passing a function as a parameter, like defaultdict does with its factory function).

Modify multiple keys of dictionary by a mapping dictionary

I have 2 dict, one original and one for mapping the original one's key to another value simultaneously,for instance:
original dict:
built_dict={'China':{'deportivo-cuenca-u20':{'danny':'test1'}},
'Germany':{'ajax-amsterdam-youth':{'lance':'test2'}}}
mapping dict:
club_team_dict={'deportivo-cuenca-u20':'deportivo','ajax-amsterdam-youth':'ajax'}
It works well if I use the following code to change the key of the nested dict of original dict,like
def club2team(built_dict,club_team_dict):
for row in built_dict:
# print test_dict[row]
for sub_row in built_dict[row]:
for key in club_team_dict:
# the key of club_team_dict must be a subset of test_dict,or you have to check it and then replace it
if sub_row==key:
built_dict[row][club_team_dict[sub_row]] = built_dict[row].pop(sub_row)
return built_dict
and the result:
{'Germany': {'ajax': {'lance': 'test2'}}, 'China': {'deportivo': {'danny': 'test1'}}}
so far so good, however if I have a dict with multiple key mapping to the same key,for example,my original dict is like
built_dict={'China':{'deportivo-cuenca-u20':{'danny':'test1'}},
'Germany':{'ajax-amsterdam-youth':{'lance':'test2'},
'ajax-amsterdam':{'tony':'test3'}}}
and the mapping dict with more then 1 key mapping to the same value,like:
club_team_dict={'deportivo-cuenca-u20':'deportivo',
'ajax-amsterdam-youth':'ajax',
'ajax-amsterdam':'ajax'}
as you can see, both 'ajax-amsterdam-youth'and 'ajax-amsterdam-youth' are mapping to 'ajax',and the trouble is when I use the same code to execute it, the original dict's size has been changed during the iteration
RuntimeError: dictionary changed size during iteration
I want to get a result with nested list for the same key like this
{'Germany': {'ajax':[{'lance': 'test2'},
{'tony' : 'test3'}]}},
'China': {'deportivo': [{'danny': 'test1'}]}}
Well I have found a solution for this,the code:
def club2team(built_dict,club_team_dict):
for row in built_dict:
# print test_dict[row]
for sub_row in built_dict[row].keys():
for key in club_team_dict:
# the key of club_team_dict must be a subset of test_dict,or you have to check it and then replace it
if sub_row==key:
# built_dict[row][club_team_dict[sub_row]] = built_dict[row].pop(sub_row)
built_dict[row].setdefault(club_team_dict[sub_row],[]).append(built_dict[row].pop(sub_row))
return built_dict
pay attention to the for sub_row in built_dict[row].keys(): and setdefault() method, I used to believe that in python 2.7, the default iteration for dict is just iterate the keys(), however, this time it proves it's a little different, maybe you have better solution, please show me and it will be appreciate,thank you

python call function in get method of dictionary instead default value

Need some help in order to understand some things in Python and get dictionary method.
Let's suppose that we have some list of dictionaries and we need to make some data transformation (e.g. get all names from all dictionaries by key 'name'). Also I what to call some specific function func(data) if key 'name' was not found in specific dict.
def func(data):
# do smth with data that doesn't contain 'name' key in dict
return some_data
def retrieve_data(value):
return ', '.join([v.get('name', func(v)) for v in value])
This approach works rather well, but as far a I can see function func (from retrieve_data) call each time, even key 'name' is present in dictionary.
If you want to avoid calling func if the dictionary contains the value, you can use this:
def retrieve_data(value):
return ', '.join([v['name'] if 'name' in v else func(v) for v in value])
The reason func is called each time in your example is because it gets evaluated before get even gets called.

Prevent evaluating default function in dictionary.get or dictionary.setdefault for existing keys

I'd like to keep track of key-value pairs I've processed already in a dictionary (or something else if it's better), where key is some input and value is the return output of some complex function/calculation. The main purpose is to prevent doing the same process over again if I wish to get the value for a key that has been seen before. I've tried using setdefault and get to solve this problem, but the function I call ends up getting executed regardless if the key exists in the dictionary.
Sample code:
def complex_function(some_key):
"""
Complex calculations using some_key
"""
return some_value
# Get my_key's value in my_dict. If my_key has not been seen yet,
# calculate its value and set it to my_dict[my_key]
my_value = my_dict.setdefault(my_key, complex_function(my_key))
complex_function ends up getting carried out regardless if my_key is in my_dict. I've also tried using my_dict.get(my_key, complex_function(my_key)) with the same result. For now, this is my fixed solution:
if my_key not in my_dict:
my_dict[my_key] = complex_function(my_key)
my_value = my_dict[my_key]
Here are my questions. First, is using a dictionary for this purpose the right approach? Second, am I using setdefault correctly? And third, is my current fix a good solution to the problem? (I end up calling my_dict[my_key] twice if my_key doesn't exist)
So I went ahead and took Vincent's suggestion of using a decorator.
Here's what the new fix looks like:
import functools
#functools.lru_cache(maxsize=16)
def complex_function(some_input):
"""
Complex calculations using some_input
"""
return some_value
my_value = complex_function(some_input)
From what I understand so far, lru_cache uses a dictionary to cache the results. The key in this dictionary refers to argument(s) to the decorated function (some_input) and the value refers to the return value of the decorated function (some_value). So, if the function gets called with an argument that's previously been passed before, it would simply return the value referenced in the decorator's dictionary instead of running the function. If the argument hasn't been seen, the function proceeds as normal, and in addition, the decorator creates a new key-value pair in its dictionary.
I set the maxsize to 16 for now as I don't expect some_input to represent more than 10 unique values. One thing to note is that the arguments for the decorated function are required to be non-mutable and hashable, as it uses the arguments as keys for its dictionary.
original_dict = {"a" : "apple", "b" : "banana", "c" : "cat"}
keys = a.keys()
new_dict = {}
For every key that you access now, run the following command :
new_dict[key] = value
To check if you have already accessed a key, run the following code :
#if new_key is not yet accessed
if not new_key in new_dict.keys() :
#read the value of new_key from original_dict and write to new_dict
new_dict[new_key] = original_dict[new_key]
I hope this helps
Your current solution is fine. You are creating slightly more work, but significantly reducing the computational workload when the key is already present.
However, defaultdict is almost what you need here. By modifying it a little bit we can make it work exactly as you want.
from collections import defaultdict
class DefaultKeyDict(defaultdict):
def __missing__(self, key):
if self.default_factory is None:
raise KeyError(key)
self[key] = value = self.default_factory(key)
return value
d = DefaultKeyDict(lambda key: key * 2)
assert d[1] == 2
print(d)

Python: How to traverse a List[Dict{List[Dict{}]}]

I was just wondering if there is a simple way to do this. I have a particular structure that is parsed from a file and the output is a list of a dict of a list of a dict. Currently, I just have a bit of code that looks something like this:
for i in xrange(len(data)):
for j, k in data[i].iteritems():
for l in xrange(len(data[i]['data'])):
for m, n in data[i]['data'][l].iteritems():
dostuff()
I just wanted to know if there was a function that would traverse a structure and internally figure out whether each entry was a list or a dict and if it is a dict, traverse into that dict and so on. I've only been using Python for about a month or so, so I am by no means an expert or even an intermediate user of the language. Thanks in advance for the answers.
EDIT: Even if it's possible to simplify my code at all, it would help.
You never need to iterate through xrange(len(data)). You iterate either through data (for a list) or data.items() (or values()) (for a dict).
Your code should look like this:
for elem in data:
for val in elem.itervalues():
for item in val['data']:
which is quite a bit shorter.
Will, if you're looking to decend an arbitrary structure of array/hash thingies then you can create a function to do that based on the type() function.
def traverse_it(it):
if (isinstance(it, list)):
for item in it:
traverse_it(item)
elif (isinstance(it, dict)):
for key in it.keys():
traverse_it(it[key])
else:
do_something_with_real_value(it)
Note that the average object oriented guru will tell you not to do this, and instead create a class tree where one is based on an array, another on a dict and then have a single function to process each with the same function name (ie, a virtual function) and to call that within each class function. IE, if/else trees based on types are "bad". Functions that can be called on an object to deal with its contents in its own way "good".
I think this is what you're trying to do. There is no need to use xrange() to pull out the index from the list since for iterates over each value of the list. In my example below d1 is therefore a reference to the current data[i].
for d1 in data: # iterate over outer list, d1 is a dictionary
for x in d1: # iterate over keys in d1 (the x var is unused)
for d2 in d1['data']: # iterate over the list
# iterate over (key,value) pairs in inner most dict
for k,v in d2.iteritems():
dostuff()
You're also using the name l twice (intentionally or not), but beware of how the scoping works.
well, question is quite old. however, out of my curiosity, I would like to respond to your question for much better answer which I just tried.
Suppose, dictionary looks like: dict1 = { 'a':5,'b': [1,2,{'a':100,'b':100}], 'dict 2' : {'a':3,'b':5}}
Solution:
dict1 = { 'a':5,'b': [1,2,{'a':100,'b':100}], 'dict 2' : {'a':3,'b':5}}
def recurse(dict):
if type(dict) == type({}):
for key in dict:
recurse(dict[key])
elif type(dict) == type([]):
for element in dict:
if type(element) == type({}):
recurse(element)
else:
print element
else:
print dict
recurse(dict1)

Categories