With four dictionaries grandpa, dad, son_1 and son_2:
grandpa = {'name': 'grandpa', 'parents': []}
dad = {'name': 'dad', 'parents': ['grandpa']}
son_1 = {'name': 'son_1', 'parents': ['dad']}
son_2 = {'name': 'son_2', 'parents': ['dad']}
relatives = [son_1, grandpa, dad, son_2]
I want to write a function that sorts all these relatives in a "reverse" order.
So instead of parents there would be children list used. The oldest grandpa would be on the top level of result dictionary, the dad would be below with its children list storing the son_1 and son_2:
def sortRelatives(relatives):
# returns a resulted dictionary:
# logic
result = sortRelatives(relatives)
print result
Which would print:
result = {'name': 'grandpa',
'children': [
{'name': 'dad',
'children': [{'name': 'son_1', 'children': [] },
{'name': 'son_2', 'children': [] }] }
]
}
How to make sortRelatives function perform a such sorting?
What I think is a viable yet simple solution is to first build a children dictionary that will map person names to their children. Then we can use that new data structure to build the output:
from collections import defaultdict
def children(relatives):
children = defaultdict(list)
for person in relatives:
for parent in person['parents']:
children[parent].append(person)
return children
Another tool we can use is a function that would find the root of our genealogy:
def genealogy_root(relatives):
for person in relatives:
if not person['parents']:
return person
raise TypeError("This doesn't look like a valid genealogy.")
That will help us locating the person that has no parent, and will therefore be the root of our genealogy-tree. Now that we have all the necessary tools we just need to build the output:
def build_genealogy(relatives):
relatives_children = children(relatives)
def sub_genealogy(current_person):
name = current_person['name']
return dict(
name=name,
children=[sub_genealogy(child) for child in relatives_children[name]]
)
root = genealogy_root(relatives)
return sub_genealogy(root)
result = build_genealogy(relatives)
print(result)
Which outputs:
{
'name': 'grandpa', 'children': [
{'name': 'dad', 'children': [
{'name': 'son_1', 'children': []},
{'name': 'son_2', 'children': []}
]}
]
}
Note that as I said in the comments, this is only working because there are no name duplicates. If several persons share the same name, you will have to have a better data-structure as input. For example, you may want to have something like:
grandpa = {'name': 'grandpa', 'parents': []}
dad = {'name': 'dad', 'parents': [grandpa]}
son_1 = {'name': 'son_1', 'parents': [dad]}
son_2 = {'name': 'son_2', 'parents': [dad]}
relatives = [grandpa, dad, son_1, son_2]
Related
I've a flat dict with entities. Each entity can have a parent. I'd like to recursively build each entity, considering the parent values.
Logic:
Each entity inherits defaults from its parent (e.g. is_mammal)
Each entity can overwrite the defaults of its parent (e.g. age)
Each entity can add new attributes (e.g. hobby)
I'm struggling to get it done. Help is appreciated, thanks!
entities = {
'human': {
'is_mammal': True,
'age': None,
},
'man': {
'parent': 'human',
'gender': 'male',
},
'john': {
'parent': 'man',
'age': 20,
'hobby': 'football',
}
};
def get_character(key):
# ... recursive magic with entities ...
return entity
john = get_character('john')
print(john)
Expected output:
{
'is_mammal': True, # inherited from human
'gender': 'male' # inherited from man
'parent': 'man',
'age': 20, # overwritten
'hobby': 'football', # added
}
def get_character(entities, key):
try:
entity = get_character(entities, entities[key]['parent'])
except KeyError:
entity = {}
entity.update(entities[key])
return entity
This solution is using recursion and a Python quirk where mutables (here it's a dictionary {}), are shared among function calls. See the discussion below for why this is somewhat surprising, though useful for accumulating recursion results.
def get_character(d, key, entity = {}):
if d.get(key) is None:
return entity
return get_character(d, d.get(key).get('parent'), d.get(key) | entity)
get_character(entities, 'john')
{'is_mammal': True,
'age': 20,
'parent': 'man',
'gender': 'male',
'hobby': 'football'}
I have a list of dictionaries in python and I would like to override old value with duplicate value. Please let me know how can I do.
{'message': [{'name': 'raghav', 'id': 10}, {'name': 'raghav', 'id': 11}]}
Output should be:
{'message': [ {'name': 'raghav', 'id': 11}]}
I don't know what you mean by "override old value with duplicate value". If you mean just picking the second dict from the list, you could:
print({k: [v[1]] for (k, v) in data.items()})
If the idea is to update the "name" with a newer value of "id" as you move along the list, then maybe:
def merge_records(data):
records = data['message']
users = {}
for record in records:
name = record['name']
id_ = record['id']
users[name] = id_
new_records = []
for name, id_ in users.items():
new_records.append({'name': name, 'id': id_})
return {'message': new_records}
But, if you have any control over how the data is represented, you might reconsider. You probably want a different data structure.
Here you go:
d = {'message': [{'name': 'raghav', 'id': 10}, {'name': 'raghav', 'id': 11}]}
#loop over outer dictionary
for key, value in d.items():
d[key] = [dict([t for k in value for t in k.items()])]
print(d)
Edit:
As per your requirement:
d = {'message': [ {'name': 'raghav', 'id': 11}, {'name': 'krish', 'id': 20}, {'name': 'anu', 'id': 30}]}
for key, value in d.items():
print [dict((k1,v1)) for k1,v1 in dict([tuple(i.items()) for i in value for val in i.items()]).items()]
I have python nested dictionary with unfixed order and update the parent count and size value based on child count and size value. But both values are available in last nested dict.
Input Dict-
[{'name': 'stack', 'children':
[{'name': 'flow', 'children':
[{'name': 'lldp', 'children':
[{'name': 'sourc', 'children':
[{'name': 'lldque.jrc', 'count': '11', 'size': '37'}]}]},
{'name': 'arp', 'children':
[{'name': 'src', 'children':
[{'name': 'arpred.cec', 'count': '37', 'size': '67'}]}]}]}]}]
Output Dict should be as below-
[{'name':'stack','count':'4','size':'6','children':
[{'name':'flow','count':'4','size':'6','children':
[{'name':'lldp','count':'1','size':'2','children':
[{'name':'sourc','count':'1','size':'2','children':
[{'name':'lldque.jrc','count':'1','size':'2'}]}]},
{'name':'arp','count':'3','size':'4','children':
[{'name':'src','count':'3','size':'4','children':
[{'name':'arpred.cec','count':'3','size':'4'}]}]}]}]}]
currently I am using below code;
def populateParentDict(ddict):
listl = []
if type(ddict) == list:
if 'children' in ddict[0]:
return populateParentDict(ddict[0])
else:
listl.append(ddict[0]['Buggy'])
listl.append(ddict[0]['Clean'])
elif 'children' in ddict:
return populateParentDict(ddict['children'])
return listl
Simple Python question, but I'm scratching my head over the answer!
I have an array of strings of arbitrary length called path, like this:
path = ['country', 'city', 'items']
I also have a dictionary, data, and a string, unwanted_property. I know that the dictionary is of arbitrary depth and is dictionaries all the way down, with the exception of the items property, which is always an array.
[CLARIFICATION: The point of this question is that I don't know what the contents of path will be. They could be anything. I also don't know what the dictionary will look like. I need to walk down the dictionary as far as the path indicates, and then delete the unwanted properties from there, without knowing in advance what the path looks like, or how long it will be.]
I want to retrieve the parts of the data object (if any) that matches the path, and then delete the unwanted_property from each.
So in the example above, I would like to retrieve:
data['country']['city']['items']
and then delete unwanted_property from each of the items in the array. I want to amend the original data, not a copy. (CLARIFICATION: By this I mean, I'd like to end up with the original dict, just minus the unwanted properties.)
How can I do this in code?
I've got this far:
path = ['country', 'city', 'items']
data = {
'country': {
'city': {
'items': [
{
'name': '114th Street',
'unwanted_property': 'foo',
},
{
'name': '8th Avenue',
'unwanted_property': 'foo',
},
]
}
}
}
for p in path:
if p == 'items':
data = [i for i in data[p]]
else:
data = data[p]
if isinstance(data, list):
for d in data:
del d['unwanted_property']
else:
del data['unwanted_property']
The problem is that this doesn't amend the original data. It also relies on items always being the last string in the path, which may not always be the case.
CLARIFICATION: I mean that I'd like to end up with:
{
'country': {
'city': {
'items': [
{
'name': '114th Street'
},
{
'name': '8th Avenue'
},
]
}
}
}
Whereas what I have available in data is only [{'name': '114th Street'}, {'name': '8th Avenue'}].
I feel like I need something like XPath for the dictionary.
The problem you are overwriting the original data reference. Change your processing code to
temp = data
for p in path:
temp = temp[p]
if isinstance(temp, list):
for d in temp:
del d['unwanted_property']
else:
del temp['unwanted_property']
In this version, you set temp to point to the same object that data was referring to. temp is not a copy, so any changes you make to it will be visible in the original object. Then you step temp along itself, while data remains a reference to the root dictionary. When you find the path you are looking for, any changes made via temp will be visible in data.
I also removed the line data = [i for i in data[p]]. It creates an unnecessary copy of the list that you never need, since you are not modifying the references stored in the list, just the contents of the references.
The fact that path is not pre-determined (besides the fact that items is going to be a list) means that you may end up getting a KeyError in the first loop if the path does not exist in your dictionary. You can handle that gracefully be doing something more like:
try:
temp = data
for p in path:
temp = temp[p]
except KeyError:
print('Path {} not in data'.format(path))
else:
if isinstance(temp, list):
for d in temp:
del d['unwanted_property']
else:
del temp['unwanted_property']
The problem you are facing is that you are re-assigning the data variable to an undesired value. In the body of your for loop you are setting data to the next level down on the tree, for instance given your example data will have the following values (in order), up to when it leaves the for loop:
data == {'country': {'city': {'items': [{'name': '114th Street', 'unwanted_property': 'foo',}, {'name': '8th Avenue', 'unwanted_property': 'foo',},]}}}
data == {'city': {'items': [{'name': '114th Street', 'unwanted_property': 'foo',}, {'name': '8th Avenue', 'unwanted_property': 'foo',},]}}
data == {'items': [{'name': '114th Street', 'unwanted_property': 'foo',}, {'name': '8th Avenue', 'unwanted_property': 'foo',},]}
data == [{'name': '114th Street', 'unwanted_property': 'foo',}, {'name': '8th Avenue', 'unwanted_property': 'foo',},]
Then when you delete the items from your dictionaries at the end you are left with data being a list of those dictionaries as you have lost the higher parts of the structure. Thus if you make a backup reference for your data you can get the correct output, for example:
path = ['country', 'city', 'items']
data = {
'country': {
'city': {
'items': [
{
'name': '114th Street',
'unwanted_property': 'foo',
},
{
'name': '8th Avenue',
'unwanted_property': 'foo',
},
]
}
}
}
data_ref = data
for p in path:
if p == 'items':
data = [i for i in data[p]]
else:
data = data[p]
if isinstance(data, list):
for d in data:
del d['unwanted_property']
else:
del data['unwanted_property']
data = data_ref
def delKey(your_dict,path):
if len(path) == 1:
for item in your_dict:
del item[path[0]]
return
delKey( your_dict[path[0]],path[1:])
data
{'country': {'city': {'items': [{'name': '114th Street', 'unwanted_property': 'foo'}, {'name': '8th Avenue', 'unwanted_property': 'foo'}]}}}
path
['country', 'city', 'items', 'unwanted_property']
delKey(data,path)
data
{'country': {'city': {'items': [{'name': '114th Street'}, {'name': '8th Avenue'}]}}}
You need to remove the key unwanted_property.
names_list = []
def remove_key_from_items(data):
for d in data:
if d != 'items':
remove_key_from_items(data[d])
else:
for item in data[d]:
unwanted_prop = item.pop('unwanted_property', None)
names_list.append(item)
This will remove the key. The second parameter None is returned if the key unwanted_property does not exist.
EDIT:
You can use pop even without the second parameter. It will raise KeyError if the key does not exist.
EDIT 2: Updated to recursively go into depth of data dict until it finds the items key, where it pops the unwanted_property as desired and append into the names_list list to get the desired output.
Using operator.itemgetter you can compose a function to return the final key's value.
import operator, functools
def compose(*functions):
'''returns a callable composed of the functions
compose(f, g, h, k) -> f(g(h(k())))
'''
def compose2(f, g):
return lambda x: f(g(x))
return functools.reduce(compose2, functions, lambda x: x)
get_items = compose(*[operator.itemgetter(key) for key in path[::-1]])
Then use it like this:
path = ['country', 'city', 'items']
unwanted_property = 'unwanted_property'
for thing in get_items(data):
del thing[unwanted_property]
Of course if the path contains non-existent keys it will throw a KeyError - you probably should account for that:
path = ['country', 'foo', 'items']
get_items = compose(*[operator.itemgetter(key) for key in path[::-1]])
try:
for thing in get_items(data):
del thing[unwanted_property]
except KeyError as e:
print('missing key:', e)
You can try this:
path = ['country', 'city', 'items']
previous_data = data[path[0]]
previous_key = path[0]
for i in path:
previous_data = previous_data[i]
previous_key = i
if isinstance(previous_data, list):
for c, b in enumerate(previous_data):
if "unwanted_property" in b:
del previous_data[c]["unwanted_property"]
current_dict = {}
previous_data_dict = {}
for i, a in enumerate(path):
if i == 0:
current_dict[a] = data[a]
previous_data_dict = data[a]
else:
if a == previous_key:
current_dict[a] = previous_data
else:
current_dict[a] = previous_data_dict[a]
previous_data_dict = previous_data_dict[a]
data = current_dict
print(data)
Output:
{'country': {'city': {'items': [{'name': '114th Street'}, {'name': '8th Avenue'}]}}, 'items': [{'name': '114th Street'}, {'name': '8th Avenue'}], 'city': {'items': [{'name': '114th Street'}, {'name': '8th Avenue'}]}}
I have an array of dicts retrieved from a web API. Each dict has a name, description, 'parent', and children key. The children key has an array of dicts as it value. For the sake of clarity, here is a dummy example:
[
{'name': 'top_parent', 'description': None, 'parent': None,
'children': [{'name': 'child_one'},
{'name': 'child_two'}]},
{'name': 'child_one', 'description': None, 'parent': 'top_parent',
'children': []},
{'name': 'child_two', 'description': None, 'parent': 'top_parent',
'children': [{'name': 'grand_child'}]},
{'name': 'grand_child', 'description': None, 'parent': 'child_two',
'children': []}
]
Every item in in the array. An item could be the top-most parent, and thus not exist in any of the children arrays. An item could be both a child and a parent. Or an item could only be a child (have no children of its own).
So, in a tree structure, you'd have something like this:
top_parent
child_one
child_two
grand_child
In this contrived and simplified example top_parent is a parent but not a child; child_one is a child but not a parent; child_two is a parent and a child; and grand_child is a child but not a parent. This covers every possible state.
What I want is to be able to iterate over the array of dicts 1 time and generate a nested dict that properly represents the tree structure (however, it 1 time is impossible, the most efficient way possible). So, in this example, I would get a dict that looked like this:
{
'top_parent': {
'child_one': {},
'child_two': {
'grand_child': {}
}
}
}
Strictly speaking, it is not necessary to have item's without children to not be keys, but that is preferable.
Fourth edit, showing three versions, cleaned up a bit. First version works top-down and returns None, as you requested, but essentially loops through the top level array 3 times. The next version only loops through it once, but returns empty dicts instead of None.
The final version works bottom up and is very clean. It can return empty dicts with a single loop, or None with additional looping:
from collections import defaultdict
my_array = [
{'name': 'top_parent', 'description': None, 'parent': None,
'children': [{'name': 'child_one'},
{'name': 'child_two'}]},
{'name': 'child_one', 'description': None, 'parent': 'top_parent',
'children': []},
{'name': 'child_two', 'description': None, 'parent': 'top_parent',
'children': [{'name': 'grand_child'}]},
{'name': 'grand_child', 'description': None, 'parent': 'child_two',
'children': []}
]
def build_nest_None(my_array):
childmap = [(d['name'], set(x['name'] for x in d['children']) or None)
for d in my_array]
all_dicts = dict((name, kids and {}) for (name, kids) in childmap)
results = all_dicts.copy()
for (name, kids) in ((x, y) for x, y in childmap if y is not None):
all_dicts[name].update((kid, results.pop(kid)) for kid in kids)
return results
def build_nest_empty(my_array):
all_children = set()
all_dicts = defaultdict(dict)
for d in my_array:
children = set(x['name'] for x in d['children'])
all_dicts[d['name']].update((x, all_dicts[x]) for x in children)
all_children.update(children)
top_name, = set(all_dicts) - all_children
return {top_name: all_dicts[top_name]}
def build_bottom_up(my_array, use_None=False):
all_dicts = defaultdict(dict)
for d in my_array:
name = d['name']
all_dicts[d['parent']][name] = all_dicts[name]
if use_None:
for d in all_dicts.values():
for x, y in d.items():
if not y:
d[x] = None
return all_dicts[None]
print(build_nest_None(my_array))
print(build_nest_empty(my_array))
print(build_bottom_up(my_array, True))
print(build_bottom_up(my_array))
Results in:
{'top_parent': {'child_one': None, 'child_two': {'grand_child': None}}}
{'top_parent': {'child_one': {}, 'child_two': {'grand_child': {}}}}
{'top_parent': {'child_one': None, 'child_two': {'grand_child': None}}}
{'top_parent': {'child_one': {}, 'child_two': {'grand_child': {}}}}
You can keep a lazy mapping from names to nodes and then rebuild the hierarchy by processing just the parent link (I'm assuming data is correct, so if A is marked as the parent of B iff B is listed among the children of A).
nmap = {}
for n in nodes:
name = n["name"]
parent = n["parent"]
try:
# Was this node built before?
me = nmap[name]
except KeyError:
# No... create it now
if n["children"]:
nmap[name] = me = {}
else:
me = None
if parent:
try:
nmap[parent][name] = me
except KeyError:
# My parent will follow later
nmap[parent] = {name: me}
else:
root = me
The children property of the input is used only to know if the element should be stored as a None in its parent (because has no children) or if it should be a dictionary because it will have children at the end of the rebuild process. Storing nodes without children as empty dictionaries would simplify the code a bit by avoiding the need of this special case.
Using collections.defaultdict the code can also be simplified for the creation of new nodes
import collections
nmap = collections.defaultdict(dict)
for n in nodes:
name = n["name"]
parent = n["parent"]
me = nmap[name]
if parent:
nmap[parent][name] = me
else:
root = me
This algorithm is O(N) assuming constant-time dictionary access and makes only one pass on the input and requires O(N) space for the name->node map (the space requirement is O(Nc) for the original nochildren->None version where Nc is the number of nodes with children).
My stab at it:
persons = [\
{'name': 'top_parent', 'description': None, 'parent': None,\
'children': [{'name': 'child_one'},\
{'name': 'child_two'}]},\
{'name': 'grand_child', 'description': None, 'parent': 'child_two',\
'children': []},\
{'name': 'child_two', 'description': None, 'parent': 'top_parent',\
'children': [{'name': 'grand_child'}]},\
{'name': 'child_one', 'description': None, 'parent': 'top_parent',\
'children': []},\
]
def findParent(name,parent,tree,found = False):
if tree == {}:
return False
if parent in tree:
tree[parent][name] = {}
return True
else:
for p in tree:
found = findParent(name,parent,tree[p],False) or found
return found
tree = {}
outOfOrder = []
for person in persons:
if person['parent'] == None:
tree[person['name']] = {}
else:
if not findParent(person['name'],person['parent'],tree):
outOfOrder.append(person)
for person in outOfOrder:
if not findParent(person['name'],person['parent'],tree):
print 'parent of ' + person['name'] + ' not found
print tree
results in:
{'top_parent': {'child_two': {'grand_child': {}}, 'child_one': {}}}
It also picks up any children whose parent has not been added yet, and then reconciles this at the end.