Python - Recursive function to serialise family tree information into JSON

Python - Recursive function to serialise family tree information into JSON - python

I am having difficulty creating a function that will produce a family tree in JSON format.
An example of a two parent, two offspring tree can be seen here:
{
"children": [
{
"id": 409,
"name": "Joe Bloggs",
"no_parent": "true"
},
{
"children": [
{
"children": [],
"id": 411,
"name": "Alice Bloggs"
},
{
"children": [],
"id": 412,
"name": "John Bloggs"
}
],
"hidden": "true",
"id": "empty_node_id_9",
"name": "",
"no_parent": "true"
},
{
"children": [],
"id": 410,
"name": "Sarah Smith",
"no_parent": "true"
}
],
"hidden": "true",
"id": "year0",
"name": ""
}
Joe Bloggs is married to Sarah Smith, with children Alice Bloggs and John Bloggs. The empty nodes exist purely to handle vertices in the tree-map diagram (see jsfiddle below).
The above example should help explain the syntax. A more complex tree can be found on this jsfiddle: http://jsfiddle.net/cyril123/0vbtvoon/22/
The JSON associated with the jsfiddle can be found from lines 34 to lines 101.
I am having difficulty writing a function that recursively produces the JSON for a family tree. I begin with a person class that represents the oldest member of the family. The function would then checks for marriages, for children etc and continues until the tree is complete, returning the json.
My code involves a person class as well as an associated marriage class. I have appropriate methods such as ids for each person, get_marriage() function, get_children() methods etc. I am wondering the best way to go about this is.
My attempt at a recursive function can be found below. The methods/functions involved etc are not detailed but their purpose should be self-explanatory. Many thanks.
def root_nodes(people, first_node=False): #begin by passing in oldest family member and first_node=True
global obj, current_obj, people_used
if obj is not None: print len(str(obj))
if type(people) != list:
people = [people]
for x in people:
if x in rootPeople and first_node: #handles the beginning of the JSON with an empty 'root' starting node.
first_node = False
obj = {'name': "", 'id': 'year0', 'hidden': 'true', 'children': root_nodes(people)}
return obj
else:
marriage_info = get_marriage(x)
if marriage_info is None: #if person is not married
current_obj = {'name': x.get_name(), 'id': x.get_id(), 'children': []}
people_used.append(x)
else:
partners = marriage_info.get_members()
husband, wife = partners[0].get_name(), partners[1].get_name()
husband_id, wife_id = marriage_info.husband.get_id(), marriage_info.wife.get_id()
marriage_year = marriage_info.year
children = marriage_info.get_children()
people_used.append(partners[0])
people_used.append(partners[1])
if partners[0].get_parents() == ['None', 'None'] or partners[1].get_parents() == ['None', 'None']:
if partners[0].get_parents() == ['None', 'None'] and partners[1].get_parents() == ['None', 'None']:
current_obj = {'name': str(husband), 'id': husband_id, 'no_parent': 'true'}, {'name': '', 'id': 'empty_node_id_' + empty_node(), 'no_parent': 'true', 'hidden': 'true', 'children': root_nodes(children)}, {'name': str(wife), 'id': wife_id, 'no_parent': 'true', 'children': []}
if partners[0].get_parents() == ['None', 'None'] and partners[1].get_parents() != ['None', 'None']:
current_obj = {'name': str(husband), 'id': husband_id, 'no_parent': 'true'}, {'name': '', 'id': 'empty_node_id_' + empty_node(), 'no_parent': 'true', 'hidden': 'true', 'children': root_nodes(children)}, {'name': str(wife), 'id': wife_id, 'children': []}
if partners[0].get_parents() != ['None', 'None'] and partners[1].get_parents() == ['None', 'None']:
current_obj = {'name': str(husband), 'id': husband_id}, {'name': '', 'id': 'empty_node_id_' + empty_node(), 'no_parent': 'true', 'hidden': 'true', 'children': root_nodes(children)}, {'name': str(wife), 'id': wife_id, 'no_parent': 'true', 'children': []}
else:
if not any((True for x in partners[0].get_parents() if x in people_used)):
current_obj = {'name': str(husband), 'id': husband_id, 'no_parent' : 'true'}, {'name': '', 'id': 'empty_node_id_' + empty_node(), 'no_parent': 'true', 'hidden': 'true', 'children': root_nodes(children)}, {'name': str(wife), 'id': wife_id, 'children': []}
elif not any((True for x in partners[1].get_parents() if x in people_used)):
current_obj = {'name': str(husband), 'id': husband_id}, {'name': '', 'id': 'empty_node_id_' + empty_node(), 'no_parent': 'true', 'hidden': 'true', 'children': root_nodes(children)}, {'name': str(wife), 'id': wife_id, 'no_parent': 'true', 'children': []}
else:
current_obj = {'name': str(husband), 'id': husband_id}, {'name': '', 'id': 'empty_node_id_' + empty_node(), 'no_parent': 'true', 'hidden': 'true', 'children': root_nodes(children)}, {'name': str(wife), 'id': wife_id, 'children': []}
return current_obj
if obj is None:
obj = current_obj
else:
obj = obj, current_obj
if people.index(x) == len(people)-1:
return obj
Even though the function above is badly written - it is almost successful. The only instance where it fails is if one child is married, then the other children are missed out from the JSON. This is because obj is returned without going to the next iteration in the for loop. Any suggestions on how to fix this would be appreciated.

Related

list with a tuple of dicts instead of list of dicts

I have list of dicts
[{'id': 14786,
'sku': '0663370-ZWA',
'sizes': ['38', '40', '42', '44', '46'],
'color': 'zwart'},
{'id': 14787,
'sku': '0663371-ZWA',
'sizes': ['38', '40', '42', '44', '46'],
'color': 'zwart'}]
want to place it in a datastructure for update attributes with woocommerce api
list_of_update_items = []
for index in lst_of_dcts:
lst_of_attributes_items=[]
attributes = {'id': 1, 'name': 'kleur', 'position': 0,'options': index['color'], 'variations': 'false','visible': 'true'},{'id': 6, 'options': index['sizes'], 'variations': 'true','visible': 'true'}
# make a list of attributes_items to use as value in the parent dict
lst_of_attributes_items.append(attributes)
#make a dict with id as key and atribute_list as key
update = {'id': index['id'], 'attributes': lst_of_attributes_items}
list_of_update_items.append(update)
#clear the list of attributes_items for the next iteration
lst_of_attributes_items=[]
#wrap in the top_dict
data = {'update': list_of_update_items}
data results in
{'update': [{'id': 14786,
'attributes': [({'id': 1,
'name': 'kleur',
'position': 0,
'options': 'zwart',
'variations': 'false',
'visible': 'true'},
{'id': 6,
'options': ['38', '40', '42', '44', '46'],
'variations': 'true',
'visible': 'true'})]},
{'id': 14787,
'attributes': [({'id': 1,
'name': 'kleur',
'position': 0,
'options': 'zwart',
'variations': 'false',
'visible': 'true'},
{'id': 6,
'options': ['38', '40', '42', '44', '46'],
'variations': 'true',
'visible': 'true'})]}]}
makes a tuple of dicts in the list of 'attributes'_values. what I need is a list of dicts. What causes the tupple?

You use an intermediate:
lst_of_attributes_items=[]
Which you append a tuple of dicst to:
attributes = {'id': 1, 'name': 'kleur', 'position': 0,'options': index['color'], 'variations': 'false','visible': 'true'},{'id': 6, 'options': index['sizes'], 'variations': 'true','visible': 'true'}
list_of_update_items.append(update)
Notice,
attributes = {...}, {...}
Creates a tuple.
And then that list, with a single item (which is a tuple with two dicts) is set to the "attribute" key of your update dict:
update = {'id': index['id'], 'attributes': lst_of_attributes_items}
This doesn't make any sense, just make attributes a list instead of a tuple (since you dont want tuples... dont use tuples) then just put that directlyin the update dict. The whole thing can be simplified to:
lst_of_dcts = [{'id': 14786,
'sku': '0663370-ZWA',
'sizes': ['38', '40', '42', '44', '46'],
'color': 'zwart'},
{'id': 14787,
'sku': '0663371-ZWA',
'sizes': ['38', '40', '42', '44', '46'],
'color': 'zwart'}]
list_of_update_items = []
for index in lst_of_dcts:
attributes = [
{'id': 1, 'name': 'kleur', 'position': 0,'options': index['color'], 'variations': 'false','visible': 'true'},
{'id': 6, 'options': index['sizes'], 'variations': 'true','visible': 'true'}
]
list_of_update_items.append({'id': index['id'], 'attributes': attributes})
data = {'update': list_of_update_items}
Indeed, you could do the whole thing with a list comprehension (although, I probably wouldn't, since it isn't as readable to me):
update_items = [
{
"id": index["id"],
"attributes": [
{
"id": 1,
"name": "kleur",
"position": 0,
"options": index["color"],
"variations": "false",
"visible": "true",
},
{
"id": 6,
"options": index["sizes"],
"variations": "true",
"visible": "true",
},
],
}
for index in lst_of_dcts
]
Although, using a helper function makes it pretty nice:
def update_from_index(index):
return {
"id": index["id"],
"attributes": [
{
"id": 1,
"name": "kleur",
"position": 0,
"options": index["color"],
"variations": "false",
"visible": "true",
},
{
"id": 6,
"options": index["sizes"],
"variations": "true",
"visible": "true",
},
],
}
update_items = [update_from_index(index) for index in lst_of_dcts]

create new json data from values in a dictionary and nested dictionary

i have 2 dictionaries and i wish to create a new json data with values from both dictionaries as follows.
dic_a = [{'name': 'puskas',
'description': 'puskas is the command center for football',
'size': '251-1K',
'revenue': '$50M-$100M',
'industryTags': ['football federation']}]
dic_b = {'page': 1,
'total': 14,
'results': [{'id': 'i01',
'name': {'fullName': 'luka modric',
'givenName': 'luka',
'familyName': 'modric'},
'role': 'leadership',
'subRole': 'ceo',
'title': 'CEO',
'company': {'name': 'puskas'},
'email': 'luka#puskas.com',
'verified': True},
{'id': 'i02',
'name': {'fullName': 'gucci mane',
'givenName': 'gucci',
'familyName': 'mane'},
'role': 'leadership',
'subRole': 'founder',
'title': 'Co-founder, CTO',
'company': {'name': 'puskas'},
'email': 'gucchi.mane#puskas.com',
'verified': True},
{'id': 'i03',
'name': {'fullName': 'tom ford',
'givenName': 'tom',
'familyName': 'ford'},
'role': 'leadership',
'subRole': 'founder',
'title': 'founder',
'company': {'name': 'puskas'},
'email': 'tomford#puskas.com',
'verified': True}]}
i want to take select values from b, append to a and then convert to json and return as c.
i have tried a few codes off of some syntax i researched here but it don’t work. i am expecting the json result to look like this
json_c = [{'name': 'puskas',
'description': 'puskas is the command center for football',
'size': '251-1K',
'revenue': '$50M-$100M',
'industryTags': ['football federation'],
'leads': [{'id': 'i01',
'name': 'luka modric',
'title': 'CEO',
'company': {'name': 'puskas'},
'email': 'luka#puskas.co',
'verified': True},
{'id': 'i02',
'name': 'gucci mane',
'title': 'Co-founder, CTO',
'company': {'name': 'gucci'},
'email': 'gucchi.mane#gucci.com',
'verified': True},
{'id': 'i03',
'name': 'tom ford',
'title': 'founder',
'company': {'name': 'xyz'},
'email': 'tomford#xyz.co',
'verified': True}]}]

such problems can be solved easily with jmespath
import jmespath
import json
c = dic_a
c[0]['leads'] = jmespath.search('results[].{id:id, name:name.fullName,title:title ,company:company,email:email,verified:verified }',dic_b)
json_string = json.dumps(c, indent=4, ensure_ascii=False)
print(json_string)
# [
# {
# "name": "puskas",
# "description": "puskas is the command center for football",
# "size": "251-1K",
# "revenue": "$50M-$100M",
# "industryTags": [
# "football federation"
# ],
# "leads": [
# {
# "id": "i01",
# "name": "luka modric",
# "title": "CEO",
# "company": {
# "name": "puskas"
# },
# "email": "luka#puskas.com",
# "verified": true
# },
# {
# "id": "i02",
# "name": "gucci mane",
# "title": "Co-founder, CTO",
# "company": {
# "name": "puskas"
# },
# "email": "gucchi.mane#puskas.com",
# "verified": true
# },
# {
# "id": "i03",
# "name": "tom ford",
# "title": "founder",
# "company": {
# "name": "puskas"
# },
# "email": "tomford#puskas.com",
# "verified": true
# }
# ]
# }
# ]

How extract the key from dictionary and make it as primary key

Myd is below
{ 'Owner': [ { 'id': '1', 'name': 'John', 'contactEmail': 'john#nif.com', 'role': 'Owner' }, { 'id': '2', 'contactName': 'Work', 'contactEmail': 'work#nif.com', 'role': 'Owner' } ], 'Manager': [ { 'id': '1', 'name': 'John', 'contactEmail': 'john#nif.com', 'role': 'Manager' } ] }
Extract id to outside
Add entire dictionary into a new key called 'employee'
For the same key role are there in two different keys merge to one
id=1 role is present as Owner and Manager, output will role:['Manager', 'Owner']
Expected out
{ 'employee': { '1': { 'email': 'john#nif.com', 'name': 'John', 'role': [ 'Owner', 'Manager' ] }, '2': { 'email': 'work#nif.com', 'name': 'Work', 'role': [ 'Owner' ] } } }
emp = {}
for key,val in event.items():
for each in val:
# [{'employee': key, **val} for key, val in event.items()] if event else []
emp['employee'] = each['id']
emp['name'] = each['name']
using python native method

Here's a try without using third party lib:
myd = {
'Owner': [
{ 'id': '1', 'name': 'John', 'contactEmail': 'john#nif.com', 'role': 'Owner' },
{ 'id': '2', 'contactName': 'Work', 'contactEmail': 'work#nif.com', 'role': 'Owner' }
],
'Manager': [ { 'id': '1', 'name': 'John', 'contactEmail': 'john#nif.com', 'role': 'Manager' } ]
}
empl_dict = {}
for employees in myd.values():
for emp in employees:
emp_id = emp.pop('id')
emp_role = emp.pop('role')
empl_dict[emp_id] = empl_dict.get(emp_id, {})
empl_dict[emp_id].update(emp)
empl_dict[emp_id]['role'] = empl_dict[emp_id].get('role', [])
empl_dict[emp_id]['role'].append(emp_role)
all_employees = {'employee': empl_dict}
print(all_employees)
results in:
{'employee': {'1': {'name': 'John', 'contactEmail': 'john#nif.com', 'role': ['Owner', 'Manager']}, '2': {'contactName': 'Work', 'contactEmail': 'work#nif.com', 'role': ['Owner']}}}

You can use pandas to achieve this
Converting to pandas dataframe followed by groupby on contactEmail and aggregating results in required manner
df = pd.concat([pd.DataFrame(v).assign(key=k) for k,v in a.items()])
res = df.groupby('contactEmail').agg({'role':list,'name':'first'}).reset_index().T.to_dict()
{'employee':res}
out:
{'employee': {0: {'contactEmail': 'john#nif.com',
'role': ['Owner', 'Manager'],
'name': 'John'},
1: {'contactEmail': 'work#nif.com', 'role': ['Owner'], 'name': nan}}}
Edit:
if you want to achieve this in python
for OM in a.keys():
for ids in a[OM]:
ids['role'] = [OM]
total_recs = sum(list(a.values()),[])
res = {}
for rec in total_recs:
ID = rec['id']
if ID not in res.keys():
rec.pop('id')
res[ID] = rec
else:
rec.pop('id')
res[ID]['role'].extend(rec['role'])
{'employee':res}
Out:
{'employee': {'1': {'name': 'John',
'contactEmail': 'john#nif.com',
'role': ['Owner', 'Manager']},
'2': {'contactName': 'Work',
'contactEmail': 'work#nif.com',
'role': ['Owner']}}}

How can I implement this recursion in python?

Let's say that I have a Dictionary like this
dict1 = [{
'Name': 'Team1',
'id': '1',
'Members': [
{
'type': 'user',
'id': '11'
},
{
'type': 'user',
'id': '12'
}
]
},
{
'Name': 'Team2',
'id': '2',
'Members': [
{
'type': 'group'
'id': '1'
},
{
'type': 'user',
'id': '21'
}
]
},
{
'Name': 'Team3',
'id': '3',
'Members': [
{
'type': 'group'
'id': '2'
}
]
}]
and I want to get an output that can replace all the groups and nested groups with all distinct users.
In this case the output should look like this:
dict2 = [{
'Name': 'Team1',
'id': '1',
'Members': [
{
'type': 'user',
'id': '11'
},
{
'type': 'user',
'id': '12'
}
]
},
{
'Name': 'Team2',
'id': '2',
'Members': [
{
'type': 'user',
'id': '11'
},
{
'type': 'user',
'id': '12'
}
{
'type': 'user',
'id': '21'
}
]
},
{
'Name': 'Team3',
'id': '3',
'Members': [
{
'type': 'user',
'id: '11'
},
{
'type': 'user',
'id': '12'
}
{
'type': 'user',
'id': '21'
}
]
}]
Now let's assume that I have a large dataset to perform these actions on. (approx 20k individual groups)
What would be the best way to code this? I am attempting recursion, but I am not sure about how to search through the dictionary and lists in this manner such that it doesn't end up using too much memory

I do not think you need recursion. Looping is enough.
I think you can simply evaluate each Memberss, fetch users if group type, and make them unique. Then you can simply replace Members's value with distinct_users.
You might have a dictionary for groups like:
group_dict = {
'1': [
{'type': 'user', 'id': '11'},
{'type': 'user', 'id': '12'}
],
'2': [
{'type': 'user', 'id': '11'},
{'type': 'user', 'id': '12'},
{'type': 'user', 'id': '21'}
],
'3': [
{'type': 'group', 'id': '1'},
{'type': 'group', 'id': '2'},
{'type': 'group', 'id': '3'} # recursive
]
...
}
You can try:
def users_in_group(group_id):
users = []
groups_to_fetch = []
for user_or_group in group_dict[group_id]:
if user_or_group['type'] == 'group':
groups_to_fetch.append(user_or_group)
else: # 'user' type
users.append(user_or_group)
groups_fetched = set() # not to loop forever
while groups_to_fetch:
group = groups_to_fetch.pop()
if group['id'] not in groups_fetched:
groups_fetched.add(group['id'])
for user_or_group in group_dict[group['id']]:
if user_or_group['type'] == 'group' and user_or_group['id'] not in groups_fetched:
groups_to_fetch.append(user_or_group)
else: # 'user' type
users.append(user_or_group)
return users
def distinct_users_in(members):
distinct_users = []
def add(user):
if user['id'] not in user_id_set:
distinct_users.append(user)
user_id_set.add(user['id'])
user_id_set = set()
for member in members:
if member['type'] == 'group':
for user in users_in_group(member['id']):
add(user)
else: # 'user'
user = member
add(user)
return distinct_users
dict2 = dict1 # or `copy.deepcopy`
for element in dict2:
element['Members'] = distinct_users_in(element['Members'])
Each Members is re-assigned by distinct_users returned by the corresponding function.
The function takes Members and fetches users from each if member type. If user type, member itself is a user. While (fetched) users are appended to distinct_user, you can use their ids for uniquity.
When you fetch users_in_group, you can use two lists; groups_to_fetch and groups_fetched. The former is a stack to recursively fetch all groups in a group. The latter is not to fetch an already fetched group again. Or, it could loop forever.
Finally, if your data are already in memory, this approach may not exhaust memory and work.

Sort and return all of nested dictionaries based on specified key value

I am trying to re-arrange the contents of a nested dictionaries where it will check the value of a specified key.
dict_entries = {
'entries': {
'AzP746r3Nl': {
'uniqueID': 'AzP746r3Nl',
'index': 2,
'data': {'comment': 'First Plastique Mat.',
'created': '17/01/19 10:18',
'project': 'EMZ',
'name': 'plastique_varA',
'version': '1'},
'name': 'plastique_varA',
'text': 'plastique test',
'thumbnail': '/Desktop/mat/plastique_varA/plastique_varA.jpg',
'type': 'matEntry'
},
'Q2tch2xm6h': {
'uniqueID': 'Q2tch2xm6h',
'index': 0,
'data': {'comment': 'Camino from John Inds.',
'created': '03/01/19 12:08',
'project': 'EMZ',
'name': 'camino_H10a',
'version': '1'},
'name': 'camino_H10a',
'text': 'John Inds : Camino',
'thumbnail': '/Desktop/chips/camino_H10a/camino_H10a.jpg',
'type': 'ChipEntry'
},
'ZeqCFCmHqp': {
'uniqueID': 'ZeqCFCmHqp',
'index': 1,
'data': {'comment': 'Prototype Bleu.',
'created': '03/01/19 14:07',
'project': 'EMZ',
'name': 'bleu_P23y',
'version': '1'},
'name': 'bleu_P23y',
'text': 'Bleu : Prototype',
'thumbnail': '/Desktop/chips/bleu_P23y/bleu_P23y.jpg',
'type': 'ChipEntry'
}
}
}
In my above nested dictionary example, I am trying to check it by the name and created key (2 functions each) and once it has been sorted, the index value will be updated accordingly as well...
Even so, I am able to query for the values of the said key(s):
for item in dict_entries.get('entries').values():
#The key that I am targetting at
tar_key = item['name']
but this is returning me the value of the name key and I am unsure on my next step as I am trying to sort by the value of the name key and capturing + re-arranging all the contents of the nested dictionaries.
This is my desired output (if checking by name):
{'entries': {
'ZeqCFCmHqp': {
'uniqueID': 'ZeqCFCmHqp',
'index': 1,
'data': {'comment': 'Prototype Bleu.',
'created': '03/01/19 14:07',
'project': 'EMZ',
'name': 'bleu_P23y',
'version': '1'},
'name': 'bleu_P23y',
'text': 'Bleu : Prototype',
'thumbnail': '/Desktop/chips/bleu_P23y/bleu_P23y.jpg',
'type': 'ChipEntry'
}
'Q2tch2xm6h': {
'uniqueID': 'Q2tch2xm6h',
'index': 0,
'data': {'comment': 'Camino from John Inds.',
'created': '03/01/19 12:08',
'project': 'EMZ',
'name': 'camino_H10a',
'version': '1'},
'name': 'camino_H10a',
'text': 'John Inds : Camino',
'thumbnail': '/Desktop/chips/camino_H10a/camino_H10a.jpg',
'type': 'ChipEntry'
},
'AzP746r3Nl': {
'uniqueID': 'AzP746r3Nl',
'index': 2,
'data': {'comment': 'First Plastique Mat.',
'created': '17/01/19 10:18',
'project': 'EMZ',
'name': 'plastique_varA',
'version': '1'},
'name': 'plastique_varA',
'text': 'plastique test',
'thumbnail': '/Desktop/mat/plastique_varA/plastique_varA.jpg',
'type': 'matEntry'
}
}
}

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python - Recursive function to serialise family tree information into JSON - python

Related

list with a tuple of dicts instead of list of dicts

create new json data from values in a dictionary and nested dictionary

How extract the key from dictionary and make it as primary key

How can I implement this recursion in python?

Sort and return all of nested dictionaries based on specified key value

Categories

Resources