building json tree from list of file paths - python

I have list of file paths and need them to be organized in a tree structure like the following.
{
"label": "VP Accounting",
"children": [
{
"label": "iWay",
"children": [
{
"label": "Universidad de Especialidades del EspĂ­ritu Santo"
},
{
"label": "Marmara University"
},
{
"label": "Baghdad College of Pharmacy"
}
]
},
{
"label": "KDB",
"children": [
{
"label": "Latvian University of Agriculture"
},
{
"label": "Dublin Institute of Technology"
}
]
},
what I did so far is the following
output = {}
current = {}
for path in paths :
current = output
for segment in path.split("/") :
if segment != '':
if segment not in current:
current[segment] = {}
current = current[segment]
The output is a tree like structure but I can not add the keys ["label", "children"]

The idea is to maintain the dictionary that your code is building as an auxiliary helper structure (which is called helper below, and is simplified to a flat dictionary), and to create the other (desired) structure in parallel (having the lists).
Note that the top level should really be a list as it is not guaranteed all entries will start with the same folder ("segment").
Here is your code adapted to create the children lists and add the labels:
output = []
root = { "children": output }
helper = {}
for path in paths:
current = root
subpath = ""
for segment in path.split("/"):
if "children" not in current:
current["children"] = []
subpath += "/" + segment
if subpath not in helper:
helper[subpath] = { "label": segment }
current["children"].append(helper[subpath])
current = helper[subpath]
print(output)

Related

convert list array to json

I'm working on taking a JSON feed and filtering out only the items I want from my list. I'm appending the items I'd like to keep to each list identifier. However, when I convert to JSON the output is incorrect. You can see the ACTUAL OUTPUT example below. The target output below is what I'm actually expecting. I've tried orienting the list with index and records, but no luck.
#TARGET OUTPUT
{
"id":"1",
"Name":"xxx",
"Image":"https://xxx.xxx.png",
},
{
"id":"2",
"Name":"xx2",
"Image":"https://xx2.xxx.png",
}
#ACTUAL OUTPUT
{
"id": ["1","2",]
},
{
"image":["https://xxx.xxx.png","https://xx2.xxx.png"]
},
{
"name":["xxx", "xx2"]
},
#CODE
# JSON feed
{
"document": {
"id": "1",
"image": "https://xxx.xxx.png",
"name": "xxx",
},
},
{
"document": {
"id": "2",
"image": "https://xx2.xxx.png",
"name": "xx2",
},
},
# create list array
list = {'id':[], 'Name': [], 'Image': []}
links = {'id': [], 'Image': []}
# loop through and append items
def getData(hits):
for item in filter(None, hits):
item = item['document']
list['id'].append(item['id'])
links['id'].append(item['id'])
links['Image'].append(item['image'])
list['Image'].append(item['image'])
list['Name'].append(item['name'])
# get first page
pageNum = 1
data = getDataPerPage(pageNum)
try:
itemsNo = data['found']
getData(data['hits'])
while itemsNo > 24:
itemsNo -= 24
pageNum += 1
data = getDataPerPage(pageNum)
getData(data['hits'])
except:
print("broken")
# save list to json
with open('./output/data_chart.json', 'w') as f:
f.write(json.dumps(list))
When you receive multiple JSON objects, those are in the form of a list (so between []). You could:
covert JSON string to python dictionary using json.loads()
filter using the dict
dump dictionary into a JSON string using json.dumps()
input = """[
{"document":
{"id": "1","image": "https://xxx.xxx.png","name": "xxx"}},
{"document":
{"id": "2","image": "https://xx2.xxx.png","name": "xx2"}}
]"""
input_dic = json.loads(input)
tmp = []
for item in input_dic:
tmp.append(json.dumps(item["document"]))
output = json.dumps(tmp)
print(output)
Hope I got your question.
It's not 100% clear what you have or what you want, but with a few assumptions (input is list of dict, desired output is list of dict):
json_obj = [
{
"document": {
"id": "1",
"image": "https://xxx.xxx.png",
"name": "xxx",
},
},
{
"document": {
"id": "2",
"image": "https://xx2.xxx.png",
"name": "xx2",
},
},
]
desired_output = [x["document"] for x in json_obj]
print(desired_output)

Decision tree in JSON - return leaf to root path for a given leaf

I have a decision tree in a json format (about 11k nodes) and need to have a function which returns a path from a given leaf to root.
Before coding this from scratch, do you know if there exists any python json format based code that returns such path?
For example, if "Predict: 59.0" is my leaf, when I run getMyPath("Predict: 59.0") I'd like to receive something like below:
{
"name": "Root",
"children": [
{
"name": "x<= 0.09",
"children": [
{
"name": "y<= 281.0",
"children": [
{
"name": "z<= 217.75400000000002",
"children": [
{
"name": "z<= -0.01",
"children": [
{
"name": "z<= -64.83",
"children": [
{
"name": "Predict: 59.0"
Thanks, Michal
Nope, you'll have to write a recursive function to look through each and every path of the decision tree structure.

convert file path list to tree

There is a python file path list like below:
file_path_list = ["test/dir1/log.txt", "test/dir1/dir2/server.txt", "test/manage/img.txt"]
I want to convert it to a tree. the expect result is below:
tree_data = [
{
"path": "test",
"children": [
{
"path": "dir1",
"children": [
{
"path": "log.txt"
},
{
"path": "dir2",
"children": [
{
"path": "server.txt"
}
]
}
]
},
{
"path": "manage",
"children": [
{
"path": "img.txt",
}
]
}
]
}
]
What's the best way to convert?
update: my code is below, but I think it's not well.
def list2tree(file_path):
"""Convert list to tree."""
tree_data = [{
"path": "root",
"children": []
}]
for f in file_path:
node_path = tree_data[0]
pathes = f.split("/")
for i, p in enumerate(pathes):
length = len(node_path["children"])
if not length or node_path["children"][length - 1]["path"] != p:
# create new node
new_node = {
"path": p,
}
if i != len(pathes) - 1: # middle path
new_node["children"] = list()
node_path["children"].append(new_node)
node_path = new_node
else:
node_path = node_path["children"][length - 1]
return tree_data
I think this way is not the best. any ideas? Thank you very much!
One way is to split the strings at '/' and put them in a defaultdict of defaultdicts, see defaultdict of defaultdict, nested.

Build a graph recursively - breadth first

i want to build a "graph" in the following style:
{
"name":cersei
"children": [
{
"name": "baratheon",
"children": [
{
"name": "cersei",
"children": []
},
{
"name": "baratheon",
"children": [],
}
],
},
{
"name": "joffrey",
"children": [
{
"name": "robert",
"children": []
},
{
"name": "cersei",
"children": []
}
]
}
]
}
But i build this via depth-first. That means the first element of "children" is fully build, after that the second element of "children" is build. This is the recurse-function:
def recurse(dicts, depth):
if depth >=0:
dicts["children"] = []
child_elements = [] //do something to get your child-elements
for child in child_elements:
if depth >=0:
child_dict = dict(name=word[0])
dicts["children"].append(child_dict)
recurse(child_dict, depth-1)
How can i change the code that it builds the whole "level" first and appends the childrens-children later? I've got the problem, that i don't know how to call the level 1 dictionary, because its kind of a dictionary in the dictionary...
Kind regards and thanks for your help, FFoDWindow.
--------------------****UPDATE****---------------------------
I solved the issue myself. Actually it was pretty simple. I only had to save the temporarly build "children"-elements in an extra list. Now the tree is build breadth-first. Here is my recurse function:
def recurse( input_dict, level_list, depth):
next_level_list = []
for dictionary in input_dict:
child_elements = [...] //get the data for your children
for child_element in child_elements:
dictionary["children"].append = dict(name = child_element)
next_level_list.append(dictionary["children"][-1])
if depth >=0:
recurse(input_dict, next_level_list, depth-1)

How to create a tree in NetworkX and display it in D3.js

I have to write a piece of code which takes some data from MySQL and use the data to make a graph in D3.js. My D3.js code is working when I give it static data in this format:
{
"name": " ",
"children": [{
"name": "HSC",
"size": 0.20,
"children": [
[{
"name": "MPP",
"size": 15,
"children": [{
"name": "CMP",
"size": 8,
"children": [{
"name": "MEP",
"size": 6
}, {
"name": "GMP",
"size": 10,
"children": [{
"name": "early PM",
"size": 2,
"children": [{
"name": "early PM",
"size": 2,
"children": []
}]
}]
}]
}]
}]
},
{
"name": "AML",
"size": 1,
"children": [{
"name": "AML t(8,21)",
"size": 30
}, {
"name": "AML t(11q23)",
"size": 10
}, {
"name": "AML inv(16)",
"size": 8
}, {
"name": "AML t(15,17)",
"size": 11
}]
}
], "size": 1
}
So I need a code written in python which can make the JSON structure like above using NetworkX.
My code so far:
n = 1 # the number of children for each node
depth = 1 # number of levels, starting from 0
G = nx.Graph()
G.add_node(1) # initialize root
ulim = 0
for level, row in enumerate(rows): #each row contains name and a size
print row # loop over each level
nl = n**level # number of nodes on a given level
llim = ulim + 1 # index of first node on a given level
ulim = ulim + nl # index of last node on a given level
for i in range(nl): # loop over nodes (parents) on a given level
parent = llim + i
offset = ulim + i * n + 1 # index pointing to node just before first child
for j in range(n): # loop over children for a given node (parent)
child = offset + j
G.add_node(child)
G.add_edge(parent, child)
Since you are using NetworkX you can call the json_graph.tree_data() generator.
See the documentation here:http://networkx.lanl.gov/reference/generated/networkx.readwrite.json_graph.tree_data.html#networkx.readwrite.json_graph.tree_data
which has the example
>>> from networkx.readwrite import json_graph
>>> G = nx.DiGraph([(1,2)])
>>> data = json_graph.tree_data(G,root=1)

Categories