Manipulating data structures in Python - python

I have data in JSON format:
data = {"outfit":{"shirt":"red,"pants":{"jeans":"blue","trousers":"khaki"}}}
I'm attempting to plot this data into a decision tree using InfoVis, because it looks pretty and interactive. The problem is that their graph takes JSON data in this format:
data = {id:"nodeOutfit",
name:"outfit",
data:{},
children:[{
id:"nodeShirt",
name:"shirt",
data:{},
children:[{
id:"nodeRed",
name:"red",
data:{},
children:[]
}],
}, {
id:"nodePants",
name:"pants",
data:{},
children:[{
id:"nodeJeans",
name:"jeans",
data:{},
children:[{
id:"nodeBlue",
name:"blue",
data:{},
children[]
},{
id:"nodeTrousers",
name:"trousers",
data:{},
children:[{
id:"nodeKhaki",
name:"khaki",
data:{},
children:[]
}
}
Note the addition of 'id', 'data' and 'children' to every key and value and calling every key and value 'name'. I feel like I have to write a recursive function to add these extra values. Is there an easy way to do this?
Here's what I want to do but I'm not sure if it's the right way. Loop through all the keys and values and replace them with the appropriate:
for name, list in data.iteritems():
for dict in list:
for key, value in dict.items():
#Need something here which changes the value for each key and values
#Not sure about the syntax to change "outfit" to name:"outfit" as well as
#adding id:"nodeOutfit", data:{}, and 'children' before the value
Let me know if I'm way off.
Here is their example http://philogb.github.com/jit/static/v20/Jit/Examples/Spacetree/example1.html
And here's the data http://philogb.github.com/jit/static/v20/Jit/Examples/Spacetree/example1.code.html

A simple recursive solution:
data = {"outfit":{"shirt":"red","pants":{"jeans":"blue","trousers":"khaki"}}}
import json
from collections import OrderedDict
def node(name, children):
n = OrderedDict()
n['id'] = 'node' + name.capitalize()
n['name'] = name
n['data'] = {}
n['children'] = children
return n
def convert(d):
if type(d) == dict:
return [node(k, convert(v)) for k, v in d.items()]
else:
return [node(d, [])]
print(json.dumps(convert(data), indent=True))
note that convert returns a list, not a dict, as data could also have more then one key then just 'outfit'.
output:
[
{
"id": "nodeOutfit",
"name": "outfit",
"data": {},
"children": [
{
"id": "nodeShirt",
"name": "shirt",
"data": {},
"children": [
{
"id": "nodeRed",
"name": "red",
"data": {},
"children": []
}
]
},
{
"id": "nodePants",
"name": "pants",
"data": {},
"children": [
{
"id": "nodeJeans",
"name": "jeans",
"data": {},
"children": [
{
"id": "nodeBlue",
"name": "blue",
"data": {},
"children": []
}
]
},
{
"id": "nodeTrousers",
"name": "trousers",
"data": {},
"children": [
{
"id": "nodeKhaki",
"name": "khaki",
"data": {},
"children": []
}
]
}
]
}
]
}
]

Related

Yaql expression. How can I make two list into json object

I m trying to form json object from two list vn1","vn2","vn3"] and [6,4,5] using below yaql expression
yaql> dict(data=>dict(["name","id"].zip(["vn1","vn2","vn3"],[6,4,5])))
{
"data": {
"name": "vn1",
"id": "vn2"
}
}
I would like below output
{
"data": [
{
"name": "vn1",
"id": 6
},
{
"name": "vn2",
"id": 4
},
{
"name": "vn3",
"id": 5
}
] }
Your json object resembles a dictionary of lists:
# dictionary of empty list
json_dict = { 'data': [] }
label = ['name', 'id']
v = ["vn1","vn2","vn3"]
k = [6,4,5]
# for each value
for i in range(len(v)):
# add each list entry as dictionarys keys, values
json_dict['data'].append({ label[0] : v[i], label[1]: k[i] })
check it matches your needed json object:
json_object = { "data": [ { "name": "vn1", "id": 6 }, { "name": "vn2", "id": 4 }, { "name": "vn3", "id": 5 } ] }
>>> json_dict == json_object
True

Find a value in a list of dictionaries

I have the following list:
{
"id":1,
"name":"John",
"status":2,
"custom_attributes":[
{
"attribute_code":"address",
"value":"st"
},
{
"attribute_code":"city",
"value":"st"
},
{
"attribute_code":"job",
"value":"test"
}]
}
I need to get the value from the attribute_code that is equal city
I've tried this code:
if list["custom_attributes"]["attribute_code"] == "city" in list:
var = list["value"]
But this gives me the following error:
TypeError: list indices must be integers or slices, not str
What i'm doing wrong here? I've read this solution and this solution but din't understood how to access each value.
Another solution, using next():
dct = {
"id": 1,
"name": "John",
"status": 2,
"custom_attributes": [
{"attribute_code": "address", "value": "st"},
{"attribute_code": "city", "value": "st"},
{"attribute_code": "job", "value": "test"},
],
}
val = next(d["value"] for d in dct["custom_attributes"] if d["attribute_code"] == "city")
print(val)
Prints:
st
Your data is a dict not a list.
You need to scan the attributes according the criteria you mentioned.
See below:
data = {
"id": 1,
"name": "John",
"status": 2,
"custom_attributes": [
{
"attribute_code": "address",
"value": "st"
},
{
"attribute_code": "city",
"value": "st"
},
{
"attribute_code": "job",
"value": "test"
}]
}
for attr in data['custom_attributes']:
if attr['attribute_code'] == 'city':
print(attr['value'])
break
output
st

Is there more effective way to get result (O(n+m) rather than O(n*m))?

Origin data as below show, every item has a type mark, such as interests, family, behaviors, etc and I want to group by this type field.
return_data = [
{
"id": "112",
"name": "name_112",
"type": "interests",
},
{
"id": "113",
"name": "name_113",
"type": "interests",
},
{
"id": "114",
"name": "name_114",
"type": "interests",
},
{
"id": "115",
"name": "name_115",
"type": "behaviors",
},
{
"id": "116",
"name": "name_116",
"type": "family",
},
{
"id": "117",
"name": "name_117",
"type": "interests",
},
...
]
And expected ouput data format like:
output_data = [
{"interests":[
{
"id": "112",
"name": "name_112"
},
{
"id": "113",
"name": "name_113"
},
...
]
},
{
"behaviors": [
{
"id": "115",
"name": "name_115"
},
...
]
},
{
"family": [
{
"id": "116",
"name": "name_116"
},
...
]
},
...
]
And here is my trial:
type_list = []
for item in return_data:
if item['type'] not in type_list:
type_list.append(item['type'])
interests_list = []
for type in type_list:
temp_list = []
for item in return_data:
if item['type'] == type:
temp_list.append({"id": item['id'], "name": item['name']})
interests_list.append({type: temp_list})
Obviously my trial is low efficient as it is O(n*m), but I cannot find the more effective way to solve the problem.
Is there more effective way to get the result? any commentary is great welcome, thanks.
Use a defaultdict to store a list of items for each type:
from collections import defaultdict
# group by type
temp_dict = defaultdict(list)
for item in return_data:
temp_dict[item["type"]].append({"id": item["id"], "name": item["name"]})
# convert back into a list with the desired format
output_data = [{k: v} for k, v in temp_dict.items()]
Output:
[
{
'behaviors': [
{'name': 'name_115', 'id': '115'}
]
},
{
'family': [
{'name': 'name_116', 'id': '116'}
]
},
{
'interests': [
{'name': 'name_112', 'id': '112'},
{'name': 'name_113', 'id': '113'},
{'name': 'name_114', 'id': '114'},
{'name': 'name_117', 'id': '117'}
]
},
...
]
If you don't want to import defaultdict, you could use a vanilla dictionary with setdefault:
# temp_dict = {}
temp_dict.setdefault(item["type"], []).append(...)
Behaves in exactly the same way, if a little less efficient.
please see Python dictionary for map.
for item in return_data:
typeMap[item['type']] = typeMap[item['type']] + delimiter + item['name']

Group and sort JSON array of dictionaries by repeatable keys in Python

I have a json that is a list of dictionaries that looks like this:
I am getting it from MySQL with pymysql
[{
"id": "123",
"name": "test",
"group": "test_group"
},
{
"id": "123",
"name": "test",
"group": "test2_group"
},
{
"id": "456",
"name": "test2",
"group": "test_group2"
},
{
"id": "456",
"name": "test2",
"group": "test_group3"
}]
I need to group it so each "name" will have just one dict and it will contain a list of all groups that under this name.
something like this :
[{
"id": "123",
"name": "test",
"group": ["test2_group", "test_group"]
},
{
"id": "456",
"name": "test2",
"group": ["test_group2", "test_group3"]
}]
I would like to get some help,
Thanks !
You can use itertools.groupby for grouping of data.
Although I don't guarantee solution below to be shortest way but it should do the work.
# Your input data
data = []
from itertools import groupby
res = []
key_func = lambda k: k['id']
for k, g in groupby(sorted(data, key=key_func), key=key_func):
obj = { 'id': k, 'name': '', 'group': []}
for group in g:
if not obj['name']:
obj['name'] = group['name']
obj['group'].append(group['group'])
res.append(obj)
print(res)
It should print the data in required format.

Python Parsing multiple JSON objects into one object

I currently have a json file that looks like this....
{
"data": [
{
"tag": "cashandequivalents",
"value": 10027000000.0
},
{
"tag": "shortterminvestments",
"value": 101000000.0
},
{
"tag": "accountsreceivable",
"value": 4635000000.0
},
{
"tag": "netinventory",
"value": 1386000000.0
}...
but what I am trying to get to is this
{
"cashandequivalents": 10027000000.0,
"shortterminvestments":101000000.0 ,
"accountsreceivable":4635000000.0,
"netinventory":1386000000.0
}
I just don't know how to go about this.
Maybe there is an easier way, but this seems the most logical to me because the next step is writer.writerow to csv
So eventually the csv will look like
cashandequivalents | shortterminvestments | accountsreceivable | netinventory
100027000000 101000000000 46350000000 13860000000
########### ############ ########### ...........
(writer.writeheader will be done outside of the loop so I am only writing the values, not the "tags")
Thanks
A naive solution:
import json
json_data = {
"data": [
{
"tag": "cashandequivalents",
"value": 10027000000.0
},
{
"tag": "shortterminvestments",
"value": 101000000.0
},
{
"tag": "accountsreceivable",
"value": 4635000000.0
},
{
"tag": "netinventory",
"value": 1386000000.0
}
]
}
result = dict()
for entry in json_data['data']:
result[entry['tag']] = entry['value']
print json.dumps(result, indent=4)
Output
{
"shortterminvestments": 101000000.0,
"netinventory": 1386000000.0,
"accountsreceivable": 4635000000.0,
"cashandequivalents": 10027000000.0
}
The easiest and cleanest way to do this is with a dictionary comprehension.
d = {
"data": [
{
"tag": "cashandequivalents",
"value": 10027000000.0
},
{
"tag": "shortterminvestments",
"value": 101000000.0
},
{
"tag": "accountsreceivable",
"value": 4635000000.0
},
{
"tag": "netinventory",
"value": 1386000000.0
}
]
}
newDict = {i['tag']: i['value'] for i in d['data']}
# {'netinventory': 1386000000.0, 'shortterminvestments': 101000000.0, 'accountsreceivable': 4635000000.0, 'cashandequivalents': 10027000000.0}
This iterates through the list that is contained within the "data" key of your original dictionary and creates a new one inline with the key being the tag value of each and the value being the value for each during the iterations.

Categories