How to extract data from complex JSON object? - python

I am trying to extract data from the json file I got from a get request.
{
"data": [
{
"type": "Projects",
"id": "102777c7-50a7-592d-1b65-621d5850a5bb",
"attributes": {
"name": "Hydroelectric Project Updated from Postman",
"projectid": "001"
},
"relationships": {
"Accounts": "Account1"
"Notes": "Note1"
}
},
{
"type": "Projects",
"id": "102c7131-d797-c085-d248-621d5820494f",
"attributes": {
"name": "Ana Hydroelectric Project",
"projectid": "002"
},
"relationships": {
"Accounts": "Account1"
"Notes": "Note1"
}
},
{
"type": "Projects",
"id": "1041f300-5acf-4bd9-2ec4-621d58bbe6bc",
"attributes": {
"name": "Methane Capture Project",
"projectid": "003"
},
"relationships": {
"Accounts": "Account1"
"Notes": "Note1"
}
}
]
}
I have an empty dictionary that stores projectid as Key.
projectids = {
001:"",
002:"",
003:"",
004:"",
}
I was looking for a way to find "projectid" inside "attributes" and the corresponding value for "id" and populate the dictionary projectids with the key(['attributes']['projectid']) and values(id):
{
"001": "102777c7-50a7-592d-1b65-621d5850a5bb",
"002": "102c7131-d797-c085-d248-621d5820494f",
"003": "1041f300-5acf-4bd9-2ec4-621d58bbe6bc",
"004": ""
}

You can try this, assuming data is your variable for the response from the GET request
# this solution will populate for all project ids
projectids = {}
for item in data['data']:
projectids[item['attributes']['projectid']] = item['id']
Output:
{
'001': '102777c7-50a7-592d-1b65-621d5850a5bb',
'002': '102c7131-d797-c085-d248-621d5820494f',
'003': '1041f300-5acf-4bd9-2ec4-621d58bbe6bc'
}
if you're trying to match with already existing projectids in a dict then try
# this solution will search for only pre-specified project ids
projectids = {
"001": "",
"002": "",
"003": "",
"004": "",
}
for idx in projectids.keys():
# find the index of matching dict from data['data']
# will return None if match is not found
matching_index = next((i for i, item in enumerate(data['data']) if
item["attributes"]["projectid"] == idx), None)
if matching_index is not None:
projectids[idx] = data['data'][matching_index]['id']

If data is your input data from the question, then:
projectids = {f"{i:>03}": "" for i in range(1, 5)}
out = {
**projectids,
**{d["attributes"]["projectid"]: d["id"] for d in data["data"]},
}
print(out)
Prints:
{
"001": "102777c7-50a7-592d-1b65-621d5850a5bb",
"002": "102c7131-d797-c085-d248-621d5820494f",
"003": "1041f300-5acf-4bd9-2ec4-621d58bbe6bc",
"004": "",
}

Simply try this:
json_data = {
"data": [
{
"type": "Projects",
"id": "102777c7-50a7-592d-1b65-621d5850a5bb",
"attributes": {
"name": "Hydroelectric Project Updated from Postman",
"projectid": "001"
},
"relationships": {
"Accounts": "Account1",
"Notes": "Note1"
}
},
{
"type": "Projects",
"id": "102c7131-d797-c085-d248-621d5820494f",
"attributes": {
"name": "Ana Hydroelectric Project",
"projectid": "002"
},
"relationships": {
"Accounts": "Account1",
"Notes": "Note1"
}
},
{
"type": "Projects",
"id": "1041f300-5acf-4bd9-2ec4-621d58bbe6bc",
"attributes": {
"name": "Methane Capture Project",
"projectid": "003"
},
"relationships": {
"Accounts": "Account1",
"Notes": "Note1"
}
}
]
}
Just asumme the above json data and try the following code:
project_ids = {item['attributes']['projectid']:item['id'] for item in json_data['data']}
expected output:
{'001': '102777c7-50a7-592d-1b65-621d5850a5bb',
'002': '102c7131-d797-c085-d248-621d5820494f',
'003': '1041f300-5acf-4bd9-2ec4-621d58bbe6bc'}

Related

A Python script that can navigate a .json and export a .csv based on a search term

I want to take a .json of "_PRESET..." items and their "code-state"s with "actions" that contain other "code-state"s, "appearance"s, and "switch"s and turn it into .csv produced from the actions under a given "_PRESET...", including the "code-state"s and the "actions" listed under their individual entries.
This would allow a user to enter the "_PRESET..." name and receive a 3-column .csv file containing each action's "type", name, and "value". There are of course ways to export the entire .json easily, but I can't fathom a way to navigate it like is needed.
enters "_PRESET_Config_A" for
input.json:
{
"abc_data": {
"_PRESET_Config_A": {
"properties": {
"category": "configuration",
"name": "_PRESET_Config_A",
"collection": null,
"description": ""
},
"actions": {
"EN-R9": {
"type": "code_state",
"value": "on"
}
}
},
"PN4FP": {
"properties": {
"category": "uncategorized",
"name": "PN4FP",
"collection": null,
"description": ""
},
"actions": {
"E_xxxxxx_Default": {
"type": "appearance",
"value": "M_Red"
}
}
},
"HEDIS": {
"properties": {
"category": "uncategorized",
"name": "HEDIS",
"collection": null,
"description": ""
},
"actions": {
"E_xxxxxx_Default": {
"type": "appearance",
"value": "M_Purple"
}
}
},
"_PRESET_Config_B": {
"properties": {
"category": "configuration",
"name": "_PRESET_Config_A",
"collection": null,
"description": ""
},
"actions": {
"HEDIS": {
"type": "code_state",
"value": "on"
}
}
},
"EN-R9": {
"properties": {
"category": "uncategorized",
"name": "EN-R9",
"collection": null,
"description": ""
},
"actions": {
"PN4FP": {
"type": "code_state",
"value": "on"
},
"switch_StorageBin": {
"type": "switch",
"value": "00_w_Storage_Bin_R9"
}
}
}
}
}
Desired output.csv
type,name,value
code_state,EN-R9,on
code_state,PN4FP,on
appearance,E_xxxxxx_Default,M_Red
switch,switch_StorageBin,00_w_Storage_Bin_R9

Python how to pick 3rd occurence in nested json array

I am working with one of my requirement
My requirement: I need to pick and print only 3rd "id" from "syrap" list from the nested json file. I am not getting desired output. Any help will be appreciated.
Test file:
{
"id": "0001",
"type": "donut",
"name": "Cake",
"ppu": 0.55,
"batters":
{ "process": "abc",
"mix": "0303",
"syrap":
[
{ "id": "1001", "type": "Regular" },
{ "id": "1002", "type": "Chocolate" },
{ "id": "1003", "type": "Blueberry" },
{ "id": "1004", "type": "Devil's Food" }
]
},
"rate": 0.55,
"topping":
[
{ "id": "5001", "type": "None" },
{ "id": "5002", "type": "Glazed" },
{ "id": "5005", "type": "Sugar" },
{ "id": "5007", "type": "Powdered Sugar" },
{ "id": "5006", "type": "Chocolate with Sprinkles" },
{ "id": "5003", "type": "Chocolate" },
{ "id": "5004", "type": "Maple" }
]
}
Expected output in a csv:
0001,donut,abc,0303,1003
My code:
import requests
import json
import csv
f = open('testdata.json')
data = json.load(f)
f.close()
f = csv.writer(open('testout.csv', 'wb+'))
for item in data:
f.writerow([item['id'], item[type], item['batters'][0]['process'],
item['batters'][0]['mix'],
item['batters'][0]['syrap'][0]['id'],
item['batters'][0]['syrap'][1]['id'],
item['batters'][0]['syrap'][2]['id'])
Here is some sample code showing how you can iterate through json content parsed as a dictionary:
import json
json_str = '''{
"id": "0001",
"type": "donut",
"name": "Cake",
"ppu": 0.55,
"batters":
{ "process": "abc",
"mix": "0303",
"syrap":
[
{ "id": "1001", "type": "Regular" },
{ "id": "1002", "type": "Chocolate" },
{ "id": "1003", "type": "Blueberry" },
{ "id": "1004", "type": "Devil's Food" }
]
},
"rate": 0.55,
"topping":
[
{ "id": "5001", "type": "None" },
{ "id": "5002", "type": "Glazed" },
{ "id": "5005", "type": "Sugar" },
{ "id": "5007", "type": "Powdered Sugar" },
{ "id": "5006", "type": "Chocolate with Sprinkles" },
{ "id": "5003", "type": "Chocolate" },
{ "id": "5004", "type": "Maple" }
]
}
'''
jsondict = json.loads(json_str)
syrap_node = jsondict['batters']['syrap']
for item in syrap_node:
print (f'id:{item["id"]} type: {item["type"]}')
Simply, data[“batters”][“syrap”][2][“id”]
Much better way to achieve this would be
f = open('testout.csv', 'wb+')
with f:
fnames = ['id','type','process','mix','syrap']
writer = csv.DictWriter(f, fieldnames=fnames)
writer.writeheader()
for item in data:
print item
writer.writerow({'id' : item['id'], 'type': item['type'],
'process' : item['batters']['process'],
'mix': item['batters']['mix'],
'syrap': item['batters']['syrap'][2]['id']})
You need to make sure that data is actually a list. if it is not a list, don't use for loop.
simply,
writer.writerow({'id' : data['id'], 'type': data['type'],
'process' : data['batters']['process'],
'mix': data['batters']['mix'],
'syrap': data['batters']['syrap'][2]['id']})

issue in Elastic Search Term Aggregation

In elastic search aggregation query I need to get all the movies watched by the user who watches the movie "Frozen". This is how my Result source
{
"_index": "user",
"_type": "user",
"_id": "ovUowmUBREWOv-CU-4RT",
"_version": 4,
"_score": 1,
"_source": {
"movies": [
"Angry birds 1",
"PINNOCCHIO",
"Frozen",
"Hotel Transylvania 3"
],
"user_id": 86
}
}
This is the query I'm using.
{
"query": {
"match": {
"movies": "Frozen"
}
},
"size": 0,
"aggregations": {
"movies_like_Frozen": {
"terms": {
"field": "movies",
"min_doc_count": 1
}
}
}
}
The result I got in the bucket is correct, but the movie names are splits by white space like this
"buckets": [
{
"key": "3",
"doc_count": 2
},
{
"key": "hotel",
"doc_count": 2
},
{
"key": "transylvania",
"doc_count": 2
},
{
"key": "1",
"doc_count": 1
},
{
"key": "angry",
"doc_count": 1
},
{
"key": "birds",
"doc_count": 1
}
]
How can I get buckets with "Angry birds 1", "Hotel Transylvania 3" as result.
Please help.
In elasticsearch 6.x, every text field is analyzed implicitly. To override this, you need to create a mapping for text type fields as not_analyzed in an index, then insert documents in it.
In your case,
{
"mappings": {
"user": {
"properties": {
"movies": {
"type": "text",
"index": "not_analyzed",
"fields": {
"keyword": {
"type": "text",
"index": "not_analyzed"
}
}
},
"user_id": {
"type": "long"
}
}
}
}
}
Hope it works.

Create dynamic json object in python

I have a dictionary which is contain multiple keys and values and the values also contain the key, value pair. I am not getting how to create dynamic json using this dictionary in python. Here's the dictionary:
image_dict = {"IMAGE_1":{"img0":"IMAGE_2","img1":"IMAGE_3","img2":"IMAGE_4"},"IMAGE_2":{"img0":"IMAGE_1", "img1" : "IMAGE_3"},"IMAGE_3":{"img0":"IMAGE_1", "img1":"IMAGE_2"},"IMAGE_4":{"img0":"IMAGE_1"}}
My expected result like this :
{
"data": [
{
"image": {
"imageId": {
"id": "IMAGE_1"
},
"link": {
"target": {
"id": "IMAGE_2"
},
"target": {
"id": "IMAGE_3"
},
"target": {
"id": "IMAGE_4"
}
}
},
"updateData": "link"
},
{
"image": {
"imageId": {
"id": "IMAGE_2"
},
"link": {
"target": {
"id": "IMAGE_1"
},
"target": {
"id": "IMAGE_3"
}
}
},
"updateData": "link"
},
{
"image": {
"imageId": {
"id": "IMAGE_3"
},
"link": {
"target": {
"id": "IMAGE_1"
},
"target": {
"id": "IMAGE_2"
}
}
},
"updateData": "link"
} ,
{
"image": {
"imageId": {
"id": "IMAGE_4"
},
"link": {
"target": {
"id": "IMAGE_1"
}
}
},
"updateData": "link"
}
]
}
I tried to solve it but I didn't get expected result.
result = {"data":[]}
for k,v in sorted(image_dict.items()):
for a in sorted(v.values()):
result["data"].append({"image":{"imageId":{"id": k},
"link":{"target":{"id": a}}},"updateData": "link"})
print(json.dumps(result, indent=4))
In Python dictionaries you can't have 2 values with the same key. So you can't have multiple targets all called "target". So you can index them. Also I don't know what this question has to do with dynamic objects but here's the code I got working:
import re
dict_res = {}
ind = 0
for image in image_dict:
lin_ind = 0
sub_dict = {'image' + str(ind): {'imageId': {image}, 'link': {}}}
for sub in image_dict[image].values():
sub_dict['image' + str(ind)]['link'].update({'target' + str(lin_ind): {'id': sub}})
lin_ind += 1
dict_res.update(sub_dict)
ind += 1
dict_res = re.sub('target\d', 'target', re.sub('image\d', 'image', str(dict_res)))
print dict_res

How to parse JSON in a Django View

I post some JSON to a view. I want to now parse the data and add it to my database.
I need to get the properties name and theme and iterate over the array pages. My JSON is as follows:
{
"name": "xaAX",
"logo": "",
"theme": "b",
"fullSiteLink": "http://www.hello.com",
"pages": [
{
"id": "1364484811734",
"name": "Page Name",
"type": "basic",
"components": {
"img": "",
"text": ""
}
},
{
"name": "Twitter",
"type": "twitter",
"components": {
"twitter": {
"twitter-username": "zzzz"
}
}
}
]
}
Here is what I have so far:
def smartpage_create_ajax(request):
if request.POST:
# get stuff and loop over each page?
return HttpResponse('done')
python provides json to encode/decode json
import json
json_dict = json.loads(request.POST['your_json_data'])
json_dict['pages']
[
{
"id": "1364484811734",
"name": "Page Name",
"type": "basic",
"components": {
"img": "",
"text": ""
}
},
{
"name": "Twitter",
"type": "twitter",
"components": {
"twitter": {
"twitter-username": "zzzz"
}
}
},
}
]

Categories