navigating json table in python - python

I am trying to access the team name key value and the american key value
print(bv_json['outcomes'][0]['description'])
the parts of the json table that I need are denoted with the ########### trailing near the end of the table posted, I get an error about needing an integer to iterate rather than string, I am also struggling with navigating through the keys
thanks
[
{
"path": [
{
"id": "2958468",
"link": "/basketball/nba",
"description": "NBA",
"type": "LEAGUE",
"sportCode": "BASK",
"order": 1,
"leaf": true,
"current": true
},
{
"id": "227",
"link": "/basketball",
"description": "Basketball",
"type": "SPORT",
"sportCode": "BASK",
"order": 1,
"leaf": false,
"current": false
}
],
"events": [
{
"id": "8801181",
"description": "L.A. Clippers # Utah Jazz",
"type": "GAMEEVENT",
"link": "/basketball/nba/l-a-clippers-utah-jazz-202106082215",
"status": "O",
"sport": "BASK",
"startTime": 1623204900000,
"live": true,
"awayTeamFirst": true,
"denySameGame": "NO",
"teaserAllowed": true,
"competitionId": "2958468",
"notes": "Best of 7 - Game 1",
"numMarkets": 34,
"lastModified": 1623212024024,
"competitors": [
{
"id": "8801181-285",
"name": "Utah Jazz",
"home": true
},
{
"id": "8801181-310",
"name": "L.A. Clippers",
"home": false
}
],
"displayGroups": [
{
"id": "100-97",
"description": "Game Lines",
"defaultType": true,
"alternateType": false,
"markets": [
{
"id": "157658380",
"descriptionKey": "Head To Head",
"description": "Moneyline",
"key": "2W-12",
"marketTypeId": "3059",
"status": "O",
"singleOnly": false,
"notes": "",
"period": {
"id": "341",
"description": "Live Game",
"abbreviation": "G",
"live": true,
"main": true
},
"outcomes": [
{
"id": "849253180",
"description": "L.A. Clippers",##############
"status": "O",
"type": "A",
"competitorId": "8801181-310",
"price": {
"id": "7927852247",
"american": "+125",#########################
"decimal": "2.250",
"fractional": "5/4",
"malay": "-0.80",
"indonesian": "1.25",
"hongkong": "1.25"

It looks like your data structure is
[{[{[{[{{}}]}]}]}]
Which is a list containing a dictionary of a list of dictionaries of lists of dictionaries, which is to say it's nested and confusing.
To make it easy on yourself, I think defining some variables will help.
Let's access the first level list item, the dictionary that contains 'path'- this dict contains all the other lists of dictionaries.
full_dict = bvjson[0] # step into a list
Looking at the data, we know that outcomes is in the 'events' list of dicts, so let's define that variable to make it easier to step into for our when we get to our ultimate answer.
events = full_dict['events'] # access dictionary value by key
Now we have access to events, which is a list of dictionaries of lists of dictionaries.
In events, we see that 'outcomes' actually lives two steps into the 'displayGroups' value, so let's get 'displayGroups' into something useable.
display = events['displayGroups'][0]
# 'displayGroups' is a key in the dictionary in the event list,
# and it holds a list of dictionaries, so we use [0] to step
# into the list to access the dicts.
# Note - if there are multiple lists this will only access the first one.
Stepping in further:
markets = display['markets'][0]
outcomes = markets['outcomes'][0]
You finally have easy access to the outcomes list of dict!
description = outcomes['description']
price = outcomes['price']['american']
So remember, anytime you get a confusing nested json like this, stepping in to each value can help you figure out how to get what you want and if you need to access via index (if it's a list) or via key (if it's a dictionary).
Think of all of this as just a way to diagnose and figure out why you aren't getting the values you are requesting - it will be different for each case, and different logic will be required for handling getting multiple values out of each list or dict - but this is a good start and way to get your mind around it.
Here is your data properly enclosed:
bvjson =
[
{
"path": [
{
"id": "2958468",
"link": "/basketball/nba",
"description": "NBA",
"type": "LEAGUE",
"sportCode": "BASK",
"order": 1,
"leaf": True,
"current": True
},
{
"id": "227",
"link": "/basketball",
"description": "Basketball",
"type": "SPORT",
"sportCode": "BASK",
"order": 1,
"leaf": False,
"current": False
}
],
"events": [
{
"id": "8801181",
"description": "L.A. Clippers # Utah Jazz",
"type": "GAMEEVENT",
"link": "/basketball/nba/l-a-clippers-utah-jazz-202106082215",
"status": "O",
"sport": "BASK",
"startTime": 1623204900000,
"live": True,
"awayTeamFirst": True,
"denySameGame": "NO",
"teaserAllowed": True,
"competitionId": "2958468",
"notes": "Best of 7 - Game 1",
"numMarkets": 34,
"lastModified": 1623212024024,
"competitors": [
{
"id": "8801181-285",
"name": "Utah Jazz",
"home": True
},
{
"id": "8801181-310",
"name": "L.A. Clippers",
"home": False
}
],
"displayGroups": [
{
"id": "100-97",
"description": "Game Lines",
"defaultType": True,
"alternateType": False,
"markets": [
{
"id": "157658380",
"descriptionKey": "Head To Head",
"description": "Moneyline",
"key": "2W-12",
"marketTypeId": "3059",
"status": "O",
"singleOnly": False,
"notes": "",
"period": {
"id": "341",
"description": "Live Game",
"abbreviation": "G",
"live": True,
"main": True
},
"outcomes": [
{
"id": "849253180",
"description": "L.A. Clippers",##############
"status": "O",
"type": "A",
"competitorId": "8801181-310",
"price": {
"id": "7927852247",
"american": "+125",#########################
"decimal": "2.250",
"fractional": "5/4",
"malay": "-0.80",
"indonesian": "1.25",
"hongkong": "1.25"}
}
]
}
]
}
]
}
]
}
]

Related

Python/API Request - Extract data from API request with dynamic output

I'm working with API requests for the first time, and I'm wondering if the following is possible to do:
I have a function that receives API responses that look like this:
{
"DATA_Items": [{
"DATA": {
"DATA_data_meta": {
"ASSIGNER": "info#gmail.com",
"ID": "DATA-2021-43062"
},
"data_format": "IRE",
"data_type": "DATA",
"data_version": "4.0",
"description": {
"description_data": [{
"lang": "en",
"value": "during web page generation."
}]
},
"problemtype": {
"problemtype_data": [{
"description": [{
"lang": "en",
"value": "CWE-79"
}]
}]
},
"references": {
"reference_data": [{
"name": "https://blablabla/data",
"refsource": "CONFIRM",
"tags": ["Advisory"],
"url": "https://blablabl/data"
},
{
"name": "http://blablabla.com/files/166055/mail-7.0.1-Cross-Site-Scripting.html",
"refsource": "MISC",
"tags": ["Exploit",
"Advisory",
"Entry"
],
"url": "http://package.com/files/166055/mail-7.0.1-Cross-Site-Scripting.html"
}
]
}
},
"configurations": {
"DATA_data_version": "4.0",
"nodes": [{
"ID_match": [{
"ID23Uri": "ID:2.3:a:info:mail:*:*:*:*:*:*:*:*",
"ID_name": [],
"versionEndExcluding": "2.0.2",
"versionStartIncluding": "3.0.0",
"vulnerable": true
},
{
"ID23Uri": "ID:2.3:a:info:mail:*:*:*:*:*:*:*:*",
"ID_name": [],
"versionEndExcluding": "6.46",
"versionStartIncluding": "9.7.0",
"vulnerable": true
},
{
"ID23Uri": "ID:2.3:a:info:mail:*:*:*:*:*:*:*:*",
"ID_name": [],
"versionEndExcluding": "6.2.8",
"versionStartIncluding": "2.2.0",
"vulnerable": true
}
],
"children": [],
"operator": "OR"
}]
},
"impact": {
"baseMetricV2": {
"acInsufInfo": false,
"datasV2": {
"Impact": "NONE",
"accessComplexity": "MEDIUM",
"accessVector": "NETWORK",
"authentication": "NONE",
"availabilityImpact": "NONE",
"baseScore": 4.3,
"integrityImpact": "PARTIAL",
"vectorString": "AV:N/AC:M/Au:N/C:N/I:P/A:N",
"version": "2.0"
},
"exploitabilityScore": 8.6,
"impactScore": 2.9,
"obtainAllPrivilege": false,
"obtainOtherPrivilege": false,
"obtainUserPrivilege": false,
"severity": "MEDIUM",
"userInteractionRequired": true
},
"baseMetricV3": {
"Score": 2.8,
"impactScore": 1.7,
"sV3": {
"Complexity": "LOW",
"Vector": "NETWORK",
"availabilityImpact": "NONE",
"baseScore": 6.1,
"baseSeverity": "MEDIUM",
"confidentialityImpact": "LOW",
"integrityImpact": "LOW",
"privilegesRequired": "NONE",
"scope": "CHANGED",
"userInteraction": "REQUIRED",
"vectorString": "BASE",
"version": "3.1"
}
}
},
"lastModifiedDate": "2012-03-04T16:33Z",
"publishedDate": "2012-05-02T11:15Z"
}]
}
The part that I'm interested in is:
{"ID_match": [{"vulnerable": true
the vulnerable after ID_match is always the same.
However, the ID_match can be in multiple places of the API call, and I'm not sure about all the different possibilities. I do have code that loops over some of the ID_matches, which looks like this:
date = datetime.datetime.now() + datetime.timedelta(days=-1)
response = requests.get('https://somewebsite?dataMatchString={}&modStartDate={}-{:02d}-{:02d}T00:00:00:000%20CEST&modEndDate={}-{:02d}-{:02d}T00:00:00:000%20CEST'.format(
application.id, date.year, date.month, date.day - 1, date.year, date.month, date.day + 2)).json()
for check in response['result']['DATA_Items'][0]['configurations']['nodes'][0]['children'][0]['ID_match']:
print(check)
for check2 in response['result']['DATA_Items'][0]['configurations']['nodes'][0]['children'][1]['ID_match']:
print(check2)
for check3 in response['result']['DATA_Items'][0]['configurations']['nodes'][0]['ID_match']:
print(check3)
When I do this, I do see that for some API responses I do get printed the part that I want to have, but it also misses some.
I was wondering if it is possible to search for the path(s) where ID_match is, and then use it to get the value of vulnerable
You could use a recursive function that traverses the entire object graph, looking at all nested dicts, and all list items, and returns those dicts that have a 'vulnerable': True entry.
def find_vulnerable_nodes(node):
if isinstance(node, list):
for item in node:
yield from find_vulnerable_nodes(item)
elif isinstance(node, dict):
if node.get('vulnerable') == True:
yield node
else:
for item in node.values():
yield from find_vulnerable_nodes(item)
This way, the structure and nesting depth of the input data is irrelevant.
Usage:
data = requests.get('...').json()
for n in find_vulnerable_nodes(data):
print(n)
or
vulnerable_nodes = list(find_vulnerable_nodes(data))
Result with your sample data:
{'ID23Uri': 'ID:2.3:a:info:mail:*:*:*:*:*:*:*:*', 'ID_name': [], 'versionEndExcluding': '2.0.2', 'versionStartIncluding': '3.0.0', 'vulnerable': True}
{'ID23Uri': 'ID:2.3:a:info:mail:*:*:*:*:*:*:*:*', 'ID_name': [], 'versionEndExcluding': '6.46', 'versionStartIncluding': '9.7.0', 'vulnerable': True}
{'ID23Uri': 'ID:2.3:a:info:mail:*:*:*:*:*:*:*:*', 'ID_name': [], 'versionEndExcluding': '6.2.8', 'versionStartIncluding': '2.2.0', 'vulnerable': True}

merge common fields in the Json in python

for a sample json as mentioned below
JSON =[{
"ID": "00300000-0000-0000-0000-000000000000",
"CommonTags": [
"Sports"
],
"subID": "149f43d0-6fa9-44f3-b4ba-6fb7a320d0a4",
"Description": "Sports,Arts",
"Title": "Biodata",
"index": 1,
"Value": "Availabity"
},
{
"ID": "00300000-0000-0000-0000-000000000000",
"CommonTags": [
"Social Media"
],
"subID": "149f43d0-6fa9-44f3-b4ba-6fb7a320d0a4",
"Description": "Sports,Arts",
"Title": "Biodata",
"index": 5,
"Value": "Availabity"
},
{
"ID": "00300000-0000-0000-0000-000000000079",
"CommonTags": [
"Sports"
],
"subID": "149f43d0-6fa9-44f3-b4ba-6fb7a320d0a4",
"Description": "Environmental Science",
"Title": "Biodata",
"index": 1,
"Value": "Performace"
}
]
I want to merger the Json fields CommonTags and index based on if the "Value" field is same for objects.
my approach
mergedArr=[]
def mergeCommon(value):
for i, d in enumerate(mergedArr):
if d['Value'] == value:
return i
return -1
for d in JSON:
if (i := mergeCommon(d['Value'])) < 0:
mergedArr.append(d)
else:
mergedArr[i]['CommonTags'].append(d['CommonTags'])
print(mergedArr)
I'm getting the common fields output to be a list within a list, but the expected output is to have all the elements in the single list
and I'm not clear on how to append index values in a list
MY OUTPUT
[{
"ID": "00300000-0000-0000-0000-000000000000",
"CommonTags": ["Sports", ["Social Media"]],
"subID": "149f43d0-6fa9-44f3-b4ba-6fb7a320d0a4",
"Description": "Sports,Arts",
"Title": "Biodata",
"index": 1,
"Value": "Availabity"
}, {
"ID": "00300000-0000-0000-0000-000000000079",
"CommonTags": ["Sports"],
"subID": "149f43d0-6fa9-44f3-b4ba-6fb7a320d0a4",
"Description": "Environmental Science",
"Title": "Biodata",
"index": 1,
"Value": "Performace"
}]
EXPECTED OUTPUT
[{
"ID": "00300000-0000-0000-0000-000000000000",
"CommonTags": ["Sports", "Social Media"],
"subID": "149f43d0-6fa9-44f3-b4ba-6fb7a320d0a4",
"Description": "Sports,Arts",
"Title": "Biodata",
"index": [1, 5],
"Value": "Availabity"
}, {
"ID": "00300000-0000-0000-0000-000000000079",
"CommonTags": ["Sports"],
"subID": "149f43d0-6fa9-44f3-b4ba-6fb7a320d0a4",
"Description": "Environmental Science",
"Title": "Biodata",
"index": 1,
"Value": "Performace"
}]
Please Guide me on this. Thanks
Replace
mergedArr[i]['CommonTags'].append(d['CommonTags'])
with
mergedArr[i]['CommonTags'].extend(d['CommonTags'])
if not isinstance(mergedArr[i]['index'], list):
mergedArr[i]['index'] = [mergedArr[i]['index']]
mergedArr[i]['index'].append(d['index'])
then your code will produce the desired outcome.
list.extend will let you extend the "CommonTags" value instead of appending a new list to it.
The other ugly if-condition will create a list if mergedArr[i]['index'] is not a list already and append to it.
Changing the second append to extend will solve the issue, i.e. mergedArr[i]['CommonTags'].extend(d['CommonTags']). Extending the existing array is what you want, instead of appending a new array at the end.

Ignore specific JSON keys when extracting data in Python

I'm extracting certain keys in several JSON files and then converting it to a CSV in Python. I'm able to define a key list when I run my code and get the information I need.
However, there are certain sub-keys that I want to ignore from the JSON file. For example, if we look at the following snippet:
JSON Sample
[
{
"callId": "abc123",
"errorCode": 0,
"apiVersion": 2,
"statusCode": 200,
"statusReason": "OK",
"time": "2020-12-14T12:00:32.744Z",
"registeredTimestamp": 1417731582000,
"UID": "_guid_abc123==",
"created": "2014-12-04T22:19:42.894Z",
"createdTimestamp": 1417731582000,
"data": {},
"preferences": {},
"emails": {
"verified": [],
"unverified": []
},
"identities": [
{
"provider": "facebook",
"providerUID": "123",
"allowsLogin": true,
"isLoginIdentity": true,
"isExpiredSession": true,
"lastUpdated": "2014-12-04T22:26:37.002Z",
"lastUpdatedTimestamp": 1417731997002,
"oldestDataUpdated": "2014-12-04T22:26:37.002Z",
"oldestDataUpdatedTimestamp": 1417731997002,
"firstName": "John",
"lastName": "Doe",
"nickname": "John Doe",
"profileURL": "https://www.facebook.com/John.Doe",
"age": 50,
"birthDay": 31,
"birthMonth": 12,
"birthYear": 1969,
"city": "City, State",
"education": [
{
"school": "High School Name",
"schoolType": "High School",
"degree": null,
"startYear": 0,
"fieldOfStudy": null,
"endYear": 0
}
],
"educationLevel": "High School",
"favorites": {
"music": [
{
"name": "Music 1",
"id": "123",
"category": "Musician/band"
},
{
"name": "Music 2",
"id": "123",
"category": "Musician/band"
}
],
"movies": [
{
"name": "Movie 1",
"id": "123",
"category": "Movie"
},
{
"name": "Movie 2",
"id": "123",
"category": "Movie"
}
],
"television": [
{
"name": "TV 1",
"id": "123",
"category": "Tv show"
}
]
},
"followersCount": 0,
"gender": "m",
"hometown": "City, State",
"languages": "English",
"likes": [
{
"name": "Like 1",
"id": "123",
"time": "2014-10-31T23:52:53.0000000Z",
"category": "TV",
"timestamp": "1414799573"
},
{
"name": "Like 2",
"id": "123",
"time": "2014-09-16T08:11:35.0000000Z",
"category": "Music",
"timestamp": "1410855095"
}
],
"locale": "en_US",
"name": "John Doe",
"photoURL": "https://graph.facebook.com/123/picture?type=large",
"timezone": "-8",
"thumbnailURL": "https://graph.facebook.com/123/picture?type=square",
"username": "john.doe",
"verified": "true",
"work": [
{
"companyID": null,
"isCurrent": null,
"endDate": null,
"company": "Company Name",
"industry": null,
"title": "Company Title",
"companySize": null,
"startDate": "2010-12-31T00:00:00"
}
]
}
],
"isActive": true,
"isLockedOut": false,
"isRegistered": true,
"isVerified": false,
"lastLogin": "2014-12-04T22:26:33.002Z",
"lastLoginTimestamp": 1417731993000,
"lastUpdated": "2014-12-04T22:19:42.769Z",
"lastUpdatedTimestamp": 1417731582769,
"loginProvider": "facebook",
"loginIDs": {
"emails": [],
"unverifiedEmails": []
},
"rbaPolicy": {
"riskPolicyLocked": false
},
"oldestDataUpdated": "2014-12-04T22:19:42.894Z",
"oldestDataUpdatedTimestamp": 1417731582894,
"registered": "2014-12-04T22:19:42.956Z",
"regSource": "",
"socialProviders": "facebook"
}
]
I want to extract data from created and identities but ignore identities.favorites and identities.likes as well as their data underneath it.
This is what I have so far, below. I defined the JSON keys that I want to extract in the key_list variable:
Current Code
import json, pandas
from flatten_json import flatten
# Enter the path to the JSON and the filename without appending '.json'
file_path = r'C:\Path\To\file_name'
# Open and load the JSON file
json_list = json.load(open(file_path + '.json', 'r', encoding='utf-8', errors='ignore'))
# Extract data from the defined key names
key_list = ['created', 'identities']
json_list = [{k:d[k] for k in key_list} for d in json_list]
# Flatten and convert to a data frame
json_list_flattened = (flatten(d, '.') for d in json_list)
df = pandas.DataFrame(json_list_flattened)
# Export to CSV in the same directory with the original file name
export_csv = df.to_csv (file_path + r'.csv', sep=',', encoding='utf-8', index=None, header=True)
Similar to the key_list, I suspect that I would make an ignore list and factor that in the json_list for loop that I have? Something like:
key_ignore = ['identities.favorites', 'identities.likes']`
Then utilize the dict.pop() which looks like it will remove the unwanted sub-keys if it matches? Just not sure how to implement that correctly.
Expected Output
As a result, the code should extract data from the defined keys in key_list and ignore the sub keys defined in key_ignore, which is identities.favorites and identities.likes. Then the rest of the code will continue to convert it into a CSV:
created
identities.0.provider
identities.0.providerUID
identities...
2014-12-04T19:23:05.191Z
site
cb8168b0cf734b70ad541f0132763761
...
If the keys are always there, you can use
del d[0]['identities'][0]['likes']
del d[0]['identities'][0]['favorites']
or if you want to remove the columns from the dataframe after reading all the json data in you can use
df.drop(df.filter(regex='identities.0.favorites|identities.0.likes').columns, axis=1, inplace=True)

check if json element or object exists or not and proceed

Hi im am trying to parse json data and gets this error every time the element
if ['fields']['assignee'] in each:
TypeError: list indices must be integers or slices, not str
>>>
My json is this
{
"expand": "schema,names",
"startAt": 1,
"maxResults": 50,
"total": 7363,
"issues": [
{
"expand": "operations,versionedRepresentations,editmeta,changelog,renderedFields",
"id": "591838",
"self": "https://jira.mynet.com/rest/api/2/issue/591838",
"key": "TEST-8564",
"fields": {
"summary": "delete tables 31-03-2020 ",
"customfield_10006": 2.0,
"created": "2020-02-27T10:29:12.000+0100",
"description": "A LOT OF TEXT",
"assignee": null,
"labels": [
"DATA",
"Refined"
],
"status": {
"self": "https://jira.mynet.com/rest/api/2/status/10000",
"description": "",
"iconUrl": "https://jira.mynet.com/",
"name": "To Do",
"id": "10000",
"statusCategory": {
"self": "https://jira.mynet.com/rest/api/2/statuscategory/2",
"id": 2,
"key": "new",
"colorName": "blue-gray",
"name": "To Do"
}
}
}
}
]
}
The element in ['fields']['assignee'] is NULL in this example
sometimes it is like this
"assignee": : {
"self": "https://mynet.com/rest/api/2/user?username=xxxxxx",
"name": "sij",
"key": "x",
"emailAddress": xx#mynet.com",
"avatarUrls": {
"48x48": "https://mynet.com/secure/useravatar?ownerId=bdysdh&avatarId=16743",
"24x24": "https://mynet.com/secure/useravatar?size=small&ownerId=bdysdh&avatarId=16743",
"16x16": "https://mynet.com/secure/useravatar?size=xsmall&ownerId=bdysdh&avatarId=16743",
"32x32": "https://mynet.com/secure/useravatar?size=medium&ownerId=bdysdh&avatarId=16743"
},
"displayName": "Bruce Springsteen",
"active": true,
"timeZone": "Arctic/Longyearbyen"
},
I am trying to check of assignee is null and if so print null
my code looks like this
with open('C:\\TEMP\\testdata.json') as json_file:
data = json.load(json_file)
for each in data['issues']:
if ['fields']['assignee'] in each:
print (['fields']['assignee']['name'])
else:
print ('null')
I have tried to put in [0] between ['fields']['assignee']['name'] but nothing seems to help.
Try with
if 'fields' in each and 'assignee' in each['fields']:
Note that you need the name of the key, not surrounded by square brackets.
Perhaps better:
for each in data['issues']:
print(each.get('fields', {}).get('assignee', {}).get('name', 'null'))
and if you can't guarantee that 'issues' exists in data either:
for each in data.get('issues', []):
<as before>
data.get('issues', []) returns an empty list if data['issuess'] doesn't exist.

Unable to pull data from json using python

I have the following json
{
"response": {
"message": null,
"exception": null,
"context": [
{
"headers": null,
"name": "aname",
"children": [
{
"type": "cluster-connectivity",
"name": "cluster-connectivity"
},
{
"type": "consistency-groups",
"name": "consistency-groups"
},
{
"type": "devices",
"name": "devices"
},
{
"type": "exports",
"name": "exports"
},
{
"type": "storage-elements",
"name": "storage-elements"
},
{
"type": "system-volumes",
"name": "system-volumes"
},
{
"type": "uninterruptible-power-supplies",
"name": "uninterruptible-power-supplies"
},
{
"type": "virtual-volumes",
"name": "virtual-volumes"
}
],
"parent": "/clusters",
"attributes": [
{
"value": "true",
"name": "allow-auto-join"
},
{
"value": "0",
"name": "auto-expel-count"
},
{
"value": "0",
"name": "auto-expel-period"
},
{
"value": "0",
"name": "auto-join-delay"
},
{
"value": "1",
"name": "cluster-id"
},
{
"value": "true",
"name": "connected"
},
{
"value": "synchronous",
"name": "default-cache-mode"
},
{
"value": "true",
"name": "default-caw-template"
},
{
"value": "blah",
"name": "default-director"
},
{
"value": [
"blah",
"blah"
],
"name": "director-names"
},
{
"value": [
],
"name": "health-indications"
},
{
"value": "ok",
"name": "health-state"
},
{
"value": "1",
"name": "island-id"
},
{
"value": "blah",
"name": "name"
},
{
"value": "ok",
"name": "operational-status"
},
{
"value": [
],
"name": "transition-indications"
},
{
"value": [
],
"name": "transition-progress"
}
],
"type": "cluster"
}
],
"custom-data": null
}
}
which im trying to parse using the json module in python. I am only intrested in getting the following information out of it.
Name Value
operational-status Value
health-state Value
Here is what i have tried.
in the below script data is the json returned from a webpage
json = json.loads(data)
healthstate= json['response']['context']['operational-status']
operationalstatus = json['response']['context']['health-status']
Unfortunately i think i must be missing something as the above results in an error that indexes must be integers not string.
if I try
healthstate= json['response'][0]
it errors saying index 0 is out of range.
Any help would be gratefully received.
json['response']['context'] is a list, so that object requires you to use integer indices.
Each item in that list is itself a dictionary again. In this case there is only one such item.
To get all "name": "health-state" dictionaries out of that structure you'd need to do a little more processing:
[attr['value'] for attr in json['response']['context'][0]['attributes'] if attr['name'] == 'health-state']
would give you a list of of matching values for health-state in the first context.
Demo:
>>> [attr['value'] for attr in json['response']['context'][0]['attributes'] if attr['name'] == 'health-state']
[u'ok']
You have to follow the data structure. It's best to interactively manipulate the data and check what every item is. If it's a list you'll have to index it positionally or iterate through it and check the values. If it's a dict you'll have to index it by it's keys. For example here is a function that get's the context and then iterates through it's attributes checking for a particular name.
def get_attribute(data, attribute):
for attrib in data['response']['context'][0]['attributes']:
if attrib['name'] == attribute:
return attrib['value']
return 'Not Found'
>>> data = json.loads(s)
>>> get_attribute(data, 'operational-status')
u'ok'
>>> get_attribute(data, 'health-state')
u'ok'
json['reponse']['context'] is a list, not a dict. The structure is not exactly what you think it is.
For example, the only "operational status" I see in there can be read with the following:
json['response']['context'][0]['attributes'][0]['operational-status']

Categories