Accessing nested json objects using python - python

I am trying to interact with an API and running into issues accessing nested objects. Below is sample json output that I am working with.
{
"results": [
{
"task_id": "22774853-2b2c-49f4-b044-2d053141b635",
"params": {
"type": "host",
"target": "54.243.80.16",
"source": "malware_analysis"
},
"v": "2.0.2",
"status": "success",
"time": 227,
"data": {
"details": {
"as_owner": "Amazon.com, Inc.",
"asn": "14618",
"country": "US",
"detected_urls": [],
"resolutions": [
{
"hostname": "bumbleride.com",
"last_resolved": "2016-09-15 00:00:00"
},
{
"hostname": "chilitechnology.com",
"last_resolved": "2016-09-16 00:00:00"
}
],
"response_code": 1,
"verbose_msg": "IP address in dataset"
},
"match": true
}
}
]
}
The deepest I am able to access is the data portion which returns too much.... ideally I am just trying access as_owner,asn,country,detected_urls,resolutions
When I try to access details / response code ... etc I will get a KeyError. My nested json goes deeper then other Q's mentioned and I have tried that logic.
Below is my current code snippet and any help is appreciated!
import requests
import json
headers = {
'Content-Type': 'application/json',
}
params = (
('wait', 'true'),
)
data = '{"target":{"one":{"type": "ip","target": "54.243.80.16", "sources": ["xxx","xxxxx"]}}}'
r=requests.post('https://fakewebsite:8000/api/services/intel/lookup/jobs', headers=headers, params=params, data=data, auth=('apikey', ''))
parsed_json = json.loads(r.text)
#results = parsed_json["results"]
for item in parsed_json["results"]:
print(item['data'])

You just need to index correctly into the converted JSON. Then you can easily loop over a list of the keys you want to fetch, since they are all in the "details" dictionary.
import json
raw = '''\
{
"results": [
{
"task_id": "22774853-2b2c-49f4-b044-2d053141b635",
"params": {
"type": "host",
"target": "54.243.80.16",
"source": "malware_analysis"
},
"v": "2.0.2",
"status": "success",
"time": 227,
"data": {
"details": {
"as_owner": "Amazon.com, Inc.",
"asn": "14618",
"country": "US",
"detected_urls": [],
"resolutions": [
{
"hostname": "bumbleride.com",
"last_resolved": "2016-09-15 00:00:00"
},
{
"hostname": "chilitechnology.com",
"last_resolved": "2016-09-16 00:00:00"
}
],
"response_code": 1,
"verbose_msg": "IP address in dataset"
},
"match": true
}
}
]
}
'''
parsed_json = json.loads(raw)
wanted = ['as_owner', 'asn', 'country', 'detected_urls', 'resolutions']
for item in parsed_json["results"]:
details = item['data']['details']
for key in wanted:
print(key, ':', json.dumps(details[key], indent=4))
# Put a blank line at the end of the details for each item
print()
output
as_owner : "Amazon.com, Inc."
asn : "14618"
country : "US"
detected_urls : []
resolutions : [
{
"hostname": "bumbleride.com",
"last_resolved": "2016-09-15 00:00:00"
},
{
"hostname": "chilitechnology.com",
"last_resolved": "2016-09-16 00:00:00"
}
]
BTW, when you fetch JSON data using requests there's no need to use json.loads: you can access the converted JSON using the .json method of the returned request object instead of using its .text attribute.
Here's a more robust version of the main loop of the above code. It simply ignores any missing keys. I didn't post this code earlier because the extra if tests make it slightly less efficient, and I didn't know that keys could be missing.
for item in parsed_json["results"]:
if not 'data' in item:
continue
data = item['data']
if not 'details' in data:
continue
details = data['details']
for key in wanted:
if key in details:
print(key, ':', json.dumps(details[key], indent=4))
# Put a blank line at the end of the details for each item
print()

Related

Python : How to loop through data to access similar keys present inside nested dict

I have an API, after calling which I'm getting a very big json in response.
I want to access similar keys which are present inside the nested dict.
I'm using following lines to make a get request and storing the json data : -
p25_st_devices = r'https://url_from_where_im_getting_data.com'
header_events = {
'Authorization': 'Basic random_keys'}
r2 = requests.get(p25_st_devices, headers= header_events)
r2_json = json.loads(r2.content)
The sample of the json is as follows : -
{
"next": "value",
"self": "value",
"managedObjects": [
{
"creationTime": "2021-08-02T10:48:15.120Z",
"type": " c8y_MQTTdevice",
"lastUpdated": "2022-03-24T17:09:01.240+03:00",
"childAdditions": {
"self": "value",
"references": []
},
"name": "PS_MQTT1",
"assetParents": {
"self": "value",
"references": []
},
"self": "value",
"id": "338",
"Building": "value"
},
{
"creationTime": "2021-08-02T13:06:09.834Z",
"type": " c8y_MQTTdevice",
"lastUpdated": "2021-12-27T12:08:20.186+03:00",
"childAdditions": {
"self": "value",
"references": []
},
"name": "FS_MQTT2",
"assetParents": {
"self": "value",
"references": []
},
"self": "value",
"id": "339",
"c8y_IsDevice": {}
},
{
"creationTime": "2021-08-02T13:06:39.602Z",
"type": " c8y_MQTTdevice",
"lastUpdated": "2021-12-27T12:08:20.433+03:00",
"childAdditions": {
"self": "value",
"references": []
},
"name": "PS_MQTT3",
"assetParents": {
"self": "value",
"references": []
},
"self": "value",
"id": "340",
"c8y_IsDevice": {}
}
],
"statistics": {
"totalPages": 423,
"currentPage": 1,
"pageSize": 3
}
}
As per my understanding I can access name key using r2_json['managedObjects'][0]['name']
But how do I iterate over this json and store all values of name inside an array?
EDIT 1 :
Another thing which I'm trying to achieve is get all id from the JSON data and store in an array where the nested dict managedObjects contains name starting with PS_ only.
Therefore, the expected output would be device_id = ['338','340']
You should not just call the [0] index of the list, but loop over it:
all_names = []
for object in r2_json['managedObjects']:
all_names.append(object['name'])
print(all_names)
edit: Updated answer after OP updated theirs.
For your second question you can use startswith(). The code is almost the same.
PS_names = []
for object in r2_json['managedObjects']:
if object['name'].startswith("PS_"):
PS_names.append(object['id']) # we append with the id, if startswith("PS_") returns True.
print(PS_names)

Python TypeError JSON parsing from api not working

I'm pretty new to Python so forgive me here.
I'm trying to call/print/log JSON from a GET request. I need to get the id's following this tree: items > track > artists > id.
Basically im trying to just get the ID's from each track on this JSON file/list. I keep getting errors no matter how I format them. Again im new so im not sure how to call it let also run it in a loop to get all the ID's as an array. I have done it with other JSON files in the past but this one is farther in and I get the error : TypeError: list indices must be integers or slices, not str
I figured out it works when you do it by adding the [0] like this:
artists = response["items"][0]["track"]["artists"] BUT I want to loop it and get all of the Id's for each one and that option picks just one.
here is the beginning of the json so you can see the layout.
{
"href": "https://api.spotify.com/v1/me/tracks?offset=0&limit=15&market=US",
"items": [
{
"added_at": "2021-12-15T22:26:25Z",
"track": {
"album": {
"album_type": "single",
"artists": [
{
"external_urls": {
"spotify": "https://open.spotify.com/artist/6MlPT0WxdWnrYcpXT8GZF8"
},
"href": "https://api.spotify.com/v1/artists/6MlPT0WxdWnrYcpXT8GZF8",
"id": "6MlPT0WxdWnrYcpXT8GZF8",
"name": "PARKFORD",
"type": "artist",
"uri": "spotify:artist:6MlPT0WxdWnrYcpXT8GZF8"
}
],
"external_urls": {
"spotify": "https://open.spotify.com/album/4o2d8uBTyMfJeaJqSXn9tP"
},
"href": "https://api.spotify.com/v1/albums/4o2d8uBTyMfJeaJqSXn9tP",
"id": "4o2d8uBTyMfJeaJqSXn9tP",
"images": [
{
"height": 640,
"url": "https://i.scdn.co/image/ab67616d0000b27332bcd9e1b2234c6cd6b2b2ec",
"width": 640
},
{
"height": 300,
"url": "https://i.scdn.co/image/ab67616d00001e0232bcd9e1b2234c6cd6b2b2ec",
"width": 300
},
{
"height": 64,
"url": "https://i.scdn.co/image/ab67616d0000485132bcd9e1b2234c6cd6b2b2ec",
"width": 64
}
],
"name": "There's Nothing in the Rain",
"release_date": "2021-11-25",
"release_date_precision": "day",
"total_tracks": 1,
"type": "album",
"uri": "spotify:album:4o2d8uBTyMfJeaJqSXn9tP"
},
"artists": [
{
"external_urls": {
"spotify": "https://open.spotify.com/artist/6MlPT0WxdWnrYcpXT8GZF8"
},
"href": "https://api.spotify.com/v1/artists/6MlPT0WxdWnrYcpXT8GZF8",
"id": "6MlPT0WxdWnrYcpXT8GZF8",
"name": "PARKFORD",
"type": "artist",
"uri": "spotify:artist:6MlPT0WxdWnrYcpXT8GZF8"
here is the code I have written out
uri = MY_LIKED_SONGS
headers = {
"Authorization": f'Bearer {tokens["access_token"]}',
"Content-Type": "application/json",
}
r = requests.get(uri, headers=headers)
response = r.json()
# return response
artist_ids = {}
data = json.dumps(response)
for artist_ids in data["items"]["track"]:
logger.info(artist_ids["items"]["track"]["artists"]["id"])
print(artist_ids["_source"]["items"][0]["track"]["id"])
tracks = response["items"]["track"][0]["artists"]["id"]
Here when you do json.dumps(), it converts a Python object into a json string. In order to iterate through json object you need to convert it into json object. For that you have to use json.loads() then you can get artist id from that.
data = json.loads(json.dumps(response))
for item in data["items"]:
for artist in item['track']['album']['artists']:
print(artist['id'])
This might be helpful.

How to Fetch the value of any item from the JSON output in Python?

I have a function in python which is fetching the data in JSON format and I need to get the value of one item and store it in variable so I could use it another function
import requests
import json
import sys
def printResponse(r):
print '{} {}\n'.format(json.dumps(r.json(),
sort_keys=True,
indent=4,
separators=(',', ': ')), r)
r = requests.get('https://wiki.tourist.com/rest/api/content',
params={'title' : 'Release Notes for 1.1n1'},
auth=('ABCDE', '*******'))
printResponse(r)
getPageid = json.loads(r)
value = int(getPageid['results']['id'])
I am trying to get the value of id(160925) item in variable "value" so I could use it another function
Below is the JSON OUTPUT
{
"_links": {
"base": "https://wiki.tourist.com",
"context": "",
"self": "https://wiki.tourist.com/rest/api/content?title=Notes+for+1.1u1"
},
"limit": 25,
"results": [
{
"_expandable": {
"ancestors": "",
"body": "",
"children": "/rest/api/content/160925/child",
"container": "",
"descendants": "/rest/api/content/160925/descendant",
"history": "/rest/api/content/160925/history",
"metadata": "",
"operations": "",
"space": "/rest/api/space/Next",
"version": ""
},
"_links": {
"self": "https://wiki.tourist.com/rest/api/content/160925412",
"tinyui": "/x/5IaXCQ",
"webui": "/display/Next/Notes+for+1.1u1"
},
"extensions": {
"position": "none"
},
"id": "160925",
"status": "current",
"title": "Notes for 1.1u1",
"type": "page"
}
],
"size": 1,
"start": 0
} <Response [200]>
It looks like the "results" key in the JSON response corresponds to a list, so you'll want to index into that list to get a dict.
E.g. getPageid['results'][0]['id'] should return you the string value of the "id" key for the first object in the "results" list

Need help pulling values from JSON using python, getting TypeError

Here's an example of the JSON I'm pulling from a URL:
[
{
"externalModelId": "500A000000RQOwnIAH",
"resource": {
"account": {
"externalModelId": "001A000001EucpoIAB",
"resource": {
"accountName": "Foobar",
"accountNumber": 1234567,
},
"resourceReliability": "Fresh"
},
"caseNumber": 1234567,
"created": "2015-06-12T19:06:22.000Z",
"createdBy": {
"externalModelId": "005A0000005mhdXIAQ",
"resourceReliability": "Fresh"
},
"description": "Example description",
"hoursInCurrentStatus": 406,
"internalPriority": "3 (Normal)",
"lastModified": "2015-06-22T14:08:18.000Z",
"owner": {
"externalModelId": "005A0000001sKDzIAM",
"resourceReliability": "Fresh"
},
"product": {
"resource": {
"line": {
"externalModelId": 21118,
"resource": {
"name": null
},
"resourceReliability": "Fresh"
},
"version": {
"externalModelId": 21988,
"resource": {
"name": "1.2"
},
"resourceReliability": "Fresh"
}
},
"resourceReliability": "Fresh"
},
"resourceCount": 0,
"sbrs": [
"Value"
],
"sbt": 139,
"severity": "4 (Low)",
"status": "Status Example",
"subject": "Subject Example",
"tags": [
"br",
"fs"
],
"targetDate": "2015-07-15T17:46:48.000Z",
"type": "Feature"
},
"resourceReliability": "Fresh"
},
I'm interested in pulling the following values from it:
caseNumber
subject
severity
sbt
sbrs
status
The code I currently have is:
#!/usr/bin/env python
import sys
import requests
import json
import os
# Setup
username = "XXX"
password = "XXX"
accountid = "12345"
# Formulate the string and then capture the output
url = "http://XXX{0}XXX{1}XXXXX".format(accountid, filedtime)
r = requests.get(url, auth=(username, password))
parsed = json.loads(r.text)
parent = parsed['resource']
# Using json_string for testing
#json_string = json.dumps(parsed, indent=4, sort_keys=True)
#print json_string
for item in parent:
print item['caseNumber']
print item['subject']
print item['severity']
print item['sbt']
print item['sbrs']
print item['status']
The code outputs a TypeError:
Traceback (most recent call last):
File "./newcase-notify.py", line 31, in <module>
parent = parsed['resource']
TypeError: list indices must be integers, not str
I've tried specifying something like:
parent = parsed['resource'][0]['type']
but that doesn't work. I think I'm confused at this point. If I don't specify a parent and simply iterate through 'parsed' like:
for item in parsed:
print item['caseNumber']
print item['subject']
print item['severity']
print item['sbt']
print item['sbrs']
print item['status']
I get KeyError's again.
My Question:
Given the information provided, how can I pull the above mentioned values from my JSON object?
I solved this by removing:
parent = parsed['resource']
and using:
for item in parsed:
print item['resource']['caseNumber']
print item['resource']['subject']
print item['resource']['severity']
etc.
If you look at the top of your JSON you'll notice this:
[
{
That means an array, with an object inside. You need to dereference that object form the array first. Hence why you're getting that jazz about list indices must be of type integer and not string. Once you do that it should work.
parent = parsed[0]['resource'] should fix you right up.
Just to help guide you with translating between the nomenclatures:
Array:JS as List:Python and Object:JS as Dict:Python.

output every attribute, value in an uneven JSON object

I have a very long and uneven JSON object and I want to output every attribute, value for the end points (leaves) of the object.
For instance, it could look like this:
data = {
"Response": {
"Version": "2.0",
"Detail": {
"TransactionID": "Ib410c-2",
"Timestamp": "04:00"
},
"Transaction": {
"Severity": "Info",
"ID": "2222",
"Text": "Success"
},
"Detail": {
"InquiryDetail": {
"Value": "804",
"CountryISOAlpha2Code": "US"
},
"Product": {
"ID": "PRD",
"Org": {
"Header": {
"valuer": "804"
},
"Location": {
"Address": [
{
"CountryISOAlpha2Code": "US",
"Address": [
{
"Text": {
"#Value": 2,
"$": "Hill St"
}
}
]
}
]
}
}
}
}
}
}
I want to output each potential leaf. It can output the (final attribute or the entire path) and the value.
I know I just need to add something to this:
data = json.loads(inputFile)
small = repeat(data)
for attribute,value in small.iteritems():
print attribute,value
You could use recursion:
def print_leaf_keyvalues(d):
for key, value in d.iteritems():
if hasattr(value, 'iteritems'):
# recurse into nested dictionary
print_leaf_keyvalues(value)
else:
print key, value
Demo on your sample data:
>>> print_leaf_keyvalues(data)
Version 2.0
valuer 804
Address [{'CountryISOAlpha2Code': 'US', 'Address': [{'Text': {'#Value': 2, '$': 'Hill St'}}]}]
ID PRD
CountryISOAlpha2Code US
Value 804
Text Success
Severity Info
ID 2222
This will not handle the list value of Address however. You can always add an additional test for sequences and iterate and recurse again.

Categories