How to extract parrticular value from nested json values.? - python

I am new to python.
I have small requirement (i.e) want to extract only one value from the JSON format.
Please do correct me if i am wrong.
JSON input is:
{
"meta": {
"limit": 1,
"next": "/api/v1/ips/?username=sic1&api_key=689db0740ed73c2bf6402a7de0fcf2d7b57111ca&limit=1&objects=&offset=1",
"offset": 0,
"previous": null,
"total_count": 56714
},
"objects": [
{
"_id": "556f4c81dcddec0c41463529",
"bucket_list": [],
"campaign": [
{
"analyst": "prabhu",
"confidence": "medium",
"date": "2015-06-03 14:50:41.440000",
"name": "Combine"
}
],
"created": "2015-06-03 14:50:41.436000",
"ip": "85.26.162.70",
"locations": [],
"modified": "2015-06-18 09:50:51.612000",
"objects": [],
"relationships": [
{
"analyst": "prabhu",
"date": "2015-06-18 09:50:51.369000",
"rel_confidence": "unknown",
"rel_reason": "N/A",
"relationship": "Related_To",
"relationship_date": "2015-06-18 09:50:51.369000",
"type": "Indicator",
"value": "556f4c81dcddec0c4146353a"
}
],
"releasability": [],
"schema_version": 3,
"screenshots": [],
"sectors": [],
"source": [
{
"instances": [
{
"analyst": "prabhu",
"date": "2015-06-03 14:50:41.438000",
"method": "trawl",
"reference": "http://www.openbl.org/lists/base_30days.txt"
}
],
"name": "www.openbl.org"
}
],
"status": "New",
"tickets": [],
"type": "Address - ipv4-addr"
}
]
}
The code i used for getting value only IP's from objects
import requests
from pprint import pprint
import json
url = 'http://127.0.0.1:8080/api/v1/ips/'
params = {'api_key':'xxxxxx','username': 'abcd'}
r = requests.get(url, params=params, verify=False)
parsed = json.loads(r)
print (parsed['objects']['ip'])
The error i am receiving is:
Traceback (most recent call last):
File "testapi.py", line 9, in <module>
parsed = json.loads(r)
File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer
I just want to get IP's from that JSON input.
Thanks.

You are passing a requests object instead of a str object to json.loads(). You need to change
parsed = json.loads(r)
to
parsed = json.loads(r.text)
Also, parsed['objects'] is a list, you need to access its first element & then get the key ip:
>>> print(parsed['objects'][0]['ip'])

The problem is in this line: parsed = json.loads(r)
You're reciving the json response but insted of feeding json elements to json.loads you're instead feeding it <Response [200]>
>>> r = requests.get('http://www.google.com')
>>> r
<Response [200]>
>>> type(r)
<class 'requests.models.Response'>
(Look closely at the error message. Expected string or buffer Which means you're providing it something that is NOT string or buffer(an object in this case))
This is the reason why str(r) didn't work. Because it just converted <Response 200> to '<Response 200>' which obviously is not json.
change this line to parsed = json.loads(r.text).
>>> type(r.text)
<type 'unicode'>
and then parsed['objects'][0]['ip'] should give you the IP address :)
You can find more about the requests module here

Related

Extracting data from JSON log

I am a beginner when it comes to programming. I'm trying to extract elements from a JSON log file, but I get an error and I don't know how to deal with it.
import json
with open("/Users/milosz/Desktop/logi.json") as f:
data = json.load(f)
print(type(data['Objects']))
print(data)
for object in data ['Objects']:
print(object)
Error:
File "/Users/milosz/PycharmProjects/JsonDataExtracter/Program/Python Exracter.py", line 4, in <module>
print(type(data['Objects']))
TypeError: list indices must be integers or slices, not str
Process finished with exit code 1
I am sending the log below.
{
"_id": "635bd4bfc594743ce9b1a5a3",
"dateStart": "2022-10-28T13:09:28.609Z",
"dateFinish": "2022-10-28T13:10:23.698Z",
"method": "customer.file.upsert",
"request": {
"Objects": [
{
"ERPId": "6915",
"B24Id": 403772,
"FileName": "B2B000202",
"FileContent": "JVBERi0xLjMNJeLjz9MN",
"B24EntityId": 3334
}
]
Following up on the guidance from #accdias, here is a code snippet that closes the gaps in your JSON snippet and demonstrates how to access the Objects section:
import json
json_string = """
{
"_id": "635bd4bfc594743ce9b1a5a3",
"dateStart": "2022-10-28T13:09:28.609Z",
"dateFinish": "2022-10-28T13:10:23.698Z",
"method": "customer.file.upsert",
"request": {
"Objects": [
{
"ERPId": "6915",
"B24Id": 403772,
"FileName": "B2B000202",
"FileContent": "JVBERi0xLjMNJeLjz9MN",
"B24EntityId": 3334
}
]
}
}
"""
json_dict = json.loads(json_string)
print(json_dict["request"]["Objects"])
Output:
[{'ERPId': '6915', 'B24Id': 403772, 'FileName': 'B2B000202', 'FileContent': 'JVBERi0xLjMNJeLjz9MN', 'B24EntityId': 3334}]

Python to parse nested JSON values that can be null sometimes

I'm trying to parse the following and pull out primary_ip as a variable. Sometimes primary_ip is "null". Here is an example of the JSON, code and the most recent error I am getting.
{
"count": 67,
"next": "https://master.netbox.dev/api/dcim/devices/?limit=50&offset=50",
"previous": null,
"results": [
{
"id": 28,
"url": "https://master.netbox.dev/api/dcim/devices/28/",
"name": "q2",
"display_name": "q2",
"device_type": {
"id": 20,
"url": "https://master.netbox.dev/api/dcim/device-types/20/",
"manufacturer": {
"id": 15,
"url": "https://master.netbox.dev/api/dcim/manufacturers/15/",
"name": "Zyxel",
"slug": "zyxel"
},
"model": "GS1900",
"slug": "gs1900",
"display_name": "Zyxel GS1900"
},
"device_role": {
"id": 4,
"url": "https://master.netbox.dev/api/dcim/device-roles/4/",
"name": "Access Switch",
"slug": "access-switch"
},
"primary_ip": {
"id": 301,
"url": "https://master.netbox.dev/api/ipam/ip-addresses/301/",
"family": 4,
"address": "172.31.254.241/24"
},
Example Python
import requests
import json
headers = {
'Authorization': 'Token 63d421a5f733dd2c5070083e80df8b4d466ae525',
'Accept': 'application/json; indent=4',
}
response = requests.get('https://master.netbox.dev/api/dcim/sites/', headers=headers)
j = response.json()
for results in j['results']:
x=results.get('name')
y=results.get('physical_address')
response2 = requests.get('https://master.netbox.dev/api/dcim/devices', headers=headers)
device = response2.json()
for result in device['results']:
x=result.get('name')
z=result.get('site')['name']
# if result.get('primary_ip') != None
y=result.get('primary_ip', {}).get('address')
print(x,y,z)
I get the following error when I run it:
ubuntu#ip-172-31-39-26:~$ python3 Netbox-python
Traceback (most recent call last):
File "Netbox-python", line 22, in <module>
y=result.get('primary_ip', {}).get('address')
AttributeError: 'NoneType' object has no attribute 'get'
Which value is None? Is it the primary_ip or is it address ?
you could try the following:
y = result.get('primary_ip', {}).get('address, 'empty_address')
This will replace any None values with empty_address
Update:
I have just ran your code and got the following output:
LC1 123.123.123.123/24 site1
q1 172.31.254.254/24 COD
q2 172.31.254.241/24 COD
After running this:
import requests
import json
headers = {
"Authorization": "Token 63d421a5f733dd2c5070083e80df8b4d466ae525",
"Accept": "application/json; indent=4",
}
response = requests.get("https://master.netbox.dev/api/dcim/sites/", headers=headers)
j = response.json()
for results in j["results"]:
x = results.get("name")
y = results.get("physical_address")
response2 = requests.get("https://master.netbox.dev/api/dcim/devices", headers=headers)
device = response2.json()
for result in device["results"]:
x = result.get("name")
z = result.get("site")["name"]
if result.get("primary_ip") != None:
y = result.get("primary_ip").get("address")
print(x, y, z)
I am not sure of the expected output but the code doesn't throw any errors. From looking at the code, it seems there were a few indentation errors, where things didn't match up in terms of where they should have been indented.

How to print out the exact field/string of the JSON output?

I'm trying to filter all the result that I got in the GET Request.
The Output that I want is just to get the summary: , key: and self:.
But I'm getting a lot of Json data.
I've tried googling on how to do this and I'm going to nowhere.
Here is my code:
The commented lines are the codes that I have tried.
import requests
import json
import re
import sys
url ="--------"
auth='i.g--t----------', 'X4------'
r = requests.get(url, auth=(auth))
data = r.json()
#print( json.dumps(data, indent=2) )
#res1 = " ".join(re.split("summary", data))
#print ("first string result: ", str(res1))
#json_str = json.dumps(data)
#resp = json.loads(json_str)
#print (resp['id'])
#resp_dict = json.loads(resp_str)
#resp_dict.get('name')
#print('dasdasd', json_str["summary"])
Example of the Get Api Output that I'm getting using this code. print( json.dumps(data, indent=2) )
{
"id": "65621",
"self": "https://bboxxltd.atlassian.net/rest/api/2/issue/65621",
"key": "CMS-5901",
"fields": {
"summary": "new starter: Edoardo Bologna",
"customfield_10700": [
{
"id": "2",
"name": "BBOXX Rwanda HQ",
"_links": {
"self": "https://bboxxltd.atlassian.net/rest/servicedeskapi/organization/2"
}
}
},
"inwardIssue": {
"id": "65862",
"key": "BMT-2890",
"self": "https://bboxxltd.atlassian.net/rest/api/2/issue/65862",
"fields": {
"summary": "ERP Databases access with Read Only",
"status": {
"self": "https://bboxxltd.atlassian.net/rest/api/2/status/10000",
"description": "",
"iconUrl": "https://bboxxltd.atlassian.net/",
"name": "To Do",
"id": "10000",
"statusCategory": {
"self": "https://bboxxltd.atlassian.net/rest/api/2/statuscategory/2",
"id": 2,
"key": "new",
"colorName": "blue-gray",
"name": "To Do"
}
},
"priority": {
"self": "https://bboxxltd.atlassian.net/rest/api/2/priority/4",
"iconUrl": "https://bboxxltd.atlassian.net/images/icons/priorities/low.svg",
"name": "Low",
My error is:
Traceback (most recent call last):
File "c:/Users/IanJayloG/Desktop/Python Files/Ex_Files_Learning_Python/Exercise Files/Test/Untitled-1.py", line 17, in <module>
print('dasdasd', data["summary"])
KeyError: 'summary'
PS C:\Users\IanJayloG\Desktop\Python Files\Ex_Files_Learning_Python\Exercise Files> & C:/Users/IanJayloG/AppData/Local/Programs/Python/Python37-32/python.exe "c:/Users/IanJayloG/Desktop/Python Files/Ex_Files_Learning_Python/Exercise Files/Test/Untitled-1.py"
Traceback (most recent call last):
File "c:/Users/IanJayloG/Desktop/Python Files/Ex_Files_Learning_Python/Exercise Files/Test/Untitled-1.py", line 17, in <module>
print('dasdasd', json_str["summary"])
TypeError: string indices must be integers
The problem about your error message
print('dasdasd', json_str["summary"])
TypeError: string indices must be integers
is that you try to access the named field summary on a string (variable json_str), which does not work because strings don't have fields to access by name. If you use the indexing [] operator on a string, you can only provide integers or ranges to extract single characters or sequences from that string. This is obviously not what you're intending.
The keys self and key are on top level of your JSON document, whereas summary is under fields. This should do it, without any extra transformation applied:
import requests
r = requests.get(url, auth=(auth))
data = r.json()
data_summary = data['fields']['summary']
data_self = data['self']
data_key = data['key']

I have to send the content of a list to a string

I have to send the content of a list to a single string.
I've tried using a loop, but only could print the result:
for item in output_list:
for line in item['output'].split('\n'):
print(line)
that is the list
output_list =
{
"jsonrpc": "2.0",
"result": [
{},
{
"tablesLastChangeTime": 1483721367.4560423,
"tablesAgeOuts": 0,
"tablesInserts": 3,
"lldpNeighbors": [
{
"ttl": 120,
"neighborDevice": "HP830_LSW",
"neighborPort": "GigabitEthernet1/0/12",
"port": "Ethernet47"
},
{
"ttl": 120,
"neighborDevice": "HP_5500EI",
"neighborPort": "GigabitEthernet2/0/22",
"port": "Ethernet48"
},
{
"ttl": 120,
"neighborDevice": "HP_5500EI",
"neighborPort": "GigabitEthernet1/0/24",
"port": "Management1"
}
],
"tablesDeletes": 0,
"tablesDrops": 0
}
],
"id": "EapiExplorer-1"
}
I want to send the content of a list to a single string.
The json module contains a method called 'dumps' which can take in your dictionary and return a string.
import json
my_string = json.dumps(output_list)
There are a couple of methods here you could use. The first is to generically pretty-print:
import pprint
...
pprint.pprint(output_list)
The second is to output in json format, since your output_list looks like it belongs that way:
import json
...
print(json.dumps(output_list))

How to parse complex json in python 2.7.5?

I trying to list the names of my puppet classes from a Puppet Enterprise 3.7 puppet master, using Puppet's REST API.
Here is my script:
#!/usr/bin/env python
import requests
import json
url='https://ppt-001.example.com:4433/classifier-api/v1/groups'
headers = {"Content-Type": "application/json"}
data={}
cacert='/etc/puppetlabs/puppet/ssl/certs/ca.pem'
key='/etc/puppetlabs/puppet/ssl/private_keys/ppt-001.example.com.pem'
cert='/etc/puppetlabs/puppet/ssl/certs/ppt-001.example.com.pem'
result = requests.get(url,
data=data, #no data needed for this request
headers=headers, #dict {"Content-Type":"application/json"}
cert=(cert,key), #key/cert pair
verify=cacert
)
print json.dumps( result.json(), sort_keys=True, indent=4, separators=(',', ': '))
for i in result.json:
print i
Here is the error message I get when I execute the script:
Traceback (most recent call last):
File "./add-group.py", line 42, in <module>
for i in result.json:
TypeError: 'instancemethod' object is not iterable
Here is a sample of the data I get back from the REST API:
[
{
"classes": {},
"environment": "production",
"environment_trumps": false,
"id": "00000000-0000-4000-8000-000000000000",
"name": "default",
"parent": "00000000-0000-4000-8000-000000000000",
"rule": [
"and",
[
"~",
"name",
".*"
]
],
"variables": {}
},
{
"classes": {
"puppet_enterprise": {
"certificate_authority_host": "ppt-001.example.com",
"console_host": "ppt-001.example.com",
"console_port": "443",
"database_host": "ppt-001.example.com",
"database_port": "5432",
"database_ssl": true,
"mcollective_middleware_hosts": [
"ppt-001.example.com"
],
"puppet_master_host": "ppt-001.example.com",
"puppetdb_database_name": "pe-puppetdb",
"puppetdb_database_user": "pe-puppetdb",
"puppetdb_host": "ppt-001.example.com",
"puppetdb_port": "8081"
}
},
"environment": "production",
"environment_trumps": false,
"id": "52c479fe-3278-4197-91ea-9127ba12474e",
"name": "PE Infrastructure",
"parent": "00000000-0000-4000-8000-000000000000",
"variables": {}
},
.
.
.
How should I go about access the name key and getting the values like default and PE Infrastructure?
I have read the other answers here on SO saying that one should use json.loads() and I have tried using parsed_json = json.loads(result.json()) but results in this error message:
Traceback (most recent call last):
File "./add-group.py", line 38, in <module>
parsed_json = json.loads(result.json())
File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python2.7/json/decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer
print json.dumps( result.json(), sort_keys=True, indent=4, separators=(',', ': '))
the first parameter of json.dumps must be a string or buffer, as stated by the TypeError your getting (TypeError: expected string or buffer).
Your variable result is an instance of Response, and the method .json() will return a dictionary. Since you're passing the result of .json() to json.dumps(), you're getting an error. You could either just use result.json() which is already a dictionary corresponding to your response, or change your json.dumps line to print json.dumps( result.text, sort_keys=True, indent=4, separators=(',', ': ')) where result.text is your JSON result as a string/unicode.
After the change, to access something like the name attribute, you could do something like:
for item in r.json():
try:
print item['name']
expect KeyError:
print "There is no 'name' attribute"

Categories