How to parse complex json in python 2.7.5? - python

I trying to list the names of my puppet classes from a Puppet Enterprise 3.7 puppet master, using Puppet's REST API.
Here is my script:
#!/usr/bin/env python
import requests
import json
url='https://ppt-001.example.com:4433/classifier-api/v1/groups'
headers = {"Content-Type": "application/json"}
data={}
cacert='/etc/puppetlabs/puppet/ssl/certs/ca.pem'
key='/etc/puppetlabs/puppet/ssl/private_keys/ppt-001.example.com.pem'
cert='/etc/puppetlabs/puppet/ssl/certs/ppt-001.example.com.pem'
result = requests.get(url,
data=data, #no data needed for this request
headers=headers, #dict {"Content-Type":"application/json"}
cert=(cert,key), #key/cert pair
verify=cacert
)
print json.dumps( result.json(), sort_keys=True, indent=4, separators=(',', ': '))
for i in result.json:
print i
Here is the error message I get when I execute the script:
Traceback (most recent call last):
File "./add-group.py", line 42, in <module>
for i in result.json:
TypeError: 'instancemethod' object is not iterable
Here is a sample of the data I get back from the REST API:
[
{
"classes": {},
"environment": "production",
"environment_trumps": false,
"id": "00000000-0000-4000-8000-000000000000",
"name": "default",
"parent": "00000000-0000-4000-8000-000000000000",
"rule": [
"and",
[
"~",
"name",
".*"
]
],
"variables": {}
},
{
"classes": {
"puppet_enterprise": {
"certificate_authority_host": "ppt-001.example.com",
"console_host": "ppt-001.example.com",
"console_port": "443",
"database_host": "ppt-001.example.com",
"database_port": "5432",
"database_ssl": true,
"mcollective_middleware_hosts": [
"ppt-001.example.com"
],
"puppet_master_host": "ppt-001.example.com",
"puppetdb_database_name": "pe-puppetdb",
"puppetdb_database_user": "pe-puppetdb",
"puppetdb_host": "ppt-001.example.com",
"puppetdb_port": "8081"
}
},
"environment": "production",
"environment_trumps": false,
"id": "52c479fe-3278-4197-91ea-9127ba12474e",
"name": "PE Infrastructure",
"parent": "00000000-0000-4000-8000-000000000000",
"variables": {}
},
.
.
.
How should I go about access the name key and getting the values like default and PE Infrastructure?
I have read the other answers here on SO saying that one should use json.loads() and I have tried using parsed_json = json.loads(result.json()) but results in this error message:
Traceback (most recent call last):
File "./add-group.py", line 38, in <module>
parsed_json = json.loads(result.json())
File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python2.7/json/decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer

print json.dumps( result.json(), sort_keys=True, indent=4, separators=(',', ': '))
the first parameter of json.dumps must be a string or buffer, as stated by the TypeError your getting (TypeError: expected string or buffer).
Your variable result is an instance of Response, and the method .json() will return a dictionary. Since you're passing the result of .json() to json.dumps(), you're getting an error. You could either just use result.json() which is already a dictionary corresponding to your response, or change your json.dumps line to print json.dumps( result.text, sort_keys=True, indent=4, separators=(',', ': ')) where result.text is your JSON result as a string/unicode.
After the change, to access something like the name attribute, you could do something like:
for item in r.json():
try:
print item['name']
expect KeyError:
print "There is no 'name' attribute"

Related

How do I do multiple JSON entries with Python?

I'm trying to pull some data from a flight simulation JSON table. It's updated every 15 seconds and I've been trying to pull print(obj['pilots']['flight_plans']['cid']). However im getting the error
Traceback (most recent call last):
File "main.py", line 18, in <module>
print(obj['pilots']['flight_plans']['cid'])
TypeError: list indices must be integers or slices, not str
My code is below
import json
from urllib.request import urlopen
import urllib
# initial setup
URL = "https://data.vatsim.net/v3/vatsim-data.json"
# json entries
response = urllib.request.urlopen(URL)
str_response = response.read().decode('utf-8')
obj = json.loads(str_response)
# result is connections
# print(obj["general"]["connected_clients"])
print(obj['pilots']['flight_plans']['cid'])
The print(obj["general"]["connected_clients"]) does work.
Investigate your obj with print(json.dumps(obj,indent=2). You'll find that the pilots key is a list of dictionaries containing flight_plan (not plural) and cid keys. Here's the first few lines:
{
"general": {
"version": 3,
"reload": 1,
"update": "20220301062202",
"update_timestamp": "2022-03-01T06:22:02.245318Z",
"connected_clients": 292,
"unique_users": 282
},
"pilots": [
{
"cid": 1149936,
"name": "1149936",
"callsign": "URO504",
"server": "UK",
"pilot_rating": 0,
"latitude": -23.39706,
"longitude": -46.3709,
"altitude": 9061,
"groundspeed": 327,
"transponder": "0507",
"heading": 305,
"qnh_i_hg": 29.97,
"qnh_mb": 1015,
"flight_plan": {
"flight_rules": "I",
"aircraft": "A346",
...
For example, iterate over the list of pilots and print name/cid:
for pilot in obj['pilots']:
print(pilot['name'],pilot['cid'])
Output:
1149936 1149936
Nick Aydin OTHH 1534423
Oguz Aydin 1429318
Marvin Steglich LSZR 1482019
Daniel Krol EPKK 1279199
... etc ...

How to print out the exact field/string of the JSON output?

I'm trying to filter all the result that I got in the GET Request.
The Output that I want is just to get the summary: , key: and self:.
But I'm getting a lot of Json data.
I've tried googling on how to do this and I'm going to nowhere.
Here is my code:
The commented lines are the codes that I have tried.
import requests
import json
import re
import sys
url ="--------"
auth='i.g--t----------', 'X4------'
r = requests.get(url, auth=(auth))
data = r.json()
#print( json.dumps(data, indent=2) )
#res1 = " ".join(re.split("summary", data))
#print ("first string result: ", str(res1))
#json_str = json.dumps(data)
#resp = json.loads(json_str)
#print (resp['id'])
#resp_dict = json.loads(resp_str)
#resp_dict.get('name')
#print('dasdasd', json_str["summary"])
Example of the Get Api Output that I'm getting using this code. print( json.dumps(data, indent=2) )
{
"id": "65621",
"self": "https://bboxxltd.atlassian.net/rest/api/2/issue/65621",
"key": "CMS-5901",
"fields": {
"summary": "new starter: Edoardo Bologna",
"customfield_10700": [
{
"id": "2",
"name": "BBOXX Rwanda HQ",
"_links": {
"self": "https://bboxxltd.atlassian.net/rest/servicedeskapi/organization/2"
}
}
},
"inwardIssue": {
"id": "65862",
"key": "BMT-2890",
"self": "https://bboxxltd.atlassian.net/rest/api/2/issue/65862",
"fields": {
"summary": "ERP Databases access with Read Only",
"status": {
"self": "https://bboxxltd.atlassian.net/rest/api/2/status/10000",
"description": "",
"iconUrl": "https://bboxxltd.atlassian.net/",
"name": "To Do",
"id": "10000",
"statusCategory": {
"self": "https://bboxxltd.atlassian.net/rest/api/2/statuscategory/2",
"id": 2,
"key": "new",
"colorName": "blue-gray",
"name": "To Do"
}
},
"priority": {
"self": "https://bboxxltd.atlassian.net/rest/api/2/priority/4",
"iconUrl": "https://bboxxltd.atlassian.net/images/icons/priorities/low.svg",
"name": "Low",
My error is:
Traceback (most recent call last):
File "c:/Users/IanJayloG/Desktop/Python Files/Ex_Files_Learning_Python/Exercise Files/Test/Untitled-1.py", line 17, in <module>
print('dasdasd', data["summary"])
KeyError: 'summary'
PS C:\Users\IanJayloG\Desktop\Python Files\Ex_Files_Learning_Python\Exercise Files> & C:/Users/IanJayloG/AppData/Local/Programs/Python/Python37-32/python.exe "c:/Users/IanJayloG/Desktop/Python Files/Ex_Files_Learning_Python/Exercise Files/Test/Untitled-1.py"
Traceback (most recent call last):
File "c:/Users/IanJayloG/Desktop/Python Files/Ex_Files_Learning_Python/Exercise Files/Test/Untitled-1.py", line 17, in <module>
print('dasdasd', json_str["summary"])
TypeError: string indices must be integers
The problem about your error message
print('dasdasd', json_str["summary"])
TypeError: string indices must be integers
is that you try to access the named field summary on a string (variable json_str), which does not work because strings don't have fields to access by name. If you use the indexing [] operator on a string, you can only provide integers or ranges to extract single characters or sequences from that string. This is obviously not what you're intending.
The keys self and key are on top level of your JSON document, whereas summary is under fields. This should do it, without any extra transformation applied:
import requests
r = requests.get(url, auth=(auth))
data = r.json()
data_summary = data['fields']['summary']
data_self = data['self']
data_key = data['key']

Creating new orders with Python API, get AttributeError: 'str' object has no attribute 'iteritems'

The code I have that's causing this is
new_order = shopify.Order.create(json.dumps({'order': { "email": "foo#example.com", "fulfillment_status": "fulfilled", "line_items": [{'message': "words go here"}]}}))
I tried without the json.dumps and got the response that it was an unhashable type. also tried this from some reasearch
data = dict()
data['order']= { "email": "foo#example.com", "fulfillment_status": "fulfilled", "line_items": [{'message': "words go here"}]}
print(data['order'])
new_order = shopify.Order.create(json.dumps(data))
What can I do to properly send in a simple order like in https://help.shopify.com/api/reference/order#create
C:\Python27\python.exe C:/Users/Kris/Desktop/moon_story/story_app.py
Traceback (most recent call last):
File "C:/Users/Kris/Desktop/moon_story/story_app.py", line 41, in <module>
{'fulfillment_status': 'fulfilled', 'email': 'foo#example.com', 'line_items': [{'message': 'words go here'}]}
get_story(1520)
File "C:/Users/Kris/Desktop/moon_story/story_app.py", line 29, in get_story
new_order = shopify.Order.create(json.dumps(data))
File "C:\Python27\lib\site-packages\pyactiveresource\activeresource.py", line 448, in create
resource = cls(attributes)
File "C:\Python27\lib\site-packages\shopify\base.py", line 126, in __init__
prefix_options, attributes = self.__class__._split_options(attributes)
File "C:\Python27\lib\site-packages\pyactiveresource\activeresource.py", line 465, in _split_options
for key, value in six.iteritems(options):
File "C:\Python27\lib\site-packages\six.py", line 599, in iteritems
return d.iteritems(**kw)
AttributeError: 'str' object has no attribute 'iteritems'
After some digging, I was able to get this working. You shouldn't need to do anything special with the argument passed to create. The following works for me:
shop_url = "https://%s:%s#%s.myshopify.com/admin" % (shopify_key, shopify_pass, shopify_store_name)
shopify.ShopifyResource.set_site(shop_url)
order_data = {
"email": "test#test.com",
"fulfillment_status": "fulfilled",
"line_items": [
{
"title": "ITEM TITLE",
"variant_id": 7214792579,
"quantity": 1,
"price": 895
}
]
}
shopify.Order.create(order_data)
It's worth noting that this Python library relies on another Shopify created library called pyactiveresource. That library provides the underlying create method, which calls the save method.
The save method has the following notes about responses:
Args:
None
Returns:
True on success, False on ResourceInvalid errors (sets the errors
attribute if an <errors> object is returned by the server).
Raises:
connection.Error: On any communications problems.
I was continually getting a False response. This helped me understand which fields were actually required by looking at the errors attribute, so I figured it might be helpful here.
Comment: ... get an order(None) as response. ... Any thoughts?
Comparing with help.shopify.com/api/reference there are the following differences:
The Endpoint have to be /admin/orders.json
Why do you use /admin?
The Main Key in the JSON Dict have to be order.
Why don't you use this, for example:
{
"order": {
"email": "foo#example.com",
"fulfillment_status": "fulfilled",
"line_items": [
{
"variant_id": 447654529,
"quantity": 1
}
]
}
}
Use:
new_order = shopify.Order.create(data['order'])

Unable to parse JSON file, keep getting ValueError: Extra Data

So, leading on from my prior issue [found here][1], I'm attempting to parse a JSON file that I've managed to download with #SiHa's help. The JSON is structured like so:
{"properties": [{"property": "name", "value": "A random company name"}, {"property": "companyId", "value": 123456789}]}{"properties": [{"property": "name", "value": "Another random company name"}, {"property": "companyId", "value": 31415999}]}{"properties": [{"property": "name", "value": "Yet another random company"}, {"property": "companyId", "value": 10101010}]}
I've been able to get this by slightly modifiying #SiHa's code:
def get_companies():
create_get_recent_companies_call = "https://api.hubapi.com/companies/v2/companies/?hapikey={hapikey}".format(hapikey=wta_hubspot_api_key)
headers = {'content-type': 'application/json'}
create_get_recent_companies_response = requests.get(create_get_recent_companies_call, headers=headers)
if create_get_recent_companies_response.status_code == 200:
while True:
for i in create_get_recent_companies_response.json()[u'companies']:
all_the_companies = { "properties": [
{ "property": "name", "value": i[u'properties'][u'name'][u'value'] },
{ "property": "companyId", "value": i[u'companyId'] }
]
}
with open("all_the_companies.json", "a") as myfile:
myfile.write(json.dumps(all_the_companies))
#print(companyProperties)
offset = create_get_recent_companies_response.json()[u'offset']
hasMore = create_get_recent_companies_response.json()[u'has-more']
if not hasMore:
break
else:
create_get_recent_companies_call = "https://api.hubapi.com/companies/v2/companies/?hapikey={hapikey}&offset={offset}".format(hapikey=wta_hubspot_api_key, offset=offset)
create_get_recent_companies_response = requests.get(create_get_recent_companies_call, headers=headers)
else:
print("Something went wrong, check the supplied field values.\n")
print(json.dumps(create_get_recent_companies_response.json(), sort_keys=True, indent=4))
So that was part one. Now I'm trying to use the code below to extract two things: 1) the name and 2) the companyId.
#!/usr/bin/python
# -*- coding: utf-8 -*-
import sys
import os.path
import requests
import json
import csv
import glob2
import shutil
import time
import time as howLong
from time import sleep
from time import gmtime, strftime
# Local Testing Version
findCSV = glob2.glob('*contact*.csv')
theDate = time=strftime("%Y-%m-%d", gmtime())
theTime = time=strftime("%H:%M:%S", gmtime())
# Exception handling
try:
testData = findCSV[0]
except IndexError:
print ("\nSyncronisation attempted on {date} at {time}: There are no \"contact\" CSVs, please upload one and try again.\n").format(date=theDate, time=theTime)
print("====================================================================================================================\n")
sys.exit()
for theCSV in findCSV:
def process_companies():
with open('all_the_companies.json') as data_file:
data = json.load(data_file)
for i in data:
company_name = data[i][u'name']
#print(company_name)
if row[0].lower() == company_name.lower():
contact_company_id = data[i][u'companyId']
#print(contact_company_id)
return contact_company_id
else:
print("Something went wrong, check the \"get_companies()\" function.\n")
print(json.dumps(create_get_recent_companies_response.json(), sort_keys=True, indent=4))
if __name__ == "__main__":
start_time = howLong.time()
process_companies()
print("This operation took %s seconds.\n" % (howLong.time() - start_time))
sys.exit()
Unfortunately, its not working - I'm getting the following traceback:
Traceback (most recent call last):
File "wta_parse_json.py", line 62, in <module>
process_companies()
File "wta_parse_json.py", line 47, in process_companies
data = json.load(data_file)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 290, in load
**kw)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 369, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 130 - line 1 column 1455831 (char 129 - 1455830)
I've made sure that i'm using json.dumps not json.dump to open the file, but still its not working. :(
I've now given up on JSON, and am trying to export a simple CSV with the code below:
def get_companies():
create_get_recent_companies_call = "https://api.hubapi.com/companies/v2/companies/?hapikey={hapikey}".format(hapikey=wta_hubspot_api_key)
headers = {'content-type': 'application/json'}
create_get_recent_companies_response = requests.get(create_get_recent_companies_call, headers=headers)
if create_get_recent_companies_response.status_code == 200:
while True:
for i in create_get_recent_companies_response.json()[u'companies']:
all_the_companies = "{name},{id}\n".format(name=i[u'properties'][u'name'][u'value'], id=i[u'companyId'])
all_the_companies.encode('utf-8')
with open("all_the_companies.csv", "a") as myfile:
myfile.write(all_the_companies)
#print(companyProperties)
offset = create_get_recent_companies_response.json()[u'offset']
hasMore = create_get_recent_companies_response.json()[u'has-more']
if not hasMore:
break
else:
create_get_recent_companies_call = "https://api.hubapi.com/companies/v2/companies/?hapikey={hapikey}&offset={offset}".format(hapikey=wta_hubspot_api_key, offset=offset)
create_get_recent_companies_response = requests.get(create_get_recent_companies_call, headers=headers)
[1]: http://stackoverflow.com/questions/36148346/unable-to-loop-through-paged-api-responses-with-python
But it looks like this isn't right either - even though i've read up on the formatting issues, and have added the .encode('utf-8') additions. I still end up getting the following traceback:
Traceback (most recent call last):
File "wta_get_companies.py", line 78, in <module>
get_companies()
File "wta_get_companies.py", line 57, in get_companies
all_the_companies = "{name},{id}\n".format(name=i[u'properties'][u'name'][u'value'], id=i[u'companyId'])
UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 3: ordinal not in range(128)
The JSON data has three Objects one after the other; simplified:
{ .. }{ .. }{ .. }
That's not something that's supported by the JSON standard. How is Python supposed to parse that? Automatically wrap it in an array? Assign it to three different variables? Just use the first one?
You probably want to wrap it in an array, simplified:
[{ .. },{ .. },{ .. }]
Or full:
[{"properties": [{"property": "name", "value": "A random company name"}, {"property": "companyId", "value": 123456789}]},{"properties": [{"property": "name", "value": "Another random company name"}, {"property": "companyId", "value": 31415999}]},{"properties": [{"property": "name", "value": "Yet another random company"}, {"property": "companyId", "value": 10101010}]}]

How to extract parrticular value from nested json values.?

I am new to python.
I have small requirement (i.e) want to extract only one value from the JSON format.
Please do correct me if i am wrong.
JSON input is:
{
"meta": {
"limit": 1,
"next": "/api/v1/ips/?username=sic1&api_key=689db0740ed73c2bf6402a7de0fcf2d7b57111ca&limit=1&objects=&offset=1",
"offset": 0,
"previous": null,
"total_count": 56714
},
"objects": [
{
"_id": "556f4c81dcddec0c41463529",
"bucket_list": [],
"campaign": [
{
"analyst": "prabhu",
"confidence": "medium",
"date": "2015-06-03 14:50:41.440000",
"name": "Combine"
}
],
"created": "2015-06-03 14:50:41.436000",
"ip": "85.26.162.70",
"locations": [],
"modified": "2015-06-18 09:50:51.612000",
"objects": [],
"relationships": [
{
"analyst": "prabhu",
"date": "2015-06-18 09:50:51.369000",
"rel_confidence": "unknown",
"rel_reason": "N/A",
"relationship": "Related_To",
"relationship_date": "2015-06-18 09:50:51.369000",
"type": "Indicator",
"value": "556f4c81dcddec0c4146353a"
}
],
"releasability": [],
"schema_version": 3,
"screenshots": [],
"sectors": [],
"source": [
{
"instances": [
{
"analyst": "prabhu",
"date": "2015-06-03 14:50:41.438000",
"method": "trawl",
"reference": "http://www.openbl.org/lists/base_30days.txt"
}
],
"name": "www.openbl.org"
}
],
"status": "New",
"tickets": [],
"type": "Address - ipv4-addr"
}
]
}
The code i used for getting value only IP's from objects
import requests
from pprint import pprint
import json
url = 'http://127.0.0.1:8080/api/v1/ips/'
params = {'api_key':'xxxxxx','username': 'abcd'}
r = requests.get(url, params=params, verify=False)
parsed = json.loads(r)
print (parsed['objects']['ip'])
The error i am receiving is:
Traceback (most recent call last):
File "testapi.py", line 9, in <module>
parsed = json.loads(r)
File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
TypeError: expected string or buffer
I just want to get IP's from that JSON input.
Thanks.
You are passing a requests object instead of a str object to json.loads(). You need to change
parsed = json.loads(r)
to
parsed = json.loads(r.text)
Also, parsed['objects'] is a list, you need to access its first element & then get the key ip:
>>> print(parsed['objects'][0]['ip'])
The problem is in this line: parsed = json.loads(r)
You're reciving the json response but insted of feeding json elements to json.loads you're instead feeding it <Response [200]>
>>> r = requests.get('http://www.google.com')
>>> r
<Response [200]>
>>> type(r)
<class 'requests.models.Response'>
(Look closely at the error message. Expected string or buffer Which means you're providing it something that is NOT string or buffer(an object in this case))
This is the reason why str(r) didn't work. Because it just converted <Response 200> to '<Response 200>' which obviously is not json.
change this line to parsed = json.loads(r.text).
>>> type(r.text)
<type 'unicode'>
and then parsed['objects'][0]['ip'] should give you the IP address :)
You can find more about the requests module here

Categories