how to read json file with json.load - python

I want to pick "ocr_text" in this json
How can I pick ocr_text with json.loads
{'message': 'Success', 'result': [{'message': 'Success', 'input': '1.jpg', 'prediction': [{'id': 'a6447ad9-80f7-4bce-bb5e-588bef3874e6', 'label': 'number_plate', 'xmin': 93, 'ymin': 405, 'xmax': 248, 'ymax': 445, 'score': 0.99992895, **'ocr_text': 'MH 02 CB 4545'**, 'type': 'field', 'status': 'correctly_predicted', 'page_no': 0, 'label_id': '45aaf761-4b60-42e9-b9a7-21d7ea8b927a'}], auto=compress&expires=1670532718&or=90&s=373803a82f093ab6b3b68d530f85f294', 'original_with_long_expiry': 'https://nnts.imgix.net/uploadedfiles/59aedc47-df0d-4e93-a52d-dd7076da1287/PredictionImages/658c79d6-c4c7-4ce3-8dfc-41d8884d5719.jpeg?expires=1686070318&or=0&s=849652a08454ccca0ac5cfb779c0cba3'}, 'uploadedfiles/59aedc47-df0d-4e93-a52d-dd7076da1287/RawPredictions/1-2022-12-08T16-51-56.347.jpg': {'original': 'https://nanonets.s3.us-west-2.amazonaws.com/uploadedfiles/59aedc47-df0d-4e93-a52d-dd7076da1287/RawPredictions/1-2022-12-08T16-51-56.347.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5F4WPNNTLX3QHN4W%2F20221208%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20221208T165158Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&response-cache-control=no-cache&X-Amz-Signature=6ccfc59eb43ffe89dda229ca2a91f09f883596014c7ab0bba6028432f506438d', 'original_compressed': '', 'thumbnail': '', 'acw_rotate_90': '', 'acw_rotate_180': '', 'acw_rotate_270': '', 'original_with_long_expiry': ''}}}

There's an error with the JSON you provided, however I will try my best to answer.
if the json data is within a string in the code, you can use loads like so:
import json
json_string = ... # this is the string with json data
json_dict = json.loads(json_string)
print(json_dict["ocr_text"])
first, I use json.loads to load json data from the json_string into a dictionary (json_dict)
then, I treat it as a regular dictionary with the square brackets.
The python json module documentation has more info if you'd like.

Related

Iterate on nested json through python

I am having a json data coming through API and i need to print some specific values through it and i am using the code to get the same but its giving me keyerror : results
import json
import requests
headers = {"Content-Type":"application/json","Accept":"application/json"}
r = requests.get('API', headers=headers)
data=r.text
parse_json=json.loads(data)
for result in parse_json['results']:
for cpu in result['cpumfs']:
print(cpu.get('pct_im_utilization'))
Below is the json data :
{'d': {'__count': '0', 'results': [{'ID': '6085', 'Name': 'device1', 'DisplayName': None, 'DisplayDescription': None, 'cpumfs': {'results': [{'ID': '6117', 'Timestamp': '1649157300', 'DeviceItemID': '6085', 'pct_im_Utilization': '4.0'}, {'ID': '6117', 'Timestamp': '1649157600', 'DeviceItemID': '6085', 'pct_im_Utilization': '1.0'}, {'ID': '6117', 'Timestamp': '1649157900', 'DeviceItemID': '6085', 'pct_im_Utilization': '4.0'}, {'ID': '6117', 'Timestamp': '1649158200', 'DeviceItemID': '6085', 'pct_im_Utilization': '1.0'},
I need to get printed Name of the device ,Timestamp,pct_im_utlization

Sending python requests and handling JSON lists

I am sending requests to a crypto network for data on accounts. You get sent back information, but I haven't yet encountered lists being sent in JSON until now. I want to parse certain information, but am having trouble because the JSON is a list and is not as easy to parse compared to normal JSON data.
import requests
import json
url = ' https://s1.ripple.com:51234/'
payload = {
"method": "account_objects",
"params": [
{
"account": "r9cZA1mLK5R5Am25ArfXFmqgNwjZgnfk59",
"ledger_index": "validated",
"type": "state",
"deletion_blockers_only": False,
"limit": 10
}
]
}
response = requests.post(url, data=json.dumps(payload))
print(response.text)
data = response.text
parsed = json.loads(data)
price = parsed['result']
price = price['account_objects']
for Balance in price:
print(Balance)
You will receive all the tokens the account holds and the value. I can not figure out how to parse this correctly and receive the correct one. This particular test account has a lot of tokens so I will only show the first tokens info.
RESULT
{'Balance': {'currency': 'ASP', 'issuer': 'rrrrrrrrrrrrrrrrrrrrBZbvji', 'value': '0'}, 'Flags': 65536, 'HighLimit': {'currency': 'ASP', 'issuer': 'r9cZA1mLK5R5Am25ArfXFmqgNwjZgnfk59', 'value': '0'}, 'HighNode': '0', 'LedgerEntryType': 'RippleState', 'LowLimit': {'currency': 'ASP', 'issuer': 'r3vi7mWxru9rJCxETCyA1CHvzL96eZWx5z', 'value': '10'}, 'LowNode': '0', 'PreviousTxnID': 'BF7555B0F018E3C5E2A3FF9437A1A5092F32903BE246202F988181B9CED0D862', 'PreviousTxnLgrSeq': 1438879, 'index': '2243B0B630EA6F7330B654EFA53E27A7609D9484E535AB11B7F946DF3D247CE9'}
I want to get the first bit of info, here. {'Balance': {'currency': 'ASP', 'issuer': 'rrrrrrrrrrrrrrrrrrrrBZbvji', 'value': '0'},
Specifically 'value' and the number
I have tried to take parse 'Balance' but since it is a list it is not as straight forward.
You're mixing up lists and dictionaries. In order to access a dictionary by key, you need to invoke the key, as such:
for Balance in price:
print(Balance['Balance'])
Yields the following results:
{'currency': 'CHF', 'issuer': 'rrrrrrrrrrrrrrrrrrrrBZbvji', 'value': '-0.3488146605801446'}
{'currency': 'BTC', 'issuer': 'rrrrrrrrrrrrrrrrrrrrBZbvji', 'value': '0'}
{'currency': 'USD', 'issuer': 'rrrrrrrrrrrrrrrrrrrrBZbvji', 'value': '-11.68225001668339'}
If you only wanted to extract the value, you simply dive one level deeper:
for Balance in price:
print(Balance['Balance']['value')
Which yields:
-0.3488146605801446
0
-11.68225001668339
I assume that under price['account_objects'] you have a list of dictionaries? And then in each dictionary you have in one of the keys: 'Balance': {'currency': 'ASP', 'issuer': 'rrrrrrrrrrrrrrrrrrrrBZbvji', 'value': '0'. If so, why don't you iterate over the list and then access each dictionary, like:
account_objects = price['account_objects']
for account_object in price:
print(account_object['Balance'])

Replace all occurrences of a string in JSON object regardless of key

I have a JSON object in Python created through requests built in .json() function.
Here is a simplified sample of what I'm doing:
data = session.get(url)
obj = data.json()
s3object = s3.Object(s3_bucket, output_file)
s3object.put(Body=(bytes(json.dumps(obj).encode('UTF-8'))))
Example obj:
{'id': 'fab779b7-2586-4895-9f3b-c9518f34e028', 'project_id': 'a1a73e68-9943-4584-9d59-cc84a0d3e92b', 'created_at': '2017-10-23 02:57:03 -0700', 'sections': [{'section_name': '', 'items': [{'id': 'ffadc652-dd36-4b9f-817c-6539a4b462ab', 'created_at': '2017-10-23 03:36:13 -0700', 'updated_at': '2017-10-23 03:38:32 -0700', 'created_by': 'paul', 'question_text': 'Drawing Ref(s)', 'spec_ref': '', 'display_number': null, 'response': '', 'comment': 'see attached mh309', 'position': 1, 'is_conforming': 'N/A', 'display_type': 'text'}]}]}
I need to replace any occurrence of the string "N/A" with "Not Applicable" anywhere it appears regardless of its key or location before I upload the JSON to S3. I cannot use local disk writes hence the reason this is done this way.
Is this possible?
My original plan was to turn it to a string and just replace before turning back, is this inefficient?
Thanks,
As mentioned in the comments, obj is a dict. One way to replace N/A with Not Applicable regardless of location is to convert it to a string, use string.replace and convert it back to dict for further processing
import json
#Original dict with N/A
obj = {'id': 'fab779b7-2586-4895-9f3b-c9518f34e028', 'project_id': 'a1a73e68-9943-4584-9d59-cc84a0d3e92b', 'created_at': '2017-10-23 02:57:03 -0700', 'sections': [{'section_name': '', 'items': [{'id': 'ffadc652-dd36-4b9f-817c-6539a4b462ab', 'created_at': '2017-10-23 03:36:13 -0700', 'updated_at': '2017-10-23 03:38:32 -0700', 'created_by': 'paul', 'question_text': 'Drawing Ref(s)', 'spec_ref': '', 'display_number': None, 'response': '', 'comment': 'see attached mh309', 'position': 1, 'is_conforming': 'N/A', 'display_type': 'text'}]}]}
#Convert to string and replace
obj_str = json.dumps(obj).replace('N/A', 'Not Applicable')
#Get obj back with replacement
obj = json.loads(obj_str)
Although #Devesh Kumar Singh's answer works with the sample json data in your question, converting the whole thing to a string, and then doing a wholesale bulk replace of the substring seems possibly error-prone because potentially it might change it in portions other than only in the values associated with dictionary keys.
To avoid that I would suggest using the following, which is more selective even though it takes a few more lines of code:
import json
def replace_NA(obj):
def decode_dict(a_dict):
for key, value in a_dict.items():
try:
a_dict[key] = value.replace('N/A', 'Not Applicable')
except AttributeError:
pass
return a_dict
return json.loads(json.dumps(obj), object_hook=decode_dict)
obj = {'id': 'fab779b7-2586-4895-9f3b-c9518f34e028', 'project_id': 'a1a73e68-9943-4584-9d59-cc84a0d3e92b', 'created_at': '2017-10-23 02:57:03 -0700', 'sections': [{'section_name': '', 'items': [{'id': 'ffadc652-dd36-4b9f-817c-6539a4b462ab', 'created_at': '2017-10-23 03:36:13 -0700', 'updated_at': '2017-10-23 03:38:32 -0700', 'created_by': 'paul', 'question_text': 'Drawing Ref(s)', 'spec_ref': '', 'display_number': None, 'response': '', 'comment': 'see attached mh309', 'position': 1, 'is_conforming': 'N/A', 'display_type': 'text'}]}]}
obj = replace_NA(obj)
I guess the Object you've pasted here must be of dict type, you can check it as if "type(json_object) is class dict". With that assumption youcan do it as:-
keys = json_object.keys()
for i in keys:
if json_object[i]=="N/A":
json_object[i]="Not Available"
Hope it helps!

How to fix "'User' object is not subscriptable"

I am trying to write some Python code, that gets the users Gitlab Profile picture/avatar to be sent in a Discord Embed later on in the code. However, when i try to read the json that the Gitlab API returns but i receive the error "'User' object is not subscriptable" this json doesnt look like other jsons returned by the Gitlab API.
I have tried to use Attributes but i still receive the same error, i have also tried just to read it but i still receive the same error.
import gitlab
import json
# private token or personal token authentication
gl = gitlab.Gitlab('URL', private_token='')
project = gl.projects.get(13)
json_data = project.tags.list(order_by='updated', sort='desc')
newest_tagjson = (json_data[0].attributes)
latesttag = newest_tagjson["name"]
name1 = newest_tagjson["commit"]["author_name"]
projectid = newest_tagjson["project_id"]
footer1 = "Panel"
if name1 == "------":
ID = 16
user = gl.users.get(ID)
print(user)
user2 = (user['avatar_url'].attributes)
i should receive a clean json that i can read but instead i recieve this in the print
<class 'gitlab.v4.objects.User'> => {'id': 16, 'name': '', 'username': '', 'state': 'active', 'avatar_url': 'https://URL.io/uploads/-/system/user/avatar/16/avatar.png', 'web_url': '', 'created_at': '2019-01-29T18:30:53.819Z', 'bio': ' \r\n', 'location': ', United Kingdom', 'public_email': '', 'skype': '', 'linkedin': '', 'twitter': '', 'website_url': '', 'organization': ''}
and i cannot read this.
The error seems pretty clear: the result of calling gl.users.get(ID) is not a Python dictionary, so you can't access keys with subscripts as in user['avatar_url']. You can access attributes using Python's dot notation, as in user.avatar_url.
You can of course extract the information you want into a Python dictionary:
>>> user_dict = {k: getattr(user, k) for k in
... ['id', 'name', 'state', 'avatar_url', 'web_url']}
>>> user_dict
{'id': 28841, 'name': 'Lars Kellogg-Stedman', 'state': 'active', 'avatar_url': 'https://secure.gravatar.com/avatar/1c09a8d9e719f9d13b6c99f6bb2637d8?s=80&d=identicon', 'web_url': 'https://gitlab.com/larsks'}
And then you can serialize this to JSON:
>>> print(json.dumps(user_dict, indent=2))
{
"id": 28841,
"name": "Lars Kellogg-Stedman",
"state": "active",
"avatar_url": "https://secure.gravatar.com/avatar/1c09a8d9e719f9d13b6c99f6bb2637d8?s=80&d=identicon",
"web_url": "https://gitlab.com/larsks"
}
The Python gitlab module wraps the gitlab API in a variety of managers designed to make certain things easier, but if your goal is to serialize things to JSON it might be easier to simply call the REST API yourself:
>>> import requests
>>> session = requests.Session()
>>> session.headers['private-token'] = your_private_token
>>> res = session.get('https://gitlab.com/api/v4/users/28841')
>>> res.json()
{'id': 28841, 'name': 'Lars Kellogg-Stedman', 'username': 'larsks', 'state': 'active', 'avatar_url': 'https://secure.gravatar.com/avatar/1c09a8d9e719f9d13b6c99f6bb2637d8?s=80&d=identicon', 'web_url': 'https://gitlab.com/larsks', 'created_at': '2014-04-26T01:52:14.000Z', 'bio': '', 'location': None, 'public_email': '', 'skype': '', 'linkedin': '', 'twitter': 'larsks', 'website_url': 'http://blog.oddbit.com/', 'organization': None}

Parse this JSON response From App Annie in Python

I am working with the request module within python to grab certain fields within the JSON response.
import json
fn = 'download.json'
data = json
response = requests.get('http://api.appannie.com/v1/accounts/1000/apps/mysuperapp/sales?break_down=application+iap&start_date=2013-10-01&end_date=2013-10-02', \
auth=('username', 'password'))
data = response.json()
print(data)
This works in python, as the response is the following:
{'prev_page': None, 'currency': 'USD', 'next_page': None, 'sales_list': [{'revenue': {'ad': '0.00', 'iap': {'refunds': '0.00', 'sales': '0.00', 'promotions': '0.00'}, 'app': {'refunds': '0.00', 'updates': '0.00', 'downloads': '0.00', 'promotions': '0.00'}},
'units': {'iap': {'refunds': 0, 'sales': 0, 'promotions': 0}, 'app': {'refunds': 0, 'updates': 0, 'downloads': 2000, 'promotions': 0}}, 'country': 'all', 'date': 'all'}], 'iap_sales': [], 'page_num': 1, 'code': 200, 'page_index': 0}
The question is how do I parse this to get my downloads number within the 'app' block - namely the "2000" value?
After the response.json() data is already a dictionary otherwise response.json() would raise an exception. Therefore you can access it just like any other dictionary.
You can use the loads() method of json -
import json
response = requests.get('http://api.appannie.com/v1/accounts/1000/apps/mysuperapp/sales?break_down=application+iap&start_date=2013-10-01&end_date=2013-10-02',
auth=('username', 'password'))
data = json.loads(response.json()) # data is a dictionary now
sales_list = data.get('sales_list')
for sales in sales_list:
print sales['revenue']['app']
You can use json.loads:
import json
import requests
response = requests.get(...)
json_data = json.loads(response.text)
This converts a given string into a dictionary which allows you to access your JSON data easily within your code.

Categories