Parse json string in Python - python

A simple one, but I've just not yet been able to wrap my head around parsing nested lists and json structures in Python...
Here is the raw message I am trying to parse.
{
"Records": [
{
"messageId": "1b9c0952-3fe3-4ab4-a8ae-26bd5d3445f8",
"receiptHandle": "AQEBy40IsvNDy33dOhn4KB8+7apBecWpSuw5OgL9sw/Nf+tM2esLgqmWjGsd4n0oqB",
"body": "{\n \"Type\" : \"Notification\",\n \"MessageId\" : \"dce5c301-029f-55e1-8cee-959b1ad4e500\",\n \"TopicArn\" : \"arn:aws:sns:ap-southeast-2:062497424678:vid\",\n \"Message\" : \"ChiliChallenge.mp4\",\n \"Timestamp\" : \"2020-01-16T07:51:39.807Z\",\n \"SignatureVersion\" : \"1\",\n \"Signature\" : \"oloRF7SzS8ipWQFZieXDQ==\",\n \"SigningCertURL\" : \"https://sns.ap-southeast-2.amazonaws.com/SimpleNotificationService-a.pem\",\n \"UnsubscribeURL\" : \"https://sns.ap-southeast-2.amazonaws.com/?Action=Unsubscribe&SubscriptionArn=arn:aws:sns:ap-southeast-2:062478:vid\"\n}",
"attributes": {
"ApproximateReceiveCount": "1",
"SentTimestamp": "1579161099897",
"SenderId": "AIDAIY4XD42",
"ApproximateFirstReceiveTimestamp": "1579161099945"
},
"messageAttributes": {},
"md5OfBody": "1f246d643af4ea232d6d4c91f",
"eventSource": "aws:sqs",
"eventSourceARN": "arn:aws:sqs:ap-southeast-2:062497424678:vid",
"awsRegion": "ap-southeast-2"
}
]
}
I am trying to extract the Message in the body section, ending up with a string as "ChiliChallenge.mp4\"
Thanks!
Essentially I just keep getting either TypeError: string indices must be integers or parsing the body but not getting any further into the list without an error.
Here's my attempt:
import json
with open ("event_testing.txt", "r") as myfile:
event=myfile.read().replace('\n', '')
str(event)
event = json.loads(event)
key = event['Records'][0]['body']
print(key)

you can use json.loads to load string
with open ("event_testing.txt", "r") as fp:
event = json.loads(fp.read())
key = json.loads(event['Records'][0]['body'])['Message']
print(key)
'ChiliChallenge.mp4'

Say your message is phrase,
I rebuild your code like:
phrase_2 = phrase["Records"]
print(phrase_2[0]["body"])
Then it works clearly. Because beginning of the Records, it looks like an array so you need to organized it.

Related

Converting JSON file into a suitable string issue with Python

I have a JSON file as follows :
{
"desired":{
"property1":{
"port":"/dev/usbserial",
"rx":{
"watchdoginterval":3600
},
"state":{
"path":"/Users/user1"
},
"enabled":"true",
"active":{
"enabled":"true"
}
},
"property2":{
"signal_interrupt":"USR2",
"signal_description_path":"/tmp/logger.log"
},
"property3":{
"periodmins":40
},
}
}
I am having issues trying to convert this into a string for use with AWS IoT. The function I am using is deviceShadowHandler.shadowUpdate(JSONPayload, customShadowCallback_Update, 5)
Where JSONPayload should be the JSON string.
I have tried :
with open('JSONfile.json' , 'r') as f:
dict = json.load(f)
JSONPayload = str(dict)
but I receive an "Invalid JSON file error".
An attempt to manually create a literal string from the jSON file gets messy with complaints about "EOL while scanning string literal" etc.
What is the best solution to solve this? I am new to JSON and stuff and Python.
Trailing commas are not allowed in JSON.
{
"desired":{
"property1":{
"port":"/dev/usbserial",
"rx":{
"watchdoginterval":3600
},
"state":{
"path":"/Users/user1"
},
"enabled":"true",
"active":{
"enabled":"true"
}
},
"property2":{
"signal_interrupt":"USR2",
"signal_description_path":"/tmp/logger.log"
},
"property3":{
"periodmins":40
} # <- no comma there
}
}

TypeError: string indices must be integers, not str with JSON parsing

I am getting the above error when trying to parse a JSON file.
Code:
import json
data = open('output.json').read()
for host in data['ASSET_DATA_REPORT']['HOST_LIST']['HOST']:
print(host['IMAGE_ID'])
Traceback:
Traceback (most recent call last):
File "json_format.py", line 11, in <module>
for host in data['ASSET_DATA_REPORT']['HOST_LIST']['HOST']:
TypeError: string indices must be integers, not str
JSON:
{
"ASSET_DATA_REPORT": {
"HOST_LIST": {
"HOST": [
{
"IP": {
"network_id": "0"
},
"TRACKING_METHOD": "EC2",
"ASSET_TAGS": {
"ASSET_TAG": [
"EC2 Running",
"IF - Database - MySQL",
]
},
"DNS": "i-xxxxxxx",
"EC2_INSTANCE_ID": "i-xxxxxx",
"EC2_INFO": {
"PUBLIC_DNS_NAME": "ec2-xxxxxxxx.amazonaws.com",
"IMAGE_ID": "ami-xxxxxx",
"VPC_ID": "vpc-xxxxxx",
"INSTANCE_STATE": "RUNNING",
"PRIVATE_DNS_NAME": "ip-xxxx.ec2.internal",
"INSTANCE_TYPE": "m3.xlarge"
}
}
]
}
}
}
It seems like host is a string for some reason and I'm not sure how to overcome this error.
Importing json is not enough. data = open('output.json').read() just treats it as any other file.
TypeError: string indices must be integers, not str is not complaining about the 'HOST' key; data['ASSET_DATA_REPORT'] on its own won't be valid either because the whole thing is a string.
Try:
with open('output.json') as infile:
data = json.load(infile)
As pointed out by #Milton Arango G there is an error in the JSON you posted. Change:
"IF - Database - MySQL",
to:
"IF - Database - MySQL"
After that, you can obtain the 'IMAGE_ID' field with:
print(data['ASSET_DATA_REPORT']['HOST_LIST']['HOST'][0]['EC2_INFO']['IMAGE_ID'])
You have a couple problems, some in your code, some in your JSON.
First, the JSON --- you have an extra comma after the last list entry:
"ASSET_TAG": [
"EC2 Running",
"IF - Database - MySQL",
]
Your code has two problems.
First is that you never convert the contents of the file to JSON --- it remains a string:
data = open('output.json').read()
You want something like
roganjosh already described:
with open('output.json') as f:
data = json.load(f)
Your followup problem is that the structure of the JSON doesn't match your code.
'IMAGE_ID' isn't a key in the
(unnamed)
dictionary stored in the 'HOST' list --- it's a key of the 'EC2_INFO' dictionary, which is contained inside that nameless dictionary.
This:
print(host['IMAGE_ID'])
Should be something like:
print(host['EC2_INFO']['IMAGE_ID'])
The output is a string:
ami-xxxxxx
That is not a good way to open a json file.
open('output.json').read()
return your file as string.
A better way is :
import json
with open('output.json', 'r') as my_file:
data = json.load(my_file)
for host in data['ASSET_DATA_REPORT']['HOST_LIST']['HOST']:
print(host['IMAGE_ID'])

TypeError: string indices must be integers // working with JSON as dict in python

Okay, so I've been banging my head on this for the last 2 days, with no real progress. I am a beginner with python and coding in general, but this is the first issue I haven't been able to solve myself.
So I have this long file with JSON formatting with about 7000 entries from the youtubeapi.
right now I want to have a short script to print certain info ('videoId') for a certain dictionary key (refered to as 'key'):
My script:
import json
f = open ('path file.txt', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['key']['Items']['id']['videoId'])
# print(trailers['key']['videoId'] gives same response
Error:
print(trailers['key']['Items']['id']['videoId'])
TypeError: string indices must be integers
It does work when I want to print all the information for the dictionary key:
This script works
import json
f = open ('path file.txt', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['key'])
Also print(type(trailers)) results in class 'dict', as it's supposed to.
My JSON File is formatted like this and is from the youtube API, youtube#searchListResponse.
{
"kind": "youtube#searchListResponse",
"etag": "",
"nextPageToken": "",
"regionCode": "",
"pageInfo": {
"totalResults": 1000000,
"resultsPerPage": 1
},
"items": [
{
"kind": "youtube#searchResult",
"etag": "",
"id": {
"kind": "youtube#video",
"videoId": ""
},
"snippet": {
"publishedAt": "",
"channelId": "",
"title": "",
"description": "",
"thumbnails": {
"default": {
"url": "",
"width": 120,
"height": 90
},
"medium": {
"url": "",
"width": 320,
"height": 180
},
"high": {
"url": "",
"width": 480,
"height": 360
}
},
"channelTitle": "",
"liveBroadcastContent": "none"
}
}
]
}
What other information is needed to be given for you to understand the problem?
The following code gives me all the videoId's from the provided sample data (which is no id's at all in fact):
import json
with open('sampledata', 'r') as datafile:
data = json.loads(datafile.read())
print([item['id']['videoId'] for item in data['items']])
Perhaps you can try this with more data.
Hope this helps.
I didn't really look into the youtube api but looking at the code and the sample you gave it seems you missed out a [0]. Looking at the structure of json there's a list in key items.
import json
f = open ('json1.json', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['items'][0]['id']['videoId'])
I've not used json before at all. But it's basically imported in the form of dicts with more dicts, lists etc. Where applicable. At least from my understanding.
So when you do type(trailers) you get type dict. Then you do dict with trailers['key']. If you do type of that, it should also be a dict, if things work correctly. Working through the items in each dict should in the end find your error.
Pythons error says you are trying find the index/indices of a string, which only accepts integers, while you are trying to use a dict. So you need to find out why you are getting a string and not dict when using each argument.
Edit to add an example. If your dict contains a string on key 'item', then you get a string in return, not a new dict which you further can get a dict from. item in the json for example, seem to be a list, with dicts in it. Not a dict itself.

List Indices in json in Python

I've got a json file that I've pulled from a web service and am trying to parse it. I see that this question has been asked a whole bunch, and I've read whatever I could find, but the json data in each example appears to be very simplistic in nature. Likewise, the json example data in the python docs is very simple and does not reflect what I'm trying to work with. Here is what the json looks like:
{"RecordResponse": {
"Id": blah
"Status": {
"state": "complete",
"datetime": "2016-01-01 01:00"
},
"Results": {
"resultNumber": "500",
"Summary": [
{
"Type": "blah",
"Size": "10000000000",
"OtherStuff": {
"valueOne": "first",
"valueTwo": "second"
},
"fieldIWant": "value i want is here"
The code block in question is:
jsonFile = r'C:\Temp\results.json'
with open(jsonFile, 'w') as dataFile:
json_obj = json.load(dataFile)
for i in json_obj["Summary"]:
print(i["fieldIWant"])
Not only am I not getting into the field I want, but I'm also getting a key error on trying to suss out "Summary".
I don't know how the indices work within the array; once I even get into the "Summary" field, do I have to issue an index manually to return the value from the field I need?
The example you posted is not valid JSON (no commas after object fields), so it's hard to dig in much. If it's straight from the web service, something's messed up. If you did fix it with proper commas, the "Summary" key is within the "Results" object, so you'd need to change your loop to
with open(jsonFile, 'w') as dataFile:
json_obj = json.load(dataFile)
for i in json_obj["Results"]["Summary"]:
print(i["fieldIWant"])
If you don't know the structure at all, you could look through the resulting object recursively:
def findfieldsiwant(obj, keyname="Summary", fieldname="fieldIWant"):
try:
for key,val in obj.items():
if key == keyname:
return [ d[fieldname] for d in val ]
else:
sub = findfieldsiwant(val)
if sub:
return sub
except AttributeError: #obj is not a dict
pass
#keyname not found
return None

python - exporting dictionary(array) to json

I have an array of dictionaries like so:
myDict[0] = {'date':'today', 'status': 'ok'}
myDict[1] = {'date':'yesterday', 'status': 'bad'}
and I'm trying to export this array to a json file where each dictionary is its own entry. The problem is when I try to run:
dump(myDict, open("test.json", "w"))
It outputs a json file with a number prefix before each entry
{"0": {"date": "today", "status": "ok"}, "1": {"date": "yesterday", "status": "bad"} }
which apparently isn't legal json since my json parser (protovis) is giving me error messages
Any ideas?
Thanks
Use a list instead of a dictionary; you probably used:
myDict = {}
myDict[0] = {...}
You should use:
myList = []
myList.append({...}
P.S.: It seems valid json to me anyways, but it is an object and not a list; maybe this is the reason why your parser is complaining
You should use a JSON serializer...
Also, an array of dictionaries would better serialize to something like this:
[
{
"date": "today",
"status": "ok"
},
{
"date": "yesterday",
"status": "bad"
}
]
That is, you should just use a JavaScript array.

Categories