I've got a json file that I've pulled from a web service and am trying to parse it. I see that this question has been asked a whole bunch, and I've read whatever I could find, but the json data in each example appears to be very simplistic in nature. Likewise, the json example data in the python docs is very simple and does not reflect what I'm trying to work with. Here is what the json looks like:
{"RecordResponse": {
"Id": blah
"Status": {
"state": "complete",
"datetime": "2016-01-01 01:00"
},
"Results": {
"resultNumber": "500",
"Summary": [
{
"Type": "blah",
"Size": "10000000000",
"OtherStuff": {
"valueOne": "first",
"valueTwo": "second"
},
"fieldIWant": "value i want is here"
The code block in question is:
jsonFile = r'C:\Temp\results.json'
with open(jsonFile, 'w') as dataFile:
json_obj = json.load(dataFile)
for i in json_obj["Summary"]:
print(i["fieldIWant"])
Not only am I not getting into the field I want, but I'm also getting a key error on trying to suss out "Summary".
I don't know how the indices work within the array; once I even get into the "Summary" field, do I have to issue an index manually to return the value from the field I need?
The example you posted is not valid JSON (no commas after object fields), so it's hard to dig in much. If it's straight from the web service, something's messed up. If you did fix it with proper commas, the "Summary" key is within the "Results" object, so you'd need to change your loop to
with open(jsonFile, 'w') as dataFile:
json_obj = json.load(dataFile)
for i in json_obj["Results"]["Summary"]:
print(i["fieldIWant"])
If you don't know the structure at all, you could look through the resulting object recursively:
def findfieldsiwant(obj, keyname="Summary", fieldname="fieldIWant"):
try:
for key,val in obj.items():
if key == keyname:
return [ d[fieldname] for d in val ]
else:
sub = findfieldsiwant(val)
if sub:
return sub
except AttributeError: #obj is not a dict
pass
#keyname not found
return None
Related
I had .xlsx file, and I converted it to JSON. I am using python to get the data from this json file. I am able for example to search for Build# and then I get the corresponding level, but when I search for values in, for example, "14H0232" or "14H4812" it throws a KeyError.
'''
import json
f = open('try.json')
data = json.load(f)
input= input('Enter the value: ')
for i in data['F6']:
if i['14H0232'] == input:
print(i['LEVEL'])
f.close()
'''
A Snippet of the json file.
'''
{
"F6": [
{
"LEVEL": "2.0.6.0",
"ID": "dataID",
"Build#": "9",
"prod/dev": "prod ",
"14H4812": "data1\r\ndata2",
"14H4826": "data",
"14H4813": "data1\r\ndata2"
}
],
"F5": [
{
"LEVEL": "2.0.5.1",
"ID": "dataID",
"Build#": "18",
"prod/dev": "prod",
"14H0232": "data1: data1\r\ndata2: data2\r\ndata3: data3",
"14H12321": "data1\r\ndata2"
}
],
"F4": [
{
"LEVEL": "2.0.4.1",
"ID": "dataID",
"Build#": "18",
"prod/dev": "prod",
"14H0232": "data1: data1\r\ndata2: data2\r\ndata3: data3",
"14H12321": "data1\r\ndata2"
}
]
}
'''
The problem is in your loop. When you are trying to access the value for "14H0232" it just doesnt exist in your json file. The case for 'Build#' is different I assume, as the key is always there. The example you shared is also not showing that F6 has a key with that id you specified.
So to avoid this kind of errors you can put your 'if' statement in a try block and catch the error.
try:
if i['14H0232'] == input:
print(i['LEVEL'])
except KeyError:
print("The key is not found but the code continues to execute")
A simple one, but I've just not yet been able to wrap my head around parsing nested lists and json structures in Python...
Here is the raw message I am trying to parse.
{
"Records": [
{
"messageId": "1b9c0952-3fe3-4ab4-a8ae-26bd5d3445f8",
"receiptHandle": "AQEBy40IsvNDy33dOhn4KB8+7apBecWpSuw5OgL9sw/Nf+tM2esLgqmWjGsd4n0oqB",
"body": "{\n \"Type\" : \"Notification\",\n \"MessageId\" : \"dce5c301-029f-55e1-8cee-959b1ad4e500\",\n \"TopicArn\" : \"arn:aws:sns:ap-southeast-2:062497424678:vid\",\n \"Message\" : \"ChiliChallenge.mp4\",\n \"Timestamp\" : \"2020-01-16T07:51:39.807Z\",\n \"SignatureVersion\" : \"1\",\n \"Signature\" : \"oloRF7SzS8ipWQFZieXDQ==\",\n \"SigningCertURL\" : \"https://sns.ap-southeast-2.amazonaws.com/SimpleNotificationService-a.pem\",\n \"UnsubscribeURL\" : \"https://sns.ap-southeast-2.amazonaws.com/?Action=Unsubscribe&SubscriptionArn=arn:aws:sns:ap-southeast-2:062478:vid\"\n}",
"attributes": {
"ApproximateReceiveCount": "1",
"SentTimestamp": "1579161099897",
"SenderId": "AIDAIY4XD42",
"ApproximateFirstReceiveTimestamp": "1579161099945"
},
"messageAttributes": {},
"md5OfBody": "1f246d643af4ea232d6d4c91f",
"eventSource": "aws:sqs",
"eventSourceARN": "arn:aws:sqs:ap-southeast-2:062497424678:vid",
"awsRegion": "ap-southeast-2"
}
]
}
I am trying to extract the Message in the body section, ending up with a string as "ChiliChallenge.mp4\"
Thanks!
Essentially I just keep getting either TypeError: string indices must be integers or parsing the body but not getting any further into the list without an error.
Here's my attempt:
import json
with open ("event_testing.txt", "r") as myfile:
event=myfile.read().replace('\n', '')
str(event)
event = json.loads(event)
key = event['Records'][0]['body']
print(key)
you can use json.loads to load string
with open ("event_testing.txt", "r") as fp:
event = json.loads(fp.read())
key = json.loads(event['Records'][0]['body'])['Message']
print(key)
'ChiliChallenge.mp4'
Say your message is phrase,
I rebuild your code like:
phrase_2 = phrase["Records"]
print(phrase_2[0]["body"])
Then it works clearly. Because beginning of the Records, it looks like an array so you need to organized it.
I'm using requests and JSON to pull some data from an API, and I'm struggling with using a nested dict.
Here is the JSON data:
{"data": [
{
"ContactId": "123",
"EmailAddress": "abc#xyz.com",
"FirstName": null,
"LastName": null,
"ClickDate": "6/6/1966",
"Clicks": "5",
"IPAddress": "1.1.1.1.1",
"UserAgent": "IE8.0",
"UniqueLinksClicked": [
{
"LinkURL": "http://link1.com",
"LinkURL": "http://link2.com",
"LinkURL": "http://link3.com"
}
]
}
]}
I'm able to access all of the ContactID and other 1st level stuff fine, but I can't figure out how to traverse the "LinkURL" stuff.
Here is my python...
result = requests.get(requesturl, headers=headers)
jdata = json.loads(result.content)
for result in jdata["data"]:
contactID = str([(result["ContactId"])])
for result in jdata["data"]["UniqueLinksClicked"]: #I'm doing this wrong, but I'm not sure how.
print(ContactID + " " + str([(result["LinkURL"])]))
The line marked with a comment above generates a TypeError indicating it's a list, where I expected it to be a dict:
list indices must be integers or slices, not str
If instead I drop the ["data"] dereference and try to access "UniqueLinksClicked" on jdata:
for link in jdata["UniqueLinksClicked"]:
I get a key error because the ["UniqueLinksClicked"] is an item inside of the ["data"] dict.
How do I do this correctly?
You can iterate over the links in a nested loop. Do not use the same variable name result in two nested loops! Use a different variable name in the inner loop.
for link in result["UniqueLinksClicked"]:
print(ContactID, link["LinkURL"])
(Moved from question.)
[OP] was confused about the variable naming in the for variable1 in variable2["dict"]: portion. After some help from HÃ¥ken Lid, [they] figured it out.
It should look like this...
for item in jdata["data"]:
contactID = str([(item["ContactId"])])
print(contactID)
for link in item["UniqueLinksClicked"]:
print(link["LinkURL"])
Okay, so I've been banging my head on this for the last 2 days, with no real progress. I am a beginner with python and coding in general, but this is the first issue I haven't been able to solve myself.
So I have this long file with JSON formatting with about 7000 entries from the youtubeapi.
right now I want to have a short script to print certain info ('videoId') for a certain dictionary key (refered to as 'key'):
My script:
import json
f = open ('path file.txt', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['key']['Items']['id']['videoId'])
# print(trailers['key']['videoId'] gives same response
Error:
print(trailers['key']['Items']['id']['videoId'])
TypeError: string indices must be integers
It does work when I want to print all the information for the dictionary key:
This script works
import json
f = open ('path file.txt', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['key'])
Also print(type(trailers)) results in class 'dict', as it's supposed to.
My JSON File is formatted like this and is from the youtube API, youtube#searchListResponse.
{
"kind": "youtube#searchListResponse",
"etag": "",
"nextPageToken": "",
"regionCode": "",
"pageInfo": {
"totalResults": 1000000,
"resultsPerPage": 1
},
"items": [
{
"kind": "youtube#searchResult",
"etag": "",
"id": {
"kind": "youtube#video",
"videoId": ""
},
"snippet": {
"publishedAt": "",
"channelId": "",
"title": "",
"description": "",
"thumbnails": {
"default": {
"url": "",
"width": 120,
"height": 90
},
"medium": {
"url": "",
"width": 320,
"height": 180
},
"high": {
"url": "",
"width": 480,
"height": 360
}
},
"channelTitle": "",
"liveBroadcastContent": "none"
}
}
]
}
What other information is needed to be given for you to understand the problem?
The following code gives me all the videoId's from the provided sample data (which is no id's at all in fact):
import json
with open('sampledata', 'r') as datafile:
data = json.loads(datafile.read())
print([item['id']['videoId'] for item in data['items']])
Perhaps you can try this with more data.
Hope this helps.
I didn't really look into the youtube api but looking at the code and the sample you gave it seems you missed out a [0]. Looking at the structure of json there's a list in key items.
import json
f = open ('json1.json', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['items'][0]['id']['videoId'])
I've not used json before at all. But it's basically imported in the form of dicts with more dicts, lists etc. Where applicable. At least from my understanding.
So when you do type(trailers) you get type dict. Then you do dict with trailers['key']. If you do type of that, it should also be a dict, if things work correctly. Working through the items in each dict should in the end find your error.
Pythons error says you are trying find the index/indices of a string, which only accepts integers, while you are trying to use a dict. So you need to find out why you are getting a string and not dict when using each argument.
Edit to add an example. If your dict contains a string on key 'item', then you get a string in return, not a new dict which you further can get a dict from. item in the json for example, seem to be a list, with dicts in it. Not a dict itself.
I have a json object which I would like to filter for misspelled key name. So for the example below, I would like to have a json object without the misspelled test_name key. What is the easiest way to do this?
json_data = """{
"my_test": [{
"group_name": "group-A",
"results": [{
"test_name": "test1",
"time": "8.556",
"status": "pass"
}, {
"test_name": "test2",
"time": "45.909",
"status": "pass"
}, {
"test_nameZASSD": "test3",
"time": "9.383",
"status": "fail"
}]
}]
}"""
This is an online test, and looks like i'm not allowed to use jsonSchema.
So far my code looks like this:
if 'test_suites' in data:
for suites in data["test_suites"]:
if 'results' in suites and 'suite_name' in suites:
for result in suites["results"]:
if 'test_name' not in result or 'time' not in result or 'status' not in result:
result.clear()
else:
....
else:
print("Check 'suite_name' and/or 'results'")
else:
print("Check 'test_suites'")
It kind of works, but result.clear() leaves a empty {}, which get annoying later. What can I do here?
It looks like your data have a consistent schema. So I would try using json schema to solve your problem. With that you can set up a schema and only allow objects with certain key names.
If you just want to check if a certain key is in the dictionary and make sure that you only get the ones that are according to spec you could do something like this:
passed = []
for item in result:
if 'test_name' in item.keys():
passed.append(item)
But if you have a lot of different keys you need to check for it will become unwieldy. So for bigger projects I would say that json schema is the way to go.