How to parse nested JSON in python

How to parse nested JSON in python - python

I'm struggling to access some values in this nested json in python.
How can I access this ['Records'][0]['s3']['bucket']['name'] ? I did search a lot to find a simple python snippet, but no luck. Thanks in advance!
{
"Records": [
{
"eventName": "xxxxxxx",
"userIdentity": {
"principalId": "AWS:XXXXXXXXXXXXXX"
},
"requestParameters": {
"sourceIPAddress": "XX.XX.XX.XX"
},
"responseElements": {
"x-amz-request-id": "8CXXXXXXXXXXHRQX",
"x-amz-id-2": "doZ3+gxxxxxxx"
},
"s3": {
"s3SchemaVersion": "1.0",
"configurationId": "X-Event",
"bucket": {
"name": "bucket-name",
"ownerIdentity": {
"principalId": "xxxxxxx"
},
"arn": "arn:aws:s3:::bucket-name"
},
"object": {
"key": "object.png",
"sequencer": "0060XXXXXXX75X"
}
}
}
]
}

Since this is a string, use the json.loads method from the inbuilt JSON library.
import json
json_string = # your json string
parsed_string = json.loads(json_string)
print(parsed_string) # it will be a python dict
print(parsed_string['Records'][0]['s3']['bucket']['name']) # prints the string

Have you tried running your example? If you're loading the json from elsewhere, you'd need to convert it to this native dictionary object using the json library (as mentioned by others, json.loads(data))
kv = {
"Records": [
{
"eventName": "xxxxxxx",
"userIdentity": {
"principalId": "AWS:XXXXXXXXXXXXXX"
},
"requestParameters": {
"sourceIPAddress": "XX.XX.XX.XX"
},
"responseElements": {
"x-amz-request-id": "8CXXXXXXXXXXHRQX",
"x-amz-id-2": "doZ3+gxxxxxxx"
},
"s3": {
"s3SchemaVersion": "1.0",
"configurationId": "X-Event",
"bucket": {
"name": "bucket-name",
"ownerIdentity": {
"principalId": "xxxxxxx"
},
"arn": "arn:aws:s3:::bucket-name"
},
"object": {
"key": "object.png",
"sequencer": "0060XXXXXXX75X"
}
}
}
]
}
print("RESULT:",kv['Records'][0]['s3']['bucket']['name'])
RESULT: bucket-name

Related

convert all int type values to strings in a json object

I have a JSON object with the following structure of repeating dictionaries. I want to replace the type values of id, which are all stored as type integer with a string type value. So in the following example, I want to replace the value of id from 1998459 to "1998459".
At the moment, these int-type values are giving me a JSON decode errors when I try to read in this json using json.loads
The error looks like this
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes:
So far I was trying to Find and Replace the int types with String types in atom IDE but I have found no clear solution to do it that way. I am not sure if there is a library I should use or another avenue I should take.
The JSON sample looks like this
{ "data": {
"imageData": {
"data": [
{
"id": 1998459,
"url": "https:some_url",
"isInspirational": "true",
"classifications": [
{
"categoryLabel": "Scene",
},
{
"categoryLabel": "Contemporary",
},
{
"categoryLabel": "Rectangle",
},
{
"categoryLabel": "Square",
},
{
"categoryLabel": "Trapezoid",
},
{
"categoryLabel": "Triangle",
},
{
"categoryLabel": "Hard",
},
{
"categoryLabel": "Soft",
},
{
"categoryLabel": "Smooth",
},
{
"categoryLabel": "Stripe",
},
{
"categoryLabel": "Primary_Color",
},
{
"categoryLabel": "Calm",
},
{
"categoryLabel": "Light",
},
{
"categoryLabel": "Interior",
},
{
"categoryLabel": "Residential",
}
]
} ]
}
]
} }}

Parse ElasticSearch time format

I want to know what is the time format of 2021-02-11T14:05:22.123123 to put in query like
query =
'{
"sort": [
{
"date": {
"order": "desc"
}
}
],
"query": {
"bool": {
"must": [
{
"range": {
"date": {
"gte": "2021-02-11T14:05:22.123123",
"format": "WHAT ???????"
}
}
}
]
}
}
}'
What should I need to write into "format"

You need to use this below format of date, in order to parse 2021-02-11T14:05:22.123123
{
"mappings": {
"properties": {
"date": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSSSSS"
}
}
}
}

Python - Parse complex JSON with objectpath

i need parse terraform file, write in JSON format. I have to extract two data, resource and id, this is example file:
{
"version": 1,
"serial": 1,
"modules": [
{
"path": [
"root"
],
"outputs": {
},
"resources": {
"aws_security_group.vpc-xxxxxxx-test-1": {
"type": "aws_security_group",
"primary": {
"id": "sg-xxxxxxxxxxxxxx",
"attributes": {
"description": "test-1",
"name": "test-1"
}
}
},
"aws_security_group.vpc-xxxxxxx-test-2": {
"type": "aws_security_group",
"primary": {
"id": "sg-yyyyyyyyyyyy",
"attributes": {
"description": "test-2",
"name": "test-2"
}
}
}
}
}
]
}
I need export for any resources, the first key and value of id, in this case, aws_security_group.vpc-xxxxxxx-test-1 sg-xxxxxxxxxxxxxx and aws_security_group.vpc-xxxxxxx-test-2 sg-yyyyyyyyyyyy
I have tried to write this in python:
#!/usr/bin/python3.6
import json
import objectpath
with open('file.json') as json_file:
data = json.load(json_file)
json_tree = objectpath.Tree(data['modules'])
result = tuple(json_tree.execute('$..resources[0]'))
result is
('aws_security_group.vpc-xxxxxxx-test-1', 'aws_security_group.vpc-xxxxxxx-test-2')
It's'ok but I can't extract the id, any help is appreciated, also use other methods
Thanks

I don't know objectpath, but I think you need:
tree.execute('$..resources[0]..primary.id')
or even just
tree.execute('$..resources[0]..id')

How to search through JSON object for specific list o expected key values?

I have this json object, and I am curious how to iterate through servicecatalog:name and alert for any name that does not equal "service-foo" or "service-bar".
Here is my json object:
{
"access": {
"serviceCatalog": [
{
"endpoints": [
{
"internalURL": "https://snet-storage101.example.com//v1.0",
"publicURL": "https://storage101.example.com//v1.0",
"region": "LON",
"tenantId": "1
},
{
"internalURL": "https://snet-storage101.example.com//v1.0",
"publicURL": "https://storage101.example.com//v1.0",
"region": "USA",
"tenantId": "1
}
],
"name": "service-foo",
"type": "object-store"
},
{
"endpoints": [
{
"publicURL": "https://x.example.com:9384/v1.0/x",
"tenantId": "6y5t4re32"
}
],
"name": "service-bar",
"type": "rax:test"
},
{
"endpoints": [
{
"publicURL": "https://y.example.com:9384/v1.0/x",
"tenantId": "765432"
}
],
"name": "service-thesystem",
"type": "rax:test"
}
]
}

If x is the above mentioned dictionary. You could do
for item in x["access"]["serviceCatalog"]:
if item["name"] not in ["service-foo", "service-bar"]:
print(item["name"])
ps: you could use json.loads() to decode json data if you are asking for that. And also you have errors in your JSON.

Check whether the JSON (object property exists) & print it as unicode decoded

I get the following data from the Instagram API, I m trying to get the text property from the caption using the following code:
data = simplejson.load(info) # info is retrieved using the urllib2
for post in data['data']:
if post['caption'] is not "null":
try:
post['caption']['text']
except NameError:
post['caption']['text'] = 0
if post['caption']['text'] is not 0:
print post['caption']['text']
But I keep getting the TypeError: 'NoneType' object has no attribute '__getitem__' error + UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-5: cha
racter maps to <undefined> error while printing the unicode strings
Here is the JSON data that is retrieved and stored in info
{
"pagination":{
"next_url":"https:\/\/api.instagram.com\/v1\/users\/self\/feed?access_token=184046392.f59def8.c5726b469ad2462f85c7cea5f72083c0&count=3&max_id=247821697921944007_6064449",
"next_max_id":"247821697921944007_6064449"
},
"meta":{
"code":200
},
"data":[
{
"attribution":null,
"tags":[
"usausausa",
"olympics"
],
"type":"image",
"location":{
"latitude":37.785929,
"name":"Aquatech Swim School",
"longitude":-122.278718,
"id":16343815
},
"comments":{
"count":0,
"data":[
]
},
"filter":"Valencia",
"created_time":"1343765260",
"link":"http:\/\/instagr.am\/p\/NwhEktJvEp\/",
"likes":{
"count":0,
"data":[
]
},
"images":{
"low_resolution":{
"url":"http:\/\/distilleryimage1.s3.amazonaws.com\/61d9cbeedb4b11e1b8e822000a1e8b8e_6.jpg",
"width":306,
"height":306
},
"thumbnail":{
"url":"http:\/\/distilleryimage1.s3.amazonaws.com\/61d9cbeedb4b11e1b8e822000a1e8b8e_5.jpg",
"width":150,
"height":150
},
"standard_resolution":{
"url":"http:\/\/distilleryimage1.s3.amazonaws.com\/61d9cbeedb4b11e1b8e822000a1e8b8e_7.jpg",
"width":612,
"height":612
}
},
"caption":{
"created_time":"1343765325",
"text":"Part of my job to watch swimming. #olympics #USAUSAUSA",
"from":{
"username":"kissinkatkelly",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_4672491_75sq_1341713095.jpg",
"id":"4672491",
"full_name":"kissinkatkelly"
},
"id":"247843973390332239"
},
"user_has_liked":false,
"id":"247843429330383145_4672491",
"user":{
"username":"kissinkatkelly",
"website":"",
"bio":"I sing the body electric\r\n\r\nBay Area, CA",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_4672491_75sq_1341713095.jpg",
"full_name":"kissinkatkelly",
"id":"4672491"
}
},
{
"attribution":null,
"tags":[
],
"type":"image",
"location":{
"latitude":36.020832061,
"longitude":-121.548835754
},
"comments":{
"count":4,
"data":[
{
"created_time":"1343763343",
"text":"I wanna cut your mustache off. \ue313\ue313\ue313\ue004",
"from":{
"username":"glorias_noodles",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_24432017_75sq_1343633079.jpg",
"id":"24432017",
"full_name":"\ue340MeGusta Gloria\ue340"
},
"id":"247827343962686703"
},
{
"created_time":"1343763844",
"text":"Ahaha^",
"from":{
"username":"chloe_carter",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_44766575_75sq_1343509145.jpg",
"id":"44766575",
"full_name":"Chloe Carter"
},
"id":"247831551235474746"
},
{
"created_time":"1343763958",
"text":"Amazingg thoo",
"from":{
"username":"saulyp",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_18051263_75sq_1335648741.jpg",
"id":"18051263",
"full_name":"Saul Perez"
},
"id":"247832506790200642"
},
{
"created_time":"1343764298",
"text":"#popesaintvictor where is that? :o",
"from":{
"username":"youknow_jameson",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_194001394_75sq_1343613135.jpg",
"id":"194001394",
"full_name":"Jameson Medina"
},
"id":"247835358103225704"
}
]
},
"filter":"Normal",
"created_time":"1343763202",
"link":"http:\/\/instagr.am\/p\/NwdJRpBkfX\/",
"likes":{
"count":611,
"data":[
{
"username":"jakyvedder",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_18148021_75sq_1336938690.jpg",
"id":"18148021",
"full_name":"Janycken"
},
{
"username":"nadjasinbruker",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_174576513_75sq_1343582260.jpg",
"id":"174576513",
"full_name":"Nadja"
},
{
"username":"vivi11",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_1193390_75sq_1338169730.jpg",
"id":"1193390",
"full_name":"Viviana Rodriguez"
},
{
"username":"me_4_eva",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_181498114_75sq_1343506811.jpg",
"id":"181498114",
"full_name":"Kelly"
},
{
"username":"roxczajkowski",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_9367244_75sq_1343696914.jpg",
"id":"9367244",
"full_name":"Czajkowski \u041a\u043e\u0440\u0448\u0443\u043d\u043e\u0432\u0430"
},
{
"username":"arsi1989",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_201586134_75sq_1343761866.jpg",
"id":"201586134",
"full_name":"Arsalan MemOn"
},
{
"username":"puppyluva",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_201579504_75sq_1343760137.jpg",
"id":"201579504",
"full_name":"puppyluva"
},
{
"username":"paulinamurr",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_49364097_75sq_1343428499.jpg",
"id":"49364097",
"full_name":"Paulina Murray"
},
{
"username":"_mcquadeface_",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_20679753_75sq_1327617901.jpg",
"id":"20679753",
"full_name":"Emily McQuade"
}
]
},
"images":{
"low_resolution":{
"url":"http:\/\/distilleryimage11.s3.amazonaws.com\/96cd5b90db4611e1827612313814176c_6.jpg",
"width":306,
"height":306
},
"thumbnail":{
"url":"http:\/\/distilleryimage11.s3.amazonaws.com\/96cd5b90db4611e1827612313814176c_5.jpg",
"width":150,
"height":150
},
"standard_resolution":{
"url":"http:\/\/distilleryimage11.s3.amazonaws.com\/96cd5b90db4611e1827612313814176c_7.jpg",
"width":612,
"height":612
}
},
"caption":null,
"user_has_liked":false,
"id":"247826160271378391_605400",
"user":{
"username":"popesaintvictor",
"website":"http:\/\/popesaintvictor.com",
"bio":"artist, friend, and brand designer for blood:water mission in nashville, tennessee. \r\n\r\nhusband to #ohsoamy\r\n\r\nbe inspired. be awesome.\r\n",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_605400_75sq_1342893414.jpg",
"full_name":"pope saint victor",
"id":"605400"
}
},
{
"attribution":null,
"tags":[
],
"type":"image",
"location":{
"latitude":40.738834381,
"longitude":-73.994163513
},
"comments":{
"count":6,
"data":[
{
"created_time":"1343762733",
"text":"Nice :)",
"from":{
"username":"belieberpernille99",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_186196238_75sq_1341347304.jpg",
"id":"186196238",
"full_name":"official belieber"
},
"id":"247822232418879860"
},
{
"created_time":"1343762748",
"text":"Those pants \ud83d\ude0d",
"from":{
"username":"morganmarzulli",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_29155556_75sq_1337653621.jpg",
"id":"29155556",
"full_name":"morganmarzulli"
},
"id":"247822351461615990"
},
{
"created_time":"1343762777",
"text":"That outfit is to die for. I love her pants! They're so fun. \ud83d\udc4d",
"from":{
"username":"ninavnegron",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_18421926_75sq_1343356820.jpg",
"id":"18421926",
"full_name":"Nina V"
},
"id":"247822600452278654"
},
{
"created_time":"1343762782",
"text":"YEAH THEIR COOL",
"from":{
"username":"belieberpernille99",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_186196238_75sq_1341347304.jpg",
"id":"186196238",
"full_name":"official belieber"
},
"id":"247822639375419775"
},
{
"created_time":"1343762782",
"text":"Another day another shoot! Look out for me and my chicest staff on #racked!",
"from":{
"username":"rebeccaminkoff",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_6064449_75sq_1332274636.jpg",
"id":"6064449",
"full_name":"Rebecca Minkoff"
},
"id":"247822641497737600"
},
{
"created_time":"1343764430",
"text":"Hot mama! Miss you!",
"from":{
"username":"ashleekoston",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_12925089_75sq_1340081366.jpg",
"id":"12925089",
"full_name":"ashleekoston"
},
"id":"247836463642020446"
}
]
},
"filter":"Walden",
"created_time":"1343762670",
"link":"http:\/\/instagr.am\/p\/NwcIVwRYnH\/",
"likes":{
"count":528,
"data":[
{
"username":"claireyoung48",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_40091175_75sq_1338778945.jpg",
"id":"40091175",
"full_name":"claireyoung48"
},
{
"username":"l_christine_k",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_14871166_75sq_1341962995.jpg",
"id":"14871166",
"full_name":"Lauren Kawano"
},
{
"username":"grcdaly",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_41426567_75sq_1335023058.jpg",
"id":"41426567",
"full_name":"\u24bc\u24c7\u24b6\u24b8\u24ba \u24b9\u24b6\u24c1\u24e8"
},
{
"username":"vanessaalcalaa",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_18115905_75sq_1342828120.jpg",
"id":"18115905",
"full_name":"Vanessa Alcala"
},
{
"username":"makennalenover",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_6464394_75sq_1343268613.jpg",
"id":"6464394",
"full_name":"Makenna Lenover"
},
{
"username":"heyitsmaryanne",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_623979_75sq_1340838647.jpg",
"id":"623979",
"full_name":"Maryanne L"
},
{
"username":"sarabeen",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/anonymousUser.jpg",
"id":"6463387",
"full_name":"sarabeen"
},
{
"username":"boldincrimson",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_191122242_75sq_1341889110.jpg",
"id":"191122242",
"full_name":"Pilar Chapa"
},
{
"username":"elizzabethhope",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_6761399_75sq_1342553102.jpg",
"id":"6761399",
"full_name":"Lizz\u270c"
}
]
},
"images":{
"low_resolution":{
"url":"http:\/\/distilleryimage7.s3.amazonaws.com\/59bc708edb4511e1b7ea22000a1cbb16_6.jpg",
"width":306,
"height":306
},
"thumbnail":{
"url":"http:\/\/distilleryimage7.s3.amazonaws.com\/59bc708edb4511e1b7ea22000a1cbb16_5.jpg",
"width":150,
"height":150
},
"standard_resolution":{
"url":"http:\/\/distilleryimage7.s3.amazonaws.com\/59bc708edb4511e1b7ea22000a1cbb16_7.jpg",
"width":612,
"height":612
}
},
"caption":null,
"user_has_liked":false,
"id":"247821697921944007_6064449",
"user":{
"username":"rebeccaminkoff",
"website":"http:\/\/www.rebeccaminkoff.com",
"bio":"The Downtown Romantic. My life, my work, my world.",
"profile_picture":"http:\/\/images.instagram.com\/profiles\/profile_6064449_75sq_1332274636.jpg",
"full_name":"Rebecca Minkoff",
"id":"6064449"
}
}
]
}

You don't need the intricate tests on wether 'text' is present for the post caption.
This code works well with the JSON string you posted:
for post in data['data']:
if post.get('caption'):
print post['caption'].get('text', 0)
Furthermore, you could be more defensive and refer to data.get('data', []) when starting the loop in case Instagram sends you empty JSON.

Basically when json loads and deserializes your object, null in JSON will become None in python.
So your line of:
if post['caption'] is not 'null':
Should become:
if post['caption']:

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to parse nested JSON in python - python

Since this is a string, use the json.loads method from the inbuilt JSON library. import json json_string = # your json string parsed_string = json.loads(json_string) print(parsed_string) # it will be a python dict print(parsed_string['Records'][0]['s3']['bucket']['name']) # prints the string

Related

convert all int type values to strings in a json object

Parse ElasticSearch time format

Python - Parse complex JSON with objectpath

How to search through JSON object for specific list o expected key values?

Check whether the JSON (object property exists) & print it as unicode decoded

Categories

Resources