How to get rid of backslashes in nested json - python

I make a request to a backend API and get the data back in json format
The response looks something like this. Kindly note that the body key values will be different and there are over 100's of them. The data1.json looks like this
[
{
"body": "[{\"task_ids\":[],\"accounts\":[],\"entity_ids\":[12814],\"guid\":\"2DFEB337-5F5D-4DF5-84CF-E951D237D448\",\"id\":\"0034030fb97251b3\",\"subject\":\"Uploaded Application\"}]",
code": 200,
"headers": {
"Content-Type": "application/json"
},
"msg": "OK",
"name": "0"
},
{
"body": "[{\"task_ids\":[],\"accounts\":[],\"entity_ids\":[12814],\"guid\":\"2DFEB337-5F5D-4DF5-84CF-E951D237D448\",\"id\":\"0034030fb97251b3\",\"subject\":\"Uploaded Application\",\}]",
code": 200,
"headers": {
"Content-Type": "application/json"
},
"msg": "OK",
"name": "0"
},
...
]
I need to get rid of the
"\" in all of the body key in the json response
Concatenate the key[body'] into one array
ideally it should look something like this.
[
{"body":"[{"task_ids":[],"accounts":[],"entity_ids":[12814],"guid":"2DFEB337-5F5D-4DF5-84CF-E951D237D448","id":"0034030fb97251b3","subject":"Uploaded Application",]","[{"task_ids":[],"accounts":[],"entity_ids":[12814],"guid":"2DFEB337-5F5D-4DF5-84CF-E951D237D448","id":"0034030fb97251b3","subject":"Uploaded Application",]",..}
]
I have tried replace and a lot of methods but none of them are replacing the \ so I cannot even go to step 2. What I did find out was if I save it to a text file the backslashes are replaced but then I cannot again send the response back as a json object. The code to get the data1.json file so far looks like this.
data = json.loads(r.text)
with open('data1.json', 'w') as outfile:
json.dump(data, outfile, sort_keys = True, indent = 4,
ensure_ascii = False)
Any suggestions on how to achieve the first points as in my desired output? Thanks.

(For starters, what you gave is invalid JSON, and json will either fail to parse it completely or produce something bogus. You need to make sure you extract the response correctly, and if this is really what the response is, have the sending party fix it.)
Now, about the question as asked:
You don't need to do anything special. That's just how JSON represents values that themselves contain JSON's special characters ("escapes" them with a backslash).
If you load the data via a proper JSON parser (e.g. json.loads()), it will undo that escaping, and in e.g. data[0]['body'], you will see proper data.
Of course, since that string is JSON itself, you will need to further parse it with json, too, if you need to split it into its meaningful parts...

The JSON data is incorrectly formatted and invalid JSON (missing quotes in "key" strings (e.g. code": 200,), invalid syntax in second dictionary body object, as mentioned in comment (e.g. "Uploaded Application\",\}]")).
However, after fixing these, a simple str.replace() statement can be used to get the expected JSON format. Then, simply parse the JSON content and build the desired list:
import json
data = '''[
{
"body": "[{\"task_ids\":[],\"accounts\":[],\"entity_ids\":[12814],\"guid\":\"2DFEB337-5F5D-4DF5-84CF-E951D237D448\",\"id\":\"0034030fb97251b3\",\"subject\":\"Uploaded Application\"}]",
"code": 200,
"headers": {
"Content-Type": "application/json"
},
"msg": "OK",
"name": "0"
},
{
"body": "[{\"task_ids\":[],\"accounts\":[],\"entity_ids\":[12814],\"guid\":\"2DFEB337-5F5D-4DF5-84CF-E951D237D448\",\"id\":\"0034030fb97251b3\",\"subject\":\"Uploaded Application\"}]",
"code": 200,
"headers": {
"Content-Type": "application/json"
},
"msg": "OK",
"name": "0"
}
]'''
r = json.loads(data.replace('\\', '').replace('"[', "[").replace("]\"", "]"))
l = []
for d in r:
l.append(d)
Now inspect the contents of l:
>>> l
[{u'body': [{u'entity_ids': [12814], u'accounts': [], u'task_ids': [], u'guid': u'2DFEB337-5F5D-4DF5-84CF-E951D237D448', u'id': u'0034030fb97251b3', u'subject': u'Uploaded Application'}], u'headers': {u'Content-Type': u'application/json'}, u'code': 200, u'name': u'0', u'msg': u'OK'},
{u'body': [{u'entity_ids': [12814], u'accounts': [], u'task_ids': [], u'guid': u'2DFEB337-5F5D-4DF5-84CF-E951D237D448', u'id': u'0034030fb97251b3', u'subject': u'Uploaded Application'}], u'headers': {u'Content-Type': u'application/json'}, u'code': 200, u'name': u'0', u'msg': u'OK'}]

Related

Reading key values a JSON array which is a set in Python

I have the following code
import requests
import json
import sys
credentials_User=sys.argv[1]
credentials_Password=sys.argv[2]
email=sys.argv[3]
def auth_api(login_User,login_Password,):
gooddata_user=login_User
gooddata_password=login_Password
body = json.dumps({
"postUserLogin":{
"login": gooddata_user,
"password": gooddata_password,
"remember":1,
"verify_level":0
}
})
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json'
}
url="https://reports.domain.com/gdc/account/login"
response = requests.request(
"POST",
url,
headers=headers,
data=body
)
sst=response.headers.get('Set-Cookie')
return sst
def query_api(cookie,email):
url="https://reports.domain.com/gdc/account/domains/domain/users?login="+email
body={}
headers={
'Content-Type': 'application/json',
'Accept': 'application/json',
'Cookie': cookie
}
response = requests.request(
"GET",
url,
headers=headers,
data=body
)
jsonContent=[]
jsonContent.append({response.text})
accountSettings=jsonContent[0]
print(accountSettings)
cookie=auth_api(credentials_User,credentials_Password)
profilehash=query_api(cookie,email)
The code itself works and sends a request to the Gooddata API.
The query_api() function returns JSON similar to below
{
"accountSettings": {
"items": [
{
"accountSetting": {
"login": "user#example.com",
"email": "user#example.com",
"firstName": "First Name",
"lastName": "Last Name",
"companyName": "Company Name",
"position": "Data Analyst",
"created": "2020-01-08 15:44:23",
"updated": "2020-01-08 15:44:23",
"timezone": null,
"country": "United States",
"phoneNumber": "(425) 555-1111",
"old_password": "secret$123",
"password": "secret$234",
"verifyPassword": "secret$234",
"authenticationModes": [
"SSO"
],
"ssoProvider": "sso-domain.com",
"language": "en-US",
"ipWhitelist": [
"127.0.0.1"
],
"links": {
"projects": "/gdc/account/profile/{profile_id}/projects",
"self": "/gdc/account/profile/{profile_id}",
"domain": "/gdc/domains/default",
"auditEvents": "/gdc/account/profile/{profile_id}/auditEvents"
},
"effectiveIpWhitelist": "[ 127.0.0.1 ]"
}
}
],
"paging": {
"offset": 20,
"count": 100,
"next": "/gdc/uri?offset=100"
}
}
}
The issue I am having is reading specific keys from this JSON Dict, I can use accountSettings=jsonContent[0] but that just returns the same JSON.
What I want to do is read the value of the project key within links
How would I do this with a dict?
Thanks
Based on your description, uyou have your value inside a list, (not a set. Foergt about set: sets are not used with JSON). Inside your list, you either your content as a single string, which then you'd have to parse with json.loads, or it is simply a well behaved nested data structure already extracted from JSON, but which is inside a single element list. This seems the most likely.
So, you should be able to do:
accountlink = jsonContent[0]["items"][0]["accountSetting"]["login"]
otherwise, if it is encoded as a a json string, you have to parse it first:
import json
accountlink = json.loads(jsonContent[0])["items"][0]["accountSetting"]["login"]
Now, given your question, I'd say your are on a begginer level as a programmer, or a casual user, just using Python to automatize something either way, I'd recommend you do try some exercising before proceeding: it will save you time (a lot of time). I am not trying to bully or mock anything here: this is the best advice I can offer you. Seek for tutorials that play around on the interactive mode, rather than trying entire programs at once that you'd just copy and paste.
Using the below code fixed the issue
jsonContent=json.loads(response.text)
print(type(jsonContent))
test=jsonContent["accountSettings"]["items"][0]
test2=test["accountSetting"]["links"]["self"]
print(test)
print(test2)
I believe this works because for some reason I didn't notice I was using .append for my jsonContent. This resulted in the data type being something other than it should have been.
Thanks to everyone who tried helping me.

Converting JSON file into a suitable string issue with Python

I have a JSON file as follows :
{
"desired":{
"property1":{
"port":"/dev/usbserial",
"rx":{
"watchdoginterval":3600
},
"state":{
"path":"/Users/user1"
},
"enabled":"true",
"active":{
"enabled":"true"
}
},
"property2":{
"signal_interrupt":"USR2",
"signal_description_path":"/tmp/logger.log"
},
"property3":{
"periodmins":40
},
}
}
I am having issues trying to convert this into a string for use with AWS IoT. The function I am using is deviceShadowHandler.shadowUpdate(JSONPayload, customShadowCallback_Update, 5)
Where JSONPayload should be the JSON string.
I have tried :
with open('JSONfile.json' , 'r') as f:
dict = json.load(f)
JSONPayload = str(dict)
but I receive an "Invalid JSON file error".
An attempt to manually create a literal string from the jSON file gets messy with complaints about "EOL while scanning string literal" etc.
What is the best solution to solve this? I am new to JSON and stuff and Python.
Trailing commas are not allowed in JSON.
{
"desired":{
"property1":{
"port":"/dev/usbserial",
"rx":{
"watchdoginterval":3600
},
"state":{
"path":"/Users/user1"
},
"enabled":"true",
"active":{
"enabled":"true"
}
},
"property2":{
"signal_interrupt":"USR2",
"signal_description_path":"/tmp/logger.log"
},
"property3":{
"periodmins":40
} # <- no comma there
}
}

Accessing nested objects with python

I have a response that I receive from foursquare in the form of json. I have tried to access the certain parts of the object but have had no success. How would I access say the address of the object? Here is my code that I have tried.
url = 'https://api.foursquare.com/v2/venues/explore'
params = dict(client_id=foursquare_client_id,
client_secret=foursquare_client_secret,
v='20170801', ll=''+lat+','+long+'',
query=mealType, limit=100)
resp = requests.get(url=url, params=params)
data = json.loads(resp.text)
msg = '{} {}'.format("Restaurant Address: ",
data['response']['groups'][0]['items'][0]['venue']['location']['address'])
print(msg)
Here is an example of json response:
"items": [
{
"reasons": {
"count": 0,
"items": [
{
"summary": "This spot is popular",
"type": "general",
"reasonName": "globalInteractionReason"
}
]
},
"venue": {
"id": "412d2800f964a520df0c1fe3",
"name": "Central Park",
"contact": {
"phone": "2123106600",
"formattedPhone": "(212) 310-6600",
"twitter": "centralparknyc",
"instagram": "centralparknyc",
"facebook": "37965424481",
"facebookUsername": "centralparknyc",
"facebookName": "Central Park"
},
"location": {
"address": "59th St to 110th St",
"crossStreet": "5th Ave to Central Park West",
"lat": 40.78408342593807,
"lng": -73.96485328674316,
"labeledLatLngs": [
{
"label": "display",
"lat": 40.78408342593807,
"lng": -73.96485328674316
}
],
the full response can be found here
Like so
addrs=data['items'][2]['location']['address']
Your code (at least as far as loading and accessing the object) looks correct to me. I loaded the json from a file (since I don't have your foursquare id) and it worked fine. You are correctly using object/dictionary keys and array positions to navigate to what you want. However, you mispelled "address" in the line where you drill down to the data. Adding the missing 'a' made it work. I'm also correcting the typo in the URL you posted.
I answered this assuming that the example JSON you linked to is what is stored in data. If that isn't the case, a relatively easy way to see exact what python has stored in data is to import pprint and use it like so: pprint.pprint(data).
You could also start an interactive python shell by running the program with the -i switch and examine the variable yourself.
data["items"][2]["location"]["address"]
This will access the address for you.
You can go to any level of nesting by using integer index in case of an array and string index in case of a dict.
Like in your case items is an array
#items[int index]
items[0]
Now items[0] is a dictionary so we access by string indexes
item[0]['location']
Now again its an object s we use string index
item[0]['location']['address]

TypeError: string indices must be integers // working with JSON as dict in python

Okay, so I've been banging my head on this for the last 2 days, with no real progress. I am a beginner with python and coding in general, but this is the first issue I haven't been able to solve myself.
So I have this long file with JSON formatting with about 7000 entries from the youtubeapi.
right now I want to have a short script to print certain info ('videoId') for a certain dictionary key (refered to as 'key'):
My script:
import json
f = open ('path file.txt', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['key']['Items']['id']['videoId'])
# print(trailers['key']['videoId'] gives same response
Error:
print(trailers['key']['Items']['id']['videoId'])
TypeError: string indices must be integers
It does work when I want to print all the information for the dictionary key:
This script works
import json
f = open ('path file.txt', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['key'])
Also print(type(trailers)) results in class 'dict', as it's supposed to.
My JSON File is formatted like this and is from the youtube API, youtube#searchListResponse.
{
"kind": "youtube#searchListResponse",
"etag": "",
"nextPageToken": "",
"regionCode": "",
"pageInfo": {
"totalResults": 1000000,
"resultsPerPage": 1
},
"items": [
{
"kind": "youtube#searchResult",
"etag": "",
"id": {
"kind": "youtube#video",
"videoId": ""
},
"snippet": {
"publishedAt": "",
"channelId": "",
"title": "",
"description": "",
"thumbnails": {
"default": {
"url": "",
"width": 120,
"height": 90
},
"medium": {
"url": "",
"width": 320,
"height": 180
},
"high": {
"url": "",
"width": 480,
"height": 360
}
},
"channelTitle": "",
"liveBroadcastContent": "none"
}
}
]
}
What other information is needed to be given for you to understand the problem?
The following code gives me all the videoId's from the provided sample data (which is no id's at all in fact):
import json
with open('sampledata', 'r') as datafile:
data = json.loads(datafile.read())
print([item['id']['videoId'] for item in data['items']])
Perhaps you can try this with more data.
Hope this helps.
I didn't really look into the youtube api but looking at the code and the sample you gave it seems you missed out a [0]. Looking at the structure of json there's a list in key items.
import json
f = open ('json1.json', 'r')
s = f.read()
trailers = json.loads(s)
print(trailers['items'][0]['id']['videoId'])
I've not used json before at all. But it's basically imported in the form of dicts with more dicts, lists etc. Where applicable. At least from my understanding.
So when you do type(trailers) you get type dict. Then you do dict with trailers['key']. If you do type of that, it should also be a dict, if things work correctly. Working through the items in each dict should in the end find your error.
Pythons error says you are trying find the index/indices of a string, which only accepts integers, while you are trying to use a dict. So you need to find out why you are getting a string and not dict when using each argument.
Edit to add an example. If your dict contains a string on key 'item', then you get a string in return, not a new dict which you further can get a dict from. item in the json for example, seem to be a list, with dicts in it. Not a dict itself.

python - exporting dictionary(array) to json

I have an array of dictionaries like so:
myDict[0] = {'date':'today', 'status': 'ok'}
myDict[1] = {'date':'yesterday', 'status': 'bad'}
and I'm trying to export this array to a json file where each dictionary is its own entry. The problem is when I try to run:
dump(myDict, open("test.json", "w"))
It outputs a json file with a number prefix before each entry
{"0": {"date": "today", "status": "ok"}, "1": {"date": "yesterday", "status": "bad"} }
which apparently isn't legal json since my json parser (protovis) is giving me error messages
Any ideas?
Thanks
Use a list instead of a dictionary; you probably used:
myDict = {}
myDict[0] = {...}
You should use:
myList = []
myList.append({...}
P.S.: It seems valid json to me anyways, but it is an object and not a list; maybe this is the reason why your parser is complaining
You should use a JSON serializer...
Also, an array of dictionaries would better serialize to something like this:
[
{
"date": "today",
"status": "ok"
},
{
"date": "yesterday",
"status": "bad"
}
]
That is, you should just use a JavaScript array.

Categories