Traversing json array in python - python

I'm using urllib.request.urlopen to get a JSON response that looks like this:
{
"batchcomplete": "",
"query": {
"pages": {
"76972": {
"pageid": 76972,
"ns": 0,
"title": "Title",
"thumbnail": {
"original": "https://linktofile.com"
}
}
}
}
The relevant code to get the response:
response = urllib.request.urlopen("https://example.com?title="+object.title)
data = response.read()
encoding = response.info().get_content_charset('utf-8')
json_object = json.loads(data.decode(encoding))
I'm trying to retrieve the value of "original", but I'm having a hard time getting there.
I can do print(json_object['query']['pages'] but once I do print(json_object['query']['pages'][0] I run into a KeyError: 0.
How would I be able to, with python retrieve the value of original?

Do this instead:
my_content = json_object['query']['pages']['76972']['thumbnail']['original']
The reason is, you need to mention index as [0] only when you have list as the object. But in your case, every item is of dict type. You need to specify key instead of index
If number is dynamic, you may do:
page_content = json_object['query']['pages']
for content in page_content.values():
my_content = content['thumbnail']['original']
where my_content is the required information.

Doing [0] is looking for that key - which doesn't exist. Assuming you don't always know what the key of the page is, Try this:
pages = json_object['query']['pages']
for key, value in pages.items(): # this is python3
original = value['thumbnail']['original']
Otherwise you can simply grab it by the key if you do know (what appears to be) the pageid:
json_object['query']['pages']['76972']['thumbnail']['original']

You can iterate over keys:
for page_no in json_object['query']['pages']:
page_data = json_object['query']['pages'][page_no]

Related

Array in a json

I am creating a python project who are working with an api, the api return an json like this:
"1": "2018-10-13T08:28:38.809469028Z",
"result": [
{
"id":3027531,
"created_at":"2018-10-13T08:20:38.809469028Z",
"date":"2018-10-13T08:19:38Z",
"text":"banana",
}
],
I can get 1, but i can't get text in result, can someone help me?
I tried:
response.json()['result']['text']
response.json()['result'].text
response.json().result[0].text
Try this
response_json = response.json()
response_json["result"][0]["text"]
The value for result is a list of dictionaries. We take the first item in that list, and then we ask for the value of text.
Be aware that this assumes that response_json["result"] has at least one item. If it is empty, you will get an IndexError. You should probably check the length of response_json["result"] before using it. Here is an example of how this could be done:
response_json = response.json()
if response_json["result"]:
response_json["result"][0]["text"]
else:
print("result is empty")
d = {"1": "2018-10-13T08:28:38.809469028Z",
"result": [
{
"id":3027531,
"created_at":"2018-10-13T08:20:38.809469028Z",
"date":"2018-10-13T08:19:38Z",
"text":"banana",
}
]}
d['result'][0]['id'] # 3027531
d['result'][0]['text'] # banana

Extracting certain value from MongoDB using Python

I have a mongo database including the following collection:
"
"_id": {
"$oid": "12345"
},
"id": "333555",
"token": [
{
"access_token": "ac_33bc",
"expires_in": 3737,
"token_type": "bearer",
"expires_at": {
"$date": "2021-07-02T13:37:28.123Z"
}
}
]
}
In the next python script I'm trying to return and print only the access_token but can't figure out how to do so. I've tried various methods which none of the worked.I've given the "id" as a parameter
def con_mongo():
try:
client = pymongo.MongoClient("mongodb:localhost")
#DB name
db = client["db1"]
#Collection
coll = db["coll1"]
#1st method
x = coll.find({"id":"333555"},{"token":"access_token"})
for data in x:
print(x)
#2nd method
x= coll.find({"id":"333555"})
tok=x.distinct("access_token")
#print(x[0])
for data in tok:
print(data)
except Exception:
logging.info(Exception)
It doesn't work this way, although if I replace (or remove) the "access_token" with simply "token" it works but I get back all the informations included in the field "token" where I only need the value of the "access_token".
Since access_token is an array element, you need to qualify it's name with the name of the array, to properly access its value.
Actually you can first extract the whole document and get the desired value through simple list and dict indexing.
So, assuming you are retrieving many documents with that same id:
x = [doc["token"][0]["access_token"] for doc in coll.find({"id":"333555"})]
The above, comprehensively creates a list with the access_tokens of all the documents matching the given id.
If you just need the first (and maybe only) occurrence of a document with that id, you can use find_one() instead:
x = coll.find_one({"id":"333555"})["token"][0]["access_token"]
# returns ac_33bc
token is a list so you have to reference the list element, e.g.
x = coll.find({"id":"333555"},{"token.access_token"})
for data in x:
print(data.get('token')[0].get('access_token'))
prints:
ac_33bc

How to remove the first and last portion of a string in Python?

How can i cut from such a string (json) everything before and including the first [ and everything behind and including the last ] with Python?
{
"Customers": [
{
"cID": "w2-502952",
"soldToId": "34124"
},
...
...
],
"status": {
"success": true,
"message": "Customers: 560",
"ErrorCode": ""
}
}
I want to have at least only
{
"cID" : "w2-502952",
"soldToId" : "34124",
}
...
...
String manipulation is not the way to do this. You should parse your JSON into Python and extract the relevant data using normal data structure access.
obj = json.loads(data)
relevant_data = obj["Customers"]
Addition to #Daniel Rosman answer, if you want all the list from JSON.
result = []
obj = json.loads(data)
for value in obj.values():
if isinstance(value, list):
result.append(*value)
While I agree that Daniel's answer is the absolute best way to go, if you must use string splitting, you can try .find()
string = #however you are loading this json text into a string
start = string.find('[')
end = string.find(']')
customers = string[start:end]
print(customers)
output will be everything between the [ and ] braces.
If you really want to do this via string manipulation (which I don't recommend), you can do it this way:
start = s.find('[') + 1
finish = s.find(']')
inner = s[start : finish]

Accessing Nested Dict from JSON

I'm using requests and JSON to pull some data from an API, and I'm struggling with using a nested dict.
Here is the JSON data:
{"data": [
{
"ContactId": "123",
"EmailAddress": "abc#xyz.com",
"FirstName": null,
"LastName": null,
"ClickDate": "6/6/1966",
"Clicks": "5",
"IPAddress": "1.1.1.1.1",
"UserAgent": "IE8.0",
"UniqueLinksClicked": [
{
"LinkURL": "http://link1.com",
"LinkURL": "http://link2.com",
"LinkURL": "http://link3.com"
}
]
}
]}
I'm able to access all of the ContactID and other 1st level stuff fine, but I can't figure out how to traverse the "LinkURL" stuff.
Here is my python...
result = requests.get(requesturl, headers=headers)
jdata = json.loads(result.content)
for result in jdata["data"]:
contactID = str([(result["ContactId"])])
for result in jdata["data"]["UniqueLinksClicked"]: #I'm doing this wrong, but I'm not sure how.
print(ContactID + " " + str([(result["LinkURL"])]))
The line marked with a comment above generates a TypeError indicating it's a list, where I expected it to be a dict:
list indices must be integers or slices, not str
If instead I drop the ["data"] dereference and try to access "UniqueLinksClicked" on jdata:
for link in jdata["UniqueLinksClicked"]:
I get a key error because the ["UniqueLinksClicked"] is an item inside of the ["data"] dict.
How do I do this correctly?
You can iterate over the links in a nested loop. Do not use the same variable name result in two nested loops! Use a different variable name in the inner loop.
for link in result["UniqueLinksClicked"]:
print(ContactID, link["LinkURL"])
(Moved from question.)
[OP] was confused about the variable naming in the for variable1 in variable2["dict"]: portion. After some help from Håken Lid, [they] figured it out.
It should look like this...
for item in jdata["data"]:
contactID = str([(item["ContactId"])])
print(contactID)
for link in item["UniqueLinksClicked"]:
print(link["LinkURL"])

How to get json data by name using python

I'm kinda new JSON and python and i wish to use the keys and values of JSON to compare it.
I'm getting the JSON from a webpage using requests lib.
Recently, I've done this:
import requests;
URL = 'https://.../update.php';
PARAMS = { 'platform':'0', 'authcode':'code', 'app':'60' };
request = requests.get( url=URL, params=PARAMS );
data = request.json( );
I used this loop to get the keys and values from that json:
for key, value in data.items( ):
print( key, value );
it return JSON part like this:
rescode 0
desc success
config {
"app":"14",
"os":"2",
"version":"6458",
"lang":"zh-CN",
"minimum":"5",
"versionName":"3.16.0-6458",
"md5":"",
"explain":"",
"DiffUpddate":[ ]
}
But in Firefox using pretty print i get different result look like this:
{
"rescode": 0,
"desc": "success",
"config": "{
\n\t\"app\":\"14\",
\n\t\"os\":\"2\",
\n\t\"version\":\"6458\",
\n\t\"lang\":\"zh-CN\",
\n\t\"minimum\":\"5\",
\n\t\"versionName\":\"3.16.0-6458\",
\n\t\"md5\":\"\",
\n\t\"explain\":\"\",
\n\t\"DiffUpddate\":[\n\t\t\n\t]\n
}"
}
What I'm planing to do is:
if data['config']['version'] == '6458':
print('TRUE!');
But everytime i get this error:
TypeError: string indices must be integers
You need to parse the config
json.loads(data['config'])['version']
Or edit the PHP to return an associative array rather than a string for the config object

Categories