Parsing JSON with Python from URL

Parsing JSON with Python from URL - python

So I'm trying to get json from a URl and the request works and I get the json but I'm not able to print specific things from it.
request_url = 'http://api.tumblr.com/v2/user/following?limit=1'
r = requests.get(request_url, auth=oauth).json()
r["updated"]
I'm very new with python I'm guessing I need to get the json into a array but I have no idea where to even begin.
According to the tumblr api I should be able to get something like this.
{
"meta": {
"status": 200,
"msg": "OK"
},
"response": {
"total_blogs": 4965,
"blogs": [
{
"name": "xxxxx",
"title": "xxxxxx",
"description": "",
"url": "http://xxxxxx.tumblr.com/",
"updated": 1439795949
}
]
}
}
I only need the name, url, and updated just no idea how to seperate that out.

Just access the levels one by one.
for i in r["response"]["blogs"]:
print i["name"],i["url"],i["updated"]
So this code can be used to print all the objects inside the blogs list
To explain how this works:
Json objects are decoded into something called dictionaries in Python. Dictionaries are simple key value pairs. In your example,
r is a dictionary with the following keys:
meta, response
You access the value of a key using r["meta"].
Now meta itself is a dictionary. The keys associated are:
status,msg
So, r["meta"]["status"] gives the status value returned by the request.

You should be able to print values as though it were nested arrays:
r["response"]["blogs"][0]["updated"] should get you the updated bit, don't go straight to it. Just work your way down. Note how blogs is an array, so in a normal case you may actually want to work towards r["response"]["blogs"], then loop through it and for each of those items, grab the ["updated"].
Similarly, r["meta"]["msg"] will get you the meta message.

The JSON data gets converted as dict which is set to r as per your code.
For accessing the value associated with updated key, you need to first access the values before it.
You should first access r["response"] which contains the actual response of the api. From that level, you should next access r["response"]["blogs"] and then loop through that to find the value of the updated key.
If it is a single blog, you can do something like r["response"]["blogs"][0]["updated"]

Related

Mapping from a specific point in Microsoft Graph API, Python

I've been beating my head against the wall for a couple of days now and can't quite come up with an answer.
Inside of the Microsoft Graph API, when you call for a specific type of email data it sends a JSON with a 'value' read at the top level.
{
"#odata.context": "https://graph.microsoft.com/v1.0/$metadata#users('938381cd-71f5-4a3d-a381-0a59a721948a')/messages",
"value": [
{
"#odata.etag": "W/\"CQAAABYAAAD0YZL2mrEQQpwOvq9h/XWNAAACpJ2E\"",
"bccRecipients": [],
"body": {
I'm attempting to dump the JSON into a dict and go into the value key to be able to get to the data I actually need.
print('\nGet user unread messages -> https://graph.microsoft.com/v1.0/me/messages?$filter=isRead ne true')
user_MAIL = session.get(api_endpoint('me/messages?$filter=isRead ne true'))
{len(user_MAIL.text)}\n')
text = json.dumps(user_MAIL.json(), indent=4, sort_keys=True)
The issue I keep running into is I can't figure out how to access the 'value' part. In Javascript I know I could just do something like a .map, but I've attempted several things here and can't seem to find an answer.
All I need to do is be able to enter the data from the value, and be able to list the keys and values of everything within 'value'.

I accidentally found the answer.
If you do a for loop like this:
for key in range(length):
print (json_object['value'][key]['id'])
print (28* ' ', "BREAK")
You're able to access the top layer and grab exactly what you want, no matter how many loops you have.

Accessing list within a dictionary via Python from a JSON file

I am new to Python and could use some help, please. I'm trying to use Python to read from my JSON file and get the values of the list within in a dictionary.
Once I read the JSON into my program, I have:
request body = {
"providerName": "ProviderNameXYZ",
"rateRequestPayments": [
{
"amount": 128.0,
"creditorProfileId": "7539903942457444658",
"debtorProfileId": "2072612712266192555",
"paymentMethodId": "2646780961603748694016",
"currency": "EUR",
"userReference": "INVALID user ref automation single",
"reconciliationId": "343546578753349"
},
{
"amount": 129.0,
"creditorProfileId": "7539903942457444658",
"debtorProfileId": "2072612712266192555",
"paymentMethodId": "2646780961603748694016",
"currency": "EUR",
"userReference": "INVALID user ref automation single",
"reconciliationId": "343546578753340"
}
]
}
I now want to be able to grab the value of any particular key.
I've tried accessing it via several routes:
rateRequestPayments[0].amount
rateRequestPayments()[0].amount
for k, v in request_body.rateRequestPayments(): print(k, v)
for each in request_body.rateRequestPayments: print(each('amount'))
values = eval(request_body['rateRequestPayments']) print(values[0])
All of these end up with errors.
Per the 2nd comment below: request_body['rateRequestPayments'][0]['amount']
This works!
I also want to be able to delete the whole key-value pair ("amount": 128.0,) from the request_body. request_body['rateRequestPayments'][0]['amount'] does not work for this. Not sure how to reference this.
I know I am missing something simple but I'm just unsure what it is. Any help would be greatly appreciated.

Python is great for data science, and has many features that make it fit for the job. One of those features is finding data in a data set. body['rateRequestPayments'][<!WHICH DATA SET YOU WANT TO ACCESS>][<!VALUE YOU WANT TO FIND>]

First you'll need to put an underscore between requests and body so its requests_body = {...}.
To access a key-value pair:
requests_body["rateRequestPayments"] #to access the list of two dicts
requests_body["rateRequestPayments"][0] #to access the individual dicts by index
requests_body["rateRequestPayments"][0]["amount"] #to access the value of a particular key (in this case "amount" is the key)
To delete a key value pair, just use the del statement:
del requests_body["rateRequestPayments"][0]["amount"]
This will delete the key-value pair in the variable requests_body, so trying to access this key again will raise a KeyError error.
To access and delete a key-value pair, use .pop():
value = requests_body["rateRequestPayments"][0].pop("amount")
#value is now 128.0 and the key-value pair is deleted

Count unique values in a JSON

I have a json called thefile.json which looks like this:
{
"domain": "Something",
"domain": "Thingie",
"name": "Another",
"description": "Thing"
}
I am trying to write a python script which would made a set of the values in domain. In this example it would return
{'Something', 'Thingie'}
Here is what I tried:
import json
with open("thefile.json") as my_file:
data = json.load(my_file)
ids = set(item["domain"] for item in data.values())
print(ids)
I get the error message
unique_ids.add(item["domain"])
TypeError: string indices must be integers
Having looked up answers on stack exchange, I'm stumped. Why can't I have a string as an index, seeing as I am using a json whose data type is a dictionary (I think!)? How do I get it so that I can get the values for "domain"?

So, to start, you can read more about JSON formats here: https://www.w3schools.com/python/python_json.asp
Second, dictionaries must have unique keys. Therefore, having two keys named domain is incorrect. You can read more about python dictionaries here: https://www.w3schools.com/python/python_dictionaries.asp
Now, I recommend the following two designs that should do what you need:
Multiple Names, Multiple Domains: In this design, you can access websites and check the domain of each of its values like ids = set(item["domain"] for item in data["websites"])
{
"websites": [
{
"domain": "Something.com",
"name": "Something",
"description": "A thing!"
},
{
"domain": "Thingie.com",
"name": "Thingie",
"description": "A thingie!"
},
]
}
One Name, Multiple Domains: In this design, each website has multiple domains that can be accessed using JVM_Domains = set(data["domains"])
{
"domains": ["Something.com","Thingie.com","Stuff.com"]
"name": "Me Domains",
"description": "A list of domains belonging to Me"
}
I hope this helps. Let me know if I missed any details.

You have a problem in your JSON, duplicate keys. I am not sure if it is forbiden, but I am sure it is bad formatted.
Besides that, of course it is gonna bring you lot of problems.
A dictionary can not have duplicate keys, what would be the return of a duplicate key?.
So, fix your JSON, something like this,
{
"domain": ["Something", "Thingie"],
"name": "Another",
"description": "Thing"
}
Guess what, good format almost solve your problem (you can have duplicates in the list) :)

How to convert a list of dictionaries to JSON in Python / Django?

I searched on Google and found an answer but it's not working for me. I have to send a list as JsonResponse in Django, similar to this:
list_to_json =[{"title": "hello there",
"link": "www.domain.com",
"date": ...},
{},{},{},...]
I am converting this to JSON by applying StackOverflow question1 and question2 but it's not working for me. I get the following error:
In order to allow non-dict objects to be serialized set the safe parameter to False
Here's my code:
def json_response(request):
list_to_json=[{"title": ..., "link": ..., "date": ...},{...}]
return JsonResponse(json.dumps(list_to_json) )

return JsonResponse(list_to_json, safe=False)
Take a look at the documentation:
The safe boolean parameter defaults to True. If it’s set to False, any object can be passed for serialization (otherwise only dict instances are allowed). If safe is True and a non-dict object is passed as the first argument, a TypeError will be raised.

Adding this answer for anyone wondering why this isn't "safe" by default. Packing a non-dict data structure into a response makes the service vulnerable to a pre-ES5 JSON Hijacking attack.
Basically, with the JSONResponse you're using here, if a user is authenticated to your site, he can now retrieve that list of {title, link, date} objects and that's fine. However, an attacker could include that endpoint as a script source on his own malicious page (cross site script inclusion, aka XSSI):
<script src="https://www.yourwebsite.com/secretlinks/"></script>
Then, if an unsuspecting authenticated user navigates to the malicious page, the browser will unknowingly request the array of data from your site. Since your service is just returning an unassigned array, the attacker must also poison the js Array constructor (this is the part of the attack that was fixed in ES5). Before ES5, the attacker could simply override the Array constructor like so:
Array = function() {secret = this;}
Now secret contains your list of dictionaries, and is available to the rest of the attacker's script, where he can send it off to his own server. ES5 fixed this by forcing the use of brackets to be evaluated by the default Array constructor.
Why wasn't this ever an issue for dictionary objects? Simply because curly brackets in javascript denote an isolated scope, and so there's no way for the attacker to inject his own code into the scope created by the returned dictionary which is surrounded by curly brackets.
More info here: https://security.stackexchange.com/questions/159609/how-is-it-possible-to-poison-javascript-array-constructor-and-how-does-ecmascrip?newreg=c70030debbca44248f54cec4cdf761bb

You have do include serializers or you can do this by using safe= False to your response data.
Like
return JsonResponse(list_to_json, safe=False)

This is not a valid dictionary:
{"title": , "link" : , "date": }
because the values are missing.
If you try adding the missing values instead, it works fine:
>>> json.dumps([{"title": "hello there", "link": "www.domain.com", "date": 2016}, {}])
'[{"link": "www.domain.com", "date": 2016, "title": "hello there"}, {}]'

Multiple FOR loops in iterating over dictionary in Python

This is a simplistic example of a dictionary created by a json.load that I have t deal with:
{
"name": "USGS REST Services Query",
"queryInfo": {
"timePeriod": "PT4H",
"format": "json",
"data": {
"sites": [{
"id": "03198000",
"params": "[00060, 00065]"
},
{
"id": "03195000",
"params": "[00060, 00065]"
}]
}
}
}
Sometimes there may be 15-100 sites with unknown sets of parameters at each site. My goal is to either create two lists (one storing "site" IDs and the other storing "params") or a much simplified dictionary from this original dictionary. Is there a way to do this using nested for loops with kay,value pairs using the iteritem() method?
What I have tried to far is this:
queryDict = {}
for key,value in WS_Req_dict.iteritems():
if key == "queryInfo":
if value == "data":
for key, value in WS_Req_dict[key][value].iteritems():
if key == "sites":
siteVal = key
if value == "params":
paramList = [value]
queryDict["sites"] = siteVal
queryDict["sites"]["params"] = paramList
I run into trouble getting the second FOR loop to work. I haven't looked into pulling out lists yet.
I think this maybe an overall stupid way of doing it, but I can't see around it yet.

I think you can make your code much simpler by just indexing, when feasible, rather than looping over iteritems.
for site in WS_Req_dict['queryInfo']['data']['sites']:
queryDict[site['id']] = site['params']
If some of the keys might be missing, dict's get method is your friend:
for site in WS_Req_dict.get('queryInfo',{}).get('data',{}).get('sites',[]):
would let you quietly ignore missing keys. But, this is much less readable, so, if I needed it, I'd encapsulate it into a function -- and often you may not need this level of precaution! (Another good alternative is a try/except KeyError encapsulation to ignore missing keys, if they are indeed possible in your specific use case).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.