I am very new to APIs. I am trying to get the response of a requests.post method in the form of a json file or dictionary. I get a status_code of 200, so I know there is success, but when I run response.text I return everything as a string. I have read parts of the Quickstart guide for Requests, but they only seem to use .text to extract the data. My expected output for this particular api would ideally be a json file or some dictionary I can work with.
What I have so far (I get this is not a full reproducible example, but I think it gets the point across, otherwise refer to here for some examples):
import pandas as pd
import requests
response = requests.post(
url = request_url
,headers = headers
,json = body
)
response.text # returns a string
response.json # returns a method
pd.json_normalize(response.text) #throws an error that pandas does not have this attribute (which it does, idk why not)
pd.read_json(response.text) #somewhat workable dataframe.
pd.read_json() gets me somewhere, but it is an object in a cell of a dataframe, which I feel like is not the route to go down on.
Based on John Gordon's comment above, you can do the following
data = response.json()
Then with from pandas.io.json import json_normalize you can also do
df = json_normalize(data)
This will convert the response into a pandas dataframe.
Related
I am a new programmer and I'm learning the request module. I'm stuck on the fact that I don't know how to get a specific part of a json response, I think it's called a header? or its the thing inside of a header? I'm not sure. But the API returns simple json code. This is the api
https://mcapi.us/server/status?ip=mc.hypixel.net
for more of a example, lets say it returns this json code from the api
{"status":"success","online":true"}
And I wanted to get the "online" response, how would I do that?
And this is the code im currently working with.
import requests
def main():
ask = input("IP : ")
response = requests.get('https://mcapi.us/server/status?ip=' + ask)
print(response.content)
main()
And to be honest, I don't even know if this is json. I think it is but the api page says its cors? if it isn't I'm sorry.
In your example you have a dictionary with key "online"
You need to parse it first with .json() and then you can get it in form dict[key]
In your case
response = requests.get('https://mcapi.us/server/status?ip=' + ask).json()
print(response["online"])
or in case of actual content
response = requests.get('https://mcapi.us/server/status?ip=' + ask).json()
print(response["content"])
https://www.apple.com/covid19/mobility
source=requests.get("https://www.apple.com/covid19/mobility")
soup=BeautifulSoup(source.text,"lxml")
I'm currently trying to get the url contained in the All Data CSV button which can be found by inspecting element. The requests.get doesn't seem to return the full body and all the elements.
Use the following API which returns data in json() format.
https://covid19-static.cdn-apple.com/covid19-mobility-data/current/v1/index.json
Now to get the url use key values
Code:
url='https://covid19-static.cdn-apple.com/covid19-mobility-data/current/v1/index.json'
data=requests.get(url).json()
print("https://covid19-static.cdn-apple.com"+data['basePath'] +data['regions']['en-us']['csvPath'])
Output:
https://covid19-static.cdn-apple.com/covid19-mobility-data/2006HotfixDev17/v1/en-us/applemobilitytrends-2020-04-25.csv
To get csv data in json format try this API
url='https://covid19-static.cdn-apple.com/covid19-mobility-data/2006HotfixDev17/v1/en-us/applemobilitytrends.json'
data=requests.get(url).json()
print(data)
learning to work with the request library and pandas but have been struggling to get past the starting point even with a good amount of examples online.
I am trying to extract NBA shot data from the URL below using a GET request, and then turn it into a DataFrame:
def extractData():
Harden_data_url = "https://stats.nba.com/events/?flag=3&CFID=33&CFPARAMS=2017-18&PlayerID=201935&ContextMeasure=FGA&Season=2017-18§ion=player&sct=hex"
response = requests.get(Harden_data_url)
data = response.json()
shots = data['resultSets'][0]['rowSet']
headers = data['resultSets'][0]['headers']
df = pandas.DataFrame.from_records(shots, columns = headers)
However I get this error starting on line 2 "response = requests.get(url)"
ValueError: No JSON object could be decoded
I imagine I am missing something basic, any debugging help is appreciated!
The problem is that you are using the wrong URL for fetching the data.
The URL you used was for the HTML, which is in charge of the layout of the site. The data comes from a different URL, which fetches it in JSON format.
The correct URL for the data you are looking for is this:
https://stats.nba.com/stats/shotchartdetail?CFID=33&CFPARAMS=2017-18&ContextMeasure=FGA&DateFrom=&DateTo=&EndPeriod=10&EndRange=28800&GameID=&GameSegment=&GroupQuantity=5&LastNGames=0&LeagueID=00&Location=&Month=0&OnOff=&OpponentTeamID=0&Outcome=&PORound=0&Period=0&PlayerID=201935&PlayerPosition=&RangeType=0&RookieYear=&Season=2017-18&SeasonSegment=&SeasonType=Regular+Season&StartPeriod=1&StartRange=0&TeamID=0&VsConference=&VsDivision=
If you run it on the browser, you can see only the raw JSON data, which is exactly what you will get in your code, and make it work properly.
This blog post explains the method to find the data URL, and although the API has changed a little since the post was written, the method still works:
http://www.gregreda.com/2015/02/15/web-scraping-finding-the-api/
I tried to scrape twitter followers using requests library. finally, i tried to save the response of required page in json format and then tried to search for the required parts. The thing is, how to find required elements in the json object?
my code is:
s =requests.session()
res = s.post("https://twitter.com/sessions",data=payload,headers=headers)
r = s.get("https://twitter.com/akhiltaker619/following/users?include_available_features=1&include_entities=1&max_position=1590310744326457266&reset_error_state=false")
dp = r.text
dp1=json.loads(dp)
x = json.dumps(dp1)
print(res.status_code)
soup = BeautifulSoup(x,"html.parser")
x1= soup.find_all("b",{"class":"u-linkComplex-target"})
for i in x1:
print(i.text)
At the end parsing part is wrong as i am trying to scrape json object which is not possible. When i print the json object, i get this:
The link is attached which contains the output of json object
now from this object, i want "class : u-linkComplex-target" present in the "item_html" of this json object. How to get this? Or is there any way to get the same content without using json object(this content is the followers list page in twitter). I used json inorder to load the dynamic content of the page.
The Beautiful Soup library is for parsing HTML and similar tagged languages, not JSON.
If your requests return JSON responses then you should call the r.json() method. This will return a dictionary of the JSON structure. Suppose you used
j = r.json()
then you probably want j['item-html']['linkComplex-target'] or something similar. If you access the dictionary interactively you will probably find what you want.
All,
I wrote the following code:
import requests,bs4
res=requests.get('http://itunes.apple.com/lookup?id=551798799')
res.raise_for_status()
wwe=bs4.BeautifulSoup(res.text)
print wwe.select('p averageUserRating')
If I only do: print wwe.select('p') then the code works, but it prints everything in the list. However when I print what is in the output above, this throws an error saying the selector is unsupported.
I am basically only trying to return the averageUserRating value (which is 4.0).
Thanks for the help!
The contents of that file isn't HTML, which is what BeautifulSoup is designed to read; it's a different data format called JSON. Thankfully, the requests library makes it really easy to parse JSON - if you call .json() on a response it parses it into a dictionary. You need to access averageUserRating, which is inside the first element of the results list, so you can use this to access what you need:
>>> data = res.json()
>>> data["results"][0]["averageUserRating"]
4.0
To modify your existing code:
import requests
res=requests.get('http://itunes.apple.com/lookup?id=551798799')
res.raise_for_status()
wwe=res.json()
print data["results"][0]["averageUserRating"]