I'm trying to pull data from an API for a fantasy football project I'm working on. You can pull data on various players using the url: 'https://fantasy.premierleague.com/api/element-summary/i/' where i is a number that relates to a player and runs from 1 to 400 or so.
I wrote code that pulls this data for a specific player and stores it as a dataframe for future analysis using the code:
import pandas as pd
import json
import requests
from pandas.io.json import json_normalize
r = requests.get('https://fantasy.premierleague.com/api/element-summary/1/')
r_json = r.json()
r_df = pd.DataFrame(r_json['history'])
r_df.head()
And this works great. The issue is it's only for 1 player and there are lots of them, so what I want is a DataFrame that contains this data for all of the players. I figured I could use a for loop for this but I can't get it to work. I'm trying the code:
import pandas as pd
import json
import requests
from pandas.io.json import json_normalize
for i in range(5):
r = requests.get('https://fantasy.premierleague.com/api/element-summary/{}/'.format(i))
r_json = r.json()
r_df= pd.DataFrame(r_json['history'])
r_df.head()
Where I've put the logic in a for loop, but I get the error:
KeyError: 'history'
When I try to run this. Why does it not like the line r_df= pd.DataFrame(r_json['history']) when it's in a for loop when it's OK outside of one?
Any help appreciated!
Thanks!
This is because your loop is trying to query 'https://fantasy.premierleague.com/api/element-summary/0/' on the first iteration, which doesn't exist. (Or, rather, gives you the JSON {"detail": "Not found."}, which doesn't have a "history" key.)
The built-in range function generates integers starting from zero by default. Try changing the loop range to range(1, 5) instead. (Also note that the end of the range is exclusive, so this will give you integers from 1 to 4.)
A KeyError indicates that the dictionary does not have this specific key.
Your specific problem here is that the range() function is generating the following range: [0, 1, 2, 3, 4] because if you don't specify an initial value, it will take 0 as an initial value and stop at one short of the final value.
And so https://fantasy.premierleague.com/api/element-summary/0/ doesn't return anything.
To fix this change range(5) to range(1, 6).
Note: the syntax of the range() function is: range(starting_point, ending_point, step)
Related
i'm learning and would appreciate any help in this code.
The issue is trying to print the values in the data that are contained in one line of the JSON using Python.
import json
import requests
data = json.loads(response.text)
print(len(data)) #showing correct value
#where i'm going wrong below obviously this will print the first value then the second as it's indexed. Q how do I print all values when using seperate print statements when the total indexed value is unknown?
for item in data:
print(data[0]['full_name'])
print(data[1]['full_name'])
I tried without the index value this gave me the first value multiple times depending on the length.
I expect to be able to access from the JSON file each indexed value separately even though they are named the same thing "full_name" for example.
import json
import requests
data = json.loads(response.text)
print(len(data)) #showing correct value
for item in data:
print(item['full_name'])
#the below code will throw error.. because python index starts with 0
print(data[0]['full_name'])
print(data[1]['full_name'])
hope this help
Presuming data is a list of dictionaries, where each dictionary contains a full_name key:
for item in data:
print(item['full_name'])
This code sample from your post makes no sense:
for item in data:
print(data[0]['full_name'])
print(data[1]['full_name'])
Firstly it's a syntax error because there is nothing indented underneath the loop.
Secondly it's a logic error, because the loop variable is item but then you never refer to that variable.
I am trying to get a full database of all active players career stats in the NBA.
I'm relatively new to Python and am trying to figure out a way to iterate a loop function by looking up the playerID for each PlayerCareerStat data frame. Ultimately I will summarize and group the data so its easier to read but I am trying to return a list of all players career stats by season.
I am able to use the players.get_active_players() endpoint to return a list of all players with their player_id: [1]: https://i.stack.imgur.com/BFxgv.png
With that, I am tryin to loop the Player_id through each data frame in the PlayerCareerStats() endpoint ... I think? Since the parameter for this endpoint requires a single player_id I can't seem to get all the players. Please see picture [1]: https://i.stack.imgur.com/skM8Y.png
Does anyone know how I might be able to get the output I am trying to find?
When you create a function, it will end when it reaches return. So your function returns the player id, and doesn't move on to the part where you are trying to pull the career stats.
So get rid of that return player['id'], and the function should run all the way through:
from nba_api.stats.static import players
from nba_api.stats.endpoints import playercareerstats
import pandas as pd
def get_nba_id():
nba_players = players.get_active_players()
for player in nba_players:
career_stats = playercareerstats.PlayerCareerStats(player_id=player['id'])
career_df = career_stats.get_data_frames()[0]
career_df['NAME'] = player['full_name']
print(career_df)
I am making a price statistics project with Python, and I have a problem with scraping data from an API. The API is https://www.rolimons.com/api/activity
I want to get prices from the API, which are the last 2 values from one block.
For example, from [1588247532, 0, "1028606", 464, 465] I would need 464 and 465 only. Also I want to do this for all tables.
How can I do that? Here is the code I have so far:
import requests
import json
r = requests.get('https://www.rolimons.com/api/activity')
content = json.loads(r.content.decode())
for key, value in content.items():
print(key)
Give this a go:
for value in content['activities']:
print(value[-2:])
It iterates through activities and prints the last two items of each value.
Or you can collect the prices in a separate list to use later on like so:
prices=[value[-2:] for value in content['activities']]
I recommend using print statements whenever you are not sure of how or why. See below, it might help give a visual of what is going on.
import requests
import json
r = requests.get('https://www.rolimons.com/api/activity')
content = json.loads(r.content.decode())
for key, value in content.items():
print("Key: ", key)
print("content[key]: ", content[key])
for array in content["activities"]:
print("array: ", array)
print("array[len(array)-1]:", array[len(array)-1])
print("array[len(array)-2]:", array[len(array)-2])
I am new and learning python. As part of my learning, i am trying to do Api integration. I am getting the result but it's limited to 100. But the totalresults is around 7000 records. Is there a way I can call multiple times to bring the entire result in CSV format. I am adding my code below and not sure how to proceed further.
import requests
import pandas as pd
resp = requests.get ('apipath' & '?company=XXXX', auth=(XXXXX', 'XXXXXX'))
dataframe = resp.json()
dataset = pd.DataFrame(dataframe["items"]).to_csv('dict_file.csv', header=True)
Please help.
You'll need to check the API Documentation but generally there will be a parameter "maxResults (or similar) that you can add to the url to retrieve more than the default number of results.
Your request (by modify the query string in the url) would look something like this:
resp = requests.get ('apipath' & '?company=XXXX&maxResults=1000', auth=(XXXXX', 'XXXXXX'))
this script is meant to parse Bloomberg finance to find the GBP value during the day, this following script does that however when it returns you get this:
{'dateTime': '2017-01-17T22:00:00Z', 'value': 1.6406}
I don't want the dateTime, or the value text there. I don't know how to get rid of it. and when I try it gives me errors like this: list index out of range.
any answers will be greatly appreciated. here is the script (in python3):
import urllib.request
import json
htmltext = urllib.request.urlopen('https://www.bloomberg.com/markets/api/bulk- time-series/price/GBPAUD%3ACUR?timeFrame=1_DAY').read().decode('utf8')
data = json.loads(htmltext)
datapoints = data[1]['price']
print(datapoints)
This should work for you.
print (data[0]['price'][-1]['value'])
EDIT: To get all the values,
for data_row in data[0]['price']:
print data_row['value']
EXPLANATION: data[0] gets the first and only element of the list, which is a dict. ['price'] gets the list corresponding to the price key. [-1] gets the last element of the list, which is presumably the data you'll be looking for as it's the latest data point.
Finally, ['value'] gets the value of the currency conversion from the dict we obtained earlier.