I have a question about rate limits.
I take a data from the CSV and enter it into the query and the output is stored in a list.
I get an error because I make too many requests at once.
(I can only make 20 requests per second). How can I determine the rate limit?
import requests
import pandas as pd
df = pd.read_csv("Data_1000.csv")
list = []
def requestSummonerData(summonerName, APIKey):
URL = "https://euw1.api.riotgames.com/lol/summoner/v3/summoners/by-name/" + summonerName + "?api_key=" + APIKey
response = requests.get(URL)
return response.json()
def main():
APIKey = (str)(input('Copy and paste your API Key here: '))
for index, row in df.iterrows():
summonerName = row['Player_Name']
responseJSON = requestSummonerData(summonerName, APIKey)
ID = responseJSON ['accountId']
ID = int(ID)
list.insert(index,ID)
df["accountId"]= list
If you already know you can only make 20 requests per second, you just need to work out how long to wait between each request:
Divide 1 second by 20, which should give you 0.05. So you just need to sleep for 0.05 of a second between each request and you shouldn't hit the limit (maybe increase it a bit if you want to be safe).
import time at the top of your file and then time.sleep(0.05) inside of your for loop (you could also just do time.sleep(1/20))
Related
I'm trying to set up a loop to pull in weather data for about 500 weather stations for an entire year which I have in my dataframe. The base URL stays the same, and the only part that changes is the weather station ID.
I'd like to create a dataframe with the results. I believe i'd use requests.get to pull in data for all the weather stations in my list, which the IDs to use in the URL are in a column called "API ID" in my dataframe. I am a python beginner - so any help would be appreciated! My code is below but doesn't work and returns an error:
"InvalidSchema: No connection adapters were found for '0 " http://www.ncei.noaa.gov/access/services/data/...\nName: API ID, Length: 497, dtype: object'
.
def callAPI(API_id):
for IDs in range(len(API_id)):
url = ('http://www.ncei.noaa.gov/access/services/data/v1?dataset=daily-summaries&dataTypes=PRCP,SNOW,TMAX,TMIN&stations=' + distances['API ID'] + '&startDate=2020-01-01&endDate=2020-12-31&includeAttributes=0&includeStationName=true&units=standard&format=json')
r = requests.request('GET', url)
d = r.json()
ll = []
for index1,rows1 in distances.iterrows():
station = rows1['Closest Station']
API_id = rows1['API ID']
data = callAPI(API_id)
ll.append([(data)])
I am not sure about your whole code base, but this is the function that will return the data from the API, If you have multiple station id on a single df column then you can use a for loop otherwise no need to do that.
Also, you are not returning the result from the function. Check the return keyword at the end of the function.
Working code:
import requests
def callAPI(API_id):
url = ('http://www.ncei.noaa.gov/access/services/data/v1?dataset=daily-summaries&dataTypes=PRCP,SNOW,TMAX,TMIN&stations=' + API_id + '&startDate=2020-01-01&endDate=2020-12-31&includeAttributes=0&includeStationName=true&units=standard&format=json')
r = requests.request('GET', url)
d = r.json()
return d
print(callAPI('USC00457180'))
So your full code will be something like this,
def callAPI(API_id):
url = ('http://www.ncei.noaa.gov/access/services/data/v1?dataset=daily-summaries&dataTypes=PRCP,SNOW,TMAX,TMIN&stations=' + API_id + '&startDate=2020-01-01&endDate=2020-12-31&includeAttributes=0&includeStationName=true&units=standard&format=json')
r = requests.request('GET', url)
d = r.json()
return d
ll = []
for index1,rows1 in distances.iterrows():
station = rows1['Closest Station']
API_id = rows1['API ID']
data = callAPI(API_id)
ll.append([(data)])
Note: Even better use asynchronous calls to the API to make the process faster. Something like this: https://stackoverflow.com/a/56926297/1138192
Background: I'm attempting to create a dataframe using data called from Twitch's API. They only allow 100 records per call so with each pull a new Pagination Cursor is offered in order to move on to the next page. I'm using the following code to try and efficiently pull this data rather than manually adjusting the after=(pagination value) in the get response. Right now the variable I'm trying to make dynamic is the 'Pagination' variable but it only gets updated once the loop finishes - not helpful! Take a look below and see if you notice anything I can change to achieve this goal. Any help is appreciated!
TwitchTopGamesDataFrame = [] #This is our Data List
BaseURL = 'https://api.twitch.tv/helix/games/top?first=100'
Headers = {'client-id':'lqctse0orgdbs5gdf5faz665api03r','Authorization': 'Bearer a1yl09mwmnwetp6ovocilheias8pzt'}
Indent = 2
Pagination = ''
FullURL = BaseURL + Pagination
Response = requests.get(FullURL,headers=Headers)
iterations = 1 # Data records returned are equivalent to iterations x100
#Loop: Response, Convert JSON data, Append to Data List, Get Pagination & Replace String in Variable - Iterate until 300 records
while count <= 3:
#Grab JSON Data, Convert, & Append
ResponseJSONData = Response.json()
#print(pgn) - Debug
pd.set_option('display.max_rows', None)
TopGamesDF = pd.DataFrame(ResponseJSONData['data'])
TopGamesDF = TopGamesDF[['id','name']]
TopGamesDF = TopGamesDF.rename(columns={'id':'GameID','name':'GameName'})
TopGamesDF['Rank'] = TopGamesDF.index + 1
TwitchTopGamesDataFrame.append(TopGamesDF)
#print(FullURL) - Debug
#Grab & Replace Pagination Value
ResponseJSONData['pagination']
RPagination = pd.DataFrame(ResponseJSONData['pagination'],index=[0])
pgn = str('&after='+RPagination.to_string(index=False,header=False).strip())
Pagination = pgn
#print(FullURL) - Debug
iterations += 1
TwitchTopGamesDataFrame```
Figured it out:
def top_games(page_count):
from time import gmtime, strftime
strftime("%Y-%m-%d %H:%M:%S", gmtime())
print("Time of Execution:", strftime("%Y-%m-%d %H:%M:%S", gmtime()))
#In order to condense the code above and be more efficient, a while/for loop would work great.
#Goal: Run a While Loop to create a larger DataFrame through Pagination as the Twitch API only allows for 100 records per call.
baseURL = 'https://api.twitch.tv/helix/games/top?first=100' #Base URL
Headers = {'client-id':'lqctse0orgdbs5gdf5faz665api03r','Authorization': 'Bearer a1yl09mwmnwetp6ovocilheias8pzt'}
Indent = 2
Pagination = ''
FullURL = BaseURL + Pagination
Response = requests.get(FullURL,headers=Headers)
start_count = 0
count = 0 # Data records returned are equivalent to iterations x100
max_count = page_count
#Loop: Response, Convert JSON data, Append to Data List, Get Pagination & Replace String in Variable - Iterate until 300 records
while count <= max_count:
#Grab JSON Data, Extend List
Pagination
FullURL = baseURL + Pagination
Response = requests.get(FullURL,headers=Headers)
ResponseJSONData = Response.json()
pd.set_option('display.max_rows', None)
if count == start_count:
TopGamesDFL = ResponseJSONData['data']
if count > start_count:
i = ResponseJSONData['data']
TopGamesDFL.extend(i)
#Grab & Replace Pagination Value
ResponseJSONData['pagination']
RPagination = pd.DataFrame(ResponseJSONData['pagination'],index=[0])
pgn = str('&after='+RPagination.to_string(index=False,header=False).strip())
Pagination = pgn
count += 1
if count == max_count:
FinalDataFrame = pd.DataFrame(TopGamesDFL)
FinalDataFrame = FinalDataFrame[['id','name']]
FinalDataFrame = FinalDataFrame.rename(columns={'id':'GameID','name':'GameName'})
FinalDataFrame['Rank'] = FinalDataFrame.index + 1
return FinalDataFrame
I'm trying to get stocks prices from an API using python, but the thing is that when I put it in a while loop, it doesn't update, while the price is updating in the api, other thing, is there anyway to make the loop each 5 minutes? Here's the code:
import urllib.request
import json
urlprices = "https://financialmodelingprep.com/api/v3/quote-short/AMZN?apikey=555555555555555555"
obj = urllib.request.urlopen(urlprices)
data = json.load(obj)
a = 0
while a == 0:
print(float(data[0]['price']))
It is possible but you need to update your data within the while loop:
import urllib.request
import json
import time
a = 0
while a == 0:
urlprices = "https://financialmodelingprep.com/api/v3/quote-short/AMZN?apikey=555555555555555555"
obj = urllib.request.urlopen(urlprices)
data = json.load(obj)
print(float(data[0]['price']))
# here you should add a pause so that the loop will not hit the request limit for the api
time.sleep(300)
What I have so far is the first request gathering Id's. I would then like to use that return draftgroupid to insert into the second url request. Is it possible to send two requests in the same script, and if so how would I do a for i in range(draftgroupid) in the second url request?
import requests
import json
req1 = requests.get(url="https://www.draftkings.com/lobby/getcontests?sport=NHL")
req.raise_for_status()
data = req.json()
for i, contest in enumerate(data['DraftGroups']):
draftgroupid = contest['DraftGroupId']
Output of draftgroupid:
16901
16905
16902
16903
req2 = requests.get(url="https://api.draftkings.com/draftgroups/v1/draftgroups/THEVALUEIWANTTOLOOPTHROUGH/draftables?format=json")
EDIT
import csv
import requests
import json
req = requests.get(url="https://www.draftkings.com/lobby/getcontests?sport=NHL")
req.raise_for_status()
data = req.json()
for i, contest in enumerate(data['DraftGroups']):
draftgroupid = contest['DraftGroupId']
req2 = requests.get(url="https://api.draftkings.com/draftgroups/v1/draftgroups/" + str(draftgroupid) + "/draftables?format=json")
data2 = req2.json
for i, player_info in enumerate(data2['draftables'][0]):
date = player_info['competition']['startTime']
print(date)
Running into a TypeError: 'method' object is not subscriptable
As I understand, your problem is related to string manipulation rather than for the request library.
So basically,
import requests
import json
req1 = requests.get(url="https://www.draftkings.com/lobby/getcontests?sport=NHL")
req.raise_for_status()
data = req.json()
for i, contest in enumerate(data['DraftGroups']):
draftgroupid = contest['DraftGroupId']
requests.get(url="https://api.draftkings.com/draftgroups/v1/draftgroups/" + str(draftgroupid) + "/draftables?format=json")
should do the job.
More elegant ways to concatenate strings can be found at http://www.pythonforbeginners.com/concatenation/string-concatenation-and-formatting-in-python
Edit
For example,
"some string " + str(123)
"some string %d" % 123
"some string %s" % 123
Will all give the same output. There are more ways to concatenate strings. You just need to choose the best fit based on the context.
for i, contest in enumerate(data['DraftGroups']):
draftgroupid = contest['DraftGroupId']
req2 = requests.get(url="https://api.draftkings.com/draftgroups/v1/draftgroups/%d/draftables?format=json" % draftgroupid)
I assume you didn't actually mean for i in range(draftgroupid) as you stated in the question, because that would mean making 16901 requests, followed by 16905 requests (all of which except the last four would be duplicates of the first batch), followed by 16902 requests (of which all would be duplicates), etc.
I am using google search api
But by default it shows 4 and maximum 8 results per page. I want more results per page.
Add the rsz=8 parameter to this google search demonstration code,
then use the start=... parameter to control which group of results you receive.
This, for example, gives you 50 results:
import urllib
import json
import sys
import itertools
def hits(astr):
for start in itertools.count():
query = urllib.urlencode({'q':astr, 'rsz': 8, 'start': start*8})
url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s'%(query)
search_results = urllib.urlopen(url)
results = json.loads(search_results.read())
data = results['responseData']
if data:
hits = data['results']
for h in hits:
yield h['url']
else:
raise StopIteration
def showmore(astr,num):
for i,h in enumerate(itertools.islice(hits(astr),num)):
print('{i}: {h}'.format(i=i,h=h))
if __name__=='__main__':
showmore(sys.argv[1],50)