Background and Code
I have the below function to handle rate limiting in Twitter's V2 API based on the HTTP status codes.
from datetime import datetime
from osometweet.utils import pause_until
def manage_rate_limits(response):
"""Manage Twitter V2 Rate Limits
This method takes in a `requests` response object after querying
Twitter and uses the headers["x-rate-limit-remaining"] and
headers["x-rate-limit-reset"] headers objects to manage Twitter's
most common, time-dependent HTTP errors.
Wiki Reference: https://github.com/osome-iu/osometweet/wiki/Info:-HTTP-Status-Codes-and-Errors
Twitter Reference: https://developer.twitter.com/en/support/twitter-api/error-troubleshooting
"""
while True:
# The x-rate-limit-remaining parameter is not always present.
# If it is, we want to use it.
try:
# Get number of requests left with our tokens
remaining_requests = int(response.headers["x-rate-limit-remaining"])
# If that number is one, we get the reset-time
# and wait until then, plus 15 seconds (your welcome Twitter).
# The regular 429 exception is caught below as well,
# however, we want to program defensively, where possible.
if remaining_requests == 1:
buffer_wait_time = 15
resume_time = datetime.fromtimestamp( int(response.headers["x-rate-limit-reset"]) + buffer_wait_time )
print(f"One request from being rate limited. Waiting on Twitter.\n\tResume Time: {resume_time}")
pause_until(resume_time)
except Exception as e:
print("An x-rate-limit-* parameter is likely missing...")
print(e)
# Explicitly checking for time dependent errors.
# Most of these errors can be solved simply by waiting
# a little while and pinging Twitter again - so that's what we do.
if response.status_code != 200:
# Too many requests error
if response.status_code == 429:
buffer_wait_time = 15
resume_time = datetime.fromtimestamp( int(response.headers["x-rate-limit-reset"]) + buffer_wait_time )
print(f"Too many requests. Waiting on Twitter.\n\tResume Time: {resume_time}")
pause_until(resume_time)
# Twitter internal server error
elif response.status_code == 500:
# Twitter needs a break, so we wait 30 seconds
resume_time = datetime.now().timestamp() + 30
print(f"Internal server error # Twitter. Giving Twitter a break...\n\tResume Time: {resume_time}")
pause_until(resume_time)
# Twitter service unavailable error
elif response.status_code == 503:
# Twitter needs a break, so we wait 30 seconds
resume_time = datetime.now().timestamp() + 30
print(f"Twitter service unavailable. Giving Twitter a break...\n\tResume Time: {resume_time}")
pause_until(resume_time)
# If we get this far, we've done something wrong and should exit
raise Exception(
"Request returned an error: {} {}".format(
response.status_code, response.text
)
)
# Each time we get a 200 response, exit the function and return the response object
if response.ok:
return response
This function is fed a response object from a requests call like the below
response = requests.get(
url,
headers=self._header,
params=payload
)
response = manage_rate_limits(response)
In the above response call the parameters are the following:
where
url = Twitter's base endpoint URL (in this case it is the full archive academic search)
params/payload = a combination of endpoint search operators (these should be irrelevant but I can include if necessary)
headers/self._bearer_token is a user bearer_token in the below proper header form
self._header = {"Authorization": f"Bearer {MY_BEARER_TOKEN}"}
Question & Error:
Using the above code, I get a long-running script that returns the below error from the rate_limit_manager function.
Traceback (most recent call last):
File "/scratch/mdeverna/Superspreaders/src/get_rts_of_user.py", line 218, in get_rts_of_user
full_archive_search = True
File "/nfs/nfs5/home/scratch/mdeverna/osometweet/osometweet/api.py", line 248, in search
response = self._oauth.make_request(url, payload)
File "/nfs/nfs5/home/scratch/mdeverna/osometweet/osometweet/oauth.py", line 181, in make_request
response = manage_rate_limits(response)
File "/nfs/nfs5/home/scratch/mdeverna/osometweet/osometweet/rate_limit_manager.py", line 67, in manage_rate_limits
response.status_code, response.text
Exception: Request returned an error: 429 {"title":"Too Many Requests","type":"about:blank","status":429,"detail":"Too Many Requests"}
What I don't understand is that the line that prints this exception is...
# If we get this far, we've done something wrong and should exit
raise Exception(
"Request returned an error: {} {}".format(
response.status_code, response.text
)
... and this illustrates the response.status_code prints (equals) 429, however, the conditional earlier in this function checks for exactly this status code but seems to miss it. It seems like the condition which checks if the status code = 429 is being skipped, only to print down below that the status code is 429?
What is going on here?
Even if the status code is 429 or 500 or 503, you're going to flow off the bottom of the if/elif/elif sequence and right into the raise. Did you intend to return at the end of each? Or did you mean for the raise to be in an else: clause?
Related
Example of 522 error when I go to the webpage manually
Example of 525 error when I go to the webpage manually
Example of 504 error when I go to the webpage manually
I am running the following for loop which goes through a dictionary of subreddits(key) and urls (value). The urls produce a dictionary with all posts from 2022 of a given subreddit. Sometimes the for loop stops and produces a 'http error 525' or other errors.
I'm wondering how I can check for these errors when reading the url and then try again until the error is not given before moving to the next subreddit.
for subredd, url in dict_last_subreddit_posts.items():
print(subredd)
page = urllib.request.urlopen(url).read()
dict_last_posts[subredd] = page
I haven't been able to figure it out.
You can put this code in try and except block like this:
for subredd, url in dict_last_subreddit_posts.items():
print(subredd)
while True:
try:
page = urllib.request.urlopen(url).read()
dict_last_posts[subredd] = page
break # exit the while loop if the request succeeded
except urllib.error.HTTPError as e:
if e.code == 525 or e.code == 522 or e.code == 504:
print("Encountered HTTP error while reading URL. Retrying...")
else:
raise # re-raise the exception if it's a different error
This code will catch any HTTP Error that occurs while reading the URL and check if the error code is 525 or 504 or 525. If it is, it will print a message and try reading the URL again. If it's a different error, it will re-raise the exception so that you can handle it appropriately.
NOTE: This code will retry reading the URL indefinitely until it succeeds or a different error occurs. You may want to add a counter or a timeout to prevent the loop from going on forever in case the error persists.
It's unwise to indefinitely retry a request. Set a limit even if it's very high, but don't set it so high that it causes you to be rate limited (HTTP status 429). The backoff_factor will also have an impact on rate limiting.
Use the requests package for this. This makes it very easy to set a custom adapter for all of your requests via Session, and it includes Retry from urllib3 which takes care of retry behavior in an object you can pass to your adapter.
import requests
from requests.adapters import HTTPAdapter, Retry
s = requests.Session()
retries = Retry(
total=5,
backoff_factor=0.1,
status_forcelist=[504, 522, 525]
)
s.mount('https://', HTTPAdapter(max_retries=retries))
for subredd, url in dict_last_subreddit_posts.items():
response = s.get(url)
dict_last_posts[subredd] = response.content
You can play around with total (maximum number of retries) and backoff_factor (adjusts wait time between retries) to get the behavior you want.
Try something like this:
for subredd, url in dict_last_subreddit_posts.items():
print(subredd)
http_response = urllib.request.urlopen(url)
while http_response.status != 200:
if http_response.status == 503:
http_response = urllib.request.urlopen(url)
elif http_response.status == 523:
#enter code here
else:
#enter code here
dict_last_posts[subredd] = http_response.read()
But, Michael Ruth answer is better
My service bus queue trigger will get triggered every time when a new file is landed in blob storage, and the trigger function will perform a post request to Airflow to trigger an airflow job, but doesn't matter if it is a good request with 200 or bad request 404, the trigger will mark as success automatically even when the airflow job does not get triggered, hence, we will lost that message and there is no way to retrieve it back.
I thought 2 solutions, firstly is catch error in Exception, secondly is manually failed the job if code != 200, but none of the solution works.
As we can see if the code, status = 404 and trigger function still marked succeeded. Where did I do wrong in my code? How could I make my logic work?
try:
result = requests.post(
f"http://localhost:8080/api/v1/dags/{dag_id}/dagRunss",
data = json.dumps(data),
headers = header,
auth = ("airflow", "airflow"))
logging.info(result.content.decode('utf-8'))
if result.status_code != 200:
raise ValueError('This is the exception you expect to handle')
except Exception as err:
logging.error(err)
I have modified your code with handling the response status code
try:
result = requests.post(
f"http://localhost:8080/api/v1/dags/{dag_id}/dagRunss",
data = json.dumps(data),
headers = header,
auth = ("airflow", "airflow"))
logging.info(result.content.decode('utf-8'))
except Exception as err:
logging.error(err)
# Check the status code and trigger your airflow job
if result.status_code == 200:
# do your steps further like trigger your airflow job
return "return your airflow job response "
# If status_code != 200 do your other jobs or not trigger the airflow job
else:
logging.error(f' Status Code: {result.status_code} ')
#here you will get other response status code like 40*,50*
return "return your not 200 response and dont call the airflow job"
else:
return " your airflow job / !=200 response"
if you got 404 response Status_Code, it will execute the code under first else: logging.error(f' Status Code: {result.status_code} '), return "return your not 200 response and dont call the airflow job".
Refer here for more information
I'm setting up a small Python script so my colleagues can collect data from a certain internal API based on a few inputs using the following code:
url = "https://....."
params = dict(...)
client = BackendApplicationClient(client_id=client_id)
client.prepare_request_body(scope=[])
session = OAuth2Session(client=client)
response = session.get(url=url, params=params, verify=session.verify)
where the params are based on the manual inputs. I can guarantee some of the inputs will not conform to the API's requirements fully (like lower case letters where upper case is needed, etc.). In this case, the API will return a response with status 400:
>> response
<Response [400]>
>> response.text
{"statusCode":400,"errorMessage":"Bad Request","errors": ...}
>> response.status_code
400
I thought I could capture this with response.raise_for_status(), but no Exception is raised, and the returned value is None:
>> response.raise_for_status()
None
Why is this? I thought the raise_for_status function was supposed to raise an Exception based on the response's status_code
raise_for_status() on a response from the requests module will raise an HTTPError exception if the HTTP status code is 400. This is a peculiarity of OAuth2Session which you can read about here
This question already has an answer here:
How to return with a specific status in a Python Google Cloud Function
(1 answer)
Closed 2 years ago.
I created a function that transforms some data and sends it to FB API.
It works perfectly when FB API responds with 200 code, otherwise function returns internal server error.
I've added raise_for_status() and now I can return an error message if FB API responds with non-200 code.
How can I make my function not only to respond with a relevant error message but with the relevant status code?
response = requests.request("POST", url, headers=headers, data=payload, params=params)
resp = {}
try:
response.raise_for_status()
except requests.exceptions.HTTPError:
response.status_code = 400
resp['message'] = response.text
else:
resp['message'] = response.text
finally:
return resp
Add the HTTP code after your response, like this
return resp, 403
I recently started learning Python 3 and am trying to write my first program. The essence of the program is the auto-display of items on the trading floor. I use the API https://market.csgo.com/docs-v2. Everything would be fine, if not the errors that appear while the script is running. I know to use "TRY and EXECPT", but how to do it right? My code:
while True:
try:
ip = {'18992549780':'10000', '18992548863':'20000','18992547710':'30000','18992546824':'40000', '18992545927':'50000', '18992544515':'60000', '18992543504':'70000', '18992542365':'80000', '18992541028':'90000', '18992540218':'100000'}
for key,value in ip.items():
url3 = ('https://market.csgo.com/api/v2/add-to-sale?key=MYAPIKEY&id={id}&price={price}&cur=RUB')
addtosale = url3.format(id = key, price = value)
onsale = requests.get(addtosale)
onsale.raise_for_status()
r = onsale.json()
print(addtosale)
print(onsale.raise_for_status)
print(r)
time.sleep(5)
except requests.HTTPError as exception:
print(exception)
My task is to run this piece of code between TRY and EXCEPT again on any error (5xx for example)
Traceback (most recent call last):
File "D:\Python\tmsolve1.py", line 30, in <module>
onsale.raise_for_status()
File "C:\Users\���������\AppData\Local\Programs\Python\Python38-32\lib\site-packages\requests\models.py", line 941, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 502 Server Error: Bad Gateway for url: https://market.csgo.com/api/v2/add-to-sale?key=MYAPIKEY&id=18992545927&price=50000&cur=RUB
502 Server Error: Bad Gateway for url: https://market.csgo.com/api/v2/add-to-sale?key=MYAPIKEY&id=18992549780&price=10000&cur=RUB
Error handling can be done in multiple ways. You have 10 API calls. You can either stop the code on first error, retry the request or continue with additional call.
The example below will continue running through all requests.
Also except requests.HTTPError as exception may not be needed. This error is thrown by response.raise_for_status(). You can preform logging before calling .raise_for_status(). The try/catch only allows the code to continue in the loop.
import requests
import time
import json
# while True: # This will make the code loop continuously
try:
ip = {'18992549780':'10000', '18992548863':'20000','18992547710':'30000','18992546824':'40000', '18992545927':'50000', '18992544515':'60000', '18992543504':'70000', '18992542365':'80000', '18992541028':'90000', '18992540218':'100000'}
for key,value in ip.items():
url= 'https://market.csgo.com/api/v2/add-to-sale'
payload = {'key': 'MYAPIKEY', 'id': id, 'price': value, 'cur': 'RUB'}
response = requests.get(url, params=payload)
print(f'Status code: { response.status_code}')
print(f'Response text: { response.text}') # This will contain an error message or json results.
response.raise_for_status() # This will only error if status code is 4xx or 5xx
results = response.json()
if results.get('error'): # "results" can contains {"error":"Bad KEY","success":false}
raise Exception('Error in response json')
print(json.dumps(results))
time.sleep(5)
except requests.HTTPError as exception: # Captures response.raise_for_status() - 4xx or 5xx status code. If you remove this, then code will use generic handle
print(exception)
except Exception as exception: # Generic error handler for raise Exception('Error in response json') and "Max retries exceeded."
print(exception)