getting a meaningful exception with Python Retry

getting a meaningful exception with Python Retry - python

Making a request to an API and I want to retry if I get 500. Alright, simple, I just use this solution (that you can also find on SO) which works wonders:
Create this function in the beginning:
import requests
from requests.adapters import HTTPAdapter Retry
def requests_retry_session(
retries=3,
backoff_factor=0.3,
status_forcelist=(500, 502, 504),
session=None,
):
session = session or requests.Session()
retry = Retry(
total=retries,
read=retries,
connect=retries,
backoff_factor=backoff_factor,
status_forcelist=status_forcelist,
)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)
return session
And later use it like so:
s = requests_retry_session()
response = s.get('http://httpbin.org')
I also like to use this method because the main part of my code is clear of try-catches and readability is important to me.
This method works and retries when a request fails for the number of retries you set.
When testing the solution on an error code response: http://httpbin.org/status/500 (url that gives you status 500 response)
I get this exception:
Traceback (most recent call last):
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\requests\adapters.py", line 489, in send
resp = conn.urlopen(
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\urllib3\connectionpool.py", line 878, in urlopen
return self.urlopen(
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\urllib3\connectionpool.py", line 878, in urlopen
return self.urlopen(
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\urllib3\connectionpool.py", line 878, in urlopen
return self.urlopen(
[Previous line repeated 3 more times]
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\urllib3\connectionpool.py", line 868, in urlopen
retries = retries.increment(method, url, response=response, _pool=self)
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\urllib3\util\retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='httpbin.org', port=80): Max retries exceeded with url: /status/500 (Caused by ResponseError('too many 500 error responses'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/user/Downloads/request_retry.py", line 25, in <module>
r = s.get('http://httpbin.org/status/500')
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\requests\sessions.py", line 600, in get
return self.request("GET", url, **kwargs)
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\requests\sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\requests\sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\requests\adapters.py", line 556, in send
raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPConnectionPool(host='httpbin.org', port=80): Max retries exceeded with url: /status/500 (Caused by ResponseError('too many 500 error responses'))
My expectation was to get the exception you would get if you ran this code:
import requests
response = requests.get('http://httpbin.org/status/500')
response.raise_for_status()
(Since making a normal get request doesn't raise an exception I would use response.raise_for_status() to raise the expected exception.)
Doing that you'll get this small and very nice exception (which I'm looking for):
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\requests\models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: INTERNAL SERVER ERROR for url: http://httpbin.org/status/500
Could someone help me make sense out of these exceptions and get normal ones?

The link you provided as the working-solution itself provides an explanation for this. You are getting a MaxRetryError, which in fact comes (Caused by ResponseError('too many 500 error responses')).
As it shows, you still need to catch the exception and handle it as you wish:
try:
response = requests_retry_session().get(
'http://httpbin.org/status/500',
)
except Exception as x:
print('It failed :(', x.__class__.__name__) # here comes your code
else:
print('It eventually worked', response.status_code)

Related

Python requests module timeouts with every https proxy and uses my real ip with http proxies

Basically i get this error with every single https proxy I've tried on every website.
Code:
import requests
endpoint = 'https://ipinfo.io/json'
proxies = {'http':'http://45.167.95.184:8085','https':'https://45.167.95.184:8085'}
r = requests.get(endpoint,proxies=proxies,timeout=10)
Error:
Traceback (most recent call last):
File "<pyshell#5>", line 1, in <module>
r = requests.get(endpoint,proxies=proxies,timeout=10)
File "C:\Users\Utente\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\Utente\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\Utente\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 529, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\Utente\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 645, in send
r = adapter.send(request, **kwargs)
File "C:\Users\Utente\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\adapters.py", line 507, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='ipinfo.io', port=443): Max retries exceeded with url: /json (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x000001FDCFB9E6A0>, 'Connection to 45.167.95.184 timed out. (connect timeout=10)'))
And when I only use http
import requests
endpoint = 'https://ipinfo.io/json'
proxies = {'http':'http://45.167.95.184:8085'}
r = requests.get(endpoint,proxies=proxies,timeout=10)
The request is sent but websites that return public ips show my real ip. Both requests (2.27.1) and urllib3 (1.26.8) are update to their lastest versions, what could the issue be?

2captcha selenium out of range

I'm trying to implement 2captcha using selenium with Python.
I just copied the example form their documentation:
https://github.com/2captcha/2captcha-api-examples/blob/master/ReCaptcha%20v2%20API%20Examples/Python%20Example/2captcha_python_api_example.py
This is my code:
from selenium import webdriver
from time import sleep
from selenium.webdriver.support.select import Select
import requests
driver = webdriver.Chrome('chromedriver.exe')
driver.get('the_url')
current_url = driver.current_url
captcha = driver.find_element_by_id("captcha-box")
captcha2 = captcha.find_element_by_xpath("//div/div/iframe").get_attribute("src")
captcha3 = captcha2.split('=')
#print(captcha3[2])
# Add these values
API_KEY = 'my_api_key' # Your 2captcha API KEY
site_key = captcha3[2] # site-key, read the 2captcha docs on how to get this
url = current_url # example url
proxy = 'Myproxy' # example proxy
proxy = {'http': 'http://' + proxy, 'https': 'https://' + proxy}
s = requests.Session()
# here we post site key to 2captcha to get captcha ID (and we parse it here too)
captcha_id = s.post("http://2captcha.com/in.php?key={}&method=userrecaptcha&googlekey={}&pageurl={}".format(API_KEY, site_key, url), proxies=proxy).text.split('|')[1]
# then we parse gresponse from 2captcha response
recaptcha_answer = s.get("http://2captcha.com/res.php?key={}&action=get&id={}".format(API_KEY, captcha_id), proxies=proxy).text
print("solving ref captcha...")
while 'CAPCHA_NOT_READY' in recaptcha_answer:
sleep(5)
recaptcha_answer = s.get("http://2captcha.com/res.php?key={}&action=get&id={}".format(API_KEY, captcha_id), proxies=proxy).text
recaptcha_answer = recaptcha_answer.split('|')[1]
# we make the payload for the post data here, use something like mitmproxy or fiddler to see what is needed
payload = {
'key': 'value',
'gresponse': recaptcha_answer # This is the response from 2captcha, which is needed for the post request to go through.
}
# then send the post request to the url
response = s.post(url, payload, proxies=proxy)
# And that's all there is to it other than scraping data from the website, which is dynamic for every website.
This is my error:
solving ref captcha...
Traceback (most recent call last):
File "main.py", line 38, in
recaptcha_answer = recaptcha_answer.split('|')[1]
IndexError: list index out of range
The captcha is getting solved because I can see it on 2captcha dashboard, so which is the error if it's de official documentation?
EDIT:
For some without modification I'm getting the captcha solved form 2captcha but then I get this error:
solving ref captcha...
OK|this_is_the_2captch_answer
Traceback (most recent call last):
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 594, in urlopen
self._prepare_proxy(conn)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 805, in _prepare_proxy
conn.connect()
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 308, in connect
self._tunnel()
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 906, in _tunnel
(version, code, message) = response._read_status()
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 278, in _read_status
raise BadStatusLine(line)
http.client.BadStatusLine: <html>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\adapters.py", line 449, in send
timeout=timeout
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\util\retry.py", line 368, in increment
raise six.reraise(type(error), error, _stacktrace)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\packages\six.py", line 685, in reraise
raise value.with_traceback(tb)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 594, in urlopen
self._prepare_proxy(conn)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connectionpool.py", line 805, in _prepare_proxy
conn.connect()
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\urllib3\connection.py", line 308, in connect
self._tunnel()
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 906, in _tunnel
(version, code, message) = response._read_status()
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\http\client.py", line 278, in _read_status
raise BadStatusLine(line)
urllib3.exceptions.ProtocolError: ('Connection aborted.', BadStatusLine('<html>\r\n'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 49, in <module>
response = s.post(url, payload, proxies=proxy)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 581, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "C:\Users\Usuari\AppData\Local\Programs\Python\Python37-32\lib\site-packages\requests\adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine('<html>\r\n'))
Why am I getting this error?
I'm setting as site_key = current_url_where_captcha_is_located
Is this correct?

Use your debugger or put a print(recaptcha_answer) before the error line to see what's the value of recaptcha_answer before you try to call .split('|') on it. There is no | in the string so when you're trying to get the second element of the resulting list with [1] it fails.

Looks like you don't provide any valid proxy connection parameters but passing this proxy to requests when connecting to the API.
Just comment these two lines:
#proxy = 'Myproxy' # example proxy
#proxy = {'http': 'http://' + proxy, 'https': 'https://' + proxy}
And then remove proxies=proxy from four lines:
captcha_id = s.post("http://2captcha.com/in.php?key={}&method=userrecaptcha&googlekey={}&pageurl={}".format(API_KEY, site_key, url)).text.split('|')[1]
recaptcha_answer = s.get("http://2captcha.com/res.php?key={}&action=get&id={}".format(API_KEY, captcha_id)).text
recaptcha_answer = s.get("http://2captcha.com/res.php?key={}&action=get&id={}".format(API_KEY, captcha_id)).text
response = s.post(url, payload, proxies=proxy)

SSL handshake failure on Google Translate

I have successfully been using the gTTS module in order to get audio from Google Translate for a while. I use it quite sparsely (I must have made 25 requests in total), and don't believe I could have hit any kind of limit that would cause my address to be blocked from using the service.
However, today, after trying to use it (I haven't used it in 1-2 months), I got the following program:
from gtts import gTTS
tts = gTTS('hallo', 'de')
tts.save('hallo.mp3')
To cause an error. I tracked down the problem, and I managed to see that even this simple program:
import requests
response = requests.get("https://translate.google.com/")
Causes the following error:
Traceback (most recent call last):
File "C:\...\lib\site-packages\urllib3\connectionpool.py", line 601, in urlopen
chunked=chunked)
File "C:\...\lib\site-packages\urllib3\connectionpool.py", line 346, in _make_request
self._validate_conn(conn)
File "C:\...\lib\site-packages\urllib3\connectionpool.py", line 850, in _validate_conn
conn.connect()
File "C:\...\lib\site-packages\urllib3\connection.py", line 326, in connect
ssl_context=context)
File "C:\...\lib\site-packages\urllib3\util\ssl_.py", line 329, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "C:\...\lib\ssl.py", line 407, in wrap_socket
_context=self, _session=session)
File "C:\...\lib\ssl.py", line 814, in __init__
self.do_handshake()
File "C:\...\lib\ssl.py", line 1068, in do_handshake
self._sslobj.do_handshake()
File "C:\...\lib\ssl.py", line 689, in do_handshake
self._sslobj.do_handshake()
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:777)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\...\lib\site-packages\requests\adapters.py", line 440, in send
timeout=timeout
File "C:\...\lib\site-packages\urllib3\connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "C:\...\lib\site-packages\urllib3\util\retry.py", line 388, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='translate.google.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:777)'),))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main2.py", line 2, in <module>
response = requests.get("https://translate.google.com/")
File "C:\...\lib\site-packages\requests\api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "C:\...\lib\site-packages\requests\api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "C:\...\lib\site-packages\requests\sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "C:\...\lib\site-packages\requests\sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "C:\...\lib\site-packages\requests\adapters.py", line 506, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='translate.google.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:777)'),))
I would like to know if anyone has an idea what the issue could be. I can get on the Google Translate website without any problems from my browser, and have no issues using the audio either.

Accepted answer did not work for me since the code has changed, the way i got it to work was to add verify=False in gtts_token.py instead
response = requests.get("https://translate.google.com/", verify=False)

This looks like an error related to your proxy setting, especially if you are using your work PC. I have got the same issue, but different error message, for example:
gTTSError: Connection error during token calculation:
HTTPSConnectionPool(host='translate.google.com', port=443): Max
retries exceeded with url: / (Caused by SSLError(SSLError("bad
handshake: Error([('SSL routines', 'ssl3_get_server_certificate',
'certificate verify failed')],)",),))
To further investigate the issue, you can debug it in the command line.
(base) c:\gtts-cli "sample text to debug" --debug --output test.mp3
you should see results as below;
ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',)))
Solution:
I have checked the gTTs documentation, there is no way to pass your proxy setting to the api. so the work around is ignore the ssl verification, which in not available also in gTTs. so the only way to do it is to change the following gtts files:
tts.py, in line 208 chage the request function to add verifiy=false
r = requests.get(self.GOOGLE_TTS_URL,
params=payload,
headers=self.GOOGLE_TTS_HEADERS,
proxies=urllib.request.getproxies(),
verify=False)
file lang.py, line 56
page = requests.get(URL_BASE, verify=False)
Then, try again the debug command line. you should be able to get the file recorded now
(base) c:\gtts-cli "sample text to debug" --debug --output test.mp3
gtts.tts - DEBUG - status-0: 200
gtts.tts - DEBUG - part-0 written to <_io.BufferedWriter name=test.mp3'>

python client request fails with default timeout

The following request from a python client to elasticsearch fails
2014-12-19 13:39:05,429 WARNING GET http://10.129.0.53:9200/delivery-logs-index.prod-20141218/_search?timeout=20m [status:N/A request:10.010s]
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/elasticsearch/connection/http_urllib3.py", line 46, in perform_request
response = self.pool.urlopen(method, url, body, retries=False, headers=headers, **kw)
File "/usr/lib/python2.6/site-packages/urllib3/connectionpool.py", line 559, in urlopen
_pool=self, _stacktrace=stacktrace)
File "/usr/lib/python2.6/site-packages/urllib3/util/retry.py", line 223, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/lib/python2.6/site-packages/urllib3/connectionpool.py", line 516, in urlopen
body=body, headers=headers)
File "/usr/lib/python2.6/site-packages/urllib3/connectionpool.py", line 336, in _make_request
self, url, "Read timed out. (read timeout=%s)" % read_timeout)
ReadTimeoutError: HTTPConnectionPool(host=u'10.129.0.53', port=9200): Read timed out. (read timeout=10)
Elasticsearch([es_host],
sniff_on_start=True,
max_retries=100,
retry_on_timeout=True,
sniff_on_connection_fail=True,
sniff_timeout=1000)
Is there a way to increase the request timeout? Currently it seems to be configured by default to read timeout=10

You can try adding a request_timeout to a value in your request like:
res = client.search(index=blabla, search_type="count", timeout="20m", request_timeout="10000", body={

You can also pass timeout=60 when instantiating the client object (60 meaning 60 seconds and of course being only an example).
This parameter overrides the 10s default specified in the Connection constructor.
https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/connection/base.py#L27

Get context type of requested url using python

I am trying to get headers of url using python using http://docs.python-requests.org/en/latest/ this tutorial. I am trying following code in python idle , I am getting following error,
>>> import requests
>>> r = requests.get('https://api.github.com/user')
Traceback (most recent call last):
File "<pyshell#32>", line 1, in <module>
r = requests.get('https://api.github.com/user')
File "C:\Python27\lib\site-packages\requests-2.3.0-py2.7.egg\requests\api.py", line 55, in get
return request('get', url, **kwargs)
File "C:\Python27\lib\site-packages\requests-2.3.0-py2.7.egg\requests\api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Python27\lib\site-packages\requests-2.3.0-py2.7.egg\requests\sessions.py", line 456, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python27\lib\site-packages\requests-2.3.0-py2.7.egg\requests\sessions.py", line 559, in send
r = adapter.send(request, **kwargs)
File "C:\Python27\lib\site-packages\requests-2.3.0-py2.7.egg\requests\adapters.py", line 375, in send
raise ConnectionError(e, request=request)
ConnectionError: HTTPSConnectionPool(host='api.github.com', port=443): Max retries exceeded with url: /user (Caused by <class 'socket.error'>: [Errno 10013] An attempt was made to access a socket in a way forbidden by its access permissions)

Looks like github is denying you access to the requested page. Before attempting to request pages in python try typing them into the browser to see what is returned. When I did this I was returned some JSON stating
{
"message": "Requires authentication",
"documentation_url": "https://developer.github.com/v3"
}
If you want to test your code and find headers of a webpage, try a publicly accessible webpage before delving into APIs.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

getting a meaningful exception with Python Retry - python

Related

Python requests module timeouts with every https proxy and uses my real ip with http proxies

2captcha selenium out of range

SSL handshake failure on Google Translate

python client request fails with default timeout

Get context type of requested url using python

Categories

Resources