python requests exception - keeps throwing errors - python

Script works ok when receiving a HTTP 200. However as soon as I get an exception I receive a error. I know the website does not have a valid SSL cert installed and I've tried to requests.exceptions.RequestException to catch anything, but still throws me errors. I'll also get exceptions such as proxy error come up now and then, and requests.exceptions.RequestException would not catch it. I have also tried, under try: request.raise_for_status() and that still throws an error. Code is as follows:
def verify_ssl(proxy_info, target):
print('Attempting to verify SSL Cert on %s:443' % target)
if proxy_info == None:
response = requests.get('https://%s' % target)
if proxy_info != None:
response = requests.get('https://%s' % target, proxies=proxy_info)
try:
response
except requests.exceptions.SSLError as g:
print('SSL Verification Error %s' % g)
return ('Successfully Verified SSL Cert: HTTP 200 received.\n')
Debug throws me this:
Traceback (most recent call last):
File "c:\Users\appleta\Desktop\ping.py", line 69, in <module>
ssl_result = verify_ssl(proxy_info, target)
File "c:\Users\appleta\Desktop\ping.py", line 37, in verify_ssl
response = requests.get('https://%s' % target)
File "C:\Python34\lib\site-packages\requests\api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "C:\Python34\lib\site-packages\requests\api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Python34\lib\site-packages\requests\sessions.py", line 512, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python34\lib\site-packages\requests\sessions.py", line 622, in send
r = adapter.send(request, **kwargs)
File "C:\Python34\lib\site-packages\requests\adapters.py", line 511, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='blabla.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:600)'),))

In python the expression:
response
can never yield an exception. You have to put both get call inside the try block instead:
def verify_ssl(proxy_info, target):
print('Attempting to verify SSL Cert on %s:443' % target)
try:
if proxy_info is None:
response = requests.get('https://%s' % target)
else:
response = requests.get('https://%s' % target, proxies=proxy_info)
except requests.exceptions.SSLError as g:
print('SSL Verification Error %s' % g)
return 'Got SSL error'
return 'Successfully Verified SSL Cert: HTTP 200 received.\n'
Also: note that you are always returning the string Successfully Verified SSL Cert ... because even in case of error you only print the error message but then execution resumes. You probably want to return something different in the except block.

Related

getting a meaningful exception with Python Retry

Making a request to an API and I want to retry if I get 500. Alright, simple, I just use this solution (that you can also find on SO) which works wonders:
Create this function in the beginning:
import requests
from requests.adapters import HTTPAdapter Retry
def requests_retry_session(
retries=3,
backoff_factor=0.3,
status_forcelist=(500, 502, 504),
session=None,
):
session = session or requests.Session()
retry = Retry(
total=retries,
read=retries,
connect=retries,
backoff_factor=backoff_factor,
status_forcelist=status_forcelist,
)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)
return session
And later use it like so:
s = requests_retry_session()
response = s.get('http://httpbin.org')
I also like to use this method because the main part of my code is clear of try-catches and readability is important to me.
This method works and retries when a request fails for the number of retries you set.
When testing the solution on an error code response: http://httpbin.org/status/500 (url that gives you status 500 response)
I get this exception:
Traceback (most recent call last):
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\requests\adapters.py", line 489, in send
resp = conn.urlopen(
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\urllib3\connectionpool.py", line 878, in urlopen
return self.urlopen(
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\urllib3\connectionpool.py", line 878, in urlopen
return self.urlopen(
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\urllib3\connectionpool.py", line 878, in urlopen
return self.urlopen(
[Previous line repeated 3 more times]
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\urllib3\connectionpool.py", line 868, in urlopen
retries = retries.increment(method, url, response=response, _pool=self)
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\urllib3\util\retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='httpbin.org', port=80): Max retries exceeded with url: /status/500 (Caused by ResponseError('too many 500 error responses'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/user/Downloads/request_retry.py", line 25, in <module>
r = s.get('http://httpbin.org/status/500')
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\requests\sessions.py", line 600, in get
return self.request("GET", url, **kwargs)
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\requests\sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\requests\sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "C:\Users\user\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\requests\adapters.py", line 556, in send
raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPConnectionPool(host='httpbin.org', port=80): Max retries exceeded with url: /status/500 (Caused by ResponseError('too many 500 error responses'))
My expectation was to get the exception you would get if you ran this code:
import requests
response = requests.get('http://httpbin.org/status/500')
response.raise_for_status()
(Since making a normal get request doesn't raise an exception I would use response.raise_for_status() to raise the expected exception.)
Doing that you'll get this small and very nice exception (which I'm looking for):
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\requests\models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: INTERNAL SERVER ERROR for url: http://httpbin.org/status/500
Could someone help me make sense out of these exceptions and get normal ones?
The link you provided as the working-solution itself provides an explanation for this. You are getting a MaxRetryError, which in fact comes (Caused by ResponseError('too many 500 error responses')).
As it shows, you still need to catch the exception and handle it as you wish:
try:
response = requests_retry_session().get(
'http://httpbin.org/status/500',
)
except Exception as x:
print('It failed :(', x.__class__.__name__) # here comes your code
else:
print('It eventually worked', response.status_code)

requests.get crashes on certain urls

import requests
r = requests.get('https://www.whosampled.com/search/?q=marvin+gaye')`
This returns the following error
Traceback (most recent call last):
File "C:\Users\thoma\Downloads\RealisticMellowProfile\Python\New folder\Term project demo.py", line 8, in <module>
r = requests.get('https://www.whosampled.com/search/?q=marvin+gaye')
File "c:\users\thoma\miniconda3\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "c:\users\thoma\miniconda3\lib\site-packages\requests\api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "c:\users\thoma\miniconda3\lib\site-packages\requests\sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "c:\users\thoma\miniconda3\lib\site-packages\requests\sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "c:\users\thoma\miniconda3\lib\site-packages\requests\adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
You can change the user agent so the server does not close the connection:
import requests
headers = {"User-Agent": "Mozilla/5.0"}
r = requests.get('https://www.whosampled.com/search/?q=marvin+gaye', headers=headers)
The url is broken (or the server serving this url)
Try to get it with
wget https://www.whosampled.com/search/?q=marvin+gaye
or with
curl https://www.whosampled.com/search/?q=marvin+gaye
Use try / except to handle such situations.
However you wnat be able to gat data from it (same as with wget or curl)
import requests
try:
r = requests.get('https://www.whosampled.com/search/?q=marvin+gaye')`
except requests.exceptions.ConnectionError:
print("can't get data from this server")
r = None
if r is not None:
# handle succesful request
else:
# handler error situation

requests.exceptions.SSLError - Failing to use python module

I am using the python module udemy-dl which i have installed via pypi.org/project/udemy-dl. When i run the script , i keep getting a SSL Error. I have looked through many questions on Stackoverflow and none of them have seemed to work. I get the following on my terminal:
[INFO-835] Downloading to: /Users/dev/the-complete-python-web-course-learn-by-building-8-apps
[INFO-107] Trying to log in ...
Traceback (most recent call last):
File "/Users/dev/homebrew/Cellar/python#2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/Users/dev/homebrew/Cellar/python#2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/Users/dev/homebrew/lib/python2.7/site-packages/udemy_dl/dev.py", line 8, in <module>
main()
File "/Users/dev/homebrew/lib/python2.7/site-packages/udemy_dl/udemy_dl.py", line 837, in main
udemy_dl(username, password, link, lecture_start, lecture_end, save_links, safe_file_names, just_list, output_dest)
File "/Users/dev/homebrew/lib/python2.7/site-packages/udemy_dl/udemy_dl.py", line 658, in udemy_dl
login(username, password)
File "/Users/dev/homebrew/lib/python2.7/site-packages/udemy_dl/udemy_dl.py", line 109, in login
csrf_token = get_csrf_token()
File "/Users/dev/homebrew/lib/python2.7/site-packages/udemy_dl/udemy_dl.py", line 95, in get_csrf_token
response = session.get('https://www.udemy.com/join/login-popup')
File "/Users/dev/homebrew/lib/python2.7/site-packages/udemy_dl/udemy_dl.py", line 66, in get
return self.session.get(url, headers=self.headers)
File "/Users/dev/homebrew/lib/python2.7/site-packages/requests/sessions.py", line 488, in get
return self.request('GET', url, **kwargs)
File "/Users/dev/homebrew/lib/python2.7/site-packages/requests/sessions.py", line 475, in request
resp = self.send(prep, **send_kwargs)
File "/Users/dev/homebrew/lib/python2.7/site-packages/requests/sessions.py", line 596, in send
r = adapter.send(request, **kwargs)
File "/Users/dev*emphasized text*/homebrew/lib/python2.7/site-packages/requests/adapters.py", line 497, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:726)
I see that in adapters.py , is where the exception is raised :
def send(self, request, stream=False, timeout=None, verify=True, cert=None, proxies=None):
"""Sends PreparedRequest object. Returns Response object.
:param request: The :class:`PreparedRequest <PreparedRequest>` being sent.
:param stream: (optional) Whether to stream the request content.
:param timeout: (optional) How long to wait for the server to send
data before giving up, as a float, or a :ref:`(connect timeout,
read timeout) <timeouts>` tuple.
:type timeout: float or tuple
:param verify: (optional) Whether to verify SSL certificates.
:param cert: (optional) Any user-provided SSL certificate to be trusted.
:param proxies: (optional) The proxies dictionary to apply to the request.
:rtype: requests.Response
"""
conn = self.get_connection(request.url, proxies)
self.cert_verify(conn, request.url, verify, cert)
url = self.request_url(request, proxies)
self.add_headers(request)
chunked = not (request.body is None or 'Content-Length' in request.headers)
if isinstance(timeout, tuple):
try:
connect, read = timeout
timeout = TimeoutSauce(connect=connect, read=read)
except ValueError as e:
# this may raise a string formatting error.
err = ("Invalid timeout {0}. Pass a (connect, read) "
"timeout tuple, or a single float to set "
"both timeouts to the same value".format(timeout))
raise ValueError(err)
else:
timeout = TimeoutSauce(connect=timeout, read=timeout)
try:
if not chunked:
resp = conn.urlopen(
method=request.method,
url=url,
body=request.body,
headers=request.headers,
redirect=False,
assert_same_host=False,
preload_content=False,
decode_content=False,
retries=self.max_retries,
timeout=timeout
)
# Send the request.
else:
if hasattr(conn, 'proxy_pool'):
conn = conn.proxy_pool
low_conn = conn._get_conn(timeout=DEFAULT_POOL_TIMEOUT)
try:
low_conn.putrequest(request.method,
url,
skip_accept_encoding=True)
for header, value in request.headers.items():
low_conn.putheader(header, value)
low_conn.endheaders()
for i in request.body:
low_conn.send(hex(len(i))[2:].encode('utf-8'))
low_conn.send(b'\r\n')
low_conn.send(i)
low_conn.send(b'\r\n')
low_conn.send(b'0\r\n\r\n')
# Receive the response from the server
try:
# For Python 2.7+ versions, use buffering of HTTP
# responses
r = low_conn.getresponse(buffering=True)
except TypeError:
# For compatibility with Python 2.6 versions and back
r = low_conn.getresponse()
resp = HTTPResponse.from_httplib(
r,
pool=conn,
connection=low_conn,
preload_content=False,
decode_content=False
)
except:
# If we hit any problems here, clean up the connection.
# Then, reraise so that we can handle the actual exception.
low_conn.close()
raise
except (ProtocolError, socket.error) as err:
raise ConnectionError(err, request=request)
except MaxRetryError as e:
if isinstance(e.reason, ConnectTimeoutError):
# TODO: Remove this in 3.0.0: see #2811
if not isinstance(e.reason, NewConnectionError):
raise ConnectTimeout(e, request=request)
if isinstance(e.reason, ResponseError):
raise RetryError(e, request=request)
if isinstance(e.reason, _ProxyError):
raise ProxyError(e, request=request)
raise ConnectionError(e, request=request)
except ClosedPoolError as e:
raise ConnectionError(e, request=request)
except _ProxyError as e:
raise ProxyError(e)
except (_SSLError, _HTTPError) as e:
if isinstance(e, _SSLError):
raise SSLError(e, request=request)
The script you use will verify certain certificates with the main site and it allows connection only when the certificates are verified. Possible workaround.
1.) You need to download the certificates given by the website and pass the same to requests call verify='path/to/ssl/certificate/' (or)
2.) find the requests call in the script and set verify=False

Webscraping with Python3.7: ConnectionError: HTTPSConnectionPool(host='www.google.com', port=443):

I want to scrape web results from google.com. I followed the first answer from this questions, Google Search Web Scraping with Python. Unfortunately I am getting connecting error. I happened to check with other websites too, its not connecting . Is it because of the corporate proxy settings?
Please note that i am using virtual env "Webscraping".
from urllib.parse import urlencode, urlparse, parse_qs
from lxml.html import fromstring
from requests import get
raw = get("https://www.google.com/search?q=StackOverflow").text
page = fromstring(raw)
for result in page.cssselect(".r a"):
url = result.get("href")
if url.startswith("/url?"):
url = parse_qs(urlparse(url).query)['q']
print(url[0])
raw = get("https://www.google.com/search?q=StackOverflow").text
Traceback (most recent call last):
File "", line 1, in
raw = get("https://www.google.com/search?q=StackOverflow").text
File
"c:\users\appdata\local\programs\python\python37\webscraping\lib\site-packages\requests\api.py",
line 75, in get
return request('get', url, params=params, **kwargs)
File
"c:\users\appdata\local\programs\python\python37\webscraping\lib\site-packages\requests\api.py",
line 60, in request
return session.request(method=method, url=url, **kwargs)
File
"c:\users\appdata\local\programs\python\python37\webscraping\lib\site-packages\requests\sessions.py",
line 524, in request
resp = self.send(prep, **send_kwargs)
File
"c:\users\appdata\local\programs\python\python37\webscraping\lib\site-packages\requests\sessions.py",
line 637, in send
r = adapter.send(request, **kwargs)
File
"c:\users\appdata\local\programs\python\python37\webscraping\lib\site-packages\requests\adapters.py",
line 516, in send
raise ConnectionError(e, request=request)
ConnectionError: HTTPSConnectionPool(host='www.google.com', port=443):
Max retries exceeded with url: /search?q=StackOverflow (Caused by
NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object
at 0x0000021B79768748>: Failed to establish a new connection:
[WinError 10060] A connection attempt failed because the connected
party did not properly respond after a period of time, or established
connection failed because connected host has failed to respond'))
Please advise. Thanks
EDIT: I tried pining google.com, it fails.
import os
hostname = "https://www.google.com" #example
response = os.system("ping -c 1 " + hostname)
#and then check the response...
if response == 0:
print(hostname, 'is up!')
else:
print(hostname, 'is down!')
https://www.google.com is down!
I think you are getting this error because of your proxy setting.
Try run one of the following commands in command prompt
set http_proxy=http://proxy_address:port
set http_proxy=http://user:password#proxy_address:port
set https_proxy=https://proxy_address:port
set https_proxy=https://user:password#proxy_address:port

What does this python exception mean

I see the following exception sometimes when I try to hit my home page.
ERROR:root:HTTPConnectionPool(host='0.0.0.0', port=8003): Max retries exceeded with url:
/snapshot/?app=cdnstats&key=28736ba5fbe151d5ff6678015c8f6ade (Caused by <class 'socket.error'>:
[Errno 61] Connection refused)
Traceback (most recent call last):
File "/Users/rokumar/CDNStats/cdnstats/app/core/views.py", line 257, in get_snapshot_data
data = templates.render_snapshot(controllers.get_snapshot_data())
File "/Users/rokumar/CDNStats/cdnstats/util.py", line 260, in decorated
expiry, mem_args, func, args, kwargs)
File "/Users/rokumar/CDNStats/cdnstats/util.py", line 227, in get_data_from_meminstance
data = func(*args, **kwargs)
File "/Users/rokumar/CDNStats/cdnstats/app/core/controllers.py", line 255, in get_snapshot_data
return util.call_get_api(config.CDNSTATS_API_URL + 'snapshot/?', data)
File "/Users/rokumar/CDNStats/cdnstats/util.py", line 123, in call_get_api
raise ex
The following is the piece of code generating the exception.
def call_get_api(url, data):
try:
data = data.copy()
data['key'] = request.args.get('key')
data['app'] = config.APPNAME
query = soft_urlencode(data)
response = requests.get(url + query)
if response.status_code == 200:
return response.json()
else:
apiexception = APIException(response.content)
apiexception.status_code = response.status_code
raise apiexception
except UnicodeEncodeError as ex:
print ex
raise ex
except Exception as ex:
raise ex
I see the exception intermittently and my app slows down heavily. I don;t really understand the exception or what is wrong. The exception says max retries exceeded but I do not have any retry logic going on.
in urlopen, try setting retries=False or retries=1. The default is 3 so that will probably be your retry logic going on there.

Categories