python requests/urllib3 connection pooling not catching HTTP errors - python

python requests (urllib3) with connection pooling is not catching http errors. Is this a bug? Or am I doing something wrong?
#!/usr/bin/env python
import contextlib
import requests
import sys
connection_pool_size = 2
adapter = requests.adapters.HTTPAdapter(pool_connections=connection_pool_size,
pool_maxsize=connection_pool_size)
r_session = requests.Session()
r_session.mount('http', adapter)
try:
with contextlib.closing(r_session.get(sys.argv[1], timeout=30, allow_redirects=True)) as r:
print 'success %r' % r
except requests.exceptions.HTTPError as e:
print 'HTTPError %r' % e
except Exception as e:
print 'Exception %r' % e
output:
$ ./test.py https://github.com
success <Response [200]>
$ ./test.py https://github.com/sithlordyoyoma
success <Response [404]>
I was expecting HTTPError . Am I doing something wrong?
Closing with contextlib I got from this thread should I call close() after urllib.urlopen()?. As suggested by Alex Martelli.
actually running requests without connection also showing this behaviour
#!/usr/bin/env python
import contextlib
import requests
import sys
try:
with contextlib.closing(requests.get(sys.argv[1], timeout=30, allow_redirects=True)) as r:
print 'success %r' % r
except requests.exceptions.HTTPError as e:
print 'HTTPError %r' % e
except Exception as e:
print 'Exception %r' % e
output:
$ ./test.py https://github.com
success <Response [200]>
$ ./test.py https://github.com/sithlordyoyoma
success <Response [404]>
urllib2 does this correctly
#!/usr/bin/env python
import contextlib
import urllib2
import sys
try:
with contextlib.closing(urllib2.urlopen(sys.argv[1], timeout=30)) as r:
print 'success %r' % r
except urllib2.HTTPError as e:
print 'HTTPError %r' % e
except Exception as e:
print 'Exception %r' % e
output:
$ ./test.py https://github.com
success <addinfourl at 4338734792 whose fp = <socket._fileobject object at 0x1025a5c50>>
$ ./test.py https://github.com/sithlordyoyoma
HTTPError HTTPError()

Regardless of connection pooling, requests.post (and other HTTP methods) does not raise HTTPError on a 404. HTTPError is raised by calling .raise_for_status(), like this example demonstrates:
#!/usr/bin/env python
import requests
r = requests.post(
'https://github.com/sithlordyoyoma',
timeout=30,
allow_redirects=True
)
print 'success %r' % r
r.raise_for_status()

Related

Create a Conection TimeOut using urllib2.urlOpen()

I want to create a connection timeout exception using urlopen.
try:
urllib2.urlopen("http://example.com", timeout = 5)
except urllib2.URLError, e:
raise MyException("There was an error: %r" % e)
This is the code
I want to create a timeout that this code would bring an exception.
Thank You in advance.
You need to catch socket.timeout exception, check example below.
import urllib2
import socket
class MyException(Exception):
pass
try:
urllib2.urlopen("http://example.com", timeout = 1)
except socket.timeout, e:
# For Python 2.7
raise MyException("There was an error: %r" % e)
I strongly recommend using Requests library for making requests, it will make your life easier.

python capture URLError code

I want to use Python to monitor a website that uses HTTPS.
The problem is that the certificate on the website is invalid.
I don't care about that, I just want to know that the website is running.
My working code looks like this:
from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError
req = Request("https://somedomain.com")
try:
response = urlopen(req)
except HTTPError as e:
print('server couldn\'t fulfill the request')
print('error code: ', e.code)
except URLError as e:
print(e.args)
else:
print ('website ok')
that ends in URLError being called. The error code is 645.
C:\python>python monitor443.py
(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:645)'),)
So, I'm trying to except code 645 as OK. I've tried this:
from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError
req = Request("https://somedomain.com")
try:
response = urlopen(req)
except HTTPError as e:
print('server couldn\'t fulfill the request')
print('error code: ', e.code)
except URLError as e:
if e.code == 645:
print("ok")
print(e.args)
else:
print ('website ok')
but get this error:
Traceback (most recent call last):
File "monitor443.py", line 11, in <module>
if e.code == 645:
AttributeError: 'URLError' object has no attribute 'code'
how do I add this exception?
Please have a look at the great requests package. It will simplify your life when doing http communication. See http://requests.readthedocs.io/en/master/.
pip install requests
To skip certificate check, you would do something like this (note the verify parameter!):
requests.get('https://kennethreitz.com', verify=False)
<Response [200]>
See the full documentation here.
HTH
I couldn't install the SLL library (egg_info error).
This is what I ended up doing
from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError
def sendEmail(r):
#send notification
print('send notify')
req = Request("https://somedomain.com")
try:
response = urlopen(req)
except HTTPError as e:
print('server couldn\'t fulfill the request')
print('error code: ', e.code)
sendEmail('server couldn\'t fulfill the request')
except URLError as e:
theReason=str(e.reason)
#[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:645)
if theReason.find('CERTIFICATE_VERIFY_FAILED') == -1:
sendEmail(theReason)
else:
print('website ok')
else:
print('website ok')

I use Jython 2.7 when sending requests sometimes an error occurs (Got this failure java.lang.NullPointerException during SSL handshake)

My code:
#trust_all_certificates
def go_url(self, url, data=None, headers={}):
global response
request = urllib2.Request(url, data, headers)
request.add_header('Authorization', 'Basic %s' % self.AdminAuthBase64)
for x in xrange(3):
try:
response = urllib2.urlopen(request)
break
except IOError, e:
if hasattr(e, 'reason'):
API.log('We failed to reach a server. Reason: ' + str(e.reason))
elif hasattr(e, 'code'):
API.log('The server couldn\'t fulfill the request. Error code: ' + str(e.code))
sleep(30)
else:
API.halt('Can not send an request to server')
Exeption:
We failed to reach a server.
Reason: [Errno 1] Unmapped exception: java.lang.NullPointerException
Got this failure java.lang.NullPointerException during SSL handshake (<_realsocket at 0x9d type=client open_count=1 channel=[id: 0xda3c990d, /my ip => to ip] timeout=60.0>)
The most interesting is that this is not constant, and there is only jython 2.7, in Jython 2.5 works correctly

Checking for Timeout Error in python

So I have a pretty generic logging statement after a request:
try:
r = requests.get(testUrl, timeout=10.0)
except Exception, err:
logger.error({"message": err.message})
This works great for everything I've thrown at it except TimeoutError. When the request times out the err I get back is a tuple that it tries and fails to serialize.
My question is how do I catch just this one type of error? For starters TimeoutError is not something I have access to. I have tried adding from exceptions import * but with no luck. I've also tried importing OSError because the docs say TimeoutError is a subclass, but I was unable to access TimeoutError after importing OSError.
TimeoutError docs
I plan to either list my exceptions in order:
except TimeoutError, err:
#handle this specific error
except Exception, err:
#handle all other errors
or just check for type:
except Exception, err:
if isinstance(err, TimeoutError):
#handle specific error
#handle all other errors
Python 2.7.3 & Django 1.5
You can handle requests.Timeout exception:
try:
r = requests.get(testUrl, timeout=10.0)
except requests.Timeout as err:
logger.error({"message": err.message})
except requests.RequestException as err:
# handle other errors
Example:
>>> import requests
>>> url = "http://httpbin.org/delay/2"
>>> try:
... r = requests.get(url, timeout=1)
... except requests.Timeout as err:
... print(err.message)
...
HTTPConnectionPool(host='httpbin.org', port=80): Read timed out. (read timeout=1)

urllib2.urlopen(z).read() - try for x seconds then move to the next item

I have a list of x websites from which I want to scrape data.
Code:
import urllib2
from urllib2 import Request, urlopen, HTTPError, URLError
def checkurl(z):
print urllib2.urlopen('http://'+z).read()
for x in t2w: #t2w is my list
print x
checkurl(x)
print "\n"
As of now, the whole process stops, as soon as a website is
unavailable. What can I do to let urllib2 try for x time, give an error e.g "website not available' and then move on to the next item in the list.
Maybe should have mentioned that this is for .onion
import socks
import socket
def create_connection(address, timeout=None, source_address=None):
sock = socks.socksocket()
sock.connect(address)
return sock
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 9150)
socket.socket = socks.socksocket
socket.create_connection = create_connection
#####
import urllib2
from urllib2 import Request, urlopen, HTTPError, URLError
def checkurl(z):
try:
urllib2.urlopen("http://"+z, timeout=1).read()
except urllib2.URLError, e:
raise MyException("Error raised: %r" % e)
#print urllib2.urlopen('http://'+z).read()
You can use the timeout parameter.
try:
urllib2.urlopen("http://example.com", timeout=1)
except urllib2.URLError, e:
raise MyException("Error raised: %r" % e)
From the docs:
The optional timeout parameter specifies a timeout in seconds for
blocking operations like the connection attempt (if not specified, the
global default timeout setting will be used). This actually only works
for HTTP, HTTPS and FTP connections.

Categories