Check if the internet cannot be accessed in Python - python

I have an app that makes a HTTP GET request to a particular URL on the internet. But when the network is down (say, no public wifi - or my ISP is down, or some such thing), I get the following traceback at urllib2.urlopen:
70, in get
u = urllib2.urlopen(req)
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 1161, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/urllib2.py", line 1136, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 8] nodename nor servname provided, or not known>
I want to print a friendly error to the user telling him that his network maybe down instead of this unfriendly "nodename nor servname provided" error message. Sure I can catch URLError, but that would catch every url error, not just the one related to network downtime.
I am not a purist, so even an error message like "The server example.com cannot be reached; either the server is indeed having problems or your network connection is down" would be nice. How do I go about selectively catching such errors? (For a start, if DNS resolution fails at urllib2.urlopen, that can be reasonably assumed as network inaccessibility? If so, how do I "catch" it in the except block?)

You should wrap the request in a try/except statement so that you catch the fault and then let them know.
try:
u = urllib2.urlopen(req)
except HTTPError as e:
#inform them of the specific error here (based off the error code)
except URLError as e:
#inform them of the specific error here
except Exception as e:
#inform them that a general error has occurred

urllib2 - The Missing Manual has a good section on how to handle URLError and HTTPError exceptions and how to differentiate the conditions that caused them.

How about catching URLError, then testing the reason attribute? If the reason isn't one you're interested in, re-throw the URLError and handle it somewhere else.
Alternatively, you could try httplib2. Its ServerNotFoundError exception would probably suit your needs.

Related

Python Urllib.urlopen:IOError: [Errno socket error] [Errno 10060] [duplicate]

OS: Windows 7; Python 2.7.3 using the Python GUI Shell
I'm trying to read a website through Python, and several authors use the urllib and urllib2 libraries. To store the site in a variable, I've seen a similar approach proposed:
import urllib
import urllib2
g = "http://www.google.com/"
read = urllib2.urlopen(g)
The last line generates an error after a 120+ seconds:
> Traceback (most recent call last): File "<pyshell#27>", line 1, in
> <module>
> r = urllib2.urlopen(o) File "C:\Python27\lib\urllib2.py", line 126, in urlopen
> return _opener.open(url, data, timeout) File "C:\Python27\lib\urllib2.py", line 400, in open
> response = self._open(req, data) File "C:\Python27\lib\urllib2.py", line 418, in _open
> '_open', req) File "C:\Python27\lib\urllib2.py", line 378, in _call_chain
> result = func(*args) File "C:\Python27\lib\urllib2.py", line 1207, in http_open
> return self.do_open(httplib.HTTPConnection, req) File "C:\Python27\lib\urllib2.py", line 1177, in do_open
> raise URLError(err) URLError: <urlopen error [Errno 10060] A connection attempt failed because the connected party did not properly
> respond after a period of time, or established connection failed
> because connected host has failed to respond>
I tried bypassing the g variable and trying to urlopen("http://www.google.com/") with no success either (it generates the same error after the same length of time).
The error code 10060 means it cannot connect to the remote peer. It might be because of the network problem or mostly your setting issues, such as proxy setting.
You could try to connect the same host with other tools(such as ncat) and/or with another PC within your same local network to find out where the problem is occuring.
For proxy issue, there are some material here:
Using an HTTP PROXY - Python
Why can't I get Python's urlopen() method to work on Windows?
Hope it helps!
Answer (Basic is advance!):
Error: 10060
Adding a timeout parameter to request solved the issue for me.
Example 1
import urllib
import urllib2
g = "http://www.google.com/"
read = urllib2.urlopen(g, timeout=20)
Example 2
A similar error also occurred while I was making a GET request. Again, passing a timeout parameter solved the 10060 Error.
response = requests.get(param_url, timeout=20)
This is because of the proxy settings.
I also had the same problem, under which I could not use any of the modules which were fetching data from the internet.
There are simple steps to follow:
1. open the control panel
2. open internet options
3. under connection tab open LAN settings
4. go to advance settings and unmark everything, delete every proxy in there. Or u can just unmark the checkbox in proxy server this will also do the same
5. save all the settings by clicking ok.
you are done.
try to run the programme again, it must work
it worked for me at least
just change your internet connection it is going to work..

Python: URLError: <urlopen error [Errno 10060]

OS: Windows 7; Python 2.7.3 using the Python GUI Shell
I'm trying to read a website through Python, and several authors use the urllib and urllib2 libraries. To store the site in a variable, I've seen a similar approach proposed:
import urllib
import urllib2
g = "http://www.google.com/"
read = urllib2.urlopen(g)
The last line generates an error after a 120+ seconds:
> Traceback (most recent call last): File "<pyshell#27>", line 1, in
> <module>
> r = urllib2.urlopen(o) File "C:\Python27\lib\urllib2.py", line 126, in urlopen
> return _opener.open(url, data, timeout) File "C:\Python27\lib\urllib2.py", line 400, in open
> response = self._open(req, data) File "C:\Python27\lib\urllib2.py", line 418, in _open
> '_open', req) File "C:\Python27\lib\urllib2.py", line 378, in _call_chain
> result = func(*args) File "C:\Python27\lib\urllib2.py", line 1207, in http_open
> return self.do_open(httplib.HTTPConnection, req) File "C:\Python27\lib\urllib2.py", line 1177, in do_open
> raise URLError(err) URLError: <urlopen error [Errno 10060] A connection attempt failed because the connected party did not properly
> respond after a period of time, or established connection failed
> because connected host has failed to respond>
I tried bypassing the g variable and trying to urlopen("http://www.google.com/") with no success either (it generates the same error after the same length of time).
The error code 10060 means it cannot connect to the remote peer. It might be because of the network problem or mostly your setting issues, such as proxy setting.
You could try to connect the same host with other tools(such as ncat) and/or with another PC within your same local network to find out where the problem is occuring.
For proxy issue, there are some material here:
Using an HTTP PROXY - Python
Why can't I get Python's urlopen() method to work on Windows?
Hope it helps!
Answer (Basic is advance!):
Error: 10060
Adding a timeout parameter to request solved the issue for me.
Example 1
import urllib
import urllib2
g = "http://www.google.com/"
read = urllib2.urlopen(g, timeout=20)
Example 2
A similar error also occurred while I was making a GET request. Again, passing a timeout parameter solved the 10060 Error.
response = requests.get(param_url, timeout=20)
This is because of the proxy settings.
I also had the same problem, under which I could not use any of the modules which were fetching data from the internet.
There are simple steps to follow:
1. open the control panel
2. open internet options
3. under connection tab open LAN settings
4. go to advance settings and unmark everything, delete every proxy in there. Or u can just unmark the checkbox in proxy server this will also do the same
5. save all the settings by clicking ok.
you are done.
try to run the programme again, it must work
it worked for me at least
just change your internet connection it is going to work..

Python - Trying to write a script that tests an http interface

ers,
I'm still getting to grips with the Python basics..
My current requirement is to develop a Python script that will test the availabilty of web-based interfaces of mulitple devices (e.g. where you may have to enter "http://192.168.0.2:9876" via a web browser), this does not have to be over complicated.
I'm trying to convert from the simple bash curl command, as originally I had something like the following in a bash script:
date=`date +"%Y-%m-%d_%H-%M-%S-%N"`
curl -s --connect-timeout 1 ${ip} -o /dev/null
test=$?
if [[ $test == 0 ]] ;then
echo "${date}:webping - Web Page Up for ${ip}" >> $log
else
echo "${date}:webping - Web Page Down for ${ip}" >> $log
fi
which worked for the original concept, but I was looking to have something similar in python. the output can vary, within reason... anyone have any pointers on where to start.
P.S I have looked at some other questions on here, but they appear to give false-positives, where the interface has been "taken down" (i.e. I stopped the service) and it still gives a status code of 200.
EDIT: Below is the code I have tried.
for url in ["http://www.google.co.uk", "http://192.168.0.2:8000"]:
try:
connection = urllib2.urlopen(url)
print connection.getcode()
connection.close()
except urllib2.HTTPError, e:
print "none"
CORRECTION: I get the following results...
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File "C:\Python27\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 391, in open
response = self._open(req, data)
File "C:\Python27\lib\urllib2.py", line 409, in _open
'_open', req)
File "C:\Python27\lib\urllib2.py", line 369, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 1173, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "C:\Python27\lib\urllib2.py", line 1148, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 10061] No connection could be made because the target machine actively refused it>
I would prefer not to see the python error output.
Thanks in advance
Take a look at http://docs.python-requests.org/en/latest/index.html for a Python module providing the facilities you need with a nice friendly API. In this instance you'd do something along these lines:
import requests
...
try:
r = requests.get(url, timeout=1)
ok = (r.status_code // 100) == 2
except:
ok = False
# now use the value of ok
though I don't know whether the particular test I've used there (success means a 2xx response) is exactly what you want.

Python socket error resilience / workaround

I have a script running that is testing a series of urls for availability.
This is one of the functions.
def checkUrl(url): # Only downloads headers, returns status code.
p = urlparse(url)
conn = httplib.HTTPConnection(p.netloc)
conn.request('HEAD', p.path)
resp = conn.getresponse()
return resp.status
Occasionally, the VPS will lose connectivity, the entire script crashes when that occurs.
File "/usr/lib/python2.6/httplib.py", line 914, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python2.6/httplib.py", line 951, in _send_request
self.endheaders()
File "/usr/lib/python2.6/httplib.py", line 908, in endheaders
self._send_output()
File "/usr/lib/python2.6/httplib.py", line 780, in _send_output
self.send(msg)
File "/usr/lib/python2.6/httplib.py", line 739, in send
self.connect()
File "/usr/lib/python2.6/httplib.py", line 720, in connect
self.timeout)
File "/usr/lib/python2.6/socket.py", line 561, in create_connection
raise error, msg
socket.error: [Errno 101] Network is unreachable
I'm not at all familiar with handling errors like this in python.
What is the appropriate way to keep the script from crashing when network connectivity is temporarily lost?
Edit:
I ended up with this - feedback?
def checkUrl(url): # Only downloads headers, returns status code.
try:
p = urlparse(url)
conn = httplib.HTTPConnection(p.netloc)
conn.request('HEAD', p.path)
resp = conn.getresponse()
return resp.status
except IOError, e:
if e.errno == 101:
print "Network Error"
time.sleep(1)
checkUrl(url)
else:
raise
I'm not sure I fully understand what raise does though..
If you just want to handle this Network is unreachable 101, and let other exceptions throw an error, you can do following for example.
from errno import ENETUNREACH
try:
# tricky code goes here
except IOError as e:
# an IOError exception occurred (socket.error is a subclass)
if e.errno == ENETUNREACH:
# now we had the error code 101, network unreachable
do_some_recovery
else:
# other exceptions we reraise again
raise
Problem with your solution as it stands is you're going to run out of stack space if there are too many errors on a single URL (> 1000 by default) due to the recursion. Also, the extra stack frames could make tracebacks hard to read (500 calls to checkURL). I'd rewrite it to be iterative, like so:
def checkUrl(url): # Only downloads headers, returns status code.
while True:
try:
p = urlparse(url)
conn = httplib.HTTPConnection(p.netloc)
conn.request('HEAD', p.path)
resp = conn.getresponse()
return resp.status
except IOError as e:
if e.errno == 101:
print "Network Error"
time.sleep(1)
except:
raise
Also, you want the last clause in your try to be a bare except not an else. Your else only gets executed if control falls through the try suite, which can never happen, since the last statement of the try suite is return.
This is very easy to change to allow a limited number of retries. Just change the while True: line to for _ in xrange(5) or however many retries you wish to accept. The function will then return None if it can't connect to the site after 5 attempts. You can have it return something else or raise an exception by adding return or raise SomeException at the very end of the function (indented the same as the for or while line).
put try...except: around your code to catch exceptions.
http://docs.python.org/tutorial/errors.html

Python urllib2 URLError exception?

I installed Python 2.6.2 earlier on a Windows XP machine and run the following code:
import urllib2
import urllib
page = urllib2.Request('http://www.python.org/fish.html')
urllib2.urlopen( page )
I get the following error.
Traceback (most recent call last):<br>
File "C:\Python26\test3.py", line 6, in <module><br>
urllib2.urlopen( page )<br>
File "C:\Python26\lib\urllib2.py", line 124, in urlopen<br>
return _opener.open(url, data, timeout)<br>
File "C:\Python26\lib\urllib2.py", line 383, in open<br>
response = self._open(req, data)<br>
File "C:\Python26\lib\urllib2.py", line 401, in _open<br>
'_open', req)<br>
File "C:\Python26\lib\urllib2.py", line 361, in _call_chain<br>
result = func(*args)<br>
File "C:\Python26\lib\urllib2.py", line 1130, in http_open<br>
return self.do_open(httplib.HTTPConnection, req)<br>
File "C:\Python26\lib\urllib2.py", line 1105, in do_open<br>
raise URLError(err)<br>
URLError: <urlopen error [Errno 11001] getaddrinfo failed><br><br><br>
import urllib2
response = urllib2.urlopen('http://www.python.org/fish.html')
html = response.read()
You're doing it wrong.
Have a look in the urllib2 source, at the line specified by the traceback:
File "C:\Python26\lib\urllib2.py", line 1105, in do_open
raise URLError(err)
There you'll see the following fragment:
try:
h.request(req.get_method(), req.get_selector(), req.data, headers)
r = h.getresponse()
except socket.error, err: # XXX what error?
raise URLError(err)
So, it looks like the source is a socket error, not an HTTP protocol related error. Possible reasons: you are not on line, you are behind a restrictive firewall, your DNS is down,...
All this aside from the fact, as mcandre pointed out, that your code is wrong.
Name resolution error.
getaddrinfo is used to resolve the hostname (python.org)in your request. If it fails, it means that the name could not be resolved because:
It does not exist, or the records are outdated (unlikely; python.org is a well-established domain name)
Your DNS server is down (unlikely; if you can browse other sites, you should be able to fetch that page through Python)
A firewall is blocking Python or your script from accessing the Internet (most likely; Windows Firewall sometimes does not ask you if you want to allow an application)
You live on an ancient voodoo cemetery. (unlikely; if that is the case, you should move out)
Windows Vista, python 2.6.2
It's a 404 page, right?
>>> import urllib2
>>> import urllib
>>>
>>> page = urllib2.Request('http://www.python.org/fish.html')
>>> urllib2.urlopen( page )
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python26\lib\urllib2.py", line 124, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python26\lib\urllib2.py", line 389, in open
response = meth(req, response)
File "C:\Python26\lib\urllib2.py", line 502, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python26\lib\urllib2.py", line 427, in error
return self._call_chain(*args)
File "C:\Python26\lib\urllib2.py", line 361, in _call_chain
result = func(*args)
File "C:\Python26\lib\urllib2.py", line 510, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found
>>>
DJ
First, I see no reason to import urllib; I've only ever seen urllib2 used to replace urllib entirely and I know of no functionality that's useful from urllib and yet is missing from urllib2.
Next, I notice that http://www.python.org/fish.html gives a 404 error to me. (That doesn't explain the backtrace/exception you're seeing. I get urllib2.HTTPError: HTTP Error 404: Not Found
Normally if you just want to do a default fetch of a web pages (without adding special HTTP headers, doing doing any sort of POST, etc) then the following suffices:
req = urllib2.urlopen('http://www.python.org/')
html = req.read()
# and req.close() if you want to be pedantic

Categories