In Python, urllib2 giving error - python

I tried running this,
>>> urllib2.urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl')
But it is giving error like this, can anyone tell me a solution ?
Traceback (most recent call last):
File "<pyshell#11>", line 1, in <module>
urllib2.urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl')
File "C:\Python26\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python26\lib\urllib2.py", line 391, in open
response = self._open(req, data)
File "C:\Python26\lib\urllib2.py", line 409, in _open
'_open', req)
File "C:\Python26\lib\urllib2.py", line 369, in _call_chain
result = func(*args)
File "C:\Python26\lib\urllib2.py", line 1161, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "C:\Python26\lib\urllib2.py", line 1136, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 11001] getaddrinfo failed>

Double check domain is accessible or not.
I am getting 504 Gateway Timeout error here for domain - tycho.usno.navy.mil , at the moment.
Looks like the site is down, also downforeveryoneorjustme.com says that
It's not just you!
http://tycho.usno.navy.mil looks down
from here.
Thats why getaddrinfo is failing

Wrapping in try..except could help keep it neat:
try:
urllib2.urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl')
except URLError:
print "Error opening URL"

Related

Python 2.7.11 [SSL: CERTIFICATE_VERIFY_FAILED]

What I am trying to do is open an xml.zip file from a website for an intro to Python project. I'm pretty sure there's nothing wrong with my code, but when I go to run the program, I get:
Traceback (most recent call last):
File "/Users/Chris/Documents/clientDownloadFile-Handout.py", line 16, in <module>
u = urllib2.urlopen("https://nvd.nist.gov/download/nvdcve-Recent.xml.zip")
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1240, in https_open
context=self._context)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1197, in do_open
raise URLError(err)
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)>
This is my code:
import urllib2
u=urllib2.urlopen("https://nvd.nist.gov/download/nvdcveRecent.xml.zip")
localFile = open('nvd.xml.zip', 'wb')
localFile.write(u.read())
localFile.close()
I've done some research, including on this site. I've tried some modifications like 'import ssl' and other 'insecure' workarounds to this problem, to no avail. I am wondering if anyone else (possibly Mac OS X El Capitan users) has encountered this, and if there is a more secure way around this?
The URL has changed; try https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-Recent.xml.zip instead.

Boto urllib2 URLError (Caught exception reading instance data)

I am using common_crawl_index lib for python from here to get some data from S3.
When I run the command to get data cci_lookup com.abc it has error as below.
I still get the output that is the list of urls for the lookup domain but I do not know why the error happen.
MainThread:2014-12-19 01:48:16,150:ERROR:utils:224 Caught exception reading instance data
Traceback (most recent call last):
File "/home/deploy/anaconda/lib/python2.7/site-packages/boto/utils.py", line 211, in retry_url
r = opener.open(req)
File "/home/deploy/anaconda/lib/python2.7/urllib2.py", line 404, in open
response = self._open(req, data)
File "/home/deploy/anaconda/lib/python2.7/urllib2.py", line 422, in _open
'_open', req)
File "/home/deploy/anaconda/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/home/deploy/anaconda/lib/python2.7/urllib2.py", line 1214, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/home/deploy/anaconda/lib/python2.7/urllib2.py", line 1184, in do_open
raise URLError(err)
URLError: <urlopen error timed out>
com.abc.www/forum/index.php/groupcp.php:http
com.abc.www/forum/index.php/includes/MLOtvHD:http
com.abc.www/forum/index.php/includes/blog.php:http
com.abc.www/forum/index.php/includes/content.php:http
com.abc.www/forum/index.php/includes/index.php:http
Hope anyone can help!

Python 3 get HTML content

I'm using this code to get the web site html content,
import urllib.request
import lxml.html as lh
req= urllib.request.Request("http://www.ip-adress.com/ip_tracer/157.123.22.11",
headers={'User-Agent' : "Magic Browser"})
html = urllib.request.urlopen(req).read()
doc = lh.fromstring(html)
print (''.join(doc.xpath('.//*[#class="odd"]')[-1].text_content().split()))
I want to get the Organization: Zenith Data Systems.
but it shows some errors
Traceback (most recent call last):
File "/usr/local/python3.2.3/lib/python3.2/urllib/request.py", line 1135, in do_open
h.request(req.get_method(), req.selector, req.data, headers)
File "/usr/local/python3.2.3/lib/python3.2/http/client.py", line 967, in request
self._send_request(method, url, body, headers)
File "/usr/local/python3.2.3/lib/python3.2/http/client.py", line 1005, in _send_request
self.endheaders(body)
File "/usr/local/python3.2.3/lib/python3.2/http/client.py", line 963, in endheaders
self._send_output(message_body)
File "/usr/local/python3.2.3/lib/python3.2/http/client.py", line 808, in _send_output
self.send(msg)
File "/usr/local/python3.2.3/lib/python3.2/http/client.py", line 746, in send
self.connect()
File "/usr/local/python3.2.3/lib/python3.2/http/client.py", line 724, in connect
self.timeout, self.source_address)
File "/usr/local/python3.2.3/lib/python3.2/socket.py", line 404, in create_connection
raise err
File "/usr/local/python3.2.3/lib/python3.2/socket.py", line 395, in create_connection
sock.connect(sa)
socket.error: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "ext.py", line 4, in <module>
html = urllib.request.urlopen(req).read()
File "/usr/local/python3.2.3/lib/python3.2/urllib/request.py", line 138, in urlopen
return opener.open(url, data, timeout)
File "/usr/local/python3.2.3/lib/python3.2/urllib/request.py", line 369, in open
response = self._open(req, data)
File "/usr/local/python3.2.3/lib/python3.2/urllib/request.py", line 387, in _open
'_open', req)
File "/usr/local/python3.2.3/lib/python3.2/urllib/request.py", line 347, in _call_chain
result = func(*args)
File "/usr/local/python3.2.3/lib/python3.2/urllib/request.py", line 1155, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/usr/local/python3.2.3/lib/python3.2/urllib/request.py", line 1138, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 111] Connection refused>}
How to solve it. Thanks,
Basically, Connection Refused means only registered users are allowed to access the page, or server under heavy maintenance or similar reasons.
From your above code, if you want to handle errors you may try using try and except like below code:
try:
req= urllib.request.Request("http://www.ip-adress.com/ip_tracer/157.123.22.11",headers={'User-Agent' : "Magic Browser"})
html = urllib.request.urlopen(req).read()
doc = lh.fromstring(html)
print (''.join(doc.xpath('.//*[#class="odd"]')[-1].text_content().split()))
except urllib.error.URLError as e:
print(e.reason)

urlopen error An error occured while connecting to the server: ApplicationError: 2 [Errno 10104] getaddrinfo failed

I am getting this URLError while using it from Google AppEngine app. However the same python script works fine when run as standalone script. I searched a lot in the internet but couldn't get the solution for this. Has anyone encountered this and resolved it successfully?
response = opener.open(req)
File "C:\Python27\lib\urllib2.py", line 394, in open
response = self._open(req, data)
File "C:\Python27\lib\urllib2.py", line 412, in _open
'_open', req)
File "C:\Python27\lib\urllib2.py", line 372, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 1207, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "C:\Python27\lib\urllib2.py", line 1174, in do_open
raise URLError(err)
URLError: urlopen error An error occured while connecting to the server: ApplicationError: 2 [Errno 10104] getaddrinfo failed

How can I iterate through a list of proxies with Socksipy

For some reason I can get this to work, using single proxy everything seems fine.
#This works
import socks
import urllib2
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, '68.xx.193.xx', 666)
socks.wrapmodule(urllib2)
print urllib2.urlopen('http://automation.whatismyip.com/n09230945.asp').read()
&
#This doesn't
import socks
import urllib2
proxies=['68.xx.193.xx','xx.178.xx.70','98.xx.84.xx','83.xx.86.xx']
ports=[666,1080,859,910]
for i in range(len(proxies)):
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, repr(proxies[i]), ports[i])
socks.wrapmodule(urllib2)
print urllib2.urlopen('http://automation.whatismyip.com/n09230945.asp').read()
Error:
Traceback (most recent call last):
File "/home/zer0/Aptana Studio 3 Workspace/sy/src/test.py", line 38, in <module>
print urllib2.urlopen('http://automation.whatismyip.com/n09230945.asp').read()
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1185, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1160, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno -5] No address associated with hostname>
Run this:
proxies=['68.xx.193.xx','xx.178.xx.70','98.xx.84.xx','83.xx.86.xx']
ports=[666,1080,859,910]
for i in range(len(proxies)):
print (repr(proxies[i]), ports[i])
You'll get
("'68.xx.193.xx'", 666)
("'xx.178.xx.70'", 1080)
("'98.xx.84.xx'", 859)
("'83.xx.86.xx'", 910)
You're adding quotes you don't want with the repr call, so urllib2 thinks it's a hostname instead of an IP address. Get rid of it and you should be fine.

Categories