What I am trying to do is open an xml.zip file from a website for an intro to Python project. I'm pretty sure there's nothing wrong with my code, but when I go to run the program, I get:
Traceback (most recent call last):
File "/Users/Chris/Documents/clientDownloadFile-Handout.py", line 16, in <module>
u = urllib2.urlopen("https://nvd.nist.gov/download/nvdcve-Recent.xml.zip")
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1240, in https_open
context=self._context)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1197, in do_open
raise URLError(err)
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)>
This is my code:
import urllib2
u=urllib2.urlopen("https://nvd.nist.gov/download/nvdcveRecent.xml.zip")
localFile = open('nvd.xml.zip', 'wb')
localFile.write(u.read())
localFile.close()
I've done some research, including on this site. I've tried some modifications like 'import ssl' and other 'insecure' workarounds to this problem, to no avail. I am wondering if anyone else (possibly Mac OS X El Capitan users) has encountered this, and if there is a more secure way around this?
The URL has changed; try https://nvd.nist.gov/feeds/xml/cve/nvdcve-2.0-Recent.xml.zip instead.
Related
I am trying to download a dataset from a web API for my work project which requires using python. I used python 3.4 and the library urllib to open the request. This does not work:
from urllib import request
r = request.urlopen(SOME_URL)
This gives error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Anaconda3\lib\urllib\request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "C:\Anaconda3\lib\urllib\request.py", line 463, in open
response = self._open(req, data)
File "C:\Anaconda3\lib\urllib\request.py", line 481, in _open
'_open', req)
File "C:\Anaconda3\lib\urllib\request.py", line 441, in _call_chain
result = func(*args)
File "C:\Anaconda3\lib\urllib\request.py", line 1210, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "C:\Anaconda3\lib\urllib\request.py", line 1184, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time,
or established connection failed because connected host has failed to respond>
But when I used RStudio with the same URL, it works:
dt = read.csv(SOME_URL)
This gives me the exact dataset I want.
For the project we want to keep a unified tech stack (only use python throughout the process), does anyone have idea why the URL can be open in R but not python? Is there any special set-up I need to configure for python?
Thanks
The following should do the job:
urllib2.urlopen(SOME_URL).read()
I am using common_crawl_index lib for python from here to get some data from S3.
When I run the command to get data cci_lookup com.abc it has error as below.
I still get the output that is the list of urls for the lookup domain but I do not know why the error happen.
MainThread:2014-12-19 01:48:16,150:ERROR:utils:224 Caught exception reading instance data
Traceback (most recent call last):
File "/home/deploy/anaconda/lib/python2.7/site-packages/boto/utils.py", line 211, in retry_url
r = opener.open(req)
File "/home/deploy/anaconda/lib/python2.7/urllib2.py", line 404, in open
response = self._open(req, data)
File "/home/deploy/anaconda/lib/python2.7/urllib2.py", line 422, in _open
'_open', req)
File "/home/deploy/anaconda/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/home/deploy/anaconda/lib/python2.7/urllib2.py", line 1214, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/home/deploy/anaconda/lib/python2.7/urllib2.py", line 1184, in do_open
raise URLError(err)
URLError: <urlopen error timed out>
com.abc.www/forum/index.php/groupcp.php:http
com.abc.www/forum/index.php/includes/MLOtvHD:http
com.abc.www/forum/index.php/includes/blog.php:http
com.abc.www/forum/index.php/includes/content.php:http
com.abc.www/forum/index.php/includes/index.php:http
Hope anyone can help!
I am getting this URLError while using it from Google AppEngine app. However the same python script works fine when run as standalone script. I searched a lot in the internet but couldn't get the solution for this. Has anyone encountered this and resolved it successfully?
response = opener.open(req)
File "C:\Python27\lib\urllib2.py", line 394, in open
response = self._open(req, data)
File "C:\Python27\lib\urllib2.py", line 412, in _open
'_open', req)
File "C:\Python27\lib\urllib2.py", line 372, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 1207, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "C:\Python27\lib\urllib2.py", line 1174, in do_open
raise URLError(err)
URLError: urlopen error An error occured while connecting to the server: ApplicationError: 2 [Errno 10104] getaddrinfo failed
from poster.encode import multipart_encode
from poster.streaminghttp import register_openers
def picscrazy(str,int):
register_openers()
datagen, headers = multipart_encode({"imagefile[]": open(str, "rb")})
request = urllib2.Request("http://www.picscrazy.com/process.php", datagen, headers)
Str is the filename and the int is just another flag.
The code is to upload a file to a image hosting website .I am using poster Poster for the post requests. The program stops after the request statement and gives an error .I cant understand the error whether its a problem in my network or in the program.
Below is the traceback of the error:
Traceback (most recent call last):
File "C:\Documents and Settings\Administrator\Desktop\for exbii\res.py", line 42, in <module>
picscrazy(fname,1)
File "C:\Documents and Settings\Administrator\Desktop\for exbii\res.py", line 14, in picscrazy
print(urllib2.urlopen(request).read())
File "C:\Python25\Lib\urllib2.py", line 121, in urlopen
return _opener.open(url, data)
File "C:\Python25\Lib\urllib2.py", line 374, in open
response = self._open(req, data)
File "C:\Python25\Lib\urllib2.py", line 392, in _open
'_open', req)
File "C:\Python25\Lib\urllib2.py", line 353, in _call_chain
result = func(*args)
File "C:\Python25\lib\poster\streaminghttp.py", line 142, in http_open
return self.do_open(StreamingHTTPConnection, req)
File "C:\Python25\Lib\urllib2.py", line 1076, in do_open
raise URLError(err)
URLError: <urlopen error (10054, 'Connection reset by peer')>
If you can't display the header coming back from the server, then, your server has simply cut you off.
It may be your request is bad -- but that's unlikely.
It may be that you've exceeded bandwidth restrictions.
It may be that your requests appear to be a DDOS attack because they're happening too frequently.
I tried running this,
>>> urllib2.urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl')
But it is giving error like this, can anyone tell me a solution ?
Traceback (most recent call last):
File "<pyshell#11>", line 1, in <module>
urllib2.urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl')
File "C:\Python26\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python26\lib\urllib2.py", line 391, in open
response = self._open(req, data)
File "C:\Python26\lib\urllib2.py", line 409, in _open
'_open', req)
File "C:\Python26\lib\urllib2.py", line 369, in _call_chain
result = func(*args)
File "C:\Python26\lib\urllib2.py", line 1161, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "C:\Python26\lib\urllib2.py", line 1136, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 11001] getaddrinfo failed>
Double check domain is accessible or not.
I am getting 504 Gateway Timeout error here for domain - tycho.usno.navy.mil , at the moment.
Looks like the site is down, also downforeveryoneorjustme.com says that
It's not just you!
http://tycho.usno.navy.mil looks down
from here.
Thats why getaddrinfo is failing
Wrapping in try..except could help keep it neat:
try:
urllib2.urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl')
except URLError:
print "Error opening URL"