I wanted to execute this code I found on the python website:
#!/usr/bin/python
import urllib2
f = urllib2.urlopen('http://www.python.org/')
print f.read(100)
but the result was this:
Traceback (most recent call last):
File "WebServiceAusführen.py", line 5, in <module>
f = urllib2.urlopen('http://www.python.org/')
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 418, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1207, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1146, in do_open
h = http_class(host, timeout=req.timeout) # will parse host:port
File "/usr/lib/python2.7/httplib.py", line 693, in __init__
self._set_hostport(host, port)
File "/usr/lib/python2.7/httplib.py", line 721, in _set_hostport
raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
httplib.InvalidURL: nonnumeric port: 'port/'
Error in sys.excepthook:
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/apport_python_hook.py", line 70, in apport_excepthook
binary = os.path.realpath(os.path.join(os.getcwdu(), sys.argv[0]))
File "/usr/lib/python2.7/posixpath.py", line 71, in join
path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 15: ordinal not in range(128)
Original exception was:
Traceback (most recent call last):
File "WebServiceAusführen.py", line 5, in <module>
f = urllib2.urlopen('http://www.python.org/')
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 418, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1207, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1146, in do_open
h = http_class(host, timeout=req.timeout) # will parse host:port
File "/usr/lib/python2.7/httplib.py", line 693, in __init__
self._set_hostport(host, port)
File "/usr/lib/python2.7/httplib.py", line 721, in _set_hostport
raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
httplib.InvalidURL: nonnumeric port: 'port/'
help would be pretty awesome because I really don't know how to make such a big error in such a small program!
The exception is thrown because you have a faulty proxy configuration.
On POSIX systems, the library looks for *_proxy environment variables (upper and lowercase). For a HTTP URL, that's the http_proxy environment variable.
You can verify this by looking for proxy variables:
import os
print {k: v for k, v in os.environ.items() if k.lower().endswith('_proxy')}
Whatever value you have configured on your system is not a valid hostname and port combination and Python is choking on that.
Related
I'm trying to run a script of an online course and had to change the source code from python 2 to python 3 syntax. In this script, there is a download of an archive, which I already transformed into:
url = "https://www.cs.cmu.edu/~./enron/enron_mail_20150507.tgz"
urllib.request.urlretrieve(url, filename="../enron_mail_20150507.tgz")
However, something seems to be wrong with the URL, since it gives me the following error:
Traceback (most recent call last):
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/encodings/idna.py", line 165, in encode
raise UnicodeError("label empty or too long")
UnicodeError: label empty or too long
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "ud120_code_py35fork/tools/startup.py", line 37, in <module>
data = urllib.request.urlretrieve("http://...")
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 187, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 162, in urlopen
return opener.open(url, data, timeout)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 465, in open
response = self._open(req, data)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 483, in _open
'_open', req)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 443, in _call_chain
result = func(*args)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 1268, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/urllib/request.py", line 1240, in do_open
h.request(req.get_method(), req.selector, req.data, headers)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/http/client.py", line 1083, in request
self._send_request(method, url, body, headers)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/http/client.py", line 1128, in _send_request
self.endheaders(body)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/http/client.py", line 1079, in endheaders
self._send_output(message_body)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/http/client.py", line 911, in _send_output
self.send(msg)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/http/client.py", line 854, in send
self.connect()
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/http/client.py", line 826, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/socket.py", line 693, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/home/xiaolong/development/Python/udacity_intro_to_machine_learning/localpython/lib/python3.5/socket.py", line 732, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
UnicodeError: encoding with 'idna' codec failed (UnicodeError: label empty or too long)
I already tried replacing the ~ and the . in the URL with %7E and %2E, but that didn't help at all.
What's wrong with the URL and how can I fix this?
exact python version: 3.5.1
I was making my project on mac and I tried to do the same things by Beagle Bone Black(BBB).
However, I couldn't use urllib in BBB so I am stuck: I cannot go forward.(it is working well in my mac)
I tried this simple code as an example:
import urllib
conn = urllib.urlopen('http://stackoverflow.com/questions/8479736/using-python-urllib-how-to-avoid-non-html-content')
then this Error occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/urllib.py", line 86, in urlopen
return opener.open(url)
File "/usr/lib/python2.7/urllib.py", line 207, in open
return getattr(self, name)(url)
File "/usr/lib/python2.7/urllib.py", line 351, in open_http
'got a bad status line', None)
IOError: ('http protocol error', 0, 'got a bad status line', None)
I need to fetch a html data for my project.
How can I solve this problem? Do you have any ideas ?
Thank you.
When I tried urllib2
I got this:
>>> import urllib2
>>> conn = urllib2.urlopen('http://stackoverflow.com/questions/8479736/using-python-urllib-how-to-avoid-non-html-content')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 418, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1207, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1180, in do_open
r = h.getresponse(buffering=True)
File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
response.begin()
File "/usr/lib/python2.7/httplib.py", line 407, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.7/httplib.py", line 371, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
Also I tried this:
curl http://stackoverflow.com/questions/8479736/using-python-urllib-how-to-avoid-non-html-content
curl: (52) Empty reply from server
and this:
wget http://stackoverflow.com/questions/8479736/using-python-urllib-how-to-avoid-non-html-content
Connecting to stackoverflow.com (198.252.206.16:80)
wget: error getting response
but they didn't work
at home, I also tried and failed but returns a different error:
conn = urllib2.urlopen('http://stackoverflow.com/questions/8479736/using-python-urllib-how-to-avoid-non-html-content')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 418, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1215, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1177, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno -2] Name or service not known>
environment
BBB: Linux beaglebone 3.8.13 #1 SMP Tue Jun 18 02:11:09 EDT 2013 armv7l GNU/Linux
python version: 2.7.3
I'm really want to recommend you requests lib:
>>> r = requests.get('https://api.github.com/user', auth=('user', 'pass'))
>>> r.status_code
200
>>> r.headers['content-type']
'application/json; charset=utf8'
>>> r.encoding
'utf-8'
>>> r.text
u'{"type":"User"...'
http://www.python-requests.org/en/latest/
How to install:
sudo pip install requests
I'm getting a couple of exceptions popping up from time to time but can't think of the cause.
Here's a snippet:
try:
r = urllib2.urlopen(url)
except urllib2.URLError, e:
if hasattr(e, 'code'):
# unauthorized
print('UA: %s' % url)
elif hasattr(e, 'reason'):
print('TO: %s' % url)
# timeout
else:
i = r.info()
try:
server = i['server']
except:
pass
else:
if not 'authenticate' in server:
print('NA: %s' % url)
I'm thinking perhaps that r.info() is causing an exception but not sure why it would as the r = urllib2.urlopen(url) is covered with the try.
The errors are:
Traceback (most recent call last):
File "C:\Python27\lib\threading.py", line 551, in __bootstrap_inner
self.run()
File "C:\Users\anthony\Scripts\checker.py", line 35, in run
r = urllib2.urlopen(url)
File "C:\Python27\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 400, in open
response = self._open(req, data)
File "C:\Python27\lib\urllib2.py", line 418, in _open
'_open', req)
File "C:\Python27\lib\urllib2.py", line 378, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 1207, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "C:\Python27\lib\urllib2.py", line 1180, in do_open
r = h.getresponse(buffering=True)
File "C:\Python27\lib\httplib.py", line 1030, in getresponse
response.begin()
File "C:\Python27\lib\httplib.py", line 407, in begin
version, status, reason = self._read_status()
File "C:\Python27\lib\httplib.py", line 371, in _read_status
raise BadStatusLine(line)
BadStatusLine: ''
and
File "C:\Python27\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 400, in open
response = self._open(req, data)
File "C:\Python27\lib\urllib2.py", line 418, in _open
'_open', req)
File "C:\Python27\lib\urllib2.py", line 378, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 1207, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "C:\Python27\lib\urllib2.py", line 1180, in do_open
r = h.getresponse(buffering=True)
File "C:\Python27\lib\httplib.py", line 1030, in getresponse
response.begin()
File "C:\Python27\lib\httplib.py", line 407, in begin
version, status, reason = self._read_status()
File "C:\Python27\lib\httplib.py", line 365, in _read_status
line = self.fp.readline()
File "C:\Python27\lib\socket.py", line 447, in readline
data = self._sock.recv(self._rbufsize)
error: [Errno 10054] An existing connection was forcibly closed by the remote host
I've read a bit of information on the [Errno 10054] but have no idea how to prevent it.
Any help would be appriciated.
I'm thinking perhaps that r.info() is causing an exception but not
sure why it would as the r = urllib2.urlopen(url) is covered with the
try.
Nope. The first exception has nothing to do with r.info() - exception is raised on urllib2.urlopen(url), as you may see in the traceback.
BadStatusLine exception is defined in httplib and your except urllib2.URLError simply doesn't catch it. You should probably improve your exception handling logic like:
except (httplib.HTTPException, urllib2.URLError) as err:
...
I have a problem while deploying application to GAE. I use ubuntu. When I type command to update application this error occurs:
2011-07-22 20:13:28,598 ERROR appcfg.py:2064 An unexpected error occurred. Aborting.
Traceback (most recent call last):
File "/home/grzegorz/google_appengine/google/appengine/tools/appcfg.py", line 2005, in DoUpload
missing_files = self.Begin()
File "/home/grzegorz/google_appengine/google/appengine/tools/appcfg.py", line 1674, in Begin
self.Send('/api/appversion/create', payload=self.config.ToYAML())
File "/home/grzegorz/google_appengine/google/appengine/tools/appcfg.py", line 1632, in Send
return self.rpcserver.Send(url, payload=payload, **self.params)
File "/home/grzegorz/google_appengine/google/appengine/tools/appengine_rpc.py", line 365, in Send
f = self.opener.open(req)
File "/usr/lib/python2.7/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1193, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/home/grzegorz/google_appengine/lib/fancy_urllib/fancy_urllib/__init__.py", line 367, in do_open
raise url_error
URLError: <urlopen error [Errno 110] Connection timed out>
Traceback (most recent call last):
File "./google_appengine/appcfg.py", line 76, in <module>
run_file(__file__, globals())
File "./google_appengine/appcfg.py", line 72, in run_file
execfile(script_path, globals_)
File "/home/grzegorz/google_appengine/google/appengine/tools/appcfg.py", line 3708, in <module>
main(sys.argv)
File "/home/grzegorz/google_appengine/google/appengine/tools/appcfg.py", line 3699, in main
result = AppCfgApp(argv).Run()
File "/home/grzegorz/google_appengine/google/appengine/tools/appcfg.py", line 2345, in Run
self.action(self)
File "/home/grzegorz/google_appengine/google/appengine/tools/appcfg.py", line 3484, in __call__
return method()
File "/home/grzegorz/google_appengine/google/appengine/tools/appcfg.py", line 2745, in Update
app_summary = self.UpdateVersion(rpcserver, self.basepath, appyaml)
File "/home/grzegorz/google_appengine/google/appengine/tools/appcfg.py", line 2734, in UpdateVersion
lambda path: self.opener(os.path.join(basepath, path), 'rb'))
File "/home/grzegorz/google_appengine/google/appengine/tools/appcfg.py", line 2005, in DoUpload
missing_files = self.Begin()
File "/home/grzegorz/google_appengine/google/appengine/tools/appcfg.py", line 1674, in Begin
self.Send('/api/appversion/create', payload=self.config.ToYAML())
File "/home/grzegorz/google_appengine/google/appengine/tools/appcfg.py", line 1632, in Send
return self.rpcserver.Send(url, payload=payload, **self.params)
File "/home/grzegorz/google_appengine/google/appengine/tools/appengine_rpc.py", line 365, in Send
f = self.opener.open(req)
File "/usr/lib/python2.7/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1193, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/home/grzegorz/google_appengine/lib/fancy_urllib/fancy_urllib/__init__.py", line 367, in do_open
raise url_error
urllib2.URLError: <urlopen error [Errno 110] Connection timed out>
I tried to figure out why this happens, especially that it didn't happen to me before.
Anything that I've done about it just didn't work. What's weird is that if I keep trying, then after about 20-30 attemps - it just magically deploys..
In my opinion there might be some Wifi problems. For example when I start xChat - it can't connect to server. I think that this problems might be connected.
Nevertheless I can't figure out anything, so I would be thankful for any help.
For the openers, opener = urllib2.build_opener(), if I try to add an header:
request.add_header('if-modified-since',request.headers.get('last-nodified'))
I get the error code:
Traceback (most recent call last):
File "<pyshell#19>", line 1, in <module>
feeddata = opener.open(request)
File "C:\Python27\lib\urllib2.py", line 391, in open
response = self._open(req, data)
File "C:\Python27\lib\urllib2.py", line 409, in _open
'_open', req)
File "C:\Python27\lib\urllib2.py", line 369, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 1173, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "C:\Python27\lib\urllib2.py", line 1142, in do_open
h.request(req.get_method(), req.get_selector(), req.data, headers)
File "C:\Python27\lib\httplib.py", line 946, in request
self._send_request(method, url, body, headers)
File "C:\Python27\lib\httplib.py", line 986, in _send_request
self.putheader(hdr, value)
File "C:\Python27\lib\httplib.py", line 924, in putheader
str = '%s: %s' % (header, '\r\n\t'.join(values))
TypeError: sequence item 0: expected string, NoneType found
How do you get around this?
I tried building a class from urllib2.BaseHandler and it doesn't work.
Your traceback says: expected string, NoneType found from which I deduce that you've stored a None value as a header. Did you really write 'last-nodified'? The header you mean was probably 'last-modified', but even then you should check that it existed and not re-use it as a header if request.headers.get() returns None.