Traceback when I use urllib2, get a HTTP 500 error - python

My code is like follows, but when it runs it throws an error.
search_request = urllib2.Request(url,data=tmp_file_name,headers={'X-Requested-With':'WoMenShi888XMLHttpRequestWin'})
#print search_request.get_method()
search_response = urllib2.urlopen(search_request)
html_data = search_response.read()
the error is:
Traceback (most recent call last):
File "xx_tmp.py", line 83, in <module>
print hello_lfi()
File "xx_tmp.py", line 69, in hello_lfi
search_response = urllib2.urlopen(search_request)
File "D:\Python27\lib\urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "D:\Python27\lib\urllib2.py", line 406, in open
response = meth(req, response)
File "D:\Python27\lib\urllib2.py", line 519, in http_response
'http', request, response, code, msg, hdrs)
File "D:\Python27\lib\urllib2.py", line 444, in error
return self._call_chain(*args)
File "D:\Python27\lib\urllib2.py", line 378, in _call_chain
result = func(*args)
File "D:\Python27\lib\urllib2.py", line 527, in http_error_defau
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 500: Internal Server Error
I don't know how to fix it? I mean, when an error happened, how can my code continue to work?
when i try use
try:
search_response = urllib2.urlopen(search_request)
except urllib2.HTTPError:
pass
new error
UnboundLocalError: local variable 'search_response' referenced before assignment
i use
global search_response
and have error
NameError: global name 'search_response' is not defined

You can catch the exception, this will prevent your program from stopping so 'abruptly':
try:
search_response = urllib2.urlopen(search_request)
except urllib2.HTTPError:
print 'There was an error with the request'
If you want to continue, you can simply:
try:
search_response = urllib2.urlopen(search_request)
except urllib2.HTTPError:
pass
This will allow your program to continue; but your other statement html_data = search_response.read() won't give you the expected result. To fix this problem permanently, you need to debug your request to see why its failing; this isn't something specific to Python.

I had the same error when I was trying to send a large post request to my GAE Python server. It turns out the server threw the error because I was trying to write the received POST string into a db.StringProperty(). I changed that to db.TextProperty() and it didn't throw the error anymore.
Source: Overcome appengine 500 byte string limit in python? consider text

Related

I keep getting HTTP Error 400: Bad Request from urlopen

I'm studying Python from Udacity
cause I use different version so I get stuck in programming profanity editor
This is my code:
import urllib.request
def readdocument(x):
document = open(x)
profanitycheck(document.read())
document.close()
def profanitycheck(urcontent):
q = urllib.request.Request("http://www.wdylike.appspot.com/?q="+urcontent)
with urllib.request.urlopen(q) as content2:
output = content2.read()
print(output)
filelocate=(r"C:\Users\Sutthikiat\Desktop\movie_quotes.txt")
readdocument(filelocate)
this is txt file:
-- Houston, we have a problem. (Apollo 13)
-- Mama always said, life is like a box of chocolates. You never know what you are going to get. (Forrest Gump)
-- You cant handle the truth. (A Few Good Men)
-- I believe everything and I believe nothing. (A Shot in the Dark)
but I create a new text file and check with it, It runs properly so I don't understand how my code gets error, maybe it's about exception??
this is error code:
Traceback (most recent call last):
File "C:\Users\Sutthikiat\Desktop\cursecheck.py", line 13, in <module>
readdocument(filelocate)
File "C:\Users\Sutthikiat\Desktop\cursecheck.py", line 4, in readdocument
profanitycheck(document.read())
File "C:\Users\Sutthikiat\Desktop\cursecheck.py", line 8, in profanitycheck
with urllib.request.urlopen(q) as content2:
File "C:\Users\Sutthikiat\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\Sutthikiat\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 532, in open
response = meth(req, response)
File "C:\Users\Sutthikiat\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\Sutthikiat\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 570, in error
return self._call_chain(*args)
File "C:\Users\Sutthikiat\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 504, in _call_chain
result = func(*args)
File "C:\Users\Sutthikiat\AppData\Local\Programs\Python\Python36\lib\urllib\request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request
the server think it invalid,just like this
urllib.request.urlopen("https://www.baidu.com/s?wd="+"a\nb")
the url contains invalid character: \n (just like content you read from file). You need to quote them:
from urllib.quote import quote
q = urllib.request.urlopen("https://www.baidu.com/s?wd="+ urllib.request.quote("a\nb"))
print(q.url)
'https://www.baidu.com/s?wd=a%0Ab'

HTTP Basic Authentication is failing in python script

I am trying to connect to a REST resource and retrieve the data using Python script (Python 3.2.3). When I run the script I am getting error as HTTP Error 401: Unauthorized. Please note that I am able to access the given REST resource using REST client using Basic Authentication. In the REST Client I have specified the hostname, user and password details (realm is not required).
Below is the code and complete error. Your help is very much appreciated.
Code:
import urllib.request
# set up authentication info
auth_handler = urllib.request.HTTPBasicAuthHandler()
auth_handler.add_password(realm=None,
uri=r'http://hostname/',
user='administrator',
passwd='administrator')
opener = urllib.request.build_opener(auth_handler)
urllib.request.install_opener(opener)
res = opener.open(r'http://hostname:9004/apollo-api/nodes')
nodes = res.read()
Error
Traceback (most recent call last):
File "C:\Python32\scripts\get-nodes.py", line 12, in <module>
res = opener.open(r'http://tolowa.wysdm.lab.emc.com:9004/apollo-api/nodes')
File "C:\Python32\lib\urllib\request.py", line 375, in open
response = meth(req, response)
File "C:\Python32\lib\urllib\request.py", line 487, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python32\lib\urllib\request.py", line 413, in error
return self._call_chain(*args)
File "C:\Python32\lib\urllib\request.py", line 347, in _call_chain
result = func(*args)
File "C:\Python32\lib\urllib\request.py", line 495, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 401: Unauthorized
Try to give the correct realm name. You can find this out for example when opening the page in a browser - the password prompt should display the name.
You can also read the realm by catching the exception that was raised:
import urllib.error
import urllib.request
# set up authentication info
auth_handler = urllib.request.HTTPBasicAuthHandler()
auth_handler.add_password(realm=None,
uri=r'http://hostname/',
user='administrator',
passwd='administrator')
opener = urllib.request.build_opener(auth_handler)
urllib.request.install_opener(opener)
try:
res = opener.open(r'http://hostname:9004/apollo-api/nodes')
nodes = res.read()
except urllib.error.HTTPError as e:
print(e.headers['www-authenticate'])
You should get the following output:
Basic realm="The realm you are after"
Read the realm from above and set it in your add_password method and it should be good to go.

Parse.com user login - 404 error

I am fairly inexperienced with user authentication especially through restful apis. I am trying to use python to log in with a user that is set up in parse.com. The following is the code I have:
API_LOGIN_ROOT = 'https://api.parse.com/1/login'
params = {'username':username,'password':password}
encodedParams = urllib.urlencode(params)
url = API_LOGIN_ROOT + "?" + encodedParams
request = urllib2.Request(url)
request.add_header('Content-type', 'application/x-www-form-urlencoded')
# we could use urllib2's authentication system, but it seems like overkill for this
auth_header = "Basic %s" % base64.b64encode('%s:%s' % (APPLICATION_ID, MASTER_KEY))
request.add_header('Authorization', auth_header)
request.add_header('X-Parse-Application-Id', APPLICATION_ID)
request.add_header('X-Parse-REST-API-Key', MASTER_KEY)
request.get_method = lambda: http_verb
# TODO: add error handling for server response
response = urllib2.urlopen(request)
#response_body = response.read()
#response_dict = json.loads(response_body)
This is a modification of an open source library used to access the parse rest interface.
I get the following error:
Traceback (most recent call last):
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/_webapp25.py", line 703, in __call__
handler.post(*groups)
File "/Users/nazbot/src/PantryPal_AppEngine/fridgepal.py", line 464, in post
url = user.login()
File "/Users/nazbot/src/PantryPal_AppEngine/fridgepal.py", line 313, in login
url = self._executeCall(self.username, self.password, 'GET', data)
File "/Users/nazbot/src/PantryPal_AppEngine/fridgepal.py", line 292, in _executeCall
response = urllib2.urlopen(request)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 400, in open
response = meth(req, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 513, in http_response
'http', request, response, code, msg, hdrs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 438, in error
return self._call_chain(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 372, in _call_chain
result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 521, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 404: Not Found
Can someone point me to where I am screwing up? I'm not quite sure why I'm getting a 404 instead of an access denied or some other issue.
Make sure the "User" class was created on Parse.com as a special user class. When you are adding the class, make sure to change the Class Type to "User" instead of "Custom". A little user head icon will show up next to the class name on the left hand side.
This stumped me for a long time until Matt from the Parse team showed me the problem.
Please change: API_LOGIN_ROOT = 'https://api.parse.com/1/login' to the following: API_LOGIN_ROOT = 'https://api.parse.com/1/login**/**'
I had the same problem using PHP, adding the / at the end fixed the 404 error.

Why am I getting an AttributeError when trying to print out

I am learning about urllib2 by following this tutorial http://docs.python.org/howto/urllib2.html#urlerror Running the code below yields a different outcome from the tutorial
import urllib2
req = urllib2.Request('http://www.pretend-o-server.org')
try:
urllib2.urlopen(req)
except urllib2.URLError, e:
print e.reason
Python interpreter spits this back
Traceback (most recent call last):
File "urlerror.py", line 8, in <module>
print e.reason
AttributeError: 'HTTPError' object has no attribute 'reason'
How come this is happening?
UPDATE
When I try to print out the code attribute it works fine
import urllib2
req = urllib2.Request('http://www.pretend-o-server.org')
try:
urllib2.urlopen(req)
except urllib2.URLError, e:
print e.code
Depending on the error type, the object e may or may not carry that attribute.
In the link you provided there is a more complete example:
Number 2
from urllib2 import Request, urlopen, URLError
req = Request(someurl)
try:
response = urlopen(req)
except URLError, e:
if hasattr(e, 'reason'): # <--
print 'We failed to reach a server.'
print 'Reason: ', e.reason
elif hasattr(e, 'code'): # <--
print 'The server couldn\'t fulfill the request.'
print 'Error code: ', e.code
else:
# everything is fine
Because there is no such attribute. Try:
print str(e)
and you will get nice:
HTTP Error 404: Not Found
The reason I got the AttributeError was because I was using OpenDNS. Apparently even when you pass in a bogus URL, OpenDNS treats it like it exists. So after switching to Googles DNS server, I am getting the expected result which is:
[Errno -2] Name or service not known
Also I should mention the traceback I got for running this code which is everything excluding try and except
from urllib2 import Request, urlopen, URLError, HTTPError
req = Request('http://www.pretend_server.com')
urlopen(req)
is this
Traceback (most recent call last):
File "urlerror.py", line 5, in <module>
urlopen(req)
File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.6/urllib2.py", line 397, in open
response = meth(req, response)
File "/usr/lib/python2.6/urllib2.py", line 510, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.6/urllib2.py", line 435, in error
return self._call_chain(*args)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 518, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found
which a kind gentle(wo)man? from IRC #python told me was highly strange and then asked if I was using OpenDNS to which I replied yes. So they suggested I switch it to Google's which I proceeded to do.

Python urllib2 URLError exception?

I installed Python 2.6.2 earlier on a Windows XP machine and run the following code:
import urllib2
import urllib
page = urllib2.Request('http://www.python.org/fish.html')
urllib2.urlopen( page )
I get the following error.
Traceback (most recent call last):<br>
File "C:\Python26\test3.py", line 6, in <module><br>
urllib2.urlopen( page )<br>
File "C:\Python26\lib\urllib2.py", line 124, in urlopen<br>
return _opener.open(url, data, timeout)<br>
File "C:\Python26\lib\urllib2.py", line 383, in open<br>
response = self._open(req, data)<br>
File "C:\Python26\lib\urllib2.py", line 401, in _open<br>
'_open', req)<br>
File "C:\Python26\lib\urllib2.py", line 361, in _call_chain<br>
result = func(*args)<br>
File "C:\Python26\lib\urllib2.py", line 1130, in http_open<br>
return self.do_open(httplib.HTTPConnection, req)<br>
File "C:\Python26\lib\urllib2.py", line 1105, in do_open<br>
raise URLError(err)<br>
URLError: <urlopen error [Errno 11001] getaddrinfo failed><br><br><br>
import urllib2
response = urllib2.urlopen('http://www.python.org/fish.html')
html = response.read()
You're doing it wrong.
Have a look in the urllib2 source, at the line specified by the traceback:
File "C:\Python26\lib\urllib2.py", line 1105, in do_open
raise URLError(err)
There you'll see the following fragment:
try:
h.request(req.get_method(), req.get_selector(), req.data, headers)
r = h.getresponse()
except socket.error, err: # XXX what error?
raise URLError(err)
So, it looks like the source is a socket error, not an HTTP protocol related error. Possible reasons: you are not on line, you are behind a restrictive firewall, your DNS is down,...
All this aside from the fact, as mcandre pointed out, that your code is wrong.
Name resolution error.
getaddrinfo is used to resolve the hostname (python.org)in your request. If it fails, it means that the name could not be resolved because:
It does not exist, or the records are outdated (unlikely; python.org is a well-established domain name)
Your DNS server is down (unlikely; if you can browse other sites, you should be able to fetch that page through Python)
A firewall is blocking Python or your script from accessing the Internet (most likely; Windows Firewall sometimes does not ask you if you want to allow an application)
You live on an ancient voodoo cemetery. (unlikely; if that is the case, you should move out)
Windows Vista, python 2.6.2
It's a 404 page, right?
>>> import urllib2
>>> import urllib
>>>
>>> page = urllib2.Request('http://www.python.org/fish.html')
>>> urllib2.urlopen( page )
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python26\lib\urllib2.py", line 124, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python26\lib\urllib2.py", line 389, in open
response = meth(req, response)
File "C:\Python26\lib\urllib2.py", line 502, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python26\lib\urllib2.py", line 427, in error
return self._call_chain(*args)
File "C:\Python26\lib\urllib2.py", line 361, in _call_chain
result = func(*args)
File "C:\Python26\lib\urllib2.py", line 510, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found
>>>
DJ
First, I see no reason to import urllib; I've only ever seen urllib2 used to replace urllib entirely and I know of no functionality that's useful from urllib and yet is missing from urllib2.
Next, I notice that http://www.python.org/fish.html gives a 404 error to me. (That doesn't explain the backtrace/exception you're seeing. I get urllib2.HTTPError: HTTP Error 404: Not Found
Normally if you just want to do a default fetch of a web pages (without adding special HTTP headers, doing doing any sort of POST, etc) then the following suffices:
req = urllib2.urlopen('http://www.python.org/')
html = req.read()
# and req.close() if you want to be pedantic

Categories