I'm working with an API which I post files to. However, when I receive the response, the HTTP status code is a 202. This is to be expected, but in addition the API will also respond with XML content.
So in my try/except block urllib2.urlopen will result in a raised urllib2.HTTPError and destroying the XML content.
try:
response = urllib2.urlopen(req)
except urllib2.HTTPError, http_e:
if http_e.code == 202:
print 'accepted!'
pass
print response.read() # UnboundLocalError: local variable 'response' referenced before assignment
How can I expect the 202 and keep the response content, but not raise an error?
Edit
Being silly, I forgot to inspect the exception that is returned by urllib2. It features all of the properties I've been waxing on about for httplib. This should do the trick for you:
try:
urllib2.urlopen(req)
except urllib2.HTTPError, e:
print "Response code",e.code # prints 404
print "Response body",e.read() # prints the body of the response...
# ie: your XML
print "Headers",e.headers.headers
Original
In this case, given that you're using HTTP as your transport protocol, you'll probably have more luck with the httplib library:
>>> import httplib
>>> conn = httplib.HTTPConnection("www.stackoverflow.com")
>>> conn.request("GET", "/dlkfjadslkfjdslkfjd.html")
>>> r = conn.getresponse()
>>> r.status
301
>>> r.reason
'Moved Permanently'
>>> r.read()
'<head><title>Document Moved</title></head>\n<body><h1>Object Moved</h1>
This document may be found
here</body>'
You can further use r.getheaders() and so forth to inspect other aspects of the response.
Related
Making a request to the server, as in code below, I've got status code 500, which was not caught as an exception. The output was "500", but I need for all 500 codes to result in sys.exit(). Does requests.exceptions.RequestException not treat 500 as an exception or is it something else? The requests module docs http://docs.python-requests.org/en/latest/user/quickstart/#errors-and-exceptions are not very clear on what falls under this class. How do I make sure that all 500 codes result in sys.exit()?
import requests
import json
import sys
url = http://www.XXXXXXXX.com
headers = {'user':'me'}
try:
r = requests.post(url, headers=headers)
status = r.status_code
response = json.dumps(r.json(), sort_keys=True, separators=(',', ': '))
print status
except requests.exceptions.RequestException as e:
print "- ERROR - Web service exception, msg = {}".format(e)
if r.status_code < 500:
print r.status_code
else:
sys.exit(-1)
A status code 500 is not an exception. There was a server error when processing the request and the server returned a 500; more of a problem with the server than the request.
You can therefore do away with the try-except:
r = requests.post(url, headers=headers)
status = r.status_code
response = json.dumps(r.json(), sort_keys=True, separators=(',', ': '))
if str(status).startswith('5'):
...
From the Requests documentation:
If we made a bad request (a 4XX client error or 5XX server error
response), we can raise it with Response.raise_for_status():
>>> bad_r = requests.get('http://httpbin.org/status/404')
>>> bad_r.status_code
404
>>> bad_r.raise_for_status()
Traceback (most recent call last):
File "requests/models.py", line 832, in raise_for_status
raise http_error
requests.exceptions.HTTPError: 404 Client Error
So, use
r = requests.post(url, headers=headers)
try:
r.raise_for_status()
except requests.exceptions.HTTPError:
# Gave a 500 or 404
else:
# Move on with your life! Yay!
If you want a successful request, but "non-OK" response to raise an error, call response.raise_for_status(). You can then catch that error and handle it appropriately. It will raise a requests.exceptions.HTTPError that has the response object hung onto the error.
I have to make a series of requests to my localserver and check response. Basically I am trying to hit the right url by brute forcing. This is my code:
for i in range(48,126):
test = chr(i)
urln = '012a4' + test
url = {"tk" : urln}
data = urllib.urlencode(url)
print data
request = urllib2.Request("http://127.0.0.1/brute.php", data)
response = urllib2.urlopen(request)
status_code = response.getcode()
I've to make request like: http://127.0.0.1/brute.php?tk=some_val
I am getting an error because the url is not properly encoding. I am internal server error 500 even when one of the url in series should give 200. manually giving that url confirms it. Also, what is the right way to skip 500/400 errors until I get a 200?
When using urllib2 you should always handle any exceptions that are raised as follows:
import urllib, urllib2
for i in range(0x012a40, 0x12a8e):
url = {"tk" : '{:x}'.format(i)}
data = urllib.urlencode(url)
print data
try:
request = urllib2.Request("http://127.0.0.1/brute.php", data)
response = urllib2.urlopen(request)
status_code = response.getcode()
except urllib2.URLError, e:
print e.reason
This will display the following when the connection fails, and then continue to try the next connection:
[Errno 10061] No connection could be made because the target machine actively refused it
e.reason will give you the textual reason, and e.errno will give you the error code. So you could still stop if the error was something other than 10061 for example.
Lastly, you seem to be cycling through a range of numbers in hex format? You might find it easier to work directly with 0x formatting to build your strings.
It sounds like you will benefit from a try/except block:
for i in range(48,126):
test = 'chr(i)'
new urln = '012a4' + test
url = {"tk" : urln}
data = urllib.urlencode(url)
print data
request = urllib2.Request("http://127.0.0.1/brute.php", data)
try:
response = urllib2.urlopen(request)
except:
status_code = response.getcode()**strong text**
print status_code
You typically would also want to catch the error as well:
except Exception, e:
print e
Or catch specific errors only, for example:
except ValueError:
#do stuff
Though you wouldn't get a ValueError in your code.
In python, when a http request is invalid, response is None, in this case, how to get the response code from the response? The invalid request in my code are caused by two reasons, one is a invalid token, I expect to get 401 in this case, another reason is invalid parameter, I expect to get 400 in this case, but under both cases, response is always None and I'm not able to get the response code by calling response.getcode(), how to solve this?
req = urllib2.Request(url)
response = None
try: response = urllib2.urlopen(req)
except urllib2.URLError as e:
res_code = response.getcode() #AttributeError: 'NoneType' object has no attribute 'getcode'
You can't get the status code when URLError is raised. Because when it is raised (ex: DNS couldn't resolve domain name), it means request hasn't been sent to server yet so there is no HTTP response generated.
In your scenario, (for 4xx HTTP status code), urllib2 throws HTTPError so you can derive the status code from it.
The documentation says:
code
An HTTP status code as defined in RFC 2616. This numeric value corresponds to a value found in the dictionary of codes as found in BaseHTTPServer.BaseHTTPRequestHandler.responses.
import urllib2
request = urllib2.Request(url)
try:
response = urllib2.urlopen(request)
res_code = response.code
except urllib2.HTTPError as e:
res_code = e.code
Hope this helps.
I've been searching all around for a Python 3.x code sample to get HTTP Header information.
Something as simple as get_headers equivalent in PHP cannot be found in Python easily. Or maybe I am not sure how to best wrap my head around it.
In essence, I would like to code something where I can see whether a URL exists or not
something in the line of
h = get_headers(url)
if(h[0] == 200)
{
print("Bingo!")
}
So far, I tried
h = http.client.HTTPResponse('http://docs.python.org/')
But always got an error
To get an HTTP response code in python-3.x, use the urllib.request module:
>>> import urllib.request
>>> response = urllib.request.urlopen(url)
>>> response.getcode()
200
>>> if response.getcode() == 200:
... print('Bingo')
...
Bingo
The returned HTTPResponse Object will give you access to all of the headers, as well. For example:
>>> response.getheader('Server')
'Apache/2.2.16 (Debian)'
If the call to urllib.request.urlopen() fails, an HTTPError Exception is raised. You can handle this to get the response code:
import urllib.request
try:
response = urllib.request.urlopen(url)
if response.getcode() == 200:
print('Bingo')
else:
print('The response code was not 200, but: {}'.format(
response.get_code()))
except urllib.error.HTTPError as e:
print('''An error occurred: {}
The response code was {}'''.format(e, e.getcode()))
For Python 2.x
urllib, urllib2 or httplib can be used here. However note, urllib and urllib2 uses httplib. Therefore, depending on whether you plan to do this check a lot (1000s of times), it would be better to use httplib. Additional documentation and examples are here.
Example code:
import httplib
try:
h = httplib.HTTPConnection("www.google.com")
h.connect()
except Exception as ex:
print "Could not connect to page."
For Python 3.x
A similar story to urllib (or urllib2) and httplib from Python 2.x applies to the urllib2 and http.client libraries in Python 3.x. Again, http.client should be quicker. For more documentation and examples look here.
Example code:
import http.client
try:
conn = http.client.HTTPConnection("www.google.com")
conn.connect()
except Exception as ex:
print("Could not connect to page.")
and if you wanted to check the status codes you would need to replace
conn.connect()
with
conn.request("GET", "/index.html") # Could also use "HEAD" instead of "GET".
res = conn.getresponse()
if res.status == 200 or res.status == 302: # Specify codes here.
print("Page Found!")
Note, in both examples, if you would like to catch the specific exception relating to when the URL doesn't exist, rather than all of them, catch the socket.gaierror exception instead (see the socket documentation).
You can use requests module to check it:
import requests
url = "http://www.example.com/"
res = requests.get(url)
if res.status_code == 200:
print("bingo")
You can also check header contents before making downloading the whole content of the webpage by using header.
you can use the urllib2 library
import urllib2
if urllib2.urlopen(url).code == 200:
print "Bingo"
According to the urllib2 documentation,
Because the default handlers handle redirects (codes in the 300 range), and codes in the 100-299 range indicate success, you will usually only see error codes in the 400-599 range.
And yet the following code
request = urllib2.Request(url, data, headers)
response = urllib2.urlopen(request)
raises an HTTPError with code 201 (created):
ERROR 2011-08-11 20:40:17,318 __init__.py:463] HTTP Error 201: Created
So why is urllib2 throwing HTTPErrors on this successful request?
It's not too much of a pain; I can easily extend the code to:
try:
request = urllib2.Request(url, data, headers)
response = urllib2.urlopen(request)
except HTTPError, e:
if e.code == 201:
# success! :)
else:
# fail! :(
else:
# when will this happen...?
But this doesn't seem like the intended behavior, based on the documentation and the fact that I can't find similar questions about this odd behavior.
Also, what should the else block be expecting? If successful status codes are all interpreted as HTTPErrors, then when does urllib2.urlopen() just return a normal file-like response object like all the urllib2 documentation refers to?
You can write a custom Handler class for use with urllib2 to prevent specific error codes from being raised as HTTError. Here's one I've used before:
class BetterHTTPErrorProcessor(urllib2.BaseHandler):
# a substitute/supplement to urllib2.HTTPErrorProcessor
# that doesn't raise exceptions on status codes 201,204,206
def http_error_201(self, request, response, code, msg, hdrs):
return response
def http_error_204(self, request, response, code, msg, hdrs):
return response
def http_error_206(self, request, response, code, msg, hdrs):
return response
Then you can use it like:
opener = urllib2.build_opener(self.BetterHTTPErrorProcessor)
urllib2.install_opener(opener)
req = urllib2.Request(url, data, headers)
urllib2.urlopen(req)
As the actual library documentation mentions:
For 200 error codes, the response object is returned immediately.
For non-200 error codes, this simply passes the job on to the protocol_error_code handler methods, via OpenerDirector.error(). Eventually, urllib2.HTTPDefaultErrorHandler will raise an HTTPError if no other handler handles the error.
http://docs.python.org/library/urllib2.html#httperrorprocessor-objects
I personally think it was a mistake and very nonintuitive for this to be the default behavior.
It's true that non-2XX codes imply a protocol level error, but turning that into an exception is too far (in my opinion at least).
In any case, I think the most elegant way to avoid this is:
opener = urllib.request.build_opener()
for processor in opener.process_response['https']: # or http, depending on what you're using
if isinstance(processor, urllib.request.HTTPErrorProcessor): # HTTPErrorProcessor also for https
opener.process_response['https'].remove(processor)
break # there's only one such handler by default
response = opener.open('https://www.google.com')
Now you have the response object. You can check it's status code, headers, body, etc.