In this little piece of code, what is the fourth line all about?
from google.appengine.api import urlfetch
url = "http://www.google.com/"
result = urlfetch.fetch(url)
if result.status_code == 200:
doSomethingWithResult(result.content)
It's a HTTP status code, it means "OK" (EG: The server successfully answered the http request).
See a list of them here on wikipedia
Whoever wrote that should have used a constant instead of a magic number. The httplib module has all the http response codes.
E.g.:
>>> import httplib
>>> httplib.OK
200
>>> httplib.NOT_FOUND
404
200 is the HTTP status code for "OK", a successful response. (Other codes you may be familiar with are 404 Not Found, 403 Forbidden, and 500 Internal Server Error.)
See RFC 2616 for more information.
Related
url = "https://www.avito.ma/fr/2_mars/sacs_et_accessoires/Ch%C3%A2les_en_Vrai_Soie_Chanel_avec_boite_38445885.htm"
try
r = requests.get(url,headers={'User-Agent': ua.random},timeout=timeout) # execute a timed website request
if r.status_code > 299: # check for bad status
r.raise_for_status() # if confirmed raise bad status
else:
print(r.status_code, url) # otherwise print status code and url
except Exception as e:
print('\nThe following exception: {0}, \nhas been found found on the following post: "{1}".\n'.format(e,url))
Expected status = 301 Moved Permanently
You can visit the page or check http://www.redirect-checker.org/index.php with the url for a correct terminal print.
Returned status = 200 OK
The page has been moved and it should return the above 301 Moved Permanently, however it returns a 200. I read the requests doc and checked all the parameters (allow_redirects=False etc.) but I don't think it is a mistake of configuration.
I am puzzled at why requests wouldn't see the redirects.
Any ideas?
Thank you in advance.
Python Requests module has the allow_redirect parameter in True by default. I've tested it with False and it gives the 301 code that you're looking for.
Note after reading your comment above: r.history saves each response_code before the one that you're right now which is saved in r.status_code (only if you leave the parameter in True).
I'm using Python 3.7 with urllib.
All work fine but it seems not to athomatically redirect when it gets an http redirect request (307).
This is the error i get:
ERROR 2020-06-15 10:25:06,968 HTTP Error 307: Temporary Redirect
I've to handle it with a try-except and manually send another request to the new Location: it works fine but i don't like it.
These is the piece of code i use to perform the request:
req = urllib.request.Request(url)
req.add_header('Authorization', auth)
req.add_header('Content-Type','application/json; charset=utf-8')
req.data=jdati
self.logger.debug(req.headers)
self.logger.info(req.data)
resp = urllib.request.urlopen(req)
url is an https resource and i set an header with some Authhorization info and content-type.
req.data is a JSON
From urllib documentation i've understood that the redirects are authomatically performed by the the library itself, but it doesn't work for me. It always raises an http 307 error and doesn't follow the redirect URL.
I've also tried to use an opener specifiyng the default redirect handler, but with the same result
opener = urllib.request.build_opener(urllib.request.HTTPRedirectHandler)
req = urllib.request.Request(url)
req.add_header('Authorization', auth)
req.add_header('Content-Type','application/json; charset=utf-8')
req.data=jdati
resp = opener.open(req)
What could be the problem?
The reason why the redirect isn't done automatically has been correctly identified by yours truly in the discussion in the comments section. Specifically, RFC 2616, Section 10.3.8 states that:
If the 307 status code is received in response to a request other
than GET or HEAD, the user agent MUST NOT automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.
Back to the question - given that data has been assigned, this automatically results in get_method returning POST (as per how this method was implemented), and since that the request method is POST, and the response code is 307, an HTTPError is raised instead as per the above specification. In the context of Python's urllib, this specific section of the urllib.request module raises the exception.
For an experiment, try the following code:
import urllib.request
import urllib.parse
url = 'http://httpbin.org/status/307'
req = urllib.request.Request(url)
req.data = b'hello' # comment out to not trigger manual redirect handling
try:
resp = urllib.request.urlopen(req)
except urllib.error.HTTPError as e:
if e.status != 307:
raise # not a status code that can be handled here
redirected_url = urllib.parse.urljoin(url, e.headers['Location'])
resp = urllib.request.urlopen(redirected_url)
print('Redirected -> %s' % redirected_url) # the original redirected url
print('Response URL -> %s ' % resp.url) # the final url
Running the code as is may produce the following
Redirected -> http://httpbin.org/redirect/1
Response URL -> http://httpbin.org/get
Note the subsequent redirect to get was done automatically, as the subsequent request was a GET request. Commenting out req.data assignment line will result in the lack of the "Redirected" output line.
Other notable things to note in the exception handling block, e.read() may be done to retrieve the response body produced by the server as part of the HTTP 307 response (since data was posted, there might be a short entity in the response that may be processed?), and that urljoin is needed as the Location header may be a relative URL (or simply has the host missing) to the subsequent resource.
Also, as a matter of interest (and for linkage purposes), this specific question has been asked multiple times before and I am rather surprised that they never got any answers, which follows:
How to handle 307 redirection using urllib2 from http to https
HTTP Error 307: Temporary Redirect in Python3 - INTRANET
HTTP Error 307 - Temporary redirect in python script
I cannot wrap my brain around this issue:
When I run this code in my IDE (pycharm), or via the command line, I get a 204 HTTP response and no content. When I set breakpoints in my debugger to see what is happening, the code executes fine and r.content and r.text are populated with the results from the request. r.status_code also has a value of 200 when running in the debugger.
code:
r = requests.post(self.dispatchurl, verify=False, auth=HTTPBasicAuth(self.user, self.passwd))
print 'first request to get sid: status {}'.format(r.status_code)
json_data = json.loads(r.text)
self.sid = json_data['sid']
print 'the sid is: {}'.format(self.sid)
self.getresulturl = '{}/services/search/jobs/{}/results{}'.format(self.url, self.sid, self.outputmode)
x = requests.get(self.getresulturl, verify=False, auth=HTTPBasicAuth(self.user, self.passwd))
print 'second request to get the data: status {}'.format(x.status_code)
print 'content: {}'.format(x.text)
output when run through debugger:
first request to get sid: status 201
the sid is: sanitizedatahere
second request to get the data: status 200
content: {"preview":false...}
Process finished with exit code 0
When I execute the code normally without the debugger, i get a 204 on the second response.
output:
first request to get sid: status 201
the sid is: sanitizedatahere
second request to get the data: status 204
content:
Process finished with exit code 0
I am guessing this has something to do with the debugger slowing down the requests and allowing the server to respond with the data? This seems like a race condition. I've never run into this with requests.
Is there something I am doing wrong? I'm at a loss. Thanks in advance for looking.
Solved by adding this loop:
while r.status_code == 204:
time.sleep(1)
r = requests.get(self.resulturl, verify=False, auth=HTTPBasicAuth(self.user, self.passwd))
As I suspected the Rest API was taking longer to collect results, hence the 204. When running the debugger, it slowed the process long enough that the API was able to complete the initial request, thus giving a 200.
The HTTP 204 No Content success status response code indicates that the request has succeeded, but that the client doesn't need to go away from its current page. A 204 response is cacheable by default.
Below settings would solve the issue.
r = requests.get(splunk_end, headers=headers, verify=False)
while r.status_code == 204:
time.sleep(1)
r = requests.get(splunk_end, headers=headers, verify=False)
The 204 response is converted to 200. Please check below logs.
https://localhost:8089/services/search/jobs/4D44-A45E-7BDB8F0BE473/results?output_mode=json
/usr/lib/python2.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
InsecureRequestWarning)
/usr/lib/python2.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
InsecureRequestWarning)
<Response [204]>
/usr/lib/python2.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
InsecureRequestWarning)
<Response [200]>
Thanks
In this case , after an SID generation , directly trying to get results when the status response is 200 but the dispatchState is not DONE. So when checking results response it gives 204.
we can keep checking the status of job ("<s:key name="dispatchState">DONE</s:key>" ) by filtering.So once the dispatchState shows DONE , go for checking results , then response code would directly give 200.
I am new to python, can anyone tell me which python tools I should use to get my work done? Any good idea to build a python script to automatically find out these 404, 5XX requests?Thanks in advance!
We can check the response status code:
>>> r = requests.get('http://httpbin.org/get')
>>> r.status_code
200
Requests also comes with a built-in status code lookup object for easy reference:
>>> r.status_code == requests.codes.ok
True
If we made a bad request (a 4XX client error or 5XX server error response), we can raise it with Response.raise_for_status():
>>> bad_r = requests.get('http://httpbin.org/status/404')
>>> bad_r.status_code
404
But, since our status_code for r was 200, when we call raise_for_status() we get:
>>> r.raise_for_status()
None
Reffer : this link
I use Flask framework in my project with pure json api. It renders only json responses without html or static files.
I am trying to achieve abort() function with custom http code, in my case 204 (No Content) which isn't defined by default. I have currently code like:
# Error define
class NoContent(HTTPException):
code = 204
description = ('No Content')
abort.mapping[204] = NoContent
def make_json_error(ex):
response = jsonify(error=str(ex))
response.status_code = (ex.code
if isinstance(ex, HTTPException)
else 500)
return response
custom_exceptions = {}
custom_exceptions[NoContent.code] = NoContent
for code in custom_exceptions.iterkeys():
app.error_handler_spec[None][code] = make_json_error
# Route
#app.route("/results/<name>")
def results(name=None):
return jsonify(data=results) if results else abort(204)
It works well I get response like:
127.0.0.1 - - [02/Dec/2014 10:51:09] "GET /results/test HTTP/1.1" 204 -
But without any content. It renders nothing, not even blank white page in browser.
I can use errorhandler
#app.errorhandler(204)
def error204(e):
response = jsonify(data=[])
return response
But it returns 200 http code. In need 204 here. When I add in error204() line like:
response.status_code = 204
It renders nothing once again.
I am stuck and I have no idea where there is an error with this approach. Please help.
If my approach is wrong from design perspective please propose something else.
Thanks in advance.
Remember, HTTP 204 is "No Content". RFC 7231 (and RFC 2616 before it) requires that user-agents ignore everything after the last header line:
The 204 (No Content) status code indicates that the server has successfully fulfilled the request and that there is no additional content to send in the response payload body ... A 204 response is terminated by the first empty line after the header fields because it cannot contain a message body.
~ RFC 7231 (emphasis mine)
The 204 response MUST NOT include a message-body, and thus is always terminated by the first empty line after the header fields.
~ RFC 2616
You need to return the status code in the error handler.
#app.errorhandler(204)
def error204(e):
response = jsonify(data=[])
return response, 204
Leaving off the status code is interpreted as 200 by Flask.