I cannot wrap my brain around this issue:
When I run this code in my IDE (pycharm), or via the command line, I get a 204 HTTP response and no content. When I set breakpoints in my debugger to see what is happening, the code executes fine and r.content and r.text are populated with the results from the request. r.status_code also has a value of 200 when running in the debugger.
code:
r = requests.post(self.dispatchurl, verify=False, auth=HTTPBasicAuth(self.user, self.passwd))
print 'first request to get sid: status {}'.format(r.status_code)
json_data = json.loads(r.text)
self.sid = json_data['sid']
print 'the sid is: {}'.format(self.sid)
self.getresulturl = '{}/services/search/jobs/{}/results{}'.format(self.url, self.sid, self.outputmode)
x = requests.get(self.getresulturl, verify=False, auth=HTTPBasicAuth(self.user, self.passwd))
print 'second request to get the data: status {}'.format(x.status_code)
print 'content: {}'.format(x.text)
output when run through debugger:
first request to get sid: status 201
the sid is: sanitizedatahere
second request to get the data: status 200
content: {"preview":false...}
Process finished with exit code 0
When I execute the code normally without the debugger, i get a 204 on the second response.
output:
first request to get sid: status 201
the sid is: sanitizedatahere
second request to get the data: status 204
content:
Process finished with exit code 0
I am guessing this has something to do with the debugger slowing down the requests and allowing the server to respond with the data? This seems like a race condition. I've never run into this with requests.
Is there something I am doing wrong? I'm at a loss. Thanks in advance for looking.
Solved by adding this loop:
while r.status_code == 204:
time.sleep(1)
r = requests.get(self.resulturl, verify=False, auth=HTTPBasicAuth(self.user, self.passwd))
As I suspected the Rest API was taking longer to collect results, hence the 204. When running the debugger, it slowed the process long enough that the API was able to complete the initial request, thus giving a 200.
The HTTP 204 No Content success status response code indicates that the request has succeeded, but that the client doesn't need to go away from its current page. A 204 response is cacheable by default.
Below settings would solve the issue.
r = requests.get(splunk_end, headers=headers, verify=False)
while r.status_code == 204:
time.sleep(1)
r = requests.get(splunk_end, headers=headers, verify=False)
The 204 response is converted to 200. Please check below logs.
https://localhost:8089/services/search/jobs/4D44-A45E-7BDB8F0BE473/results?output_mode=json
/usr/lib/python2.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
InsecureRequestWarning)
/usr/lib/python2.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
InsecureRequestWarning)
<Response [204]>
/usr/lib/python2.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py:768: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html
InsecureRequestWarning)
<Response [200]>
Thanks
In this case , after an SID generation , directly trying to get results when the status response is 200 but the dispatchState is not DONE. So when checking results response it gives 204.
we can keep checking the status of job ("<s:key name="dispatchState">DONE</s:key>" ) by filtering.So once the dispatchState shows DONE , go for checking results , then response code would directly give 200.
Related
url = "https://www.avito.ma/fr/2_mars/sacs_et_accessoires/Ch%C3%A2les_en_Vrai_Soie_Chanel_avec_boite_38445885.htm"
try
r = requests.get(url,headers={'User-Agent': ua.random},timeout=timeout) # execute a timed website request
if r.status_code > 299: # check for bad status
r.raise_for_status() # if confirmed raise bad status
else:
print(r.status_code, url) # otherwise print status code and url
except Exception as e:
print('\nThe following exception: {0}, \nhas been found found on the following post: "{1}".\n'.format(e,url))
Expected status = 301 Moved Permanently
You can visit the page or check http://www.redirect-checker.org/index.php with the url for a correct terminal print.
Returned status = 200 OK
The page has been moved and it should return the above 301 Moved Permanently, however it returns a 200. I read the requests doc and checked all the parameters (allow_redirects=False etc.) but I don't think it is a mistake of configuration.
I am puzzled at why requests wouldn't see the redirects.
Any ideas?
Thank you in advance.
Python Requests module has the allow_redirect parameter in True by default. I've tested it with False and it gives the 301 code that you're looking for.
Note after reading your comment above: r.history saves each response_code before the one that you're right now which is saved in r.status_code (only if you leave the parameter in True).
I'm using Python 3.7 with urllib.
All work fine but it seems not to athomatically redirect when it gets an http redirect request (307).
This is the error i get:
ERROR 2020-06-15 10:25:06,968 HTTP Error 307: Temporary Redirect
I've to handle it with a try-except and manually send another request to the new Location: it works fine but i don't like it.
These is the piece of code i use to perform the request:
req = urllib.request.Request(url)
req.add_header('Authorization', auth)
req.add_header('Content-Type','application/json; charset=utf-8')
req.data=jdati
self.logger.debug(req.headers)
self.logger.info(req.data)
resp = urllib.request.urlopen(req)
url is an https resource and i set an header with some Authhorization info and content-type.
req.data is a JSON
From urllib documentation i've understood that the redirects are authomatically performed by the the library itself, but it doesn't work for me. It always raises an http 307 error and doesn't follow the redirect URL.
I've also tried to use an opener specifiyng the default redirect handler, but with the same result
opener = urllib.request.build_opener(urllib.request.HTTPRedirectHandler)
req = urllib.request.Request(url)
req.add_header('Authorization', auth)
req.add_header('Content-Type','application/json; charset=utf-8')
req.data=jdati
resp = opener.open(req)
What could be the problem?
The reason why the redirect isn't done automatically has been correctly identified by yours truly in the discussion in the comments section. Specifically, RFC 2616, Section 10.3.8 states that:
If the 307 status code is received in response to a request other
than GET or HEAD, the user agent MUST NOT automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.
Back to the question - given that data has been assigned, this automatically results in get_method returning POST (as per how this method was implemented), and since that the request method is POST, and the response code is 307, an HTTPError is raised instead as per the above specification. In the context of Python's urllib, this specific section of the urllib.request module raises the exception.
For an experiment, try the following code:
import urllib.request
import urllib.parse
url = 'http://httpbin.org/status/307'
req = urllib.request.Request(url)
req.data = b'hello' # comment out to not trigger manual redirect handling
try:
resp = urllib.request.urlopen(req)
except urllib.error.HTTPError as e:
if e.status != 307:
raise # not a status code that can be handled here
redirected_url = urllib.parse.urljoin(url, e.headers['Location'])
resp = urllib.request.urlopen(redirected_url)
print('Redirected -> %s' % redirected_url) # the original redirected url
print('Response URL -> %s ' % resp.url) # the final url
Running the code as is may produce the following
Redirected -> http://httpbin.org/redirect/1
Response URL -> http://httpbin.org/get
Note the subsequent redirect to get was done automatically, as the subsequent request was a GET request. Commenting out req.data assignment line will result in the lack of the "Redirected" output line.
Other notable things to note in the exception handling block, e.read() may be done to retrieve the response body produced by the server as part of the HTTP 307 response (since data was posted, there might be a short entity in the response that may be processed?), and that urljoin is needed as the Location header may be a relative URL (or simply has the host missing) to the subsequent resource.
Also, as a matter of interest (and for linkage purposes), this specific question has been asked multiple times before and I am rather surprised that they never got any answers, which follows:
How to handle 307 redirection using urllib2 from http to https
HTTP Error 307: Temporary Redirect in Python3 - INTRANET
HTTP Error 307 - Temporary redirect in python script
I am planning to send request to server with the following code. I have spent more than 1 day to resolve this but without any progress. And please forgive me I have to hide the real URL address becuase of Company Security Policy.
import requests
get_ci = requests.session()
get_ci_url = 'https://this_is_a_fake_URL_to_paste_in_stackoverflow.JSON'
get_ci_param_dict = {"Username": "fake","Password": "fakefakefake","CIType": "system","CIID": "sampleid","CIName": "","AttrFilter": "","SubObjFilter": ""}
get_ci_param_str = str(get_ci_param_dict)
print(get_ci_param_dict)
print(get_ci_param_str)
get_ci_result = get_ci.request('POST', url=get_ci_url, params=get_ci_param_str, verify=False)
print(get_ci_result.status_code)
print(get_ci_result.text)
And what I get in the Run result is,
C:\Python34\python.exe C:/Users/this/is/the/fake/path/Test_02.py
{'CIID': 'sampleid', 'CIType': 'system', 'AttrFilter': '', 'Password': 'fake', 'CIName': '', 'Username': 'fake', 'SubObjFilter': ''}
{'CIID': 'sampleid', 'CIType': 'system', 'AttrFilter': '', 'Password': 'fake', 'CIName': '', 'Username': 'fake', 'SubObjFilter': ''}
C:\Python34\lib\requests\packages\urllib3\connectionpool.py:843: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning)
500
<ns1:XMLFault xmlns:ns1="http://cxf.apache.org/bindings/xformat"><ns1:faultstring xmlns:ns1="http://cxf.apache.org/bindings/xformat">*org.codehaus.jettison.json.JSONException: A JSONObject text must begin with '{' at character 0 of* </ns1:faultstring></ns1:XMLFault>
Process finished with exit code 0
More tips,
I have contact the Server Code Developer - They only need is a
string in a JSON format sent in a "Parameter" way. Which means it
is correct to use params in request().
I have tried with dumps.json(get_ci_param_dict) => The same result.
It has returned the 200 code when I only request the server's root,
which proves me the url is ok.
Additional Logs when update the params to data.
C:\Python34\lib\requests\packages\urllib3\connectionpool.py:843: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning)
500
<html><head><title>Apache Tomcat/7.0.61 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 500 - 1</h1><HR size="1" noshade="noshade"><p><b>type</b> Exception report</p><p><b>message</b> <u>1</u></p><p><b>description</b> <u>The server encountered an internal error that prevented it from fulfilling this request.</u></p><p><b>exception</b> <pre>java.lang.ArrayIndexOutOfBoundsException: 1
com.fake.security.XSSHttpReuquestWrapper.GeneralParameters(XSSHttpReuquestWrapper.java:158)
com.fake.security.XSSHttpReuquestWrapper.checkParameter(XSSHttpReuquestWrapper.java:101)
com.fake.security.XSSHttpReuquestWrapper.validateParameter(XSSHttpReuquestWrapper.java:142)
com.fake.security.XSSSecurityFilter.doFilter(XSSSecurityFilter.java:35)
com.fake.webservice.interceptor.GetContextFilter.doFilter(GetContextFilter.java:24)
org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:88)
org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:76)
</pre></p><p><b>note</b> <u>The full stack trace of the root cause is available in the Apache Tomcat/7.0.61 logs.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/7.0.61</h3></body></html>
Process finished with exit code 0
FINAL SOLUTION FOR THIS QUESTION
#e4c5 your suggestion is helpful to figure out the final solution. The param should be sent to server via data, as the data should be sent out as dict or byte as defined in the offical document, so need use param as the dict key to send. Please see the code as below,
import requests
import json
get_ci_url = 'https://sample.fake.com:0000/sample/fake/fakeagain.JSON'
get_ci_param_dict = {"Username": "fake","Password": "fakefake".......}
get_ci_param_json = json.dumps(get_ci_param_dict)
params = {'param': get_ci_param_json}
get_ci_result = requests.request('POST', url=get_ci_url, data=params, verify=False)
print(get_ci_result.status_code)
print(get_ci_result.text)
ROOT CAUSE:param should be sent via data parameter. Official Document has clearly state that => :param data: (optional) Dictionary, bytes, or file-like object to send in the body of the :class:Request.
Thanks My colleauge - Mr.J and #e4c5's great help.
if what your server expects is json, you should use the json parameter to python requests
get_ci_result = get_ci.request('POST', url=get_ci_url,
json=get_ci_param_dict, verify=False)
also note that the params parameter is usually used with get (and is used to format the query string of a URL), with post and form data it should be data and a dictionary again.
For additional information please refer to : http://docs.python-requests.org/en/master/api/
Your dictionary of data will automatically be form-encoded when the request is made. Use json parameter when the server accepts JSON-Encoded POST/PATCH data instead of form-encoded data.
response = requests.post(url=url, headers=headers, json=data)
Using the json parameter in the request will change the Content-Type in the header to application/json.
Visit https://2.python-requests.org/en/master/user/quickstart/#More-complicated-POST-requests
I also had a similar problem.
Even though my data was already a dictionary I needed to json.dumps(data) again:
response = requests.post(url = url ,headers=head,data = json.dumps(data))
I just wanted to share this in case somebody has a similar problem.
I am new to python, can anyone tell me which python tools I should use to get my work done? Any good idea to build a python script to automatically find out these 404, 5XX requests?Thanks in advance!
We can check the response status code:
>>> r = requests.get('http://httpbin.org/get')
>>> r.status_code
200
Requests also comes with a built-in status code lookup object for easy reference:
>>> r.status_code == requests.codes.ok
True
If we made a bad request (a 4XX client error or 5XX server error response), we can raise it with Response.raise_for_status():
>>> bad_r = requests.get('http://httpbin.org/status/404')
>>> bad_r.status_code
404
But, since our status_code for r was 200, when we call raise_for_status() we get:
>>> r.raise_for_status()
None
Reffer : this link
In this little piece of code, what is the fourth line all about?
from google.appengine.api import urlfetch
url = "http://www.google.com/"
result = urlfetch.fetch(url)
if result.status_code == 200:
doSomethingWithResult(result.content)
It's a HTTP status code, it means "OK" (EG: The server successfully answered the http request).
See a list of them here on wikipedia
Whoever wrote that should have used a constant instead of a magic number. The httplib module has all the http response codes.
E.g.:
>>> import httplib
>>> httplib.OK
200
>>> httplib.NOT_FOUND
404
200 is the HTTP status code for "OK", a successful response. (Other codes you may be familiar with are 404 Not Found, 403 Forbidden, and 500 Internal Server Error.)
See RFC 2616 for more information.