I am making a streaming POST request with around 12MB of body data. Once this request is complete, I intend to send another POST request on the same TCP connection. However, I do not see it happening on the same TCP connection. It takes place on an another TCP connection(as per Wireshark). As per my understanding from here and here, that should not be the case. Here is the snippet:
# First POST request without Auth info
with open(file) as f:
r = requests.post(url, data=f, headers=h)
# Second POST request with Auth info
r = requests.post(url, headers=h)
I then tried using a session object, however, in this case I am not seeing the second POST request on the Wireshark (as a second POST method). It is actually getting appended in the end of the first streaming POST data:
# First POST request without Auth info
s = requests.session()
with open(file) as f:
r = s.post(url, data=f, headers=h)
# Second POST request with Auth info
r = s.post(url, headers=h)
SPBWZKSCM3RJQAKKC0B7UQ1DIRDWHPBXDYMTUPODQ4TFAFPZTQFMY6Q2SIY6ZET8W6BD4889Z69WMO7UIKQOZB22BOBQ1TH2EUUOOSQJA8B0Y***POST / HTTP/1.1
Content-Length:***
So, I have following questions:
why the first case does not work?
in the second case, why POST is getting appended to the previous streaming POST?
Thanks.
Related
I have created a list of payload/param/data which need to pass as a param for every post request. I am using grequest to send the request parallelly. So the idea is for each parallel post request the param should be from the list. But what I am observing with my code is, the request is happening with the same payload from the list for every request, it is not accessing all the payload from the list.
I have tried two different ways, please help what is wrong with it, I am kind of lost because I have tested the same thing with different URLs and headers it works why is not working for param/data/payload for post request.
Below is my code:
for i in range(0, count):
list_payload_user.append(fp1())
url2 = "https:xxxxxyzzz/rest/v3/edit-user-requests/" + devid
headers = {'Authorization': token, 'Content-Type': 'application/json'}
#this 1 way I have tried:-
rs = [grequests.post(url2, headers=headers, data=final_pl1(), hooks={'response': res}) for i in range(0, len(list_payload))]
#here I have used a function - final_pl1(), which should get called every time, I have seen it is sending a different payload, but when grequest sending the request it is sending with 1st payload only.
#another way:-
rs = [grequests.post(url2, headers=headers, data=pl, hooks={'response': res}) for pl in list_payload_user]
#here I have used the list, grequest is sending with 1st payload from the list for every post request.
resp = grequests.map(rs, exception_handler=err_handler)
print('Maped response status code ==> ', resp)
Any help would be appreciated.
Consider an http request using an OAuth token. The access token needs to be included in the header as bearer. However, if the token is expired, another request needs to be made to refresh the token and then try again. So the custom Retry object will look like:
s = requests.Session()
### token is added to the header here
s.headers.update(token_header)
retry = OAuthRetry(
total=2,
read=2,
connect=2,
backoff_factor=1,
status_forcelist=[401],
method_whitelist=frozenset(['GET', 'POST']),
session=s
)
adapter = HTTPAdapter(max_retries=retry)
s.mount('http://', adapter)
s.mount('https://', adapter)
r = s.post(url, data=data)
The Retry class:
class OAuthRetry(Retry):
def increment(self, method, url, *args, **kwargs):
# refresh the token here. This could be by getting a reference to the session or any other way.
return super(OAuthRetry, self).increment(method, url, *args, **kwargs)
The problem is that after the token is refreshed, HTTPConnectionPool is still using the same headers to make the request after calling increment. See: https://github.com/urllib3/urllib3/blob/master/src/urllib3/connectionpool.py#L787.
Although the instance of the pool is passed in increment, changing the headers there will not affect the call since it is using a local copy of the headers.
This seems like a use case that should come up frequently for the request parameters to change in between retries.
Is there a way to change the request headers in between two subsequent retries?
No, in current version of Requests(2.18.4) and urllib3(1.22).
Retrys is finally handled by openurl in urllib3. And by trace the code of the whole function, there is not a interface to change headers between retrys.
And dynamically changing headers should not be considered as a solution. From the doc:
headers – Dictionary of custom headers to send, such as User-Agent, If-None-Match, etc. If None, pool headers are used. If provided, these headers completely replace any pool-specific headers.
headers is a param passed to the function. And there is no guarantee that it will not be copy after passed. Although in current version of urllib3, openurl does not copy headers, any solution based on changing headers is considered hacky, since it's based on the implementation but not the documentation.
One work around
Interrupt a function and edit some verible it's using is very dangerous.
Instead of injecting something into urllib3, one simple solution is that check the response status and try again if needed.
r = s.post(url, data=data)
if r.status_code == 401:
# refresh the token here.
r = s.post(url, data=data)
Why does the original approach not work?
Requests copy the header in prepare_headers before sending it to urllib3. So urllib3 use the copy created before editing when retrying.
I'm using Python 3.7 with urllib.
All work fine but it seems not to athomatically redirect when it gets an http redirect request (307).
This is the error i get:
ERROR 2020-06-15 10:25:06,968 HTTP Error 307: Temporary Redirect
I've to handle it with a try-except and manually send another request to the new Location: it works fine but i don't like it.
These is the piece of code i use to perform the request:
req = urllib.request.Request(url)
req.add_header('Authorization', auth)
req.add_header('Content-Type','application/json; charset=utf-8')
req.data=jdati
self.logger.debug(req.headers)
self.logger.info(req.data)
resp = urllib.request.urlopen(req)
url is an https resource and i set an header with some Authhorization info and content-type.
req.data is a JSON
From urllib documentation i've understood that the redirects are authomatically performed by the the library itself, but it doesn't work for me. It always raises an http 307 error and doesn't follow the redirect URL.
I've also tried to use an opener specifiyng the default redirect handler, but with the same result
opener = urllib.request.build_opener(urllib.request.HTTPRedirectHandler)
req = urllib.request.Request(url)
req.add_header('Authorization', auth)
req.add_header('Content-Type','application/json; charset=utf-8')
req.data=jdati
resp = opener.open(req)
What could be the problem?
The reason why the redirect isn't done automatically has been correctly identified by yours truly in the discussion in the comments section. Specifically, RFC 2616, Section 10.3.8 states that:
If the 307 status code is received in response to a request other
than GET or HEAD, the user agent MUST NOT automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.
Back to the question - given that data has been assigned, this automatically results in get_method returning POST (as per how this method was implemented), and since that the request method is POST, and the response code is 307, an HTTPError is raised instead as per the above specification. In the context of Python's urllib, this specific section of the urllib.request module raises the exception.
For an experiment, try the following code:
import urllib.request
import urllib.parse
url = 'http://httpbin.org/status/307'
req = urllib.request.Request(url)
req.data = b'hello' # comment out to not trigger manual redirect handling
try:
resp = urllib.request.urlopen(req)
except urllib.error.HTTPError as e:
if e.status != 307:
raise # not a status code that can be handled here
redirected_url = urllib.parse.urljoin(url, e.headers['Location'])
resp = urllib.request.urlopen(redirected_url)
print('Redirected -> %s' % redirected_url) # the original redirected url
print('Response URL -> %s ' % resp.url) # the final url
Running the code as is may produce the following
Redirected -> http://httpbin.org/redirect/1
Response URL -> http://httpbin.org/get
Note the subsequent redirect to get was done automatically, as the subsequent request was a GET request. Commenting out req.data assignment line will result in the lack of the "Redirected" output line.
Other notable things to note in the exception handling block, e.read() may be done to retrieve the response body produced by the server as part of the HTTP 307 response (since data was posted, there might be a short entity in the response that may be processed?), and that urljoin is needed as the Location header may be a relative URL (or simply has the host missing) to the subsequent resource.
Also, as a matter of interest (and for linkage purposes), this specific question has been asked multiple times before and I am rather surprised that they never got any answers, which follows:
How to handle 307 redirection using urllib2 from http to https
HTTP Error 307: Temporary Redirect in Python3 - INTRANET
HTTP Error 307 - Temporary redirect in python script
I need to test POST and GET calls against an NGINX server.
I need to capture the error codes and verify the response. I was able to test the GET requests by hitting localhost:8080 (NGINX is running on docker exposing 8080), but I'm not sure how to test the POST calls.
Can we construct a dummy request and test POST call? NGINX runs with default page.
Below is one way to make a post request to an endpoint in python
import requests
API_ENDPOINT = "http://pastebin.com/api/api_post.php"
data = {param1:value1,
param2:value2}
#sending post request and saving response as response object
r = requests.post(url = API_ENDPOINT, data = data)
#extracting response text
pastebin_url = r.text
print("The pastebin URL is:%s"%pastebin_url)
I am trying to delete a git branch from gitlab, using the gitlab API with a personal access token.
If I use curl like this:
curl --request DELETE --header "PRIVATE_TOKEN: somesecrettoken" "deleteurl"
then it works and the branch is deleted.
But if I use requests like this:
token_data = {'private_token': "somesecrettoken"}
requests.Request("DELETE", url, data= token_data)
it doesn't work; the branch is not deleted.
Your requests code is indeed not doing the same thing. You are setting data=token_data, which puts the token in the request body. The curl command-line uses a HTTP header instead, and leaves the body empty.
Do the same in Python:
token_data = {'Private-Token': "somesecrettoken"}
requests.Request("DELETE", url, headers=token_data)
You can also put the token in the URL parameters, via the params argument:
token_data = {'private_token': "somesecrettoken"}
requests.Request("DELETE", url, params=token_data)
This adds ?private_token=somesecrettoken to the URL sent to gitlab.
However, GitLab does accept the private_token value in the request body as well, either as form data or as JSON. Which means that you are using the requests API wrong.
A requests.Request() instance is not going to be sent without additional work. It is normally only needed if you want to access the prepared data before sending.
If you don't need to use this more advanced feature, use the requests.delete() method:
response = requests.delete(url, headers=token_data)
If you do need the feature, use a requests.Session() object, then first prepare the request object, then send it:
with requests.Session() as session:
request = requests.Request("DELETE", url, params=token_data)
prepped = request.prepare()
response = session.send(prepped)
Even without needing to use prepared requests, a session is very helpful when using an API. You can set the token once, on a session:
with requests.Session() as session:
session.headers['Private-Token'] = 'somesecrettoken'
# now all requests via the session will use this header
response = session.get(url1)
response = session.post(url2, json=....)
response = session.delete(url3)
# etc.