What HTTP response codes are retried by python Requests - python

What are the list of HTTP response status codes retried by default and how many times, by the Python Requests. How can I change the number of retries? I couldn't find any documentation for it.
I tried below code and there were two retries on 401 status code.
import requests
from http.client import HTTPConnection
from requests.auth import HTTPDigestAuth
HTTPConnection.debuglevel = 1
requests.adapters.DEFAULT_RETRIES = 5
def test():
data = 'testdata'
username = 'testuser'
password = 'test'
url='https://example.com:443/captionen_0001.vtt'
try:
response = requests.put(url, auth=HTTPDigestAuth(username,password), data=data, verify=False)
except Exception as e:
print('error'+str(e))
test()
warnings.warn(
send: b'PUT /channel_captionen_0001.vtt HTTP/1.1\r\nHost: example.com\r\nUser-Agent: python-requests/2.24.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\nContent-Length: 8\r\n\r\n'
send: b'testdata'
reply: 'HTTP/1.1 401 Authorization Required\r\n'
header: Date: Sat, 05 Feb 2022 07:50:25 GMT
header: WWW-Authenticate: Digest realm="WebDAV", nonce="tE/JnkDX845db3", algorithm=MD5, qop="auth"
warnings.warn(
send: b'PUT /channel_captionen_0001.vtt HTTP/1.1\r\nHost: example.com\r\nUser-Agent: python-requests/2.24.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\nContent-Length: 8\r\nAuthorization: Digest username="testuser", realm="WebDAV", nonce="tE/JnkDX845db3", uri="/channel_captionen_0001.vtt", response="1c3299c716797e8f36528f6e6dbaeb50", algorithm="MD5", qop="auth", nc=00000001, cnonce="dd0835ef485c6b71"\r\n\r\n'
send: b'testdata'
reply: 'HTTP/1.1 401 Authorization Required\r\n'
header: Date: Sat, 05 Feb 2022 07:50:25 GMT
header: WWW-Authenticate: Digest realm="WebDAV", nonce="/vXKnkDXBQc098a4", algorithm=MD5, qop="auth

It's not obviously to find. You have to know requests is not the package that manage the connection, urllib3 does.
In the source code of HTTPAdapter (use it when you want more control on requests), the docstring on max_retries parameter said:
If you need granular control over the conditions under which we retry a request, import urllib3's Retry class and pass that instead
Now you can refer to the documentation of urllib3 for Retry class.
Read especially status_forcelist parameter and RETRY_AFTER_STATUS_CODES (default: frozenset({413, 429, 503}))
Update
import requests
import urllib3
my_code_list = [401, 403, ...]
s = requests.Session()
r = urllib3.util.Retry(status_forcelist=my_code_list)
a = requests.adapters.HTTPAdapter(max_retries=r)
s.mount('http://', a)

Related

Python. Cannot make a request to simple site api. Flask and requests

I am trying to create simple API for my site. I created the route with flask:
#api.route('/api/rate&message_id=<message_id>&performer=<performer_login>', methods=['POST'])
def api_rate_msg(message_id, performer_login):
print("RATE API ", message_id, ' ', performer_id)
return 400
print(...) function don't execute...
I use flask-socketio to communicate between client and server.
I send json from client and process it with:
#socket.on('rate')
def handle_rate(data):
print(data)
payload = {'message_id':data['message_id'], 'performer':data['performer']}
r = requests.post('/api/rate', params=payload)
print (r.status_code)
Note, that data variable is sending from client and is correct(I've checked it).
print(r.status_code) don't exec too...
Where I'm wrong? Please, sorry for my bad english :(
This api function must increase rate of message, which stored in mongodb, if interesting.
Don't put &message_id=<message_id>&performer=<performer_login> in your route string. Instead, get these arguments from request.args.
Try it:
from flask import request
...
#api.route('/api/rate', methods=['POST'])
def api_rate_msg():
print(request.args)
return ''
I've tested it with httpie:
$ http -v POST :5000/api/rate message_id==123 performer_login==foo
POST /api/rate?message_id=123&performer_login=foo HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 0
Host: localhost:5000
User-Agent: HTTPie/0.9.8
HTTP/1.0 200 OK
Content-Length: 0
Content-Type: text/html; charset=utf-8
Date: Sun, 02 Apr 2017 13:54:40 GMT
Server: Werkzeug/0.11.11 Python/2.7.13
And from flask's log:
ImmutableMultiDict([('message_id', u'123'), ('performer_login', u'foo')])
127.0.0.1 - - [02/Apr/2017 22:54:40] "POST /api/rate?message_id=123&performer_login=foo HTTP/1.1" 200 -
Remove the below part from your api route
&message_id=<message_id>&performer=<performer_login
This is not required in POST request. It helps in GET requests. API call in request is not matching the route definition and therefore you have the current problem

httplib - http not accepting content length

Problem
When I switched Macbooks, all of the sudden I am getting an HTTP 411: Length Required (I wasn't getting this using a different Mac) trying to use a POST request with httplib. I cannot seem to find a work around for this.
Code Portion 1: from a supporting class; retrieves data and other things,
class Data(object):
def __init__(self, value):
self.company_id = None
self.host = settings.CONSUMER_URL
self.body = None
self.headers = {"clienttype": "Cloud-web", "Content-Type": "application/json", "ErrorLogging": value}
def login(self):
'''Login and store auth token'''
path = "/Security/Login"
body = self.get_login_info()
status_code, resp = self.submit_request("POST", path, json.dumps(body))
self.info = json.loads(resp)
company_id = self.get_company_id(self.info)
self.set_token(self.info["token"])
return company_id
def submit_request(self, method, path, body=None, header=None):
'''Submit requests for API tests'''
conn = httplib.HTTPSConnection(self.host)
conn.set_debuglevel(1)
conn.request(method, path, body, self.headers)
resp = conn.getresponse()
return resp.status, resp.read()
Code Portion 2: my unittests,
# logging in
cls.api = data.Data(False) # initializing the data class from Code Portion 1
cls.company_id = cls.api.login()
...
# POST Client/Register
def test_client_null_body(self):
'''Null body object - 501'''
status, resp = self.api.submit_request('POST', '/Client/register')
if status != 500:
log.log_warning('POST /Client/register: %s, %s' % (str(status), str(resp)))
self.assertEqual(status, 500)
Code Portion 3: example of the data I send from a settings file,
API_ACCOUNT = {
"userName": "account#account.com",
"password": "password",
"companyId": 107
}
From Logging
WARNING:root: POST /Client/register: 411, <!DOCTYPE HTML PUBLIC "-//W3C//DTD
HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Length Required</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Length Required</h2>
<hr><p>HTTP Error 411. The request must be chunked or have a content length.</p>
</BODY></HTML>
Additional Info: I was using a 2008 Macbook Pro without issue. Switched to a 2013 Macbook Pro and this keeps occurring.
I took a look at this post:
Python httplib and POST and it seems that at the time httplib did not automatically generate the content length.
Now https://docs.python.org/2/library/httplib.html:
If one is not provided in headers, a Content-Length header is added automatically for all methods if the length of the body can be determined, either from the length of the str representation, or from the reported size of the file on disk.
when using conn.set_debuglevel(1) we see that httplib is sending a header
reply: 'HTTP/1.1 411 Length Required\r\n'
header: Content-Type: text/html; charset=us-ascii
header: Server: Microsoft-HTTPAPI/2.0
header: Date: Thu, 26 May 2016 17:08:46 GMT
header: Connection: close
header: Content-Length: 344
Edit
Unittest Failure:
======================================================================
FAIL: test_client_null_body (__main__.NegApi)
Null body object - 501
----------------------------------------------------------------------
Traceback (most recent call last):
File "API_neg.py", line 52, in test_client_null_body
self.assertEqual(status, 500)
AssertionError: 411 != 500
.send: 'POST /Client/register HTTP/1.1\r\nHost: my.host\r\nAccept-Encoding: identity\r\nAuthorizationToken: uhkGGpJ4aQxm8BKOCH5dt3bMcwsHGCHs1p+OJvtf9mHKa/8pTEnKyYeJr+boBr8oUuvWvZLr1Fd+Og2xJP3xVw==\r\nErrorLogging: False\r\nContent-Type: application/json\r\nclienttype: Cloud-web\r\n\r\n'
reply: 'HTTP/1.1 411 Length Required\r\n'
header: Content-Type: text/html; charset=us-ascii
header: Server: Microsoft-HTTPAPI/2.0
header: Date: Thu, 26 May 2016 17:08:27 GMT
header: Connection: close
header: Content-Length: 344
Any ideas as to why this was working on a previous Mac and is currently not working here? It's the same code, same operating systems. Let me know if I can provide any more information.
Edit 2
The issue seemed to be with OSX 10.10.4, after upgrading to 10.10.5 all is well. I still would like to get some insight on why I was having this issue.
The only change from 10.10.4 to 10.10.5, that seems close, would have been the python update from 2.7.6 to 2.7.10 which includes this bug fix: http://bugs.python.org/issue22417

Python requests - print entire http request (raw)?

While using the requests module, is there any way to print the raw HTTP request?
I don't want just the headers, I want the request line, headers, and content printout. Is it possible to see what ultimately is constructed from HTTP request?
Since v1.2.3 Requests added the PreparedRequest object. As per the documentation "it contains the exact bytes that will be sent to the server".
One can use this to pretty print a request, like so:
import requests
req = requests.Request('POST','http://stackoverflow.com',headers={'X-Custom':'Test'},data='a=1&b=2')
prepared = req.prepare()
def pretty_print_POST(req):
"""
At this point it is completely built and ready
to be fired; it is "prepared".
However pay attention at the formatting used in
this function because it is programmed to be pretty
printed and may differ from the actual request.
"""
print('{}\n{}\r\n{}\r\n\r\n{}'.format(
'-----------START-----------',
req.method + ' ' + req.url,
'\r\n'.join('{}: {}'.format(k, v) for k, v in req.headers.items()),
req.body,
))
pretty_print_POST(prepared)
which produces:
-----------START-----------
POST http://stackoverflow.com/
Content-Length: 7
X-Custom: Test
a=1&b=2
Then you can send the actual request with this:
s = requests.Session()
s.send(prepared)
These links are to the latest documentation available, so they might change in content:
Advanced - Prepared requests and API - Lower level classes
import requests
response = requests.post('http://httpbin.org/post', data={'key1': 'value1'})
print(response.request.url)
print(response.request.body)
print(response.request.headers)
Response objects have a .request property which is the PreparedRequest object that was sent.
An even better idea is to use the requests_toolbelt library, which can dump out both requests and responses as strings for you to print to the console. It handles all the tricky cases with files and encodings which the above solution does not handle well.
It's as easy as this:
import requests
from requests_toolbelt.utils import dump
resp = requests.get('https://httpbin.org/redirect/5')
data = dump.dump_all(resp)
print(data.decode('utf-8'))
Source: https://toolbelt.readthedocs.org/en/latest/dumputils.html
You can simply install it by typing:
pip install requests_toolbelt
Note: this answer is outdated. Newer versions of requests support getting the request content directly, as AntonioHerraizS's answer documents.
It's not possible to get the true raw content of the request out of requests, since it only deals with higher level objects, such as headers and method type. requests uses urllib3 to send requests, but urllib3 also doesn't deal with raw data - it uses httplib. Here's a representative stack trace of a request:
-> r= requests.get("http://google.com")
/usr/local/lib/python2.7/dist-packages/requests/api.py(55)get()
-> return request('get', url, **kwargs)
/usr/local/lib/python2.7/dist-packages/requests/api.py(44)request()
-> return session.request(method=method, url=url, **kwargs)
/usr/local/lib/python2.7/dist-packages/requests/sessions.py(382)request()
-> resp = self.send(prep, **send_kwargs)
/usr/local/lib/python2.7/dist-packages/requests/sessions.py(485)send()
-> r = adapter.send(request, **kwargs)
/usr/local/lib/python2.7/dist-packages/requests/adapters.py(324)send()
-> timeout=timeout
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(478)urlopen()
-> body=body, headers=headers)
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(285)_make_request()
-> conn.request(method, url, **httplib_request_kw)
/usr/lib/python2.7/httplib.py(958)request()
-> self._send_request(method, url, body, headers)
Inside the httplib machinery, we can see HTTPConnection._send_request indirectly uses HTTPConnection._send_output, which finally creates the raw request and body (if it exists), and uses HTTPConnection.send to send them separately. send finally reaches the socket.
Since there's no hooks for doing what you want, as a last resort you can monkey patch httplib to get the content. It's a fragile solution, and you may need to adapt it if httplib is changed. If you intend to distribute software using this solution, you may want to consider packaging httplib instead of using the system's, which is easy, since it's a pure python module.
Alas, without further ado, the solution:
import requests
import httplib
def patch_send():
old_send= httplib.HTTPConnection.send
def new_send( self, data ):
print data
return old_send(self, data) #return is not necessary, but never hurts, in case the library is changed
httplib.HTTPConnection.send= new_send
patch_send()
requests.get("http://www.python.org")
which yields the output:
GET / HTTP/1.1
Host: www.python.org
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/2.1.0 CPython/2.7.3 Linux/3.2.0-23-generic-pae
requests supports so called event hooks (as of 2.23 there's actually only response hook). The hook can be used on a request to print full request-response pair's data, including effective URL, headers and bodies, like:
import textwrap
import requests
def print_roundtrip(response, *args, **kwargs):
format_headers = lambda d: '\n'.join(f'{k}: {v}' for k, v in d.items())
print(textwrap.dedent('''
---------------- request ----------------
{req.method} {req.url}
{reqhdrs}
{req.body}
---------------- response ----------------
{res.status_code} {res.reason} {res.url}
{reshdrs}
{res.text}
''').format(
req=response.request,
res=response,
reqhdrs=format_headers(response.request.headers),
reshdrs=format_headers(response.headers),
))
requests.get('https://httpbin.org/', hooks={'response': print_roundtrip})
Running it prints:
---------------- request ----------------
GET https://httpbin.org/
User-Agent: python-requests/2.23.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
None
---------------- response ----------------
200 OK https://httpbin.org/
Date: Thu, 14 May 2020 17:16:13 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 9593
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
<!DOCTYPE html>
<html lang="en">
...
</html>
You may want to change res.text to res.content if the response is binary.
Here is a code, which makes the same, but with response headers:
import socket
def patch_requests():
old_readline = socket._fileobject.readline
if not hasattr(old_readline, 'patched'):
def new_readline(self, size=-1):
res = old_readline(self, size)
print res,
return res
new_readline.patched = True
socket._fileobject.readline = new_readline
patch_requests()
I spent a lot of time searching for this, so I'm leaving it here, if someone needs.
A fork of #AntonioHerraizS answer (HTTP version missing as stated in comments)
Use this code to get a string representing the raw HTTP packet without sending it:
import requests
def get_raw_request(request):
request = request.prepare() if isinstance(request, requests.Request) else request
headers = '\r\n'.join(f'{k}: {v}' for k, v in request.headers.items())
body = '' if request.body is None else request.body.decode() if isinstance(request.body, bytes) else request.body
return f'{request.method} {request.path_url} HTTP/1.1\r\n{headers}\r\n\r\n{body}'
headers = {'User-Agent': 'Test'}
request = requests.Request('POST', 'https://stackoverflow.com', headers=headers, json={"hello": "world"})
raw_request = get_raw_request(request)
print(raw_request)
Result:
POST / HTTP/1.1
User-Agent: Test
Content-Length: 18
Content-Type: application/json
{"hello": "world"}
💡 Can also print the request in the response object
r = requests.get('https://stackoverflow.com')
raw_request = get_raw_request(r.request)
print(raw_request)
I use the following function to format requests. It's like #AntonioHerraizS except it will pretty-print JSON objects in the body as well, and it labels all parts of the request.
format_json = functools.partial(json.dumps, indent=2, sort_keys=True)
indent = functools.partial(textwrap.indent, prefix=' ')
def format_prepared_request(req):
"""Pretty-format 'requests.PreparedRequest'
Example:
res = requests.post(...)
print(format_prepared_request(res.request))
req = requests.Request(...)
req = req.prepare()
print(format_prepared_request(res.request))
"""
headers = '\n'.join(f'{k}: {v}' for k, v in req.headers.items())
content_type = req.headers.get('Content-Type', '')
if 'application/json' in content_type:
try:
body = format_json(json.loads(req.body))
except json.JSONDecodeError:
body = req.body
else:
body = req.body
s = textwrap.dedent("""
REQUEST
=======
endpoint: {method} {url}
headers:
{headers}
body:
{body}
=======
""").strip()
s = s.format(
method=req.method,
url=req.url,
headers=indent(headers),
body=indent(body),
)
return s
And I have a similar function to format the response:
def format_response(resp):
"""Pretty-format 'requests.Response'"""
headers = '\n'.join(f'{k}: {v}' for k, v in resp.headers.items())
content_type = resp.headers.get('Content-Type', '')
if 'application/json' in content_type:
try:
body = format_json(resp.json())
except json.JSONDecodeError:
body = resp.text
else:
body = resp.text
s = textwrap.dedent("""
RESPONSE
========
status_code: {status_code}
headers:
{headers}
body:
{body}
========
""").strip()
s = s.format(
status_code=resp.status_code,
headers=indent(headers),
body=indent(body),
)
return s
test_print.py content:
import logging
import pytest
import requests
from requests_toolbelt.utils import dump
def print_raw_http(response):
data = dump.dump_all(response, request_prefix=b'', response_prefix=b'')
return '\n' * 2 + data.decode('utf-8')
#pytest.fixture
def logger():
log = logging.getLogger()
log.addHandler(logging.StreamHandler())
log.setLevel(logging.DEBUG)
return log
def test_print_response(logger):
session = requests.Session()
response = session.get('http://127.0.0.1:5000/')
assert response.status_code == 300, logger.warning(print_raw_http(response))
hello.py content:
from flask import Flask
app = Flask(__name__)
#app.route('/')
def hello_world():
return 'Hello, World!'
Run:
$ python -m flask hello.py
$ python -m pytest test_print.py
Stdout:
------------------------------ Captured log call ------------------------------
DEBUG urllib3.connectionpool:connectionpool.py:225 Starting new HTTP connection (1): 127.0.0.1:5000
DEBUG urllib3.connectionpool:connectionpool.py:437 http://127.0.0.1:5000 "GET / HTTP/1.1" 200 13
WARNING root:test_print_raw_response.py:25
GET / HTTP/1.1
Host: 127.0.0.1:5000
User-Agent: python-requests/2.23.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
HTTP/1.0 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: 13
Server: Werkzeug/1.0.1 Python/3.6.8
Date: Thu, 24 Sep 2020 21:00:54 GMT
Hello, World!

HTTP Authentication headers with no "realm"

The following code works but gives no output even though there is a file instance lying behind response
import urllib2
from ntlm import HTTPNtlmAuthHandler
user = 'id'
password = "Password"
url = "http://abc.def.ghij:3080"
passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, url, user, password)
# create the NTLM authentication handler
auth_NTLM = HTTPNtlmAuthHandler.HTTPNtlmAuthHandler(passman)
# create and install the opener
opener = urllib2.build_opener(auth_NTLM)
urllib2.install_opener(opener)
# retrieve the result
response = urllib2.urlopen("http://abc.def.ghij:3080")
print response.read()
print response.headers
header response:
Cache-Control: private
Content-Length: 0
Location: /index.epx
Server: Microsoft-IIS/7.5
X-AspNet-Version: 4.19
Persistent-Auth: true
X-Powered-By: ASP.NET
Date: Thu, 04 Jul 2013 15:30:03 GMT
Connection: close
but print response.read doesnt give any content :O
WWW-Authenticate: Negotiate
WWW-Authenticate: NTLM
The problem is that your webserver is requesting NTLM authentication, so it wont accept BasicAuth. Use NTML authentication while sending request or change your webserver config to allow Basic authentication.
Dont use BasicAuth because it sends user/pwd in plaintext over the network. Minimum safe is DigestAuth or NTLM or GSSNegotiate auth.

'urllib2.urlopen' adding Host header

I'm using Observium to pull Nginx stats on localhost however it returns '405 Not Allowed':
# curl -I localhost/nginx_status
HTTP/1.1 405 Not Allowed
Server: nginx
Date: Wed, 19 Jun 2013 22:12:37 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 166
Connection: keep-alive
Keep-Alive: timeout=5
# curl -I -H "Host: example.com" localhost/nginx_status
HTTP/1.1 200 OK
Server: nginx
Date: Wed, 19 Jun 2013 22:12:43 GMT
Content-Type: text/plain
Connection: keep-alive
Keep-Alive: timeout=5
Could you please advise how to add Host header with 'urllib2.urlopen' in Python (Python 2.6.6
):
Current script:
#!/usr/bin/env python
import urllib2
import re
data = urllib2.urlopen('http://localhost/nginx_status').read()
params = {}
for line in data.split("\n"):
smallstat = re.match(r"\s?Reading:\s(.*)\sWriting:\s(.*)\sWaiting:\s(.*)$", line)
req = re.match(r"\s+(\d+)\s+(\d+)\s+(\d+)", line)
if smallstat:
params["Reading"] = smallstat.group(1)
params["Writing"] = smallstat.group(2)
params["Waiting"] = smallstat.group(3)
elif req:
params["Requests"] = req.group(3)
else:
pass
dataorder = [
"Active",
"Reading",
"Writing",
"Waiting",
"Requests"
]
print "<<<nginx>>>\n";
for param in dataorder:
if param == "Active":
Active = int(params["Reading"]) + int(params["Writing"]) + int(params["Waiting"])
print Active
else:
print params[param]
You might want to check out the urllib2 missing manual for more information, but basically you create a dictionary of your header labels and values and pass it to the urllib2.Request method. A (slightly) modified version of the code from the linked manual:
from urllib import urlencode
from urllib2 import Request urlopen
# Define values that we'll pass to our urllib and urllib2 methods
url = 'http://www.something.com/blah'
user_host = 'example.com'
values = {'name' : 'Engineero', # dict of keys and values for our POST data
'location' : 'Interwebs',
'language' : 'Python' }
headers = { 'Host' : user_host } # dict of keys and values for our header
# Set up our request, execute, and read
data = urlencode(values) # encode for sending URL request
req = Request(url, data, headers) # make POST request to url with data and headers
response = urlopen(req) # get the response from the server
the_page = response.read() # read the response from the server
# Do other stuff with the response

Categories