Redirect with no auth - python

According to the docs, it should be as simple as:
data = self.http_pool.urlopen('GET', file_url,
preload_content=False,
retries=max_download_retries)
request.add_unredirected_header(key, header)
Add a header that will not be added to a redirected request.
But I cannot seem to find any examples on how this can be achieved.
I am using the pyupdater to download updates from bitbucket and launch the newest version of exe. I am using this library to create a script that connects to bitbucket fine, but then it later redirects to Amazon with nauthorization: Basic <redacted>\r\n\r\n (this is bitbucket auth) meaning I get 'HTTP/1.1 400 Bad Request\r\n'. Amazon does not support basic auth. This should be easily solvable, but I cannot find much on this issue.
The solutions presented here, require Recreating each redirected request manually. This would become an ever-growing list and get tedious very quickly, if I had to do this for new file I uploaded. It also does not continue the rest of the script, but rather downloads to the same directory.
As this is how Pyupdater handles the downloads this is where the issue would likely be solved.
Line 366 of downloader.py:
data = self.http_pool.urlopen('GET', file_url,
preload_content=False,
retries=max_download_retries)
Any ideas on how to fix this so it no longer creates this error.
Full Error (ctrl f -> 400):
Python main.py
DEBUG:root:Version - 2.5.1
DEBUG:pyupdater.client:PyUpdater Version 2.5.1
Current version is 1.3
{'authorization': 'Basic <redacted>'}
DEBUG:pyupdater.client:Setting up directories...
DEBUG:pyupdater.client:Downloading key file
DEBUG:pyupdater.client.downloader:Url for request: https://api.bitbucket.org/2.0/repositories/ brofewfefwefewef/eee/downloads/keys.gz
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.bitbucket.org
send: b'GET /2.0/repositories/ brofewfefwefewef/eee/downloads/keys.gz HTTP/1.1\r\nHost: api.bitbucket.org\r\nAccept-Encoding: identity\r\nauthorization: Basic <redacted>\r\n\r\n'
reply: 'HTTP/1.1 302 Found\r\n'
DEBUG:urllib3.connectionpool:https://api.bitbucket.org:443 "GET /2.0/repositories/brofewfefwefewef/eee/downloads/keys.gz HTTP/1.1" 302 0
DEBUG:urllib3.util.retry:Incremented Retry for (url='https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/keys.gz'): Retry(total=2, connect=None, read=None, redirect=None, status=None)
INFO:urllib3.poolmanager:Redirecting https://api.bitbucket.org/2.0/repositories/ brofewfefwefewef/eee/downloads/keys.gz -> https://bbuseruploads.s3.amazonaws.com/a0e395b6-0c54-4efb-9074-57ec4190020b/downloads/3fc0be6d-ca69-42d3-9711-fbb5cfd2bc38/keys.gz?Signature=<redacted>&Expires=1515976464&AWSAccessKeyId=<redacted>&versionId=n.ymY11KRkq36Xozy25aChvfUT.YzTf5&response-content-disposition=attachment%3B%20filename%3D%22keys.gz%22
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): bbuseruploads.s3.amazonaws.com
header: Server header: Vary header: Content-Type header: X-OAuth-Scopes header: Strict-Transport-Security header: Date header: Location header: X-Served-By header: ETag header: X-Static-Version header: X-Content-Type-Options header: X-Accepted-OAuth-Scopes header: X-Credential-Type header: X-Render-Time header: Connection header: X-Request-Count header: X-Frame-Options header: X-Version header: Content-Length send: b'GET /a0e395b6-0c54-4efb-9074-57ec4190020b/downloads/3fc0be6d-ca69-42d3-9711-fbb5cfd2bc38/keys.gz?Signature=<redacted>&Expires=1515976464&AWSAccessKeyId=<redacted>&versionId=n.ymY11KRkq36Xozy25aChvfUT.YzTf5&response-content-disposition=attachment%3B%20filename%3D%22keys.gz%22 HTTP/1.1\r\nHost: bbuseruploads.s3.amazonaws.com\r\nAccept-Encoding: identity\r\nauthorization: Basic <redacted>\r\n\r\n'
reply: 'HTTP/1.1 400 Bad Request\r\n'
DEBUG:urllib3.connectionpool:https://bbuseruploads.s3.amazonaws.com:443 "GET /a0e395b6-0c54-4efb-9074-57ec4190020b/downloads/3fc0be6d-ca69-42d3-9711-fbb5cfd2bc38/keys.gz?Signature=<redacted>&Expires=1515976464&AWSAccessKeyId=<redacted>&versionId=n.ymY11KRkq36Xozy25aChvfUT.YzTf5&response-content-disposition=attachment%3B%20filename%3D%22keys.gz%22 HTTP/1.1" 400 None
DEBUG:pyupdater.client.downloader:Resource URL: https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/keys.gz
DEBUG:pyupdater.client.downloader:Got content length of: None
DEBUG:pyupdater.client.downloader:Content-Length not in headers
DEBUG:pyupdater.client.downloader:Callbacks will not show time left or percent downloaded.
DEBUG:pyupdater.client.downloader:Using file as storage since the file is too large
DEBUG:pyupdater.client.downloader:Block size: 1036
DEBUG:pyupdater.client.downloader:{'total': None, 'downloaded': 519, 'status': 'downloading', 'percent_complete': '-.-%', 'time': '--:--'}
DEBUG:pyupdater.client.downloader:{'total': None, 'downloaded': 519, 'status': 'finished', 'percent_complete': '-.-%', 'time': '00:00'}
DEBUG:pyupdater.client.downloader:Download Complete
DEBUG:pyupdater.client.downloader:No hash to verify
WARNING:pyupdater.client.downloader:Downloaded file is very large, reading it in to memory may crash the app
DEBUG:pyupdater.client:Failed to decompress gzip file
DEBUG:pyupdater.client:Version file download failed
header: x-amz-request-id header: x-amz-id-2 header: Content-Type header: Transfer-Encoding header: Date header: Connection header: Server {'authorization': 'Basic <redacted>'}
DEBUG:pyupdater.client:Not a gzipped file (b'<?')
Traceback (most recent call last):
File "C:\Users\Django\AppData\Local\Continuum\miniconda3\lib\site-packages\pyupdater\client\__init__.py", line 440, in _get_key_data
decompressed_data = _gzip_decompress(data)
File "C:\Users\Django\AppData\Local\Continuum\miniconda3\lib\site-packages\dsdev_utils\helpers.py", line 58, in gzip_decompress
data = decompressed_file.read()
File "C:\Users\Django\AppData\Local\Continuum\miniconda3\Lib\gzip.py", line 276, in read
return self._buffer.read(size)
File "C:\Users\Django\AppData\Local\Continuum\miniconda3\Lib\gzip.py", line 463, in read
if not self._read_gzip_header():
File "C:\Users\Django\AppData\Local\Continuum\miniconda3\Lib\gzip.py", line 411, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'<?')
DEBUG:pyupdater.client:Loading version file...
DEBUG:pyupdater.client:Downloading online version file
DEBUG:pyupdater.client.downloader:Url for request: https://api.bitbucket.org/2.0/repositories/ brofewfefwefewef/eee/downloads/versions.gz
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.bitbucket.org
send: b'GET /2.0/repositories/ brofewfefwefewef/eee/downloads/versions.gz HTTP/1.1\r\nHost: api.bitbucket.org\r\nAccept-Encoding: identity\r\nauthorization: Basic <redacted>\r\n\r\n'
reply: 'HTTP/1.1 302 Found\r\n'
DEBUG:urllib3.connectionpool:https://api.bitbucket.org:443 "GET /2.0/repositories/brofewfefwefewef/eee/downloads/versions.gz HTTP/1.1" 302 0
DEBUG:urllib3.util.retry:Incremented Retry for (url='https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/versions.gz'): Retry(total=2, connect=None, read=None, redirect=None, status=None)
INFO:urllib3.poolmanager:Redirecting https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/versions.gz -> https://bbuseruploads.s3.amazonaws.com/a0e395b6-0c54-4efb-9074-57ec4190020b/downloads/0b04c4a8-dd59-49d2-9cd7-95d22379a5e6/versions.gz?Signature=<redacted>&Expires=1515976465&AWSAccessKeyId=<redacted>&versionId=jLhOcIbVAU4xRghD3kB2NfB4iLqUr7PM&response-content-disposition=attachment%3B%20filename%3D%22versions.gz%22
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): bbuseruploads.s3.amazonaws.com
header: Server header: Vary header: Content-Type header: X-OAuth-Scopes header: Strict-Transport-Security header: Date header: Location header: X-Served-By header: ETag header: X-Static-Version header: X-Content-Type-Options header: X-Accepted-OAuth-Scopes header: X-Credential-Type header: X-Render-Time header: Connection header: X-Request-Count header: X-Frame-Options header: X-Version header: Content-Length send: b'GET /a0e395b6-0c54-4efb-9074-57ec4190020b/downloads/0b04c4a8-dd59-49d2-9cd7-95d22379a5e6/versions.gz?Signature=<redacted>&Expires=1515976465&AWSAccessKeyId=<redacted>&versionId=jLhOcIbVAU4xRghD3kB2NfB4iLqUr7PM&response-content-disposition=attachment%3B%20filename%3D%22versions.gz%22 HTTP/1.1\r\nHost: bbuseruploads.s3.amazonaws.com\r\nAccept-Encoding: identity\r\nauthorization: Basic <redacted>\r\n\r\n'
DEBUG:urllib3.connectionpool:https://bbuseruploads.s3.amazonaws.com:443 "GET /a0e395b6-0c54-4efb-9074-57ec4190020b/downloads/0b04c4a8-dd59-49d2-9cd7-95d22379a5e6/versions.gz?Signature=<redacted>&Expires=1515976465&AWSAccessKeyId=<redacted>&versionId=jLhOcIbVAU4xRghD3kB2NfB4iLqUr7PM&response-content-disposition=attachment%3B%20filename%3D%22versions.gz%22 HTTP/1.1" 400 None
reply: 'HTTP/1.1 400 Bad Request\r\n'
DEBUG:pyupdater.client.downloader:Resource URL: https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/versions.gz
DEBUG:pyupdater.client.downloader:Got content length of: None
DEBUG:pyupdater.client.downloader:Content-Length not in headers
DEBUG:pyupdater.client.downloader:Callbacks will not show time left or percent downloaded.
DEBUG:pyupdater.client.downloader:Using file as storage since the file is too large
DEBUG:pyupdater.client.downloader:Block size: 1036
DEBUG:pyupdater.client.downloader:{'total': None, 'downloaded': 519, 'status': 'downloading', 'percent_complete': '-.-%', 'time': '--:--'}
DEBUG:pyupdater.client.downloader:{'total': None, 'downloaded': 519, 'status': 'finished', 'percent_complete': '-.-%', 'time': '00:00'}
DEBUG:pyupdater.client.downloader:Download Complete
DEBUG:pyupdater.client.downloader:No hash to verify
WARNING:pyupdater.client.downloader:Downloaded file is very large, reading it in to memory may crash the app
DEBUG:pyupdater.client:Failed to decompress gzip file
DEBUG:pyupdater.client:Version file download failed
DEBUG:pyupdater.client:Not a gzipped file (b'<?')
Traceback (most recent call last):
File "C:\Users\Django\AppData\Local\Continuum\miniconda3\lib\site-packages\pyupdater\client\__init__.py", line 417, in _get_manifest_from_http
decompressed_data = _gzip_decompress(data)
File "C:\Users\Django\AppData\Local\Continuum\miniconda3\lib\site-packages\dsdev_utils\helpers.py", line 58, in gzip_decompress
data = decompressed_file.read()
File "C:\Users\Django\AppData\Local\Continuum\miniconda3\Lib\gzip.py", line 276, in read
return self._buffer.read(size)
File "C:\Users\Django\AppData\Local\Continuum\miniconda3\Lib\gzip.py", line 463, in read
if not self._read_gzip_header():
File "C:\Users\Django\AppData\Local\Continuum\miniconda3\Lib\gzip.py", line 411, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'<?')
DEBUG:dsdev_utils.paths:Changing to Directory --> C:\Users\Django\AppData\Local\any\main
DEBUG:pyupdater.client:Found version file on file system
DEBUG:pyupdater.client:Loaded version file from file system
DEBUG:dsdev_utils.paths:Moving back to Directory --> C:\Users\Django\privacy 4
DEBUG:pyupdater.client:Data type: <class 'bytes'>
DEBUG:pyupdater.client:App key is None
DEBUG:pyupdater.client:Version Data:
{'latest': {'main': {'stable': {'win': '1.4.0.2.0'}}}, 'updates': {'main': {'1.3.0.2.0': {'win': {'file_hash': '807c743b8c29f0053f4f9d9e6a8895b0e037f77480e7065c1470c2aba1cb08a0', 'file_size': 12194381, 'filename': 'main-win-1.3.zip', 'patch_hash': '29fec1006c2736eb78cc859f89e165af942daae6d9ac994a1a686d9b7b418ef6', 'patch_name': 'main-win-5', 'patch_size': 147}}, '1.4.0.2.0': {'win': {'file_hash': 'd59a22a95229f0a9c64909c646bfba31daf6bf8689dc16c9c93180c1602e9d3c', 'file_size': 12195571, 'filename': 'main-win-1.4.zip', 'patch_hash': 'baf3eba3a4b3184919ed9e57c3e8be9494a50862b40b1590ecb64e39e71a4ce3', 'patch_name': 'main-win-6', 'patch_size': 479625}}}}, 'signature': '<redacted>'}
DEBUG:dsdev_utils.helpers:Version str: 1.3
DEBUG:pyupdater.client:Failed version file verification
For those that want to replicate the error for themselves I’ve written the steps I have taken exactly.

Edit-1:
You need use below code for your main.py without any changes to downloader.py
from __future__ import print_function
import urllib3.poolmanager
orig_urlopen = urllib3.poolmanager.PoolManager.urlopen
def new_urlopen(self, method, url, redirect=True, **kw):
if "s3.amazonaws.com" in url and 'authorization' in self.headers:
self.headers.pop('authorization')
return orig_urlopen(self, method, url, redirect, **kw)
urllib3.poolmanager.PoolManager.urlopen = new_urlopen
import logging
from selenium import webdriver
logging.basicConfig(level=logging.DEBUG)
from client_config import ClientConfig
from pyupdater.client import Client, AppUpdate
import http.client as http_client
http_client.HTTPConnection.debuglevel = 1
def check_for_update():
client = Client(ClientConfig(), refresh=True, headers={'basic_auth': '<username>:<password>'})
app_update = client.update_check(ClientConfig.APP_NAME, ClientConfig.APP_VERSION, channel='stable')
if app_update is not None:
if app_update.download():
if isinstance(app_update, AppUpdate):
app_update.extract_restart()
return True
else:
app_update.extract()
return True
return False
def main():
print('Current version is ', ClientConfig.APP_VERSION)
if check_for_update():
print('there\'s a new update :D')
# driver = webdriver.Firefox()
# driver.get('http://stackoverflow.com')
if __name__ == "__main__":
main()
original answer
You need to use monkey patching for this. Below patch should do the job
import urllib3.poolmanager
orig_urlopen = urllib3.poolmanager.PoolManager.urlopen
def new_urlopen(self, method, url, redirect=True, **kw):
if "s3.amazonaws.com" in url and 'Authorization' in self.headers:
self.headers.pop('Authorization')
return orig_urlopen(self, method, url, redirect, **kw)
urllib3.poolmanager.PoolManager.urlopen = new_urlopen
A sample test worked for me with the above patch
import urllib3
pool = urllib3.PoolManager()
pool.headers.update({'Authorization': 'Basic XYZ=='})
r = pool.urlopen('GET', 'https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/keys.gz')
print(r.data)
You need to execute the code before import pyupdater

I just checked and I believe it is a problem with pyupdater (I don't know what it is, never used).
It seems to assume that all the response's body will be compressed in GZIP. There is no flag I can find that would prevent this assumption. The actual content is actually not compressed at all.
Here is some relevant code from pyupdater:
pyupdater/client/__init__.py:
def _get_manifest_from_http(self):
log.debug('Downloading online version file')
try:
fd = _FD(self.version_file, self.update_urls, verify=self.verify,
urllb3_headers=self.urllib3_headers)
data = fd.download_verify_return()
try:
import ipdb
ipdb.set_trace()
decompressed_data = _gzip_decompress(data)
except IOError:
log.debug('Failed to decompress gzip file')
# Will be caught down below.
# Just logging the error
raise
log.debug('Version file download successful')
# Writing version file to application data directory
self._write_manifest_2_filesystem(decompressed_data)
return decompressed_data
except Exception as err:
log.debug('Version file download failed')
log.debug(err, exc_info=True)
return None
Here is a sample of data I receive:
ipdb> data
b'{"type": "error", "error": {"message": "keys.gz"}}'
I believe you should open a ticket on https://github.com/JMSwag/PyUpdater and see if they can help you further.

Requests is a pretty fantastic library, don't waste time with anything else, unless there is a really good reason:
import requests
import zlib
def download(url, username, password):
r = requests.get(url, auth=requests.auth.HTTPBasicAuth(username, password))
r.raise_for_status()
return zlib.decompress(r.content, 15 + 32)
download('https://api.bitbucket.org/2.0/repositories/brofewfefwefewef/eee/downloads/keys.gz', 'brofewfefwefewef', your_password)
Also, it's probably worth noting that the credentials here aren't shouldn't be used anymore. Basic Auth can be decoded pretty simply.

Related

What HTTP response codes are retried by python Requests

What are the list of HTTP response status codes retried by default and how many times, by the Python Requests. How can I change the number of retries? I couldn't find any documentation for it.
I tried below code and there were two retries on 401 status code.
import requests
from http.client import HTTPConnection
from requests.auth import HTTPDigestAuth
HTTPConnection.debuglevel = 1
requests.adapters.DEFAULT_RETRIES = 5
def test():
data = 'testdata'
username = 'testuser'
password = 'test'
url='https://example.com:443/captionen_0001.vtt'
try:
response = requests.put(url, auth=HTTPDigestAuth(username,password), data=data, verify=False)
except Exception as e:
print('error'+str(e))
test()
warnings.warn(
send: b'PUT /channel_captionen_0001.vtt HTTP/1.1\r\nHost: example.com\r\nUser-Agent: python-requests/2.24.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\nContent-Length: 8\r\n\r\n'
send: b'testdata'
reply: 'HTTP/1.1 401 Authorization Required\r\n'
header: Date: Sat, 05 Feb 2022 07:50:25 GMT
header: WWW-Authenticate: Digest realm="WebDAV", nonce="tE/JnkDX845db3", algorithm=MD5, qop="auth"
warnings.warn(
send: b'PUT /channel_captionen_0001.vtt HTTP/1.1\r\nHost: example.com\r\nUser-Agent: python-requests/2.24.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\nContent-Length: 8\r\nAuthorization: Digest username="testuser", realm="WebDAV", nonce="tE/JnkDX845db3", uri="/channel_captionen_0001.vtt", response="1c3299c716797e8f36528f6e6dbaeb50", algorithm="MD5", qop="auth", nc=00000001, cnonce="dd0835ef485c6b71"\r\n\r\n'
send: b'testdata'
reply: 'HTTP/1.1 401 Authorization Required\r\n'
header: Date: Sat, 05 Feb 2022 07:50:25 GMT
header: WWW-Authenticate: Digest realm="WebDAV", nonce="/vXKnkDXBQc098a4", algorithm=MD5, qop="auth
It's not obviously to find. You have to know requests is not the package that manage the connection, urllib3 does.
In the source code of HTTPAdapter (use it when you want more control on requests), the docstring on max_retries parameter said:
If you need granular control over the conditions under which we retry a request, import urllib3's Retry class and pass that instead
Now you can refer to the documentation of urllib3 for Retry class.
Read especially status_forcelist parameter and RETRY_AFTER_STATUS_CODES (default: frozenset({413, 429, 503}))
Update
import requests
import urllib3
my_code_list = [401, 403, ...]
s = requests.Session()
r = urllib3.util.Retry(status_forcelist=my_code_list)
a = requests.adapters.HTTPAdapter(max_retries=r)
s.mount('http://', a)

httplib - http not accepting content length

Problem
When I switched Macbooks, all of the sudden I am getting an HTTP 411: Length Required (I wasn't getting this using a different Mac) trying to use a POST request with httplib. I cannot seem to find a work around for this.
Code Portion 1: from a supporting class; retrieves data and other things,
class Data(object):
def __init__(self, value):
self.company_id = None
self.host = settings.CONSUMER_URL
self.body = None
self.headers = {"clienttype": "Cloud-web", "Content-Type": "application/json", "ErrorLogging": value}
def login(self):
'''Login and store auth token'''
path = "/Security/Login"
body = self.get_login_info()
status_code, resp = self.submit_request("POST", path, json.dumps(body))
self.info = json.loads(resp)
company_id = self.get_company_id(self.info)
self.set_token(self.info["token"])
return company_id
def submit_request(self, method, path, body=None, header=None):
'''Submit requests for API tests'''
conn = httplib.HTTPSConnection(self.host)
conn.set_debuglevel(1)
conn.request(method, path, body, self.headers)
resp = conn.getresponse()
return resp.status, resp.read()
Code Portion 2: my unittests,
# logging in
cls.api = data.Data(False) # initializing the data class from Code Portion 1
cls.company_id = cls.api.login()
...
# POST Client/Register
def test_client_null_body(self):
'''Null body object - 501'''
status, resp = self.api.submit_request('POST', '/Client/register')
if status != 500:
log.log_warning('POST /Client/register: %s, %s' % (str(status), str(resp)))
self.assertEqual(status, 500)
Code Portion 3: example of the data I send from a settings file,
API_ACCOUNT = {
"userName": "account#account.com",
"password": "password",
"companyId": 107
}
From Logging
WARNING:root: POST /Client/register: 411, <!DOCTYPE HTML PUBLIC "-//W3C//DTD
HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Length Required</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Length Required</h2>
<hr><p>HTTP Error 411. The request must be chunked or have a content length.</p>
</BODY></HTML>
Additional Info: I was using a 2008 Macbook Pro without issue. Switched to a 2013 Macbook Pro and this keeps occurring.
I took a look at this post:
Python httplib and POST and it seems that at the time httplib did not automatically generate the content length.
Now https://docs.python.org/2/library/httplib.html:
If one is not provided in headers, a Content-Length header is added automatically for all methods if the length of the body can be determined, either from the length of the str representation, or from the reported size of the file on disk.
when using conn.set_debuglevel(1) we see that httplib is sending a header
reply: 'HTTP/1.1 411 Length Required\r\n'
header: Content-Type: text/html; charset=us-ascii
header: Server: Microsoft-HTTPAPI/2.0
header: Date: Thu, 26 May 2016 17:08:46 GMT
header: Connection: close
header: Content-Length: 344
Edit
Unittest Failure:
======================================================================
FAIL: test_client_null_body (__main__.NegApi)
Null body object - 501
----------------------------------------------------------------------
Traceback (most recent call last):
File "API_neg.py", line 52, in test_client_null_body
self.assertEqual(status, 500)
AssertionError: 411 != 500
.send: 'POST /Client/register HTTP/1.1\r\nHost: my.host\r\nAccept-Encoding: identity\r\nAuthorizationToken: uhkGGpJ4aQxm8BKOCH5dt3bMcwsHGCHs1p+OJvtf9mHKa/8pTEnKyYeJr+boBr8oUuvWvZLr1Fd+Og2xJP3xVw==\r\nErrorLogging: False\r\nContent-Type: application/json\r\nclienttype: Cloud-web\r\n\r\n'
reply: 'HTTP/1.1 411 Length Required\r\n'
header: Content-Type: text/html; charset=us-ascii
header: Server: Microsoft-HTTPAPI/2.0
header: Date: Thu, 26 May 2016 17:08:27 GMT
header: Connection: close
header: Content-Length: 344
Any ideas as to why this was working on a previous Mac and is currently not working here? It's the same code, same operating systems. Let me know if I can provide any more information.
Edit 2
The issue seemed to be with OSX 10.10.4, after upgrading to 10.10.5 all is well. I still would like to get some insight on why I was having this issue.
The only change from 10.10.4 to 10.10.5, that seems close, would have been the python update from 2.7.6 to 2.7.10 which includes this bug fix: http://bugs.python.org/issue22417

Return response code python HTTP header

I have python server serving cgi scripts,
I want to add status code to my response. I did,
try:
cgi = CGI()
output = cgi.fire()
print 'Content-Type text/json'
print 'Status:200 success'
print
print json.dumps(output)
except:
print 'Content-Type: text/json'
print 'Status: 403 Forbidden'
print
print json.dumps({'msg':'error'})
But when I request the this script via dojo xhr request, I get 200 request status. Why is so?
Header
Request URL:http://192.168.2.72:8080/cgi-bin/cgi.py
Request Method:POST
Status Code:200 Script output follows
Request Headersview source
Accept:*/*
Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.3
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-US,en;q=0.8
Cache-Control:no-cache
Connection:keep-alive
Content-Length:125
Content-Type:application/x-www-form-urlencoded
Host:192.168.2.72:7999
Origin:http://192.168.2.72:7999
Pragma:no-cache
Referer:http://192.168.2.72:7999/home.html
User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.22 (KHTML, like Gecko) Ubuntu Chromium/25.0.1364.160 Chrome/25.0.1364.160 Safari/537.22
X-Requested-With:XMLHttpRequest
Form Dataview sourceview URL encoded
Response Headersview source
Content-Type:text/json
Date:Fri, 08 Aug 2014 05:16:29 GMT
Server:SimpleHTTP/0.6 Python/2.7.3
Status:403 Forbidden
Any inputs?
what I already have tried:
result.ioArgs.xhr.getAllResponseHeaders() // returns string
ioargs.xhr.status // returns the request status.
If json.dumps(output) raises an exception, you will have already printed your headers including status code (generally would be spelled as Status: 200 OK) and a blank line to end the header section of the HTTP response.
Then, the except block will print a second set of headers, but those are actually considered part of the body of the response at that point because printing an empty line ended the headers. See the HTTP message spec.
The solution is to wait until you know what your output is going to be to print any headers.
-more-
json.dumps can raise exceptions if you give it input that is not serializable. And given that cgi.fire() appears to be a method of some custom CGI object (builtin cgi module doesn't have that method) it could be returning anything.
To debug you need to log what exception is being raised, preferably with traceback. The bare except: block you have will catch all errors and then do nothing with them, so you don't know what's going on, nor does anyone looking at the question. You might also need to log the value of output.
To complement what Jason S says I reproduced exactly in his answer I reproduced the exactly same failure with a non json serializable object (in this example a md5 hash) and have the same behaviour than original poster a 200 return code
#!/usr/bin/env python
import json
import traceback
class CGI:
def fire(self):
import md5
return md5.md5()
try:
cgi = CGI()
output = cgi.fire()
print 'Content-Type text/json'
print 'Status:200 success'
print
print json.dumps(output)
except:
traceback.print_exc()
print 'Content-Type: text/json'
print 'Status: 403 Forbidden'
print
print json.dumps({'msg':'error'})
interacting with the server
$ socat - TCP4:localhost:8000
input
GET /cgi-bin/test.py HTTP/1.0
output
HTTP/1.0 200 Script output follows
Server: SimpleHTTP/0.6 Python/3.4.0
Date: Sun, 17 Aug 2014 16:16:19 GMT
Content-Type text/json
Status:200 success
Content-Type: text/json
Status: 403 Forbidden
{"msg": "error"}
traceback:
127.0.0.1 - - [17/Aug/2014 18:16:19] "GET /cgi-bin/test.py HTTP/1.0" 200 -
Traceback (most recent call last):
File "/home/xcombelle/dev/test/cgi-bin/test.py", line 16, in <module>
print json.dumps(output)
File "/usr/lib/python2.7/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python2.7/json/encoder.py", line 200, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python2.7/json/encoder.py", line 263, in iterencode
return _iterencode(o, 0)
File "/usr/lib/python2.7/json/encoder.py", line 177, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <md5 HASH object # 0x7fcb090d0a30> is not JSON serializable
just do json.dumps() to a string before outputting your headers and you should be fine no?
that will protect you from setting headers and then getting an exception as exception in print is unlikely
For anyone coming across this in future, the reason for this is the python http.server which was being used to serve the content. For some reason this is designed to spit out a Status 200: script output follows header before the cgi script starts running. This means that you can't change the status code returned within your script (see the documentation for CGIHTTPRequestHandler on this page)
This actually makes it a real pain to use when developing as the errors don't propagate in the same way they would in production.
It looks to me like you are setting a Status: header field but you want to set Status-Code:.
Does your script really write Status Code:200 Script output follows as a header field?

Python requests - print entire http request (raw)?

While using the requests module, is there any way to print the raw HTTP request?
I don't want just the headers, I want the request line, headers, and content printout. Is it possible to see what ultimately is constructed from HTTP request?
Since v1.2.3 Requests added the PreparedRequest object. As per the documentation "it contains the exact bytes that will be sent to the server".
One can use this to pretty print a request, like so:
import requests
req = requests.Request('POST','http://stackoverflow.com',headers={'X-Custom':'Test'},data='a=1&b=2')
prepared = req.prepare()
def pretty_print_POST(req):
"""
At this point it is completely built and ready
to be fired; it is "prepared".
However pay attention at the formatting used in
this function because it is programmed to be pretty
printed and may differ from the actual request.
"""
print('{}\n{}\r\n{}\r\n\r\n{}'.format(
'-----------START-----------',
req.method + ' ' + req.url,
'\r\n'.join('{}: {}'.format(k, v) for k, v in req.headers.items()),
req.body,
))
pretty_print_POST(prepared)
which produces:
-----------START-----------
POST http://stackoverflow.com/
Content-Length: 7
X-Custom: Test
a=1&b=2
Then you can send the actual request with this:
s = requests.Session()
s.send(prepared)
These links are to the latest documentation available, so they might change in content:
Advanced - Prepared requests and API - Lower level classes
import requests
response = requests.post('http://httpbin.org/post', data={'key1': 'value1'})
print(response.request.url)
print(response.request.body)
print(response.request.headers)
Response objects have a .request property which is the PreparedRequest object that was sent.
An even better idea is to use the requests_toolbelt library, which can dump out both requests and responses as strings for you to print to the console. It handles all the tricky cases with files and encodings which the above solution does not handle well.
It's as easy as this:
import requests
from requests_toolbelt.utils import dump
resp = requests.get('https://httpbin.org/redirect/5')
data = dump.dump_all(resp)
print(data.decode('utf-8'))
Source: https://toolbelt.readthedocs.org/en/latest/dumputils.html
You can simply install it by typing:
pip install requests_toolbelt
Note: this answer is outdated. Newer versions of requests support getting the request content directly, as AntonioHerraizS's answer documents.
It's not possible to get the true raw content of the request out of requests, since it only deals with higher level objects, such as headers and method type. requests uses urllib3 to send requests, but urllib3 also doesn't deal with raw data - it uses httplib. Here's a representative stack trace of a request:
-> r= requests.get("http://google.com")
/usr/local/lib/python2.7/dist-packages/requests/api.py(55)get()
-> return request('get', url, **kwargs)
/usr/local/lib/python2.7/dist-packages/requests/api.py(44)request()
-> return session.request(method=method, url=url, **kwargs)
/usr/local/lib/python2.7/dist-packages/requests/sessions.py(382)request()
-> resp = self.send(prep, **send_kwargs)
/usr/local/lib/python2.7/dist-packages/requests/sessions.py(485)send()
-> r = adapter.send(request, **kwargs)
/usr/local/lib/python2.7/dist-packages/requests/adapters.py(324)send()
-> timeout=timeout
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(478)urlopen()
-> body=body, headers=headers)
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(285)_make_request()
-> conn.request(method, url, **httplib_request_kw)
/usr/lib/python2.7/httplib.py(958)request()
-> self._send_request(method, url, body, headers)
Inside the httplib machinery, we can see HTTPConnection._send_request indirectly uses HTTPConnection._send_output, which finally creates the raw request and body (if it exists), and uses HTTPConnection.send to send them separately. send finally reaches the socket.
Since there's no hooks for doing what you want, as a last resort you can monkey patch httplib to get the content. It's a fragile solution, and you may need to adapt it if httplib is changed. If you intend to distribute software using this solution, you may want to consider packaging httplib instead of using the system's, which is easy, since it's a pure python module.
Alas, without further ado, the solution:
import requests
import httplib
def patch_send():
old_send= httplib.HTTPConnection.send
def new_send( self, data ):
print data
return old_send(self, data) #return is not necessary, but never hurts, in case the library is changed
httplib.HTTPConnection.send= new_send
patch_send()
requests.get("http://www.python.org")
which yields the output:
GET / HTTP/1.1
Host: www.python.org
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/2.1.0 CPython/2.7.3 Linux/3.2.0-23-generic-pae
requests supports so called event hooks (as of 2.23 there's actually only response hook). The hook can be used on a request to print full request-response pair's data, including effective URL, headers and bodies, like:
import textwrap
import requests
def print_roundtrip(response, *args, **kwargs):
format_headers = lambda d: '\n'.join(f'{k}: {v}' for k, v in d.items())
print(textwrap.dedent('''
---------------- request ----------------
{req.method} {req.url}
{reqhdrs}
{req.body}
---------------- response ----------------
{res.status_code} {res.reason} {res.url}
{reshdrs}
{res.text}
''').format(
req=response.request,
res=response,
reqhdrs=format_headers(response.request.headers),
reshdrs=format_headers(response.headers),
))
requests.get('https://httpbin.org/', hooks={'response': print_roundtrip})
Running it prints:
---------------- request ----------------
GET https://httpbin.org/
User-Agent: python-requests/2.23.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
None
---------------- response ----------------
200 OK https://httpbin.org/
Date: Thu, 14 May 2020 17:16:13 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 9593
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
<!DOCTYPE html>
<html lang="en">
...
</html>
You may want to change res.text to res.content if the response is binary.
Here is a code, which makes the same, but with response headers:
import socket
def patch_requests():
old_readline = socket._fileobject.readline
if not hasattr(old_readline, 'patched'):
def new_readline(self, size=-1):
res = old_readline(self, size)
print res,
return res
new_readline.patched = True
socket._fileobject.readline = new_readline
patch_requests()
I spent a lot of time searching for this, so I'm leaving it here, if someone needs.
A fork of #AntonioHerraizS answer (HTTP version missing as stated in comments)
Use this code to get a string representing the raw HTTP packet without sending it:
import requests
def get_raw_request(request):
request = request.prepare() if isinstance(request, requests.Request) else request
headers = '\r\n'.join(f'{k}: {v}' for k, v in request.headers.items())
body = '' if request.body is None else request.body.decode() if isinstance(request.body, bytes) else request.body
return f'{request.method} {request.path_url} HTTP/1.1\r\n{headers}\r\n\r\n{body}'
headers = {'User-Agent': 'Test'}
request = requests.Request('POST', 'https://stackoverflow.com', headers=headers, json={"hello": "world"})
raw_request = get_raw_request(request)
print(raw_request)
Result:
POST / HTTP/1.1
User-Agent: Test
Content-Length: 18
Content-Type: application/json
{"hello": "world"}
💡 Can also print the request in the response object
r = requests.get('https://stackoverflow.com')
raw_request = get_raw_request(r.request)
print(raw_request)
I use the following function to format requests. It's like #AntonioHerraizS except it will pretty-print JSON objects in the body as well, and it labels all parts of the request.
format_json = functools.partial(json.dumps, indent=2, sort_keys=True)
indent = functools.partial(textwrap.indent, prefix=' ')
def format_prepared_request(req):
"""Pretty-format 'requests.PreparedRequest'
Example:
res = requests.post(...)
print(format_prepared_request(res.request))
req = requests.Request(...)
req = req.prepare()
print(format_prepared_request(res.request))
"""
headers = '\n'.join(f'{k}: {v}' for k, v in req.headers.items())
content_type = req.headers.get('Content-Type', '')
if 'application/json' in content_type:
try:
body = format_json(json.loads(req.body))
except json.JSONDecodeError:
body = req.body
else:
body = req.body
s = textwrap.dedent("""
REQUEST
=======
endpoint: {method} {url}
headers:
{headers}
body:
{body}
=======
""").strip()
s = s.format(
method=req.method,
url=req.url,
headers=indent(headers),
body=indent(body),
)
return s
And I have a similar function to format the response:
def format_response(resp):
"""Pretty-format 'requests.Response'"""
headers = '\n'.join(f'{k}: {v}' for k, v in resp.headers.items())
content_type = resp.headers.get('Content-Type', '')
if 'application/json' in content_type:
try:
body = format_json(resp.json())
except json.JSONDecodeError:
body = resp.text
else:
body = resp.text
s = textwrap.dedent("""
RESPONSE
========
status_code: {status_code}
headers:
{headers}
body:
{body}
========
""").strip()
s = s.format(
status_code=resp.status_code,
headers=indent(headers),
body=indent(body),
)
return s
test_print.py content:
import logging
import pytest
import requests
from requests_toolbelt.utils import dump
def print_raw_http(response):
data = dump.dump_all(response, request_prefix=b'', response_prefix=b'')
return '\n' * 2 + data.decode('utf-8')
#pytest.fixture
def logger():
log = logging.getLogger()
log.addHandler(logging.StreamHandler())
log.setLevel(logging.DEBUG)
return log
def test_print_response(logger):
session = requests.Session()
response = session.get('http://127.0.0.1:5000/')
assert response.status_code == 300, logger.warning(print_raw_http(response))
hello.py content:
from flask import Flask
app = Flask(__name__)
#app.route('/')
def hello_world():
return 'Hello, World!'
Run:
$ python -m flask hello.py
$ python -m pytest test_print.py
Stdout:
------------------------------ Captured log call ------------------------------
DEBUG urllib3.connectionpool:connectionpool.py:225 Starting new HTTP connection (1): 127.0.0.1:5000
DEBUG urllib3.connectionpool:connectionpool.py:437 http://127.0.0.1:5000 "GET / HTTP/1.1" 200 13
WARNING root:test_print_raw_response.py:25
GET / HTTP/1.1
Host: 127.0.0.1:5000
User-Agent: python-requests/2.23.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
HTTP/1.0 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: 13
Server: Werkzeug/1.0.1 Python/3.6.8
Date: Thu, 24 Sep 2020 21:00:54 GMT
Hello, World!

How to handle "413: Request Entity Too Large" in python flask server

I'm using Flask-uploads to upload files to my Flask server. The max size allowed is set by using flaskext.uploads.patch_request_class(app, 16 * 1024 * 1024).
My client application (A unit test) uses requests to post a file that is to large.
I can see that my server returnes a HTTP response with status 413: Request Entity Too Large. But the client raises an exception in the requests code
ConnectionError: HTTPConnectionPool(host='api.example.se', port=80): Max retries exceeded with url: /images (Caused by <class 'socket.error'>: [Errno 32] Broken pipe)
My guess is that the server disconnect the receving socket and sends the reponse back to the client. But when the client gets a broken sending socket, it raises an exception and skips the response.
Questions:
Are my guess about Flask-Uploads and requests correct?
Does Flask-Uploads and request handle the 413 error correct?
Should I expect that my client code gets back some html when the post are to large?
Update
Here is a simple example reproducing my problem.
server.py
from flask import Flask, request
app = Flask(__name__)
app.config['MAX_CONTENT_LENGTH'] = 1024
#app.route('/post', methods=('POST',))
def view_post():
return request.data
app.run(debug=True)
client.py
from tempfile import NamedTemporaryFile
import requests
def post(size):
print "Post with size %s" % size,
f = NamedTemporaryFile(delete=False, suffix=".jpg")
for i in range(0, size):
f.write("CoDe")
f.close()
# Post
files = {'file': ("tempfile.jpg", open(f.name, 'rb'))}
r = requests.post("http://127.0.0.1:5000/post", files=files)
print "gives status code = %s" % r.status_code
post(16)
post(40845)
post(40846)
result from client
Post with size 16 gives status code = 200
Post with size 40845 gives status code = 413
Post with size 40846
Traceback (most recent call last):
File "client.py", line 18, in <module>
post(40846)
File "client.py", line 13, in post
r = requests.post("http://127.0.0.1:5000/post", files=files)
File "/opt/python_env/renter/lib/python2.7/site-packages/requests/api.py", line 88, in post
return request('post', url, data=data, **kwargs)
File "/opt/python_env/renter/lib/python2.7/site-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/opt/python_env/renter/lib/python2.7/site-packages/requests/sessions.py", line 357, in request
resp = self.send(prep, **send_kwargs)
File "/opt/python_env/renter/lib/python2.7/site-packages/requests/sessions.py", line 460, in send
r = adapter.send(request, **kwargs)
File "/opt/python_env/renter/lib/python2.7/site-packages/requests/adapters.py", line 354, in send
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='127.0.0.1', port=5000): Max retries exceeded with url: /post (Caused by <class 'socket.error'>: [Errno 32] Broken pipe)
my versions
$ pip freeze
Flask==0.10.1
Flask-Mail==0.9.0
Flask-SQLAlchemy==1.0
Flask-Uploads==0.1.3
Jinja2==2.7.1
MarkupSafe==0.18
MySQL-python==1.2.4
Pillow==2.1.0
SQLAlchemy==0.8.2
Werkzeug==0.9.4
blinker==1.3
itsdangerous==0.23
passlib==1.6.1
python-dateutil==2.1
requests==2.0.0
simplejson==3.3.0
six==1.4.1
virtualenv==1.10.1
voluptuous==0.8.1
wsgiref==0.1.2
Flask is closing the connection, you can set an error handler for the 413 error:
#app.errorhandler(413)
def request_entity_too_large(error):
return 'File Too Large', 413
Now the client should get a 413 error, note that I didn't test this code.
Update:
I tried recreating the 413 error, and I didn't get a ConnectionError exception.
Here's a quick example:
from flask import Flask, request
app = Flask(__name__)
app.config['MAX_CONTENT_LENGTH'] = 1024
#app.route('/post', methods=('POST',))
def view_post():
return request.data
app.run(debug=True)
After running the file, I used the terminal to test requests and sending large data:
>>> import requests
>>> r = requests.post('http://127.0.0.1:5000/post', data={'foo': 'a'})
>>> r
<Response [200]>
>>> r = requests.post('http://127.0.0.1:5000/post', data={'foo': 'a'*10000})
>>> r
<Response [413]>
>>> r.status_code
413
>>> r.content
'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">\n<title>413 Request Entity Too Large</title
>\n<h1>Request Entity Too Large</h1>\n<p>The data value transmitted exceeds the capacity limit.</p>\n'
As you can see, we got a response from flask 413 error and requests didn't raise an exception.
By the way I'm using:
Flask: 0.10.1
Requests: 2.0.0
RFC 2616, the specification for HTTP 1.1, says:
10.4.14 413 Request Entity Too Large
The server is refusing to process a request because the request
entity is larger than the server is willing or able to process. The
server MAY close the connection to prevent the client from continuing
the request.
If the condition is temporary, the server SHOULD include a Retry-
After header field to indicate that it is temporary and after what
time the client MAY try again.
This is what's happening here: flask is closing the connection to prevent the client from continuing the upload, which is giving you the Broken pipe error.
Based on this github issue answers (https://github.com/benoitc/gunicorn/issues/1733#issuecomment-377000612)
#app.before_request
def handle_chunking():
"""
Sets the "wsgi.input_terminated" environment flag, thus enabling
Werkzeug to pass chunked requests as streams. The gunicorn server
should set this, but it's not yet been implemented.
"""
transfer_encoding = request.headers.get("Transfer-Encoding", None)
if transfer_encoding == u"chunked":
request.environ["wsgi.input_terminated"] = True

Categories