I am trying to understand how and when the WSGI environment HTTP Header(s) get renamed in an app's request object.
I am trying Eve and I am sending a POST or a PUT with X-HTTP-Method-Override.
The code, within Eve, is trying to access the request headers using the following code (here):
return request.headers.get('X-HTTP-Method-Override', request.method)
In my WSGI Environment I have a HTTP_X_HTTP_METHOD_OVERRIDE with value PATCH.
When I try to do a request.headers dump, I get:
Request Header: ('X-Http-Method-Override', u'PATCH')
Request Header: ('Origin', u'http://localhost:9000')
Request Header: ('Content-Length', u'622')
Request Header: ('Host', u'localhost:24435')
Request Header: ('Accept', u'application/json;charset=UTF-8')
Request Header: ('Content-Type', u'application/json')
Request Header: ('Accept-Encoding', u'identity')
I checked online and other Python applications are trying to access this specific request header with the case:
X-HTTP-Method-Override and not X-Http-Method-Override (which I get in request)
Flask takes care of extracting the headers from the WSGI environment variables for you, in the process removing the initial HTTP_ prefix. The prefix is there in the WSGI environment to distinguish the headers from other WSGI information, but that prefix is entirely redundant once you extracted the headers into a dedicated structure.
The request object also provides you with a specialised dictionary where keys are matched case insensitively. It doesn't matter what case you use here, as long as the lowercased version matches the lowercased header key; http, Http, HTTP and HtTp all are valid case variations. That's because the HTTP standard explicitly states that case should be ignored when handling headers.
See the Headers class reference in the Werkzeug documentation, it is the bases for the request.headers object. It in turn is compatible with the wsgiref.headers.Headers class, including this:
For each of these methods, the key is the header name (treated case-insensitively), and the value is the first value associated with that header name.
Emphasis mine.
Related
I am trying to use AWS DynamoDB in a Flutter app, and given the lack of an official AWS SDK for Dart I am forced to use the low level HTTP REST API.
The method for signing an AWS HTTP request is quite tedious, but using an AWS supplied sample as a guide, I was able to convert the Python to Dart pretty much line-for-line relatively easily. The end result was both sets of code producing the same auth signatures.
My issue came when I actually went to sent the request. The Python works as expected but sending a POST with Dart's HTTP package gives the error
The request signature we calculated does not match the signature you
provided. Check your AWS Secret Access Key and signing method. Consult
the service documentation for details.
I'll spare you the actual code for generating the auth signature, as the issue can be replicated simply by sending the same request hard-coded. See the Python and Dart code below.
Note: A valid response will return
Signature expired: 20190307T214900Z is now earlier than
20190307T215809Z (20190307T221309Z - 15 min.)
as the request signature uses current date and is only valid for 15 mins.
*****PYTHON CODE*****
import requests
headers = {'Content-Type':'application/json',
'X-Amz-Date':'20190307T214900Z',
'X-Amz-Target':'DynamoDB_20120810.GetItem',
'Authorization':'AWS4-HMAC-SHA256 Credential=AKIAJFZWA7QQAQT474EQ/20190307/ap-southeast-2/dynamodb/aws4_request, SignedHeaders=content-type;host;x-amz-date;x-amz-target, Signature=297c5a03c59db6da45bfe2fda6017f89a0a1b2ab6da2bb6e0d838ca40be84320'}
endpoint = 'https://dynamodb.ap-southeast-2.amazonaws.com/'
request_parameters = '{"TableName": "player-exports","Key": {"exportId": {"S": "HG1T"}}}'
r = requests.post(endpoint, data=request_parameters, headers=headers)
print('Response status: %d\n' % r.status_code)
print('Response body: %s\n' % r.text)
*****DART CODE*****
import 'package:http/http.dart' as http;
void main(List<String> arguments) async {
var headers = {'Content-Type':'application/json',
'X-Amz-Date':'20190307T214900Z',
'X-Amz-Target':'DynamoDB_20120810.GetItem',
'Authorization':'AWS4-HMAC-SHA256 Credential=AKIAJFZWA7QQAQT474EQ/20190307/ap-southeast-2/dynamodb/aws4_request, SignedHeaders=content-type;host;x-amz-date;x-amz-target, Signature=297c5a03c59db6da45bfe2fda6017f89a0a1b2ab6da2bb6e0d838ca40be84320'};
var endpoint = 'https://dynamodb.ap-southeast-2.amazonaws.com/';
var request_parameters = '{"TableName": "player-exports","Key": {"exportId": {"S": "HG1T"}}}';
http.post(endpoint, body: request_parameters, headers: headers).then((response) {
print("Response status: ${response.statusCode}");
print("Response body: ${response.body}");
});
}
The endpoint, headers and body are literally copy and pasted between the two sets of code.
Is there some nuance to how Dart HTTP works that I am missing here? Is there some map/string/json conversion of the headers or request_paramaters happening?
One thing I did note is that in the AWS provided example it states
For DynamoDB, the request can include any headers, but MUST include
"host", "x-amz-date", "x-amz-target", "content-type", and
"Authorization". Except for the authorization header, the headers must
be included in the canonical_headers and signed_headers values, as
noted earlier. Order here is not significant. Python note: The 'host'
header is added automatically by the Python 'requests' library.
But
a) When I add 'Host':'dynamodb.ap-southeast-2.amazonaws.com' to the headers in the Dart code I get the same result
and
b) If I look at r.request.headers after the Python requests returns, I can see that it has added a few new headers (Content-Length etc) automatically, but "Host" isn't one of them.
Any ideas why the seemingly same HTTP request works for Python Requests but not Dart HTTP?
Ok this is resolved now. My issue was in part a massive user-error. I was using a new IDE and when I generated the hardcoded example I provided I was actually still executing the previous file. Stupid, stupid, stupid.
But...
I was able to sort out the actual issue that caused me raise the question in the first place. I found that if you set the content type to "application/json" in the headers, the dart HTTP package automatically appends "; charset=utf-8". Because this value is part of the auth signature, when AWS encodes the values from the header to compare to the user-generated signature, they don't match.
The fix is simply to ensure that when you are setting the header content-type, make sure that you manually set it to "application/json; charset=utf-8" and not "application/json".
Found a bit more discussion about this "bug" after the fact here.
Why does autobahn web socket server change all http header keys to lower case? I need to implement authentication token in header with OAuth2 standard with custom header 'Authorization:Bearer $token'. But it seems from autobahn 'request.headers' in onConnect method of WebSocketServerProtocol class all the keys are changed to lower case. What is the reason behind this? Can I use 'authorization' instead of 'Authorization' as the key to fetch auth token from request in this scenario?
According to the HTTP RFC, "HTTP header ... field names are case-insensitive." In your example, any of the following incoming header spellings are equivalent: "Authorization", "authorization", "AuThOrIzAtIoN".
The software in question lower-casifies the header to make lookups easier. You should always use the lower-case version as the key.
Does the requests package of Python cache data by default?
For example,
import requests
resp = requests.get('https://some website')
Will the response be cached? If so, how do I clear it?
Add a 'Cache-Control: no-cache' header:
self.request = requests.get('http://google.com',
headers={'Cache-Control': 'no-cache'})
See https://stackoverflow.com/a/55613686/469045 for complete answer.
Late answer, but python requests doesn't cache requests, you should use the Cache-Control and Pragma headers instead, i.e.:
import requests
h = {
...
"Cache-Control": "no-cache",
"Pragma": "no-cache"
}
r = requests.get("url", headers=h)
...
HTTP/Headers
Cache-Control
The Cache-Control general-header field is used to specify directives for caching mechanisms in both requests and responses. Caching directives are unidirectional, meaning that a given directive in a request is not implying that the same directive is to be given in the response.
Pragma
Implementation-specific header that may have various effects anywhere
along the request-response chain. Used for backwards compatibility
with HTTP/1.0 caches where the Cache-Control header is not yet
present.
Directive
no-cache
Forces caches to submit the request to the origin server for
validation before releasing a cached copy.
Note on Pragma:
Pragma is not specified for HTTP responses and is therefore not a
reliable replacement for the general HTTP/1.1 Cache-Control header,
although it does behave the same as Cache-Control: no-cache, if the
Cache-Control header field is omitted in a request. Use Pragma only
for backwards compatibility with HTTP/1.0 clients.
Python-requests doesn't have any caching features.
However, if you need them you can look at requests-cache, although I never used it.
Requests does not do caching by default. You can easily plug it in by using something like CacheControl.
I was getting outdated version of a website and I thought about requests cache too, but adding no-cache parameter to headers didn't change anything. It appears that the cookie I was passing, was causing the server to present outdated site.
I was working on a simple API server using tornado and all requests require the parameter access_token. I was playing with curl, and was surprised to find that DELETE and GET requests will not extract this value from the request body--they only allow this param to be passed via the query string.
ie, when I do
curl -i -X DELETE -d access_token=1234 http://localhost:8888/
In the delete method of my web handler, this returns None:
self.get_argument('access_token', None)
However, when I do
curl -i -X DELETE http://localhost:8888/?access_token=1234
This yields "1234" as expected:
self.get_argument('access_token', None)
I examined the tornado source, and found that the body is only parsed for POST and PUT requests: https://github.com/facebook/tornado/blob/4b346bdde80c1e677ca0e235e04654f8d64b365c/tornado/httpserver.py#L258
Is it correct to ignore the request body for GET, HEAD, and DELETE requests, or is this a choice made by the authors of tornado?
This is correct per the HTTP/1.1 protocol specification.
DELETE and GET requests do not accept entity data enclosed in the request.
According to the definition, get requests retrieve their entity data from the request URI.
HEAD requests are defined as identical to GET requests except that the server should not return a message body in the response.
Therefore the authors of tornado were correct to ignore the "post" data for GET, HEAD, and DELETE.
See HTTP/1.1 Method Definitions
It is a good idea to not to accept requests with the payload if they are not POST or PUT. Just because of security reasons. Some servers, e.g. lighttpd, return server error in this case.
I'm playing around with some APIs and I'm trying to figure this out.
I am making a basic HTTP authenticated request to my server via the API. As part of this request, the authenticated key is stored in the HTTP header as username.
So my question is, how do I get the contents of the incoming request such that I can perform a check against it?
What I am trying to do:
if incoming request has header == 'myheader':
do some stuff
else:
return ('not authorised')
For those interested, I am trying to get this to work.
UPDATE
I am using Django
http://docs.djangoproject.com/en/dev/ref/request-response/
HttpRequest.META
A standard Python dictionary containing all available HTTP headers.
Available headers depend on the client and server, but here are some examples:
CONTENT_LENGTH
CONTENT_TYPE
HTTP_ACCEPT_ENCODING
HTTP_ACCEPT_LANGUAGE
HTTP_HOST -- The HTTP Host header sent by the client.
HTTP_REFERER -- The referring page, if any.
HTTP_USER_AGENT -- The client's user-agent string.
QUERY_STRING -- The query string, as a single (unparsed) string.
REMOTE_ADDR -- The IP address of the client.
REMOTE_HOST -- The hostname of the client.
REMOTE_USER -- The user authenticated by the Web server, if any.
REQUEST_METHOD -- A string such as "GET" or "POST".
SERVER_NAME -- The hostname of the server.
SERVER_PORT -- The port of the server.
With the exception of CONTENT_LENGTH and CONTENT_TYPE, as
given above, any HTTP headers in the
request are converted to META keys by
converting all characters to
uppercase, replacing any hyphens with
underscores and adding an HTTP_ prefix
to the name. So, for example, a header
called X-Bender would be mapped to the
META key HTTP_X_BENDER.
So:
if request.META['HTTP_USERNAME']:
blah
else:
blah
The headers are stored in os.environ. So you can access the HTTP headers like this:
import os
if os.environ.haskey("SOME_HEADER"):
# do something with the header, i.e. os.environ["SOME_HEADER"]