How do I clear cache with Python Requests? - python

Does the requests package of Python cache data by default?
For example,
import requests
resp = requests.get('https://some website')
Will the response be cached? If so, how do I clear it?

Add a 'Cache-Control: no-cache' header:
self.request = requests.get('http://google.com',
headers={'Cache-Control': 'no-cache'})
See https://stackoverflow.com/a/55613686/469045 for complete answer.

Late answer, but python requests doesn't cache requests, you should use the Cache-Control and Pragma headers instead, i.e.:
import requests
h = {
...
"Cache-Control": "no-cache",
"Pragma": "no-cache"
}
r = requests.get("url", headers=h)
...
HTTP/Headers
Cache-Control
The Cache-Control general-header field is used to specify directives for caching mechanisms in both requests and responses. Caching directives are unidirectional, meaning that a given directive in a request is not implying that the same directive is to be given in the response.
Pragma
Implementation-specific header that may have various effects anywhere
along the request-response chain. Used for backwards compatibility
with HTTP/1.0 caches where the Cache-Control header is not yet
present.
Directive
no-cache
Forces caches to submit the request to the origin server for
validation before releasing a cached copy.
Note on Pragma:
Pragma is not specified for HTTP responses and is therefore not a
reliable replacement for the general HTTP/1.1 Cache-Control header,
although it does behave the same as Cache-Control: no-cache, if the
Cache-Control header field is omitted in a request. Use Pragma only
for backwards compatibility with HTTP/1.0 clients.

Python-requests doesn't have any caching features.
However, if you need them you can look at requests-cache, although I never used it.

Requests does not do caching by default. You can easily plug it in by using something like CacheControl.

I was getting outdated version of a website and I thought about requests cache too, but adding no-cache parameter to headers didn't change anything. It appears that the cookie I was passing, was causing the server to present outdated site.

Related

Can I use Python requests library to send inconsistent requests

I try to write a python script containing a somewhat unusual HTTP request as part of learning about web attacks and solving the lab at
https://portswigger.net/web-security/request-smuggling/lab-basic-cl-te.
There, I need to issue a request containing both a Content-Length and a Transfer-Encoding header that are in disagreement.
My basic and still unmanipulated request looks like this and works as expected:
with requests.Session() as client:
client.verify = False
client.proxies = proxies
[...]
data = '0\r\n\r\nX'
req = requests.Request('POST', host, data=data)
prep = client.prepare_request(req)
client.send(prep)
[...]
Content-Length: 6\r\n
\r\n
0\r\n
\r\n
X
However, as soon as I add the Transfer-Encoding header, the request itself gets modified.
data = '0\r\n\r\nX'
req = requests.Request('POST', host, data=data)
prep = client.prepare_request(req)
prep.headers['Transfer-Encoding'] = 'chunked'
client.send(prep)
The request that is actually send down the wire is
[...]
Content-Length: 0\r\n
\r\n
whereas the expected request would be
[...]
Content-Length: 6\r\n
Transfer-Encoding: chunked\r\n
\r\n
0\r\n
\r\n
X
The same thing happens if I flip things around, prepare a chunked request and modify the Content-Length header afterwards:
def gen():
yield b'0\r\n'
yield b'\r\n'
yield b'X'
req = requests.Request('POST', host, data=gen())
prep = client.prepare_request(req)
prep.headers['Content-Length'] = '6'
client.send(prep)
Basically, the Transfer-Encoding header gets removed completely, the data is reinterpreted according to the chunking and the Content-Length header gets recalculated to match.
I was under the impression that preparing a request and manipulating its content before sending should send the modified content, but either this is a wrong assumption or I do things horribly wrong.
Is sending such a request possible this way or do I have to go onto a lower level to put arbitrary data on the wire?
requests is a good HTTP client, and as such will prevent you from generating bad HTTP queries. As writing bad HTTP queries will result in 400 errors in a lot of cases.
To generate syntax errors in HTTP queries you need to avoid using high level http clients (like a browser, but also like an http library). Instead you need togo down to the tcp/ip socket management (and maybe ssl also) and start writing the full HTTP protocol with your own code, no library.

Can't use DjangoRestFramework with Requests because of self-signed-cert in chain

I've got 2 Django applications that I need to be talking to each other via the DjangoRestFramework. The apps live on Windows Servers that run Apache to serve the data.
They live on separate boxes within my infrastructure and here we use a self-signed cert for internal things because that's a requirement.
When I try to make the request, it's upset with the self-signed-certificate in the chain and doesn't finish the request.
I made a little utility function that reaches out to the API:
def get_future_assignments(user_id):
"""gets a users data from the API
Arguments:
user_id {int} -- user_id for a User
"""
headers = {
'User-Agent': 'Mozilla/5.0',
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'X-Requested-With': 'XMLHttpRequest'
}
api_app = settings.MY_API
api_model = 'api_endpoint/'
api_query = '?user_id='+str(user_id)
json_response = requests.get(
api_app+api_model+api_query,
headers=headers,
verify=False
)
return json.loads(json_response.content)
using verify=False allows the endpoint to run and give back appropriate data. If I remove that it will fail saying there is a self-signed-cert in the chain and not finish the call.
The overall layout is like this:
Server1: https - self-signed (server1.cer (making the call))
Server2: https - self-signed (server2.cer (api lives on this server))
I have the root certificate we install on browsers and such within the organization (rootcert.cer).
I'm not too well versed in certs so I am not sure which ones I should be sending across with the request.
I know there is a cert argument that can be used in requests - but I'm unsure what to put in there.
I'm pretty sure I would need something like this:
json_response = requests.get(
api_app+api_model+api_query,
headers=headers,
cert= our_cert, # not sure which cert this would be...
key= our_root_ca # do I need this? Not sure...
)
But any help from people who know certs a bit better is appreciated!

How can I set custom Server header with Tornado?

I noticed that my app returns this HTTP response header:
Server: TornadoServer/4.5.2
Is it possible to change it to custom?
Use RequestHandler.set_default_headers()
Do note that setting such headers in the normal flow of request processing may not do what you want, since headers may be reset during error handling.
Here is the source from the documentation.
You can use the RequestHandler.set_header() for the headers you want to add or change.
Here is an example
RequestHandler.set_header('Access-Control-Allow-Origin', '*')
RequestHandler.set_header('Access-Control-Allow-Methods', 'POST, GET, PUT, DELETE, OPTIONS')
RequestHandler.set_header('Access-Control-Max-Age', 1000)
you also can use RequestHandler.set_header().
this method will change the finally return response header.

Flask/Eve + WSGI and HTTP_X_HTTP_METHOD_OVERRIDE

I am trying to understand how and when the WSGI environment HTTP Header(s) get renamed in an app's request object.
I am trying Eve and I am sending a POST or a PUT with X-HTTP-Method-Override.
The code, within Eve, is trying to access the request headers using the following code (here):
return request.headers.get('X-HTTP-Method-Override', request.method)
In my WSGI Environment I have a HTTP_X_HTTP_METHOD_OVERRIDE with value PATCH.
When I try to do a request.headers dump, I get:
Request Header: ('X-Http-Method-Override', u'PATCH')
Request Header: ('Origin', u'http://localhost:9000')
Request Header: ('Content-Length', u'622')
Request Header: ('Host', u'localhost:24435')
Request Header: ('Accept', u'application/json;charset=UTF-8')
Request Header: ('Content-Type', u'application/json')
Request Header: ('Accept-Encoding', u'identity')
I checked online and other Python applications are trying to access this specific request header with the case:
X-HTTP-Method-Override and not X-Http-Method-Override (which I get in request)
Flask takes care of extracting the headers from the WSGI environment variables for you, in the process removing the initial HTTP_ prefix. The prefix is there in the WSGI environment to distinguish the headers from other WSGI information, but that prefix is entirely redundant once you extracted the headers into a dedicated structure.
The request object also provides you with a specialised dictionary where keys are matched case insensitively. It doesn't matter what case you use here, as long as the lowercased version matches the lowercased header key; http, Http, HTTP and HtTp all are valid case variations. That's because the HTTP standard explicitly states that case should be ignored when handling headers.
See the Headers class reference in the Werkzeug documentation, it is the bases for the request.headers object. It in turn is compatible with the wsgiref.headers.Headers class, including this:
For each of these methods, the key is the header name (treated case-insensitively), and the value is the first value associated with that header name.
Emphasis mine.

sending http requests with specific/non-existent http version protocol in Python

There is some way to send http requests in python with specific http version protocol.I think that, with httplib or urllib, it is not possible.
For example: GET / HTTP/6.9
Thanks in advance.
The simple answer to your question is: You're right, neither httplib nor urllib has public, built-in functionality to do this. (Also, you really shouldn't be using urllib for most thingsā€”in particular, for urlopen.)
Of course you can always rely on implementation details of those modules, as in Lukas Graf's answer.
Or, alternatively, you could fork one of those modules and modify it, which guarantees that your code will work on other Python 2.x implementations.*. Note that httplib is one of those modules that has a link to the source up at the top, which means it's mean to server as example code, not just as a black-box library.
Or you could just reimplement the lowest-level function that needs to be hooked but that's publicly documented. For httplib, I believe that's httplib.HTTPConnection.putrequest, which is a few hundred lines long.
Or you could pick a different library that has more hooks in it, so you have less to hook.
But really, if you're trying to craft a custom request to manually fingerprint the results, why are you using an HTTP library at all? Why not just do this?
msg = 'GET / HTTP/6.9\r\n\r\n'
s = socket.create_connection((host, 80))
with closing(s):
s.send(msg)
buf = ''.join(iter(partial(s.recv, 4096), ''))
* That's not much of a benefit, given that there will never be a 2.8, all of the existing major 2.7 implementations share the same source for this module, and it's not likely any new 2.x implementation will be any different. And if you go to 3.x, httplib has been reorganized and renamed, while urllib has been removed entirely, so you'll already have bigger changes to worry about.
You can do it easily enough by subclassing httplib.HTTPConnection and redefining the class attribute _http_vsn_str:
from httplib import HTTPConnection
class MyHTTPConnection(HTTPConnection):
_http_vsn_str = '6.9'
conn = MyHTTPConnection("www.stackoverflow.com")
conn.request("GET", "/")
response = conn.getresponse()
print "Status: {} {}".format(response.status, response.reason)
print "Headers: {}".format(response.getheaders())
print "Body: {}".format(response.read())
Of course this will result in a 400 Bad Request for most servers:
Status: 400 Bad Request
Headers: [('date', 'Tue, 11 Nov 2014 21:21:12 GMT'), ('connection', 'close'), ('content-type', 'text/html; charset=us-ascii'), ('content-length', '311')]
Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>Bad Request</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=us-ascii"></HEAD>
<BODY><h2>Bad Request</h2>
<hr><p>HTTP Error 400. The request is badly formed.</p>
</BODY></HTML>
this is possible using pycurl by using this option
c.setopt(pycurl.HTTP_VERSION, pycurl.CURL_HTTP_VERSION_1_0)
however you need to use linux or mac since pycurl is not officially supported on windows

Categories