I cannot for the life of me figure out how to perform an HTTP PUT request with verbatim binary data in Python 2.7 with the standard Python libraries.
I thought I could do it with urllib2, but that fails because urllib2.Request expects its data in application/x-www-form-urlencoded format. I do not want to encode the binary data, I just want to transmit it verbatim, after the headers that include
Content-Type: application/octet-stream
Content-Length: (whatever my binary data length is)
This seems so simple, but I keep going round in circles and can't seem to figure out how.
How can I do this? (aside from open up a raw binary socket and write to it)
I found out my problem. It seems there is some obscure behavior in urllib2.Request / urllib2.urlopen() (at least in Python 2.7)
The urllib2.Request(url, data, headers) constructor seems to expect the same type of string in its url and data parameters.
I was giving the data parameter raw data from a file read() call (which in Python 2.7 returns it in the form of a 'plain' string), but my url was accidentally Unicode because I concatenated a portion of the URL from the result of another function which returned Unicode strings.
Rather than trying to "downcast" url from Unicode -> plain strings, it tried to "upcast" the data parameter to Unicode, and it gave me a codec error. (oddly enough, this happens on the urllib2.urlopen() function call, not the urllib2.Request constructor)
When I changed my function call to
# headers contains `{'Content-Type': 'application/octet-stream'}`
r = urllib2.Request(url.encode('utf-8'), data, headers)
it worked fine.
You're misreading the documentation: urllib2.Request expects the data already encoded, and for POST that usually means the application/x-www-form-urlencoded format. You are free to associate any other, binary data, like this:
import urllib2
data = b'binary-data'
r = urllib2.Request('http://example.net/put', data,
{'Content-Type': 'application/octet-stream'})
r.get_method = lambda: 'PUT'
urllib2.urlopen(r)
This will produce the request you want:
PUT /put HTTP/1.1
Accept-Encoding: identity
Content-Length: 11
Host: example.net
Content-Type: application/octet-stream
Connection: close
User-Agent: Python-urllib/2.7
binary-data
Have you considered/tried using httplib?
HTTPConnection.request(method, url[, body[, headers]])
This will send a request to the server using the HTTP request method
method and the selector url. If the body argument is present, it
should be a string of data to send after the headers are finished.
Alternatively, it may be an open file object, in which case the
contents of the file is sent; this file object should support fileno()
and read() methods. The header Content-Length is automatically set to
the correct value. The headers argument should be a mapping of extra
HTTP headers to send with the request.
This snipped worked for me to PUT an image:
on HTTPS site. If you don't need HTTPS, use
httplib.HTTPConnection(URL) instead.
import httplib
import ssl
API_URL="api-mysight.com"
TOKEN="myDummyToken"
IMAGE_FILE="myimage.jpg"
imageID="myImageID"
URL_PATH_2_USE="/My/image/" + imageID +"?objectId=AAA"
headers = {"Content-Type":"application/octet-stream", "X-Access-Token": TOKEN}
imgData = open(IMAGE_FILE, "rb")
REQUEST="PUT"
conn = httplib.HTTPSConnection(API_URL, context=ssl.SSLContext(ssl.PROTOCOL_TLSv1))
conn.request(REQUEST, URL_PATH_2_USE, imgData, headers)
response = conn.getresponse()
result = response.read()
Related
I try to write a python script containing a somewhat unusual HTTP request as part of learning about web attacks and solving the lab at
https://portswigger.net/web-security/request-smuggling/lab-basic-cl-te.
There, I need to issue a request containing both a Content-Length and a Transfer-Encoding header that are in disagreement.
My basic and still unmanipulated request looks like this and works as expected:
with requests.Session() as client:
client.verify = False
client.proxies = proxies
[...]
data = '0\r\n\r\nX'
req = requests.Request('POST', host, data=data)
prep = client.prepare_request(req)
client.send(prep)
[...]
Content-Length: 6\r\n
\r\n
0\r\n
\r\n
X
However, as soon as I add the Transfer-Encoding header, the request itself gets modified.
data = '0\r\n\r\nX'
req = requests.Request('POST', host, data=data)
prep = client.prepare_request(req)
prep.headers['Transfer-Encoding'] = 'chunked'
client.send(prep)
The request that is actually send down the wire is
[...]
Content-Length: 0\r\n
\r\n
whereas the expected request would be
[...]
Content-Length: 6\r\n
Transfer-Encoding: chunked\r\n
\r\n
0\r\n
\r\n
X
The same thing happens if I flip things around, prepare a chunked request and modify the Content-Length header afterwards:
def gen():
yield b'0\r\n'
yield b'\r\n'
yield b'X'
req = requests.Request('POST', host, data=gen())
prep = client.prepare_request(req)
prep.headers['Content-Length'] = '6'
client.send(prep)
Basically, the Transfer-Encoding header gets removed completely, the data is reinterpreted according to the chunking and the Content-Length header gets recalculated to match.
I was under the impression that preparing a request and manipulating its content before sending should send the modified content, but either this is a wrong assumption or I do things horribly wrong.
Is sending such a request possible this way or do I have to go onto a lower level to put arbitrary data on the wire?
requests is a good HTTP client, and as such will prevent you from generating bad HTTP queries. As writing bad HTTP queries will result in 400 errors in a lot of cases.
To generate syntax errors in HTTP queries you need to avoid using high level http clients (like a browser, but also like an http library). Instead you need togo down to the tcp/ip socket management (and maybe ssl also) and start writing the full HTTP protocol with your own code, no library.
I am trying to convert this curl command in python requests, but I am unsure about how I should pass the data:
curl -X POST -u "apikey:{apikey}" \
--header "Content-Type: text/plain" \
--data "some plain text data" \
"{url}"
I have tried to pass the string directly and to encode it with str.encode('utf-8') but I get an error 415 Unsupported Media Type
This is my code:
text = "some random text"
resp = requests.post(url, data=text, headers={'Content-Type': 'text/plain'}, auth=('apikey', self.apikey))
When using requests library, it is usually a good idea not to set Content-Type header manually using headers= keyword.
requests will set this header for you if it is needed (for example posting JSON will always result in Content-Type: application/json header).
Another reason for not setting this type of header manually is encoding, because sometimes you should specify something like Content-Type: text/plain; charset=utf-8.
One more important thing about Content-Type is that this header is not required for making POST requests. RFC 2616:
Any HTTP/1.1 message containing an entity-body SHOULD include a
Content-Type header field defining the media type of that body. If
and only if the media type is not given by a Content-Type field, the
recipient MAY attempt to guess the media type via inspection of its
content and/or the name extension(s) of the URI used to identify the
resource. If the media type remains unknown, the recipient SHOULD
treat it as type "application/octet-stream".
So depending on the server type you're making request to, this header may be left empty.
Sorry for this explanation being a bit vague. I cannot give you an exact explanation why this approach worked for you unless you provide target URL.
I'm having an issue trying to decode a python dictionary sent to my server as json.
This is what I have in my application:
payload = {'status':[bathroomId,item,percentage,timestamp]}
r=requests.post(url,None,json.dumps(payload))
Here is what I do in my Flask server:
req = request.get_json()
print req['status']
When I try to print the content of req['status'], it seems like python won't recognize it as a dictionary and I get an internal server error.
I tried printing req, and I get None
What am I missing?
Unless you set the Content-Type header to application/json in your request, Flask will not attempt to decode any JSON found in your request body.
Instead, get_json will return None (which is what you're seeing right now).
So, you need to set the Content-Type header in your request.
Fortunately since version 2.4.2 (released a year ago), requests has a helper argument to post JSON; this will set the proper headers for you. Use:
requests.post(url, json=payload)
Alternatively (e.g. using requests < 2.4.2), you can set the header yourself:
requests.post(url, data=json.dumps(payload), headers={"Content-Type": "application/json"})
Here is the relevant code from Flask:
Flask only loads JSON if is_json is True (or if you pass force=True to get_json). Otherwise, it returns None.
is_json looks at the Content-Type header, and looks for application/json there.
I'm working on an API wrapper. The spec I'm trying to build to has the following request in it:
curl -H "Content-type:application/json" -X POST -d data='{"name":"Partner13", "email":"example#example.com"}' http://localhost:5000/
This request produces the following response from a little test server I setup to see exatly what headers/params etc are sent as. This little script produces:
uri: http://localhost:5000/,
method: POST,
api_key: None,
content_type: application/json,
params: None,
data: data={"name":"Partner13", "email":"example#example.com"}
So that above is the result I want my python script to create when it hits the little test script.
I'm using the python requests module, which is the most beautiful HTTP lib I have ever used. So here is my python code:
uri = "http://localhost:5000/"
headers = {'content-type': 'application/json' }
params = {}
data = {"name":"Partner13", "email":"example#exmaple.com"}
params["data"] = json.dumps(data)
r = requests.post(uri, data=params, headers=headers)
So simple enough stuff. Set the headers, and create a dictionary for the POST parameters. That dictionary has one entry called "data" which is the JSON string of the data I want to send to the server. Then I call the post. However, the result my little test script gives back is:
uri: http://localhost:5000/,
method: POST,
api_key: None,
content_type: application/json,
params: None,
data: data=%7B%22name%22%3A+%22Partner13%22%2C+%22email%22%3A+%22example%40example.com%22%7D
So essentially the json data I wanted to send under the data parameter has been urlendcoded.
Does anyone know how to fix this? I have looked through the requests documentation and cannot seem to find a way to not auto urlencode the send data.
Thanks very much,
Kevin
When creating the object for the data keyword, simply assign a variable the result of json.dumps(data).
Also, because HTTP POST can accept both url parameters as well as data in the body of the request, and because the requests.post function has a keyword argument named "params", it might be better to use a different variable name for readability. The requests docs use the variable name "payload", so thats what I use.
data = {"name":"Partner13", "email":"example#exmaple.com"}
payload = json.dumps(data)
r = requests.post(uri, data=payload, headers=headers)
Requests automatically URL encodes dictionaries passed as data here. John_GG's solution works because rather than posting a dictionary containing the JSON encoded string in the 'data' field it simply passes the JSON encoded string directly: strings are not automatically encoded. I can't say I understand the reason for this behaviour in Requests but regardless, it is what it is. There is no way to toggle this behaviour off that I can find.
Best of luck with it, Kevin.
I'm having some trouble with twisted.web.client.Agent...
I think the string data in my post request isn't being formatted properly. I'm trying to do something analogous to this synchronous code:
from urllib import urlencode
import urllib2
page = 'http://example.com/'
id_string = 'this:is,my:id:string'
req = urllib2.Request(page, data=urlencode({'id': id_string})) # urlencode call returns 'id=this%3Ais%2Cmy%3Aid%3Astring'
resp = urllib2.urlopen(req)
Here's how I'm building my Agent request as of right now:
from urllib import urlencode
from StringIO import StringIO
page = 'http://example.com/'
id_string = 'my:id_string'
head = {'User-Agent': ['user agent goes here']}
data = urlencode({'id': id_string})
request = agent.request('POST', page, Headers(head), FileBodyProducer(StringIO(data)))
request.addCallback(foo)
Because of the HTTP response I'm getting (null JSON string), I'm beginning to suspect that the id is not being properly encoded in the POST request, but I'm not sure what I can be doing about it. Is using urlencode with the Agent.request call valid? Is there another way I should be encoding these things?
EDIT: Some kind IRC guys have suggested that the problem may stem from the fact that I didn't send the header information that indicates the data is encoded in a url string. I know remarkably little about this kind of stuff... Can anybody point me in the right direction?
As requested, here's my comment in the form of an answer:
HTTP requests with bodies should have the Content-Type header set (to tell the server how to interpret the bytes in the body); in this case, it seems the server is expecting URL-encoded data, like a web browser would send when a form is filled out.
urllib2.Request apparently defaults the content type for you, but the twisted library seems to need it to be set manually. In this case, you want a content type of application/x-www-form-urlencoded.