Create and parse multipart HTTP requests in Python

Create and parse multipart HTTP requests in Python - python

I'm trying to write some python code which can create multipart mime http requests in the client, and then appropriately interpret then on the server. I have, I think, partially succeeded on the client end with this:
from email.mime.multipart import MIMEMultipart, MIMEBase
import httplib
h1 = httplib.HTTPConnection('localhost:8080')
msg = MIMEMultipart()
fp = open('myfile.zip', 'rb')
base = MIMEBase("application", "octet-stream")
base.set_payload(fp.read())
msg.attach(base)
h1.request("POST", "http://localhost:8080/server", msg.as_string())
The only problem with this is that the email library also includes the Content-Type and MIME-Version headers, and I'm not sure how they're going to be related to the HTTP headers included by httplib:
Content-Type: multipart/mixed; boundary="===============2050792481=="
MIME-Version: 1.0
--===============2050792481==
Content-Type: application/octet-stream
MIME-Version: 1.0
This may be the reason that when this request is received by my web.py application, I just get an error message. The web.py POST handler:
class MultipartServer:
def POST(self, collection):
print web.input()
Throws this error:
Traceback (most recent call last):
File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 242, in process
return self.handle()
File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 233, in handle
return self._delegate(fn, self.fvars, args)
File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 415, in _delegate
return handle_class(cls)
File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 390, in handle_class
return tocall(*args)
File "/home/richard/Development/server/webservice.py", line 31, in POST
print web.input()
File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/webapi.py", line 279, in input
return storify(out, *requireds, **defaults)
File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 150, in storify
value = getvalue(value)
File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 139, in getvalue
return unicodify(x)
File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 130, in unicodify
if _unicode and isinstance(s, str): return safeunicode(s)
File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 326, in safeunicode
return obj.decode(encoding)
File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 137-138: invalid data
My line of code is represented by the error line about half way down:
File "/home/richard/Development/server/webservice.py", line 31, in POST
print web.input()
It's coming along, but I'm not sure where to go from here. Is this a problem with my client code, or a limitation of web.py (perhaps it just can't support multipart requests)? Any hints or suggestions of alternative code libraries would be gratefully received.
EDIT
The error above was caused by the data not being automatically base64 encoded. Adding
encoders.encode_base64(base)
Gets rid of this error, and now the problem is clear. HTTP request isn't being interpreted correctly in the server, presumably because the email library is including what should be the HTTP headers in the body instead:
<Storage {'Content-Type: multipart/mixed': u'',
' boundary': u'"===============1342637378=="\n'
'MIME-Version: 1.0\n\n--===============1342637378==\n'
'Content-Type: application/octet-stream\n'
'MIME-Version: 1.0\n'
'Content-Transfer-Encoding: base64\n'
'\n0fINCs PBk1jAAAAAAAAA.... etc
So something is not right there.
Thanks
Richard

I used this package by Will Holcomb http://pypi.python.org/pypi/MultipartPostHandler/0.1.0 to make multi-part requests with urllib2, it may help you out.

After a bit of exploration, the answer to this question has become clear. The short answer is that although the Content-Disposition is optional in a Mime-encoded message, web.py requires it for each mime-part in order to correctly parse out the HTTP request.
Contrary to other comments on this question, the difference between HTTP and Email is irrelevant, as they are simply transport mechanisms for the Mime message and nothing more. Multipart/related (not multipart/form-data) messages are common in content exchanging webservices, which is the use case here. The code snippets provided are accurate, though, and led me to a slightly briefer solution to the problem.
# open an HTTP connection
h1 = httplib.HTTPConnection('localhost:8080')
# create a mime multipart message of type multipart/related
msg = MIMEMultipart("related")
# create a mime-part containing a zip file, with a Content-Disposition header
# on the section
fp = open('file.zip', 'rb')
base = MIMEBase("application", "zip")
base['Content-Disposition'] = 'file; name="package"; filename="file.zip"'
base.set_payload(fp.read())
encoders.encode_base64(base)
msg.attach(base)
# Here's a rubbish bit: chomp through the header rows, until hitting a newline on
# its own, and read each string on the way as an HTTP header, and reading the rest
# of the message into a new variable
header_mode = True
headers = {}
body = []
for line in msg.as_string().splitlines(True):
if line == "\n" and header_mode == True:
header_mode = False
if header_mode:
(key, value) = line.split(":", 1)
headers[key.strip()] = value.strip()
else:
body.append(line)
body = "".join(body)
# do the request, with the separated headers and body
h1.request("POST", "http://localhost:8080/server", body, headers)
This is picked up perfectly well by web.py, so it's clear that email.mime.multipart is suitable for creating Mime messages to be transported by HTTP, with the exception of its header handling.
My other overall conern is in scalability. Neither this solution nor the others proposed here scale well, as they read the contents of a file into a variable before bundling up in the mime message. A better solution would be one which could serialise on demand as the content is piped out over the HTTP connection. It's not urgent for me to fix that, but I'll come back here with a solution if I get to it.

There is a number of things wrong with your request. As TokenMacGuy suggests, multipart/mixed is unused in HTTP; use multipart/form-data instead. In addition, parts should have a Content-disposition header. A python fragment to do that can be found in the Code Recipes.

Related

How do I get my JSON decoder to work properly?

I was working on API testing, and I tried everything but it would not print the JSON file into a string. I was wondering if it was the website I was testing the API requests from as I kept getting a 406 error. I even tried taking code from online that shows how to do this but it still would not print and it would give the error listed below. Here I give the code I used and the response Pycharm's console gave me.
import json
import requests
res = requests.get("http://dummy.restapiexample.com/api/v1/employees")
data = json.loads(res.text)
data = json.dumps(data)
print(data)
print(type(data))
Traceback (most recent call last):
File "C:/Users/johnc/PycharmProjects/API_testing/api_testing.py", line 8, in <module>
data = json.loads(res.text)
File "D:\Program Files (x86)\lib\json\__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "D:\Program Files (x86)\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "D:\Program Files (x86)\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

REST API's vary widely in the types of requests they will accept. 406 means that you didn't give enough information for the server to format a response. You should include a user agent because API's are frequently tweaked to deal with the foibles of various http clients and specifically list the format of the output you want. Adding acceptable encodings lets the API compress data. Charset is a good idea. You could even add a language request but most API's don't care.
import json
import requests
headers={"Accept":"application/json",
"User-agent": "Mozilla/5.0",
"Accept-Charset":"utf-8",
"Accept-Encoding":"gzip, deflate",
"Accept-Language":"en-US"} # or your favorite language
res = requests.get("http://dummy.restapiexample.com/api/v1/employees", headers=headers)
data = json.loads(res.text)
data = json.dumps(data)
print(data)
print(type(data))
The thing about REST API's is that they may ignore some or part of the header and return what they please. Its a good idea to form the request properly anyway.

The default Python User-Agent was being probably blocked by the hosting company.
You can setup any string or search for a real device string.
res = requests.get("http://dummy.restapiexample.com/api/v1/employees", headers={"User-Agent": "XY"})

It's you, your connection or a proxy. Things work just fine for me.
>>> import requests
>>> res = requests.get("http://dummy.restapiexample.com/api/v1/employees")
>>> res.raise_for_status() # would raise if status != 200
>>> print(res.json()) # `res.json()` is the canonical way to extract JSON from Requests
{'status': 'success', 'data': [{'id': '1', 'employee_name': 'Tiger Nixon', 'employee_salary': '320800', ...

Why can response.content be read twice and can't be decoded to json

I found a strange behavior today.
I sent a message in google cloud messaging via the python requests lib.
Then I tried to decode the response to json like this:
response = requests.post(Message_Broker.host, data=json.dumps(payload), headers=headers)
response_results = json.loads(response.content)["results"]
This crashed with a decode error:
response_results = json.loads(response.content)["results"]
File "/usr/local/lib/python2.7/dist-packages/simplejson/__init__.py", line 505, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python2.7/dist-packages/simplejson/decoder.py", line 370, in decode
obj, end = self.raw_decode(s)
File "/usr/local/lib/python2.7/dist-packages/simplejson/decoder.py", line 400, in raw_decode
return self.scan_once(s, idx=_w(s, idx).end())
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
This happened on my productive system, so I added some debug logging to get to know what the actual content of the response is like this:
logger.info("GCM-Response: " + str(response))
logger.info("GCM-Response: " + response.content)
logger.info("GCM-Response: " + str(response.headers))
Now the actual weird behavior occured. It got logged correctly and didn't throw the decoding error anymore.
Can someone explain me that behavior?
I have also checked what response.content actually is:
#property
def content(self):
"""Content of the response, in bytes."""
if self._content is False:
# Read the contents.
try:
if self._content_consumed:
raise RuntimeError(
'The content for this response was already consumed')
if self.status_code == 0:
self._content = None
else:
self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
except AttributeError:
self._content = None
self._content_consumed = True
# don't need to release the connection; that's been handled by urllib3
# since we exhausted the data.
return self._content
It is part of the requests models. Not a actual property but made accessable via the #property decorator.
For my understanding, the first time the content is read for the logging, the _content_consumed flag is set to True. Therefore the second time, when I read it for the json decoding it should actually raise the Runtime Error.
Is there an explanation, which I just not found when browsing the requests docs?

Therefore the second time, when I read it for the json decoding it should actually raise the Runtime Error.
No, it won't raise a RuntimeError. When you access response.content first time it will cache actual data into self._content. On the second (third, fourth, etc) access if self._content is False: is falsy, so you will get content cached in self._content.
The if self._content_consumed: check is most likely internal assert to discover attempts to read data from the socket multiple times (which is obviously an error).
It can't be decoded to JSON, because you received not a JSON in response body or received empty body. Maybe it is 500 response or 429. It's impossible to say without seeing actual response.

Return response code python HTTP header

I have python server serving cgi scripts,
I want to add status code to my response. I did,
try:
cgi = CGI()
output = cgi.fire()
print 'Content-Type text/json'
print 'Status:200 success'
print
print json.dumps(output)
except:
print 'Content-Type: text/json'
print 'Status: 403 Forbidden'
print
print json.dumps({'msg':'error'})
But when I request the this script via dojo xhr request, I get 200 request status. Why is so?
Header
Request URL:http://192.168.2.72:8080/cgi-bin/cgi.py
Request Method:POST
Status Code:200 Script output follows
Request Headersview source
Accept:*/*
Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.3
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-US,en;q=0.8
Cache-Control:no-cache
Connection:keep-alive
Content-Length:125
Content-Type:application/x-www-form-urlencoded
Host:192.168.2.72:7999
Origin:http://192.168.2.72:7999
Pragma:no-cache
Referer:http://192.168.2.72:7999/home.html
User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.22 (KHTML, like Gecko) Ubuntu Chromium/25.0.1364.160 Chrome/25.0.1364.160 Safari/537.22
X-Requested-With:XMLHttpRequest
Form Dataview sourceview URL encoded
Response Headersview source
Content-Type:text/json
Date:Fri, 08 Aug 2014 05:16:29 GMT
Server:SimpleHTTP/0.6 Python/2.7.3
Status:403 Forbidden
Any inputs?
what I already have tried:
result.ioArgs.xhr.getAllResponseHeaders() // returns string
ioargs.xhr.status // returns the request status.

If json.dumps(output) raises an exception, you will have already printed your headers including status code (generally would be spelled as Status: 200 OK) and a blank line to end the header section of the HTTP response.
Then, the except block will print a second set of headers, but those are actually considered part of the body of the response at that point because printing an empty line ended the headers. See the HTTP message spec.
The solution is to wait until you know what your output is going to be to print any headers.
-more-
json.dumps can raise exceptions if you give it input that is not serializable. And given that cgi.fire() appears to be a method of some custom CGI object (builtin cgi module doesn't have that method) it could be returning anything.
To debug you need to log what exception is being raised, preferably with traceback. The bare except: block you have will catch all errors and then do nothing with them, so you don't know what's going on, nor does anyone looking at the question. You might also need to log the value of output.

To complement what Jason S says I reproduced exactly in his answer I reproduced the exactly same failure with a non json serializable object (in this example a md5 hash) and have the same behaviour than original poster a 200 return code
#!/usr/bin/env python
import json
import traceback
class CGI:
def fire(self):
import md5
return md5.md5()
try:
cgi = CGI()
output = cgi.fire()
print 'Content-Type text/json'
print 'Status:200 success'
print
print json.dumps(output)
except:
traceback.print_exc()
print 'Content-Type: text/json'
print 'Status: 403 Forbidden'
print
print json.dumps({'msg':'error'})
interacting with the server
$ socat - TCP4:localhost:8000
input
GET /cgi-bin/test.py HTTP/1.0
output
HTTP/1.0 200 Script output follows
Server: SimpleHTTP/0.6 Python/3.4.0
Date: Sun, 17 Aug 2014 16:16:19 GMT
Content-Type text/json
Status:200 success
Content-Type: text/json
Status: 403 Forbidden
{"msg": "error"}
traceback:
127.0.0.1 - - [17/Aug/2014 18:16:19] "GET /cgi-bin/test.py HTTP/1.0" 200 -
Traceback (most recent call last):
File "/home/xcombelle/dev/test/cgi-bin/test.py", line 16, in <module>
print json.dumps(output)
File "/usr/lib/python2.7/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python2.7/json/encoder.py", line 200, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python2.7/json/encoder.py", line 263, in iterencode
return _iterencode(o, 0)
File "/usr/lib/python2.7/json/encoder.py", line 177, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <md5 HASH object # 0x7fcb090d0a30> is not JSON serializable

just do json.dumps() to a string before outputting your headers and you should be fine no?
that will protect you from setting headers and then getting an exception as exception in print is unlikely

For anyone coming across this in future, the reason for this is the python http.server which was being used to serve the content. For some reason this is designed to spit out a Status 200: script output follows header before the cgi script starts running. This means that you can't change the status code returned within your script (see the documentation for CGIHTTPRequestHandler on this page)
This actually makes it a real pain to use when developing as the errors don't propagate in the same way they would in production.

It looks to me like you are setting a Status: header field but you want to set Status-Code:.
Does your script really write Status Code:200 Script output follows as a header field?

Travelport Galileo python SoapClient

I need to develop python soapclient for Travelport Galileo uAPI.
This is 30-day trial credentials for Travelport Universal API
Universal API User ID: Universal API/uAPI2514620686-0edbb8e4
Universal API Password: D54HWfck9nRZNPbXmpzCGwc95
Branch Code for Galileo (1G): P7004130
URLs: https://emea.universal-api.pp.travelport.com/B2BGateway/connect/uAPI/
This is quote from documentation galileo
HTTP Header
The HTTP header includes:
SOAP endpoints, which vary by:
Geographical region.
Requested service. In the preceding example, the HotelService is used for the endpoint; however, the service name is modified based on the request transaction.
gzip compression, which is optional, but strongly recommended. To accept gzip compression in the response, specify “Accept-Encoding: gzip,deflate” in the header.
Authorization, which follows the standard basic authorization pattern.
The text that follows “Authorization: Basic” can be encoded using Base 64. This functionality is supported by most programming languages.
The syntax of the authorization credentials must include the prefix "Universal API/" before the User Name and Password assigned by Travelport.
POST https://americas.universal-api.pp.travelport.com/
B2BGateway/connect/uAPI/HotelService HTTP/2.0
Accept-Encoding: gzip,deflate
Content-Type: text/xml;charset=UTF-8
SOAPAction: ""
Authorization: Basic UniversalAPI/UserName:Password
Content-Length: length
This is i my python code
import urllib2
import base64
import suds
class HTTPSudsPreprocessor(urllib2.BaseHandler):
def http_request(self, req):
message = \
"""
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:air="http://www.travelport.com/schema/air_v16_0" xmlns:com="http://www.travelport.com/schema/common_v13_0" -->
<soapenv:header>
<soapenv:body>
<air:availabilitysearchreq xmlns:air="http://www.travelport.com/schema/air_v16_0" xmlns:com="http://www.travelport.com/schema/common_v13_0" authorizedby="Test" targetbranch="P7004130">
<air:searchairleg>
<air:searchorigin>
<com:airport code="LHR">
</com:airport></air:searchorigin>
<air:searchdestination>
<com:airport code="JFK">
</com:airport></air:searchdestination>
<air:searchdeptime preferredtime="2011-11-08">
</air:searchdeptime></air:searchairleg>
</air:availabilitysearchreq>
</soapenv:body>
"""
auth = base64.b64encode('Universal API/uAPI2514620686-0edbb8e4:D54HWfck9nRZNPbXmpzCGwc95')
req.add_header('Content-Type', 'text/xml; charset=utf-8')
req.add_header('Accept', 'gzip,deflate')
req.add_header('Cache-Control','no-cache')
req.add_header('Pragma', 'no-cache')
req.add_header('SOAPAction', '')
req.add_header('Authorization', 'Basic %s'%(auth))
return req
https_request = http_request
URL = "https://emea.universal-api.pp.travelport.com/B2BGateway/connect/uAPI/"
https = suds.transport.https.HttpTransport()
opener = urllib2.build_opener(HTTPSudsPreprocessor)
https.urlopener = opener
suds.client.Client(URL, transport = https)
But it is not working.
Traceback (most recent call last):
File "soap.py", line 42, in <module>
suds.client.Client(URL, transport = https)
File "/usr/local/lib/python2.7/site-packages/suds/client.py", line 112, in __init__
self.wsdl = reader.open(url)
File "/usr/local/lib/python2.7/site-packages/suds/reader.py", line 152, in open
d = self.fn(url, self.options)
File "/usr/local/lib/python2.7/site-packages/suds/wsdl.py", line 136, in __init__
d = reader.open(url)
File "/usr/local/lib/python2.7/site-packages/suds/reader.py", line 79, in open
d = self.download(url)
File "/usr/local/lib/python2.7/site-packages/suds/reader.py", line 95, in download
fp = self.options.transport.open(Request(url))
File "/usr/local/lib/python2.7/site-packages/suds/transport/http.py", line 64, in open
raise TransportError(str(e), e.code, e.fp)
suds.transport.TransportError: HTTP Error 500: Dynamic backend host not specified
I'm trying to solve this problem for the past 2 weeks, so if you can, please advise me solution.

I think you can try to download WSDL files in ZIP archive from this url https://support.travelport.com/webhelp/uAPI/uAPI.htm#Getting_Started/Universal_API_Schemas_and_WSDLs.htm
So you will be able to generate your client classes using those WSDL files, because there is no WSDL endpoint on the https://emea.universal-api.pp.travelport.com/B2BGateway/connect/uAPI/
(like ?wsdl or /.wsdl)

Python httplib ResponseNotReady

I'm writing a REST client for elgg using python, and even when the request succeeds, I get this in response:
Traceback (most recent call last):
File "testclient.py", line 94, in <module>
result = sendMessage(token, h1)
File "testclient.py", line 46, in sendMessage
res = h1.getresponse().read()
File "C:\Python25\lib\httplib.py", line 918, in getresponse
raise ResponseNotReady()
httplib.ResponseNotReady
Looking at the header, I see ('content-length', '5749'), so I know there is a page there, but I can't use .read() to see it because the exception comes up. What does ResponseNotReady mean and why can't I see the content that was returned?

Previous answers are correct, but there's another case where you could get that exception:
Making multiple requests without reading any intermediate responses completely.
For instance:
conn.request('PUT',...)
conn.request('GET',...)
# will not work: raises ResponseNotReady
conn.request('PUT',...)
r = conn.getresponse()
r.read() # <-- that's the important call!
conn.request('GET',...)
r = conn.getresponse()
r.read() # <-- same thing
and so on.

Make sure you don't reuse the same object from a previous connection. You will hit this once the server keep-alive ends and the socket closes.

I was running into this same exception today, using this code:
conn = httplib.HTTPConnection(self._host, self._port)
conn.putrequest('GET',
'/retrieve?id={0}'.format(parsed_store_response['id']))
retr_response = conn.getresponse()
I didn't notice that I was using putrequest rather than request; I was mixing my interfaces. ResponseNotReady is raised because I haven't actually sent the request yet.

Additionally, errors like this can occur when the server sends a response without a Content-Length header, which will cripple the state of the HTTP client if Keep-Alive is used and another request is sent over the same socket.

This can also occur if a firewall blocks the connection.

Unable to add comment to #Bokeh 's answer; as I do not have the requisite reputation yet on this platform.
So, adding as answer: Bokeh's answer worked for me.
I was trying to pipeline multiple requests sequentially over the same connection object. For few of the responses I wanted to process the response later, hence missed to read the response.
From my experience, I second Bokeh's answer:
response.read() is a must after each request. Even if you wish to process response or not.
From my standpoint this question would have been incomplete without Bokeh's answer.
Thanks #Bokeh

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Create and parse multipart HTTP requests in Python - python

I used this package by Will Holcomb http://pypi.python.org/pypi/MultipartPostHandler/0.1.0 to make multi-part requests with urllib2, it may help you out.

There is a number of things wrong with your request. As TokenMacGuy suggests, multipart/mixed is unused in HTTP; use multipart/form-data instead. In addition, parts should have a Content-disposition header. A python fragment to do that can be found in the Code Recipes.

Related

How do I get my JSON decoder to work properly?

Why can response.content be read twice and can't be decoded to json

Return response code python HTTP header

Travelport Galileo python SoapClient

Python httplib ResponseNotReady

Categories

Resources