I received an error: expected str instance, bytes found when i tried to add image binary into multipart/form-data.
The problem is i tried to append the imageData in binary format to strings. Is there a way to add binary image to multipart/form data?
I'm at my wits end, would appreciate some help for this.
imageData = request.FILES['filePath'].read()
content_type, request_body = encode_multipart_formdata([('include_target_data', targetMetaDataRet),
('max_num_results', str(maxNoResultRet))],
[('image', imagePath, imageData)])
def encode_multipart_formdata(fields, files):
BOUNDARY = '----------ThIs_Is_tHe_bouNdaRY_$'
CRLF = '\r\n'
lines = []
for (key, value) in fields:
lines.append('--' + BOUNDARY)
lines.append('Content-Disposition: form-data; name="%s"' % key)
lines.append('')
lines.append(value)
for (key, filename, value) in files:
lines.append('--' + BOUNDARY)
lines.append('Content-Disposition: form-data; name="%s"; filename="%s"' % (key, filename))
lines.append('Content-Type: %s' % get_content_type(filename))
lines.append('')
lines.append(value)
lines.append('--' + BOUNDARY + '--')
lines.append('')
body = CRLF.join(lines)
content_type = 'multipart/form-data; boundary=%s' % BOUNDARY
return content_type, body
Traceback:
35. response = get_response(request)
128. response = self.process_exception_by_middleware(e, request)
126. response = wrapped_callback(request, *callback_args, **callback_kwargs)
166. [('image', imagePath, imageData)])
232. body = CRLF.join(lines)
Exception Type: TypeError at /identify_shrine
Exception Value: sequence item 12: expected str instance, bytes found
as per #coltoneakins request I modified request body to bytes, but I seem to be getting a bad request error any idea why?
Code:
content_type = 'multipart/form-data; boundary=----------ThIs_Is_tHe_bouNdaRY_$'
request_body = '----------ThIs_Is_tHe_bouNdaRY_$' + '\n'+'Content-Disposition: form-data; name="include_target_data"' + '\n' + '\n' + 'top'+ '\n' + '----------ThIs_Is_tHe_bouNdaRY_$' +'\n' + 'Content-Disposition: form-data; name="max_num_results"' + '\n' + '\n' + '1' + '\n' + '----------ThIs_Is_tHe_bouNdaRY_$' +'\n' + 'Content-Disposition: form-data; name="image"; filename="img_2.jpg"' + '\n' + 'Content-Type: image/jpeg' + '\n' + '\n'
request_body1 = request_body.encode('utf-8')
request_body2 = imageData
request_body3 = ('\n' + '\n' + '----------ThIs_Is_tHe_bouNdaRY_$').encode('utf-8')
request_body4 = request_body1 + request_body2 + request_body3
content_type_bare = 'multipart/form-data'
# Sign the request and get the Authorization header
# use client key
auth_header = authorization_header_for_request(CLIENT_ACCESS_KEY, CLIENT_SECRET_KEY, HTTP_METHOD, request_body4,
content_type_bare,
date, path)
request_headers = {
'Accept': 'application/json',
'Authorization': auth_header,
'Content-Type': content_type,
'Date': date
}
try:
# Make the request over HTTPS on port 443
connection = http.client.HTTPSConnection(CLOUD_RECO_API_ENDPOINT, 443)
connection.request(HTTP_METHOD, path, request_body4, request_headers)
response = connection.getresponse()
response_body = response.read()
reason = response.reason
status = response.status
finally:
connection.close()
You have a type issue in your code. You are getting a TypeError expected str instance, bytes found because you are attempting to join() a list that contains both str types and bytes types in Python.
Look at these lines in your code:
for (key, filename, value) in files:
lines.append('--' + BOUNDARY)
lines.append('Content-Disposition: form-data; name="%s"; filename="%s"' % (key, filename))
lines.append('Content-Type: %s' % get_content_type(filename))
lines.append('')
lines.append(value) # <---------- THIS IS BYTES, EVERYTHING ELSE IS STR
lines.append('--' + BOUNDARY + '--')
lines.append('')
body = CRLF.join(lines) # <---------- AHHH RED FLAG!!!
CRLF is type str. But, value (which is added on to your lines list) is bytes. This means you end up with lines containing both str and bytes types. When you are sending an image via the mulitpart/form-data request, the whole body of the request is bytes. So, you need to use join() with only bytes types.
This is what you are doing:
body = CRLF.join(lines)
which is really:
'\r\n, i am a str'.join(['i am also str', b'I am not a str, I am bytes']) # <-- NO
This is what you need to be doing:
b'I am bytes'.join([b'I am also bytes', b'Me too!'])
Also, just so that you are aware, the Requests library provides mechanisms for you to send files. See the files parameter in the Requests documentation or this StackOverflow answer:
https://stackoverflow.com/a/12385661/9347694
So, you may not need to reinvent the wheel here. Requests will multipart encode the file and construct the request for you.
Related
I'm calling a REST API with basic authentication in an AWS Lambda function. This is my code
import json, os, base64
from urllib import request
def lambda_handler(event, context):
retStatusCode = 0
retBody = ""
try:
url = os.environ['URL']
username = os.environ['USERNAME']
password = os.environ['PASSWORD']
requestURL = url + "/" + event['queueID'] + "/" + event['cli'];
#print ("QUEUEID IS: " + event['queueID'])
#print ("CLI IS: " + event['cli'])
#print ("URL IS: " + requestURL)
req = request.Request(requestURL, method="POST")
myStr = '%s:%s' % (username, password)
myBytes = myStr.encode("utf-8")
base64string = base64.b64encode(myBytes)
req.add_header("Authorization", "Basic %s" % base64string)
resp = request.urlopen(req)
responseJSON = json.load(resp)
retStatusCode = responseJSON["Result"]
retBody = responseJSON["Message"]
except Exception as e:
retStatusCode = 500
retBody = "An exception occurred: " + str(e)
return {
'statusCode': retStatusCode,
'body': retBody
}
However, I'm getting a "HTTP Error 401: Unauthorized" returned. If I call the API method in Postman with the same credentials, it returns data successfully, so I figure it must be something to do with the format of the header I'm adding, but just can't see what's wrong.
The problem is in this line:
req.add_header("Authorization", "Basic %s" % base64string)
From the documentation, base64.b64encode method is designed to "return the encoded bytes".
If you try to execute this code in REPL, you'll see that your resulting header looks wrong. It concatenates string with bytes:
>>> "Basic %s" % base64string
"Basic b'aGVsbG86d29ybGQ='"
You can read more about Python's b' syntax here.
So you need to decode the string back to utf8.
req.add_header("Authorization", "Basic %s" % base64string.decode('utf-8'))
The result will look a like a valid Auth header now:
>>> "Basic %s" % base64string.decode('utf-8')
'Basic aGVsbG86d29ybGQ='
I am writing a python script to fetch mail attachments through Graph API.
In the Graph Explorer, I can perfectly download file attachments by manually pressing the download button after calling:
https://graph.microsoft.com/v1.0/me/messages/{message-id}/attachments/{attachment-id}/$value
However, when trying to make the same request in my Python script, all I get returned is 'Response [200]' (so the request works, but the file is not reachable).
I try to make the request like this:
def get_mails_json():
requestHeaders = {'Authorization': 'Bearer ' +result["access_token"],'Content-Type': 'application/json'}
queryResults = msgraph_request(graphURI + "/v1.0/me/messages?$filter=isRead ne true",requestHeaders)
return json.dumps(queryResults)
try:
data = json.loads(mails)
values = data['value']
for i in values:
mail_id = i['id']
mail_subj = i['subject']
if i['hasAttachments'] != False:
attachments = o365.get_attachments(mail_id)
attachments = json.loads(attachments)
attachments = attachments['value']
for i in attachments:
details = o365.get_attachment_details(mail_id,i["id"])
except Exception as e:
print(e)
def get_attachment_details(mail,attachment):
requestHeaders = {'Authorization': 'Bearer ' + result["access_token"],'Content-Type': 'application/json'}
queryResults = msgraph_request(graphURI + "/v1.0/me/messages/"+mail+"/attachments/"+attachment+'/$value',requestHeaders)
return json.dumps(queryResults)
Is there a way for me to download the file to AT ALL through my python script ?
I found a simple solution to downloading a file through a python script!
I used chip's answer, found on this thread:
thread containing chip's answer
I make the request for the attachment like so:
def get_attachment_details(mail,attachment):
requestHeaders = {'Authorization': 'Bearer ' + result["access_token"],'Content-Type': 'application/file'}
resource= graphURI + "/v1.0/me/messages/"+mail+"/attachments/"+attachment+'/$value'
payload = {}
results = requests.request("GET", resource,headers=requestHeaders,data=payload, allow_redirects=False)
return results.content
This gets me the encoded bytes of the file, which I then decode and write to a file like so:
for i in attachments:
details = o365.get_attachment_details(mail_id,i["id"])
toread = io.BytesIO()
toread.write(details)
with open(i['name'], 'wb') as f:
f.write(toread.getbuffer())
I have a 300 mb file that I need to upload, and my current code just isn't cutting it.
#----------------------------------------------------------------------------------
def _post_multipart(self, host, selector,
fields, files,
ssl=False,port=80,
proxy_url=None,proxy_port=None):
""" performs a multi-post to AGOL, Portal, or AGS
Inputs:
host - string - root url (no http:// or https://)
ex: www.arcgis.com
selector - string - everything after the host
ex: /PWJUSsdoJDp7SgLj/arcgis/rest/services/GridIndexFeatures/FeatureServer/0/1/addAttachment
fields - dictionary - additional parameters like token and format information
files - tuple array- tuple with the file name type, filename, full path
ssl - option to use SSL
proxy_url - string - url to proxy server
proxy_port - interger - port value if not on port 80
Output:
JSON response as dictionary
Useage:
import urlparse
url = "http://sampleserver3.arcgisonline.com/ArcGIS/rest/services/SanFrancisco/311Incidents/FeatureServer/0/10261291"
parsed_url = urlparse.urlparse(url)
params = {"f":"json"}
print _post_multipart(host=parsed_url.hostname,
selector=parsed_url.path,
files=files,
fields=params
)
"""
content_type, body = self._encode_multipart_formdata(fields, files)
headers = {
'content-type': content_type,
'content-length': str(len(body))
}
if proxy_url:
if ssl:
h = httplib.HTTPSConnection(proxy_url, proxy_port)
h.request('POST', 'https://' + host + selector, body, headers)
else:
h = httplib.HTTPConnection(proxy_url, proxy_port)
h.request('POST', 'http://' + host + selector, body, headers)
else:
if ssl:
h = httplib.HTTPSConnection(host,port)
h.request('POST', selector, body, headers)
else:
h = httplib.HTTPConnection(host,port)
h.request('POST', selector, body, headers)
resp_data = h.getresponse().read()
try:
result = json.loads(resp_data)
except:
return None
if 'error' in result:
if result['error']['message'] == 'Request not made over ssl':
return self._post_multipart(host=host, selector=selector, fields=fields,
files=files, ssl=True,port=port,
proxy_url=proxy_url,proxy_port=proxy_port)
return result
def _encode_multipart_formdata(self, fields, files):
boundary = mimetools.choose_boundary()
buf = StringIO()
for (key, value) in fields.iteritems():
buf.write('--%s\r\n' % boundary)
buf.write('Content-Disposition: form-data; name="%s"' % key)
buf.write('\r\n\r\n' + self._tostr(value) + '\r\n')
for (key, filepath, filename) in files:
if os.path.isfile(filepath):
buf.write('--%s\r\n' % boundary)
buf.write('Content-Disposition: form-data; name="%s"; filename="%s"\r\n' % (key, filename))
buf.write('Content-Type: %s\r\n' % (self._get_content_type3(filename)))
file = open(filepath, "rb")
try:
buf.write('\r\n' + file.read() + '\r\n')
finally:
file.close()
buf.write('--' + boundary + '--\r\n\r\n')
buf = buf.getvalue()
content_type = 'multipart/form-data; boundary=%s' % boundary
return content_type, buf
I cannot use requests module, and must use the standard libraries like urllib2, urllib, etc.. for python 2.7.x.
Is there a way to load the 300 mb files to a site without pushing the whole thing to memory?
UPDATE:
So I switched to requests, and now I get: MissingSchema: Invalid URL u'www.arcgis.com/sharing/rest/content/users//addItem?': No schema supplied. Perhaps you meant http://www.arcgis.com/sharing/rest/content/users//addItem??
What does this mean?
I provide the fields with the request.post() as such:
#----------------------------------------------------------------------------------
def _post_big_files(self, host, selector,
fields, files,
ssl=False,port=80,
proxy_url=None,proxy_port=None):
import sys
sys.path.insert(1,os.path.dirname(__file__))
from requests_toolbelt import MultipartEncoder
import requests
if proxy_url is not None:
proxyDict = {
"http" : "%s:%s" % (proxy_url, proxy_port),
"https" : "%s:%s" % (proxy_url, proxy_port)
}
else:
proxyDict = {}
for k,v in fields.iteritems():
print k,v
fields[k] = json.dumps(v)
for key, filepath, filename in files:
fields[key] = ('filename', open(filepath, 'rb'), self._get_content_type3(filepath))
m = MultipartEncoder(
fields=fields)
print host + selector
r = requests.post(host + selector , data=m,
headers={'Content-Type': m.content_type})
print r
I followed the example in the help documentation from both request and the toolbelt. Any ideas why this is breaking?
Thank you,
I need to send all data from form in Django to other application (through REST API).
The problem is with forwarding InMemoryUploadedFile (which i'm catching it from request).
I have big problem with build new request: content of the file is always empty (uploaded file is empty).
I had to create my own request.body builder method (encode_multipart_formdata) cus requests, i don't know why, can't do this properly. Inside this function when i call: tmpfile.read()
i got empty string, but when I'm trying do this earlier for ex. in addContent() everything is ok..
views.py
def addContent(request):
if request.method == 'POST': # If the form has been submitted...
form = ContentForm(request.POST, request.FILES)
if form.is_valid():
data = restApiController.addContent(request.POST, request.FILES)
return HttpResponseRedirect('/content') # Redirect after POST
else:
form = ContentForm # An unbound form
return render(request, 'content/addNew.html', {'form': form, })
restApiController.py
import requests
from io import BytesIO
def addContent(requestPOST, requestFILE):
content_type, body = encode_multipart_formdata(requestPOST, requestFILE)
h = {'Content-Type': content_type}
r = requests.post(settings.CONTENTS_URL, auth=('user', 'pass'), headers=h, data=body)
def encode_multipart_formdata(fields, files):
boundary = 'ARCFormBoundaryovmtr0efdw019k9'
CRLF = '\r\n'
L = []
for (key, value) in fields.iteritems():
L.append('--' + boundary)
L.append('Content-Disposition: form-data; name="%s"' % key)
L.append('')
L.append(value)
for (key, value) in files.iteritems():
L.append('--' + boundary)
L.append('Content-Disposition: form-data; name="%s"; filename="%s"' % ('contentFile', files['contentFile']._name))
L.append('Content-Type: %s' % get_content_type(files['contentFile']._name))
L.append('')
L.append(files['contentFile'].read())
L.append('--' + boundary + '--')
L.append('')
#body = CRLF.join(L) INSTEAD DO THIS:
s = BytesIO()
for element in L:
s.write(str(element))
s.write(CRLF)
body = s.getvalue()
content_type = 'multipart/form-data; boundary=%s' % boundary
return content_type, body
body content:
-----------------------------11286521771531197711838573892
Content-Disposition: form-data; name="name"
test
-----------------------------11286521771531197711838573892
Content-Disposition: form-data; name="language"
eng
-----------------------------11286521771531197711838573892
Content-Disposition: form-data; name="contentFile"; filename="chaos_handdrums.wav"
Content-Type: audio/x-wav
-----------------------------11286521771531197711838573892
Content-Disposition: form-data; name="type"
stream
-----------------------------11286521771531197711838573892--
in body context should be also context of binary file, but isn't..
You can also just use requests.post(url, data=[('name', 'test'), ('language', 'eng'), ('type', 'stream')], files={'chaos_handdrums.wav': <file-like-object>}) and requests will do the multipart conversion for you.
I am trying to post a multi-part form using httplib, url is hosted on google app engine, on post it says Method not allowed, though the post using urllib2 works. Full working example is attached.
My question is what is the difference between two, why one works but not the other
is there a problem in my mulipart form post code?
or the problem is with google app engine?
or something else ?
import httplib
import urllib2, urllib
# multipart form post using httplib fails, saying
# 405, 'Method Not Allowed'
url = "http://mockpublish.appspot.com/publish/api/revision_screen_create"
_, host, selector, _, _ = urllib2.urlparse.urlsplit(url)
print host, selector
h = httplib.HTTP(host)
h.putrequest('POST', selector)
BOUNDARY = '----------THE_FORM_BOUNDARY'
content_type = 'multipart/form-data; boundary=%s' % BOUNDARY
h.putheader('content-type', content_type)
h.putheader('User-Agent', 'Python-urllib/2.5,gzip(gfe)')
content = ""
L = []
L.append('--' + BOUNDARY)
L.append('Content-Disposition: form-data; name="test"')
L.append('')
L.append("xxx")
L.append('--' + BOUNDARY + '--')
L.append('')
content = '\r\n'.join(L)
h.putheader('content-length', str(len(content)))
h.endheaders()
h.send(content)
print h.getreply()
# post using urllib2 works
data = urllib.urlencode({'test':'xxx'})
request = urllib2.Request(url)
f = urllib2.urlopen(request, data)
output = f.read()
print output
Edit: After changing putrequest to request (on Nick Johnson's suggestion), it works
url = "http://mockpublish.appspot.com/publish/api/revision_screen_create"
_, host, selector, _, _ = urllib2.urlparse.urlsplit(url)
h = httplib.HTTPConnection(host)
BOUNDARY = '----------THE_FORM_BOUNDARY'
content_type = 'multipart/form-data; boundary=%s' % BOUNDARY
content = ""
L = []
L.append('--' + BOUNDARY)
L.append('Content-Disposition: form-data; name="test"')
L.append('')
L.append("xxx")
L.append('--' + BOUNDARY + '--')
L.append('')
content = '\r\n'.join(L)
h.request('POST', selector, content,{'content-type':content_type})
res = h.getresponse()
print res.status, res.reason, res.read()
so now the question remains what is the difference between two approaches and can first first be made to work?
Nick Johnson's answer
Have you tried sending the request with httplib using .request() instead of .putrequest() etc, supplying the headers as a dict?
it works!