webapp2 request handler and byte arrays

webapp2 request handler and byte arrays - python

I am writing a python service with webapp2 and want to get a byte-array from a client POST request and save it to a file.
Whenever I am trying to get the data field that containes the byte array from the request object I get an exception saying:
'utf8' codec can't decode byte 0xff in position 0: invalid start byte
my post() code:
def post(self):
file_data = self.request.get('file_data')
Is there another method I should use to read the field because it's not a string?

You can use self.request.body to get the raw request (a byte string)
utf-8 json string request example:
def post(self):
binary_body = self.request.body # get the binary request
utf8_json_string = binary_body.decode('utf-8')
json_object = json.loads(utf8_json_string)
More on unicode here.

Related

UnicodeEncodeError : 'latin-1' codec can't encode character '\u01e2' in position 8: ordinal not in range(256). How to solve? [duplicate]

This question already has answers here:
What are the character encodings UTF-8 and ISO-8859-1 rules
(2 answers)
Closed 2 years ago.
I have to access to a db through this code that is provided by MobiDB to have disorder prediction in proteins.
import urllib2
import json
# Define request
acceptHeader = 'My_File_TrEMBL.txt' # text/csv and text/plain supported
request = urllib2.Request("https://mobidb.org/ws/P04050/uniprot", headers={"Accept" : acceptHeader})
# Send request
response = urllib2.urlopen(request)
# Parse JSON response di Python dict
data = json.load(response)
# handle data
print(data)
Since I'm not using Python 2.6 I changed the script as follows:
import urllib.request
import json
# Define request
acceptHeader ='My_File_TrEMBL.txt'
# text/csv and text/plain supported
request = urllib.request.Request("https://mobidb.org/ws/P04050/uniprot", headers={"Accept" :
acceptHeader})
# Send request
response = urllib.request.urlopen(request)
# Parse JSON response di Python dict
data = json.load(response)
# handle data
print(data)
So I am not using urllib2 but urllib.request. The problem arises when the variable request is passed to urllib.request.urlopen that returns me this instance:
" 'latin-1' codec can't encode character '\u01e2' in position 8: ordinal not in range(256) "
I understood that is something related to ASCII code, but since I am new to Python and I am eager given the deadline of the work I'd like any help you can give me.
Obliged

Decode the bytes content using utf-8 encoding and read the content sing json.loads
response = urllib.request.urlopen(request)
#get the content and decode it using utf-8
respcontent = response.read().decode('utf-8')
data = json.loads(respcontent)
print(data)

sending binary data over json

I wanted to upload file to s3 using python. I am using requests_aws4 auth library for this
import requests
from requests_aws4auth import AWS4Auth
# data=encode_audio(data)
endpoint = 'http://bucket.s3.amazonaws.com/testing89.mp3'
data = //some binary data(mp3) here
auth = AWS4Auth('xxxxxx', 'xxxxxx', 'eu-west-2', 's3')
response = requests.put(endpoint,data=data, auth=auth, headers={'x-amz-acl': 'public-read'})
print response.text
I have tried the above code and got the following error.
'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128).
This works fine if I send text data since it is also ascii but when the binary data is being sent I think there is some concatenation error of binary data with auth data. Am I wrong somewhere? Someone please guide me. Thanks.

Get request body as string in Django

I'm sending a POST request with JSON body to a Django server (fairly standard). On the server I need to decode this using json.loads().
The problem is how do I get the body of the request in a string format?
I have the following code currently:
body_data = {}
if request.META.get('CONTENT_TYPE', '').lower() == 'application/json' and len(request.body) > 0:
try:
body_data = json.loads(request.body)
except Exception as e:
return HttpResponseBadRequest(json.dumps({'error': 'Invalid request: {0}'.format(str(e))}), content_type="application/json")
However, this gives an error the JSON object must be str, not 'bytes'.
How do I retrieve the body of the request as a string, with the correct encoding applied?

The request body, request.body, is a byte string. In Python 3.0 to 3.5.x, json.loads() will only accept a unicode string, so you must decode request.body before passing it to json.loads().
body_unicode = request.body.decode('utf-8')
body_data = json.loads(body_unicode)
In Python 2, json.loads will accept a unicode string or a byte sting, so the decode step is not necessary.
When decoding the string, I think you're safe to assume 'utf-8' - I can't find a definitive source for this, but see the quote below from the jQuery docs:
Note: The W3C XMLHttpRequest specification dictates that the charset is always UTF-8; specifying another charset will not force the browser to change the encoding.
In Python 3.6, json.loads() accepts bytes or bytearrays. Therefore you shouldn't need to decode request.body (assuming it's encoded in UTF-8).

I believe that the other end from where you receive this request does not convert the data to JSON before sending the request. Either you have to convert the data to JSON before you send, or just try accessing request.body in your view.

If your goal is to end up with a dictionary of the data you have just sent to the server using JSON, save yourself the trouble of decoding the body yourself and use the request.POST dictionary-like object django already provides out-of-the-box.
So suppose you POST this to the server:
{ 'foo': 'bar' }
Then the following method
def my_handler(request):
foo = request.POST['foo']
print(foo)
Would print bar to the console

Character encoding in a GET request with http.client lib (Python)

I am a beginner in python and I coded this little script to send an HTTP GET request on my local server (localhost). It works great, except that I wish I could send Latin characters such as accents.
import http.client
httpMethod = "GET"
url = "localhost"
params = "Hello World"
def httpRequest(httpMethod, url, params):
conn = http.client.HTTPConnection(url)
conn.request(httpMethod, '/?param='+params)
conn.getresponse().read()
conn.close()
return
httpRequest(httpMethod, url, params)
When I insert the words with accent in my parameter "params", this is the error message that appears:
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 14: ordinal not in range(128)
I don't know if there is a solution using http.client library but I think so. When I look in the documentation http.client, I can see this:
HTTPConnection.request
Strings are encoded as ISO-8859-1, the default charset for HTTP

You shouldn't construct arguments manually. Use urlencode instead:
>>> from urllib.parse import urlencode
>>> params = 'Aserejé'
>>> urlencode({'params': params})
'params=Aserej%C3%A9'
So, you can do:
conn.request(httpMethod, '/?' + urlencode({'params': params}))
Also note that yout string will be encoded as UTF-8 before being URL-escaped.

Decode Cookie variable extracted from a HTTP Stream - Python

I am using python to send a request to a server. I get a cookie from the server. I am trying to decode the encoding scheme used by the server - I suspect it's either utf-8 or base64.
So I create my header and connection objects.
resp, content = httpobj.request(server, 'POST', headers=HTTPheader, body=HTTPbody)
And then i extract the cookie from the HTTP Stream
cookie= resp['set-cookie']
I have tried str.decode() and unicode() but I am unable to get the unpacked content of the cookie.
Assume the cookie is
MjAyMTNiZWE4ZmYxYTMwOVPJ7Jh0B%2BMUcE4si5oDcH7nKo4kAI8CMYgKqn6yXpgtXOSGs8J9gm20bgSlYMUJC5rmiQ1Ch5nUUlQEQNmrsy5LDgAuuidQaZJE5z%2BFqAJPnlJaAqG2Fvvk5ishG%2FsH%2FA%3D%3D
The output I am expecting is
20213bea8ff1a309SÉì˜tLQÁ8².hÁûœª8<Æ
*©úÉzµs’Ïö¶Ñ¸•ƒ$.kš$5gQIPf®Ì¹,8�ºèA¦IœöZ€$ùå% *ao¾Nb²¶ÁöÃ

Try like this:
import urllib
import base64
cookie_val = """MjAyMTNiZWE4ZmYxYTMwOVPJ7Jh0B%2BMUcE4si5oDcH7nKo4kAI8CMYgKqn6yXpgtXOSGs8J9gm20bgSlYMUJC5rmiQ1Ch5nUUlQEQNmrsy5LDgAuuidQaZJE5z%2BFqAJPnlJaAqG2Fvvk5ishG%2FsH%2FA%3D%3D"""
res = base64.b64decode(urllib.unquote(cookie_val))
print repr(res)
Output:
"20213bea8ff1a309S\xc9\xec\x98t\x07\xe3\x14pN,\x8b\x9a\x03p~\xe7*\x8e$\x00\x8f\x021\x88\n\xaa~\xb2^\x98-\\\xe4\x86\xb3\xc2}\x82m\xb4n\x04\xa5`\xc5\t\x0b\x9a\xe6\x89\rB\x87\x99\xd4RT\x04#\xd9\xab\xb3.K\x0e\x00.\xba'Pi\x92D\xe7?\x85\xa8\x02O\x9eRZ\x02\xa1\xb6\x16\xfb\xe4\xe6+!\x1b\xfb\x07\xfc"
Of course the result here is a 8-bit string, so you have to decode it to get the the string that you want, i'm not sure which encoding to use, but there is the decoding result using the unicode-escape (unicode literal) :
>>> print unicode(res, 'unicode-escape')
20213bea8ff1a309SÉìtãpN,p~ç*$1ª~²^-\ä³Â}m´n¥`ÅBÔRT#Ù«³.K.º'PiDç?¨ORZ¡¶ûäæ+!ûü
Well Hope this can help .

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

webapp2 request handler and byte arrays - python

You can use self.request.body to get the raw request (a byte string) utf-8 json string request example: def post(self): binary_body = self.request.body # get the binary request utf8_json_string = binary_body.decode('utf-8') json_object = json.loads(utf8_json_string) More on unicode here.

Related

UnicodeEncodeError : 'latin-1' codec can't encode character '\u01e2' in position 8: ordinal not in range(256). How to solve? [duplicate]

sending binary data over json

Get request body as string in Django

Character encoding in a GET request with http.client lib (Python)

Decode Cookie variable extracted from a HTTP Stream - Python

Categories

Resources