Trouble Decompressing web-socket data - python

I wanted to get data from Bitmart's WebSocket (an exchange). I was able to subscribe to the WebSocket and get data back but it is compressed and according to the Documentation I am supposed to use zlib to decompress the data but when I tried to do it, it gave an error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcd in position 1: invalid continuation byte
This is my code:
import json
from websocket import create_connection
import zlib
ws = create_connection("wss://ws-manager-compress.bitmart.news?protocol=1.1")
ws.send(json.dumps({
"op": "subscribe",
"args": ["spot/ticker:BTC_USDT"]
}))
while True:
result = ws.recv()
message = result
compressed = zlib.compress(message)
decompressed = zlib.decompress(compressed).decode('UTF-8')
print(decompressed)
ws.close()
BTW ws.recv() returns data like this:
b'5\xcd\xd1\x0e\x82 \x18\x05\xe0w\xf9\xaf\x1d\x01\x82\xbfzY\xbdAv\xd5\x1aCc\xe9\xc2pB\xb5\xe6|\xf7`\xcb\xdb\xef\x9c\x9d\xb3\xc0M\x07\r\xf5e\x81V{\xa3\xde\xce\xbeF\xa3\xb8\xe8\xa1\x06N\xab\x1c\x11+\x82\x122\xe8\x87{\xff\x0fdI)%\x94F\xb5\xda\x075\xcdCg\x92#2I\x10\x93\xbb\xcfV\x96L\xe4\xa4,"\xba\xc9<7\xc5\x9cK\xc2\xd2\x84W\x01jVp*\xa8(\xa5\x8c\xf0\x1d[gci\xdf\x1c\xd4\xf9tl`\xbdf\x10tk\xd3\x89\x9f\\\xd8\x85\xa1{\x98\x19\xd6\x1f'

decompressed = zlib.decompress(message, -zlib.MAX_WBITS).decode('UTF-8')

Related

UnicodeEncodeError : 'latin-1' codec can't encode character '\u01e2' in position 8: ordinal not in range(256). How to solve? [duplicate]

This question already has answers here:
What are the character encodings UTF-8 and ISO-8859-1 rules
(2 answers)
Closed 2 years ago.
I have to access to a db through this code that is provided by MobiDB to have disorder prediction in proteins.
import urllib2
import json
# Define request
acceptHeader = 'My_File_TrEMBL.txt' # text/csv and text/plain supported
request = urllib2.Request("https://mobidb.org/ws/P04050/uniprot", headers={"Accept" : acceptHeader})
# Send request
response = urllib2.urlopen(request)
# Parse JSON response di Python dict
data = json.load(response)
# handle data
print(data)
Since I'm not using Python 2.6 I changed the script as follows:
import urllib.request
import json
# Define request
acceptHeader ='My_File_TrEMBL.txt'
# text/csv and text/plain supported
request = urllib.request.Request("https://mobidb.org/ws/P04050/uniprot", headers={"Accept" :
acceptHeader})
# Send request
response = urllib.request.urlopen(request)
# Parse JSON response di Python dict
data = json.load(response)
# handle data
print(data)
So I am not using urllib2 but urllib.request. The problem arises when the variable request is passed to urllib.request.urlopen that returns me this instance:
" 'latin-1' codec can't encode character '\u01e2' in position 8: ordinal not in range(256) "
I understood that is something related to ASCII code, but since I am new to Python and I am eager given the deadline of the work I'd like any help you can give me.
Obliged
Decode the bytes content using utf-8 encoding and read the content sing json.loads
response = urllib.request.urlopen(request)
#get the content and decode it using utf-8
respcontent = response.read().decode('utf-8')
data = json.loads(respcontent)
print(data)

Unable to call an API and get respective output in Python

I am new to python and I am trying to call an API and get output, however, I am running into the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8f in position
2: invalid start byte
The code I have written thus far is as follows:
import requests
file = open("NLPData/testimg.png")
payload = {file}
response = requests.get("http://sampleapi.com/get", params=payload)
print(response.text)

UnicodeDecodeError on Windows, but not when running the exact same code on Mac

I'm trying to download json data via an API. The code is as follows:
import urllib.request, ssl, json
context = ssl._create_unverified_context()
rsbURL = "https://rsbuddy.com/exchange/summary.json"
with urllib.request.urlopen(rsbURL, context=context) as url:
data = json.loads(url.read().decode('UTF-8'))
This code works perfectly fine on my Mac, and I confirmed that data is what is supposed to be the JSON string. However, when I run the exact same code on windows, I get this error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
What is going on and how do I fix it?
Looks like the server is sending a compressed response for some reason (it shouldn't be doing that unless you explicitly set the accept-encoding header). You can adapt your code to work with compressed responses like this:
import gzip
import urllib.request, ssl, json
context = ssl._create_unverified_context()
rsbURL = "https://rsbuddy.com/exchange/summary.json"
with urllib.request.urlopen(rsbURL, context=context) as url:
if url.info().get('Content-Encoding') == 'gzip':
body = gzip.decompress(url.read())
else:
body = url.read()
data = json.loads(body)

sending binary data over json

I wanted to upload file to s3 using python. I am using requests_aws4 auth library for this
import requests
from requests_aws4auth import AWS4Auth
# data=encode_audio(data)
endpoint = 'http://bucket.s3.amazonaws.com/testing89.mp3'
data = //some binary data(mp3) here
auth = AWS4Auth('xxxxxx', 'xxxxxx', 'eu-west-2', 's3')
response = requests.put(endpoint,data=data, auth=auth, headers={'x-amz-acl': 'public-read'})
print response.text
I have tried the above code and got the following error.
'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128).
This works fine if I send text data since it is also ascii but when the binary data is being sent I think there is some concatenation error of binary data with auth data. Am I wrong somewhere? Someone please guide me. Thanks.

webapp2 request handler and byte arrays

I am writing a python service with webapp2 and want to get a byte-array from a client POST request and save it to a file.
Whenever I am trying to get the data field that containes the byte array from the request object I get an exception saying:
'utf8' codec can't decode byte 0xff in position 0: invalid start byte
my post() code:
def post(self):
file_data = self.request.get('file_data')
Is there another method I should use to read the field because it's not a string?
You can use self.request.body to get the raw request (a byte string)
utf-8 json string request example:
def post(self):
binary_body = self.request.body # get the binary request
utf8_json_string = binary_body.decode('utf-8')
json_object = json.loads(utf8_json_string)
More on unicode here.

Categories