binary contents cant be decoded to Ascii or utf-8 - python

i have the below posted binary contents. i would like to convert them into a readible text/string. i found some questions related to the same issue and they suggested that the binary contents must be decoded to utf-8 or ascii. however, i tried both and
i got the following errors respectively:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 54: invalid start byte
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 54: ordinal not in range(128)
please let me know how to decode the below posted binary
binary contents:
b"MM\x00*\x00\x00\x00\x08\x00\x12\x01\x00\x00\x03\x00\x00\x00\x01\x00$\x00\x00\x01\x01\x00\x03\x00\x00\x00\x01\x00%\x00\x00\x01\x02\x00\x03\x00\x00\x00\x01\x00 \x00\x00\x01\x03\x00\x03\x00\x00\x00\x01\x80\xb2\x00\x00\x01\x06\x00\x03\x00\x00\x00\x01\x00\x01\x00\x00\x01\x11\x00\x04\x00\x00\x00\x01\x00\x00\x01\xee\x01\x15\x00\x03\x00\x00\x00\x01\x00\x01\x00\x00\x01\x16\x00\x03\x00\x00\x00\x01\x00%\x00\x00\x01\x17\x00\x04\x00\x00\x00\x01\x00\x00\x00 \x01\x1a\x00\x05\x00\x00\x00\x01\x00\x00\x00\xe8\x01\x1b\x00\x05\x00\x00\x00\x01\x00\x00\x00\xf0\x01(\x00\x03\x00\x00\x00\x01\x00\x01\x00\x00\x01S\x00\x03\x00\x00\x00\x01\x00\x03\x00\x00\x83\x0e\x00\x0c\x00\x00\x00\x03\x00\x00\x00\xf8\x84\x82\x00\x0c\x00\x00\x00\x06\x00\x00\x01\x10\x87\xaf\x00\x03\x00\x00\x004\x00\x00\x01#\x87\xb0\x00\x0c\x00\x00\x00\x03\x00\x00\x01\xa8\x87\xb1\x00\x02\x00\x00\x00.\x00\x00\x01\xc0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x00\x01\x00\x00\x00\x01#0\x03\x80T\x81\xa0\x00#/\xfb\x9e\xac\xf4S\x07\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00A&\xb9\xf8\x04\xce4\x9bAYYY\xd7\xe3ZB\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x01\x00\x00\x00\x0c\x04\x00\x00\x00\x00\x01\x00\x01\x04\x01\x00\x00\x00\x01\x00\x01\x08\x02\x00\x00\x00\x01\x00\x01\x04\x02\x87\xb1\x00%\x00\x00\x08\x01\x87\xb1\x00\x06\x00&\x08\x06\x00\x00\x00\x01#\x8e\x08\x08\x00\x00\x00\x01\x00\x01\x08\t\x87\xb0\x00\x01\x00\x00\x08\n\x87\xb0\x00\x01\x00\x01\x08\r\x87\xb0\x00\x01\x00\x02\x0c\x00\x00\x00\x00\x01\x0f\x11\x0c\x04\x00\x00\x00\x01#)AXT\xa6#\x00\x00\x00AXT\xa6#\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00Popular Visualisation Pseudo Mercator|WGS 84|\x00x^\xed\xc3\x01\t\x00\x00\x0c\x03\xa05_\xb5G[\x8e\x83\x82\xbd\xa4\xaa\xaa\xaa\xaa\xaa\x0f\x0e]\x13|'"

Related

unicode error occures when trying to get data in a memoryView

i have a function returns objects as memoryView as follows:
data:[(<memory at 0x000001C665563E80>,)]
data[0]:(<memory at 0x000001C665563E80>,)
data[0][0]:<memory at 0x000001C665563E80>
as an attempt to see the data contained in the memoryView, i altered the encoding to be iso-8859-1 and used .tobytes but both resulted in an empty string as follwos
iso-8859-1:
tobytes:b''
i also used base64.b64encode(data[0][0]) and the result was base64_data:b''
please let me know how to extract the data contained in a memoryView objects
NOTE:i am using windows operating system
errors received:
UnicodeEncodeError: 'charmap' codec can't encode character '\x93'
UnicodeEncodeError: 'charmap' codec can't encode character '\x89'
attempts to solve this issue:
str(data[0][0],'iso-8859-1')#> caused:UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 1: invalid start byte and UnicodeDecodeError: 'utf-8' codec can't decode byte 0x91
running: `chcp 65001` #>did not solve it

UBlox NAV_PVT message: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5

Does anybody know how to decode the NAV_PVT message in python?
I tried the UTF-8 but it I get this error message:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 0: invalid start byte
I can't find the right decode format.
You should read the file as binary, because it is binary. UBlox has a nice documentation on various formats/protocols. Check them
E.g. https://www.u-blox.com/sites/default/files/products/documents/u-blox8-M8_ReceiverDescrProtSpec_%28UBX-13003221%29.pdf page 332. Is this what you are looking for?
Or if you were using some libraries, you should check such documentation. But I assume or you mixed up the binary with ascii version, or you are just using the binary protocol.

Can't read data using read_csv due to encoding errors

So, I am facing a huge issue. I am trying to read a csv file which has '|' as delimiters. If I use utf-8 or utf-sig-8 as encoders then I get the following error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 0: invalid start byte
but I use the unicode_escape encoding then I get this error:
UnicodeDecodeError: 'unicodeescape' codec can't decode byte 0x5c in position 13: \ at end of string
Is it an issue with the dataset?
it worked after I 'Saved with Encoding - utf-8' in Sublime Text Editor. I think the data had some issues.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc9 in position 388: invalid continuation byte

I am really beginning at python, but I am hours in this line, can't go anywhere without fixing it.
cadastro_2019_10= pd.read_csv("inf_cadastral_fi_20191015.csv",delimiter=";")[["CNPJ_FUNDO","DENOM_SOCIAL","CLASSE"]]
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc9 in position 49: invalid continuation byte
cadastro_2019_10= pd.read_csv("inf_cadastral_fi_20191015.csv",delimiter=";")[["CNPJ_FUNDO","DENOM_SOCIAL","CLASSE"]]
again:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc9 in position 388: invalid continuation byte
Figure out what encoding the CSV file uses. Seems it doesn't use UTF-8. Say it's latin1, then you can try with read_csv(..., encoding="latin1").
If you are on a UNIX system, you can use the file command to try to detect the encoding.
I found that I had to add :encoding='cp1252'
but thank you for your time

UnicodeDecode 'ascii' error on end of file

I'm getting a UnicodeDecodeError: 'ascii' codec can't decode byte 0x84 in position 24245: ordinal not in range(128) on multiple files, for all of which the position given is basically the end of the file. Chardet.detect() gives me ASCII as the codec with 1.0 confidence.
Does anyone know what encoding this should probably be? This file was written in windows so I assume that has something to do with it.
Edit: Removed hex dump.

Categories