Decode request's response in python - python

The response to my request looks as following:
alle_R={'items': [{'x':..., 'y'..., ...}, {...}...]}
I am trying to perform some Actions, as for example, to recall 'x' with help of alle_R['items'][4]['x'], but I get this error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 1022: invalid start Byte
I tried out this solution:
alle_R=str(alle_R)
encode_alle_R = codecs.encode(alle_R, 'utf-8')
while importing codecs of course. But this did not bring any results. Creating json files brings the same error.
Do you have any idea how I can access my elements?
Thank you a lot!

Related

How to decode python byte code to ASCII? (Selenium. Getting xml from network response)

How to decode pyhon bytecode to ascii?
I extract data with selenium from network response. Should get xml.
Getting: ['b'\xa5\xff\xff\xc7\x88\xe4\xb4\xd7\x03\xa0\x11:|\xce\xdb\xb7\x0f\xf1\xdf\xfc\x1f\xdb\x93\x91^\xbc\xa3\xdd\xc2\x02V\x00\xba$\xbd\x10\xd2\xd0E\xf2\x90\xb6\xca\xee\x10\xbf\xbf_\xbf\xfc\xef?\xe9\x13{H\xf1\xa1\xa0\x00\x1c\x01(\x80\x1c\x81\x02(s\xe7Z\xf3\xb3N\xf5L\xdc>\xe7\x8f\xbbwl\xbf\x99\x91\xd4O\xde\xb4,\xf3PH\x02L1\x00\xc98\xc3,\x13!\x82\xc6\xc2\xa6Bd"k\xcb\x9d(\xb9\x13%WQr\x15%W\xb1\xe5J\t\x9e:\x8a\x03\x99\x06H\xd0\x8f\xd8\xfe\x9f9\xbc\xfc\x157\x111\xd7\x15\xaab\xfb\xe8;\xab\xee\xfc\x9b\xeeu\x10<d\x04\x06Y\xa8\xd7\x9f\x11...
Code:
...
for request in driver.requests: if request.response: text_file.write(str(request.response.body))
I've tried:
decoded = request.response.body.decode('ascii')
or request.response.body.decode('utf-8') or cp1251/1252
I get:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa5 in position 0: ordinal not in range(128)
Response should be xml (~1,5mb) in attached photoresponse
If I use:
decoded = base64.b64decode(request.response.body)
I'm getting smth like: b'T#\x00\xad\x9a\xb5\xba\xfa3u\xca\x84PG\xbd\x8a\xab\x1f\xcdcJ%\r\xd4\xff\x0c$)\x9a>.... not what to be expected.
Combining decoded = base64.b64decode(request.response.body).decode('ascii') also doesnt help:
UnicodeDecodeError: 'ascii' codec can't decode byte 0x85 in position 0: ordinal not in range(128)
Help me, please.
Its because of the Header 'Content-Encoding': 'br'
Installing brotly helped. Also deleting
This message helped a lot

How to fix UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d : character maps to <undefined>?

I am using curl to data.
import os
cmd = "curl --data \"action=getdata\" https:localhost:8070"
print(cmd)
data = os.popen(cmd).read()
The line above produces an error UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 565334: character maps to <undefined>.
When I debugged using breakpoints, the command os.popen generates a large corpus of text and when it goes to read() the error arises in file cp1252.py in IncrementalDecoder class. I tried doing,
data = os.popen(cmd).read().encode('utf-8').decode('ascii')
and
data = os.popen(cmd).read().encode().decode('utf-8')
But the error persists. How can we solve this?

I got an error decoding from binary to ascii

When I use the following code:
import requests
def googleSearch(qu):
with requests.session() as c:
url = 'https://www.google.com'
qu = {'q': qu}
urllink = requests.get(url, params=qu)
x=urllink.url
return x
x=googleSearch('translation')
print(x)
import urllib.request
site=urllib.request.urlopen(x)
bytes=site.read()
"artificial limit of size: "
"bytes=bytes[0:6000]"
text=bytes.decode("utf8")
print (text)
I got the the following errors (running the program again and again):
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 6116: invalid continuation byte
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 6143: invalid continuation byte
etc.
So I suppose the "site" file is to big.
When I limit the size of the file to 6000 bytes there is no error"
What is happening? Should I slice the file and treat each slice separately?

How to handle the network message with unicode that is not decodeable to utf-8

I receive the following byte message via socket connection and I want to convert into string and do further processing I am using python3.7
below is the code i tried so far
import codecs
a = b'0400F224648188E0801200000040000000001941678904000010237890000000000000222220418151856038556051259950760020806002468060046010403319 HSBCBSB8001101234567890MC 100 WITH ORDERIN FO AU009006Q\x00\x00\x00\x83\x00007\xa0\x00\x00\x00\x00%\x02010003855604181518562468000000000460100000'
b= codecs.decode(a, 'utf-8')
print(b)
Iam getting the error as below
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0x83 in position > 208: invalid start byte
how can I convert the data to string and process further
Thanks in advance
Your data is not utf-8 encoded. You can use BeautifulSoup to decode unknown encodings:
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(b'0400F224648188E0801200000040000000001941678904000010237890000000000000222220418151856038556051259950760020806002468060046010403319 HSBCBSB8001101234567890MC 100 WITH ORDERIN FO AU009006Q\x00\x00\x00\x83\x00007\xa0\x00\x00\x00\x00%\x02010003855604181518562468000000000460100000'
)
print(soup.contents[0])
print(soup.originalEncoding)
to get
0400F224648188E0801200000040000 ... # etc
and
windows-1252
You can use the bs4-detector seperately as well: UnicodeDammit and also provide it with suggestions which encodings to try first / not to try to finetune it.
More info on SO:
How to determine the encoding of text?

Anyone know how to fix a unicode error?

I am using Google App Engine for Python, but I get a unicode error is there a way to work around it?
Here is my code:
def get(self):
contents = db.GqlQuery("SELECT * FROM Content ORDER BY created DESC")
output = StringIO.StringIO()
with zipfile.ZipFile(output, 'w') as myzip:
for content in contents:
if content.code:
code=content.code
else:
code=content.code2
myzip.writestr("udacity_code", code)
self.response.headers["Content-Type"] = "application/zip"
self.response.headers['Content-Disposition'] = "attachment; filename=test.zip"
self.response.out.write(output.getvalue())
I now get a unicode error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf7 in position 12: ordinal not in range(128)
I believe it is coming from output.getvalue()... Is there a way to fix this?
#Areke Ignacio's answer is the fix. For a brief walkthrough here is a post I did recently "Python and Unicode Punjabi" https://www.pippallabs.com/blog/python-and-unicode-panjabi
I had the exact same issue.
in the end I solved it by changing the call to writestr from
myzip.writestr("udacity_code", code)
to
myzip.writestr("udacity_code", code.encode('utf-8'))
From this link:
Python UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 ordinal not in range(128)
However in the meantime your problem is that your templates are ASCII
but your data is not (can't tell if it's utf-8 or unicode). Easy
solution is to prefix each template string with u to make it Unicode.

Categories