Anyone know how to fix a unicode error?

Anyone know how to fix a unicode error? - python

I am using Google App Engine for Python, but I get a unicode error is there a way to work around it?
Here is my code:
def get(self):
contents = db.GqlQuery("SELECT * FROM Content ORDER BY created DESC")
output = StringIO.StringIO()
with zipfile.ZipFile(output, 'w') as myzip:
for content in contents:
if content.code:
code=content.code
else:
code=content.code2
myzip.writestr("udacity_code", code)
self.response.headers["Content-Type"] = "application/zip"
self.response.headers['Content-Disposition'] = "attachment; filename=test.zip"
self.response.out.write(output.getvalue())
I now get a unicode error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf7 in position 12: ordinal not in range(128)
I believe it is coming from output.getvalue()... Is there a way to fix this?

#Areke Ignacio's answer is the fix. For a brief walkthrough here is a post I did recently "Python and Unicode Punjabi" https://www.pippallabs.com/blog/python-and-unicode-panjabi

I had the exact same issue.
in the end I solved it by changing the call to writestr from
myzip.writestr("udacity_code", code)
to
myzip.writestr("udacity_code", code.encode('utf-8'))

From this link:
Python UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 ordinal not in range(128)
However in the meantime your problem is that your templates are ASCII
but your data is not (can't tell if it's utf-8 or unicode). Easy
solution is to prefix each template string with u to make it Unicode.

Related

How to decode python byte code to ASCII? (Selenium. Getting xml from network response)

How to decode pyhon bytecode to ascii?
I extract data with selenium from network response. Should get xml.
Getting: ['b'\xa5\xff\xff\xc7\x88\xe4\xb4\xd7\x03\xa0\x11:|\xce\xdb\xb7\x0f\xf1\xdf\xfc\x1f\xdb\x93\x91^\xbc\xa3\xdd\xc2\x02V\x00\xba$\xbd\x10\xd2\xd0E\xf2\x90\xb6\xca\xee\x10\xbf\xbf_\xbf\xfc\xef?\xe9\x13{H\xf1\xa1\xa0\x00\x1c\x01(\x80\x1c\x81\x02(s\xe7Z\xf3\xb3N\xf5L\xdc>\xe7\x8f\xbbwl\xbf\x99\x91\xd4O\xde\xb4,\xf3PH\x02L1\x00\xc98\xc3,\x13!\x82\xc6\xc2\xa6Bd"k\xcb\x9d(\xb9\x13%WQr\x15%W\xb1\xe5J\t\x9e:\x8a\x03\x99\x06H\xd0\x8f\xd8\xfe\x9f9\xbc\xfc\x157\x111\xd7\x15\xaab\xfb\xe8;\xab\xee\xfc\x9b\xeeu\x10<d\x04\x06Y\xa8\xd7\x9f\x11...
Code:
...
for request in driver.requests: if request.response: text_file.write(str(request.response.body))
I've tried:
decoded = request.response.body.decode('ascii')
or request.response.body.decode('utf-8') or cp1251/1252
I get:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa5 in position 0: ordinal not in range(128)
Response should be xml (~1,5mb) in attached photoresponse
If I use:
decoded = base64.b64decode(request.response.body)
I'm getting smth like: b'T#\x00\xad\x9a\xb5\xba\xfa3u\xca\x84PG\xbd\x8a\xab\x1f\xcdcJ%\r\xd4\xff\x0c$)\x9a>.... not what to be expected.
Combining decoded = base64.b64decode(request.response.body).decode('ascii') also doesnt help:
UnicodeDecodeError: 'ascii' codec can't decode byte 0x85 in position 0: ordinal not in range(128)
Help me, please.

Its because of the Header 'Content-Encoding': 'br'
Installing brotly helped. Also deleting
This message helped a lot

Decode request's response in python

The response to my request looks as following:
alle_R={'items': [{'x':..., 'y'..., ...}, {...}...]}
I am trying to perform some Actions, as for example, to recall 'x' with help of alle_R['items'][4]['x'], but I get this error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 1022: invalid start Byte
I tried out this solution:
alle_R=str(alle_R)
encode_alle_R = codecs.encode(alle_R, 'utf-8')
while importing codecs of course. But this did not bring any results. Creating json files brings the same error.
Do you have any idea how I can access my elements?
Thank you a lot!

UnicodeEncodeError / python3

Can't solve typical issue with encodings. Cyrrlic text is received via post and error is raised
'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
The text key itself has a look and it must cyrillic text in russian: u'\u043f\u0440\u043e'
After that error tried this way and some others:
key = key.decode('ascii').encode('utf8')
or :
key = key.decode('ascii')
Localy it works, error is raised in production only. Python system encoding in production is utf8
EDIT: in order to clear things up. Error is raised on form handler function(again, works localy, doesn't in production)
def search(request):
if request.method == 'POST':
key = request.POST.get("key")
if key is not None:
..
So it's str received from input form, and error is first raised at this point, so I supposed it must be decoded but it didn't help.
more traceback:
UnicodeEncodeError at /search/
'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)

Django 1.4 - django.db.models.FileField.save(filename, file, save=True) produces error with non-ascii filename

I'm making a fileupload feature using django.db.models.FileField of Django 1.4
When I try to upload a file whose name includes non-ascii characters, it produces error below.
'ascii' codec can't encode characters in position 109-115: ordinal not
in range(128)
The actual code is like below
file = models.FileField(_("file"),
max_length=512,
upload_to=os.path.join('uploaded', 'files', '%Y', '%m', '%d'))
file.save(filename, file, save=True) #<- This line produces the error
above, if 'filename' includes non-ascii character
If I try to use unicode(filename, 'utf-8') insteadof filename, it produces error below
TypeError: decoding Unicode is not supported
How can I upload a file whose name has non-ascii characters?
Info of my environment:
sys.getdefaultencoding() : 'ascii'
sys.getfilesystemencoding() : 'UTF-8'
using Django-1.4.10-py2.7.egg

You need to use .encode() to encode the string:
file.save(filename.encode('utf-8', 'ignore'), file, save=True)

In your FileField definition the 'upload_to' argument might be like os.path.join(u'uploaded', 'files', '%Y', '%m', '%d')
(see the first u'uploaded' started with u') so all string will be of type unicode and this may help you.

Insert record of utf-8 character (Chinese, Arabic, Japanese.. etc) into GAE datastore programatically with python

I just want to build simple UI translation built in GAE (using python SDK).
def add_translation(self, pid=None):
trans = Translation()
trans.tlang = db.Key("agtwaW1kZXNpZ25lcnITCxILQXBwTGFuZ3VhZ2UY8aIEDA")
trans.ttype = "UI"
trans.transid = "ui-about"
trans.content = "关于我们"
trans.put()
this is resulting encoding error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 0: ordinal not in range(128)
How to encode the correct insert content with unicode(utf-8) character?

using the u notation:
>>> s=u"关于我们"
>>> print s
关于我们
Or explicitly, stating the encoding:
>>> s=unicode('אדם מתן', 'utf8')
>>> print s
אדם מתן
Read more at the Unicode HOWTO page in the python documentation site.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Anyone know how to fix a unicode error? - python

#Areke Ignacio's answer is the fix. For a brief walkthrough here is a post I did recently "Python and Unicode Punjabi" https://www.pippallabs.com/blog/python-and-unicode-panjabi

I had the exact same issue. in the end I solved it by changing the call to writestr from myzip.writestr("udacity_code", code) to myzip.writestr("udacity_code", code.encode('utf-8'))

Related

How to decode python byte code to ASCII? (Selenium. Getting xml from network response)

Decode request's response in python

UnicodeEncodeError / python3

Django 1.4 - django.db.models.FileField.save(filename, file, save=True) produces error with non-ascii filename

Insert record of utf-8 character (Chinese, Arabic, Japanese.. etc) into GAE datastore programatically with python

Categories

Resources