UnicodeDecodeError when I set recursive=True in FilePathField in django - python

The error happened when I add record in admin. My model is like
class THMusic(models.Model):
originmusic = models.ForeignKey(Originalmusic)
name = models.CharField(max_length=100)
audiofile = models.FilePathField(path=os.path.dirname(os.path.abspath(__file__))+'/resource', recursive=True)
team = models.CharField(max_length=50)
album = models.CharField(max_length=50)
class THMusicAdmin(admin.ModelAdmin):
list_display = ('name', 'originmusic', )
But I get UnicodeDecodeError:
UnicodeDecodeError at /admin/thmusic/thmusic/add/
'ascii' codec can't decode byte 0xe3 in position 54: ordinal not in range(128)
Request Method: GET
Request URL: http://127.0.0.1:8006/admin/thmusic/thmusic/add/
Django Version: 1.6.5
Exception Type: UnicodeDecodeError
Exception Value:
'ascii' codec can't decode byte 0xe3 in position 54: ordinal not in range(128)
Exception Location: /usr/local/lib/python2.7/dist-packages/django/forms/fields.py in __init__, line 1057
Python Executable: /usr/bin/python2.7
Python Version: 2.7.6
Python Path:
['/home/hyzhappy/djangoprojects/myblog',
'/usr/local/lib/python2.7/dist-packages/simplejson-3.6.3-py2.7-linux-i686.egg',
'/home/hyzhappy/djangoprojects/myblog',
'/usr/lib/python2.7',
'/usr/lib/python2.7/plat-i386-linux-gnu',
'/usr/lib/python2.7/lib-tk',
'/usr/lib/python2.7/lib-old',
'/usr/lib/python2.7/lib-dynload',
'/usr/local/lib/python2.7/dist-packages',
'/usr/local/lib/python2.7/dist-packages/PIL',
'/usr/lib/python2.7/dist-packages',
'/usr/lib/python2.7/dist-packages/PILcompat',
'/usr/lib/python2.7/dist-packages/gst-0.10',
'/usr/lib/python2.7/dist-packages/gtk-2.0',
'/usr/lib/pymodules/python2.7',
'/usr/lib/python2.7/dist-packages/ubuntu-sso-client',
'/usr/lib/python2.7/dist-packages/wx-2.8-gtk2-unicode']
If I delete "recursice=True" in FilePathField, it will go well. But it's not my expectation
More detail
Unicode error hint
The string that could not be encoded/decoded was: urce/������
UnicodeDecodeError('ascii', '/home/hyzhappy/djangoprojects/myblog/thmusic/resource/\xe3\x83\x97\xe3\x83\xac\xe3\x82\xa4\xe3\x83\xa4\xe3\x83\xbc\xe3\x82\xba\xe3\x82\xb9\xe3\x82\xb3\xe3\x82\xa2.mp3', 54, 55, 'ordinal not in range(128)')

Probably not the right answer, but you can check this as a possible cause.
Sometimes in Django (I had the same case with tastypie), the UnicodeDecodeError occurs because of an initial unrelated error which tries to display but can not because of weird character.
In my case, it was because I put French accents in my code comments (not ascii). This usually does not create errors, but when there is actually an error somewhere and Django tries to display this error in the debug mode, it can not because of the non ascii chars and raises a UnicodeDecodeError instead of raising the initial error.
Just try to remove any suspicious character in your code (if there is any), and check again to see if the error is still the same.

Under the hood, FilePathField uses os.walk/os.listdir to find the files. These functions have two "modes" of operation in python: byte string and unicode. The mode is selected by the type of path given, i.e. if you feed it an unicode string, the resulting file list will be unicode, otherwise it will be a byte string.
So, to fix this problem, just feed a unicode path to the FilePathField, like this (note the u at the beginning of u'/resource'):
audiofile = models.FilePathField(path=os.path.dirname(os.path.abspath(__file__))+u'/resource', recursive=True)

Related

UnicodeEncodeError / python3

Can't solve typical issue with encodings. Cyrrlic text is received via post and error is raised
'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
The text key itself has a look and it must cyrillic text in russian: u'\u043f\u0440\u043e'
After that error tried this way and some others:
key = key.decode('ascii').encode('utf8')
or :
key = key.decode('ascii')
Localy it works, error is raised in production only. Python system encoding in production is utf8
EDIT: in order to clear things up. Error is raised on form handler function(again, works localy, doesn't in production)
def search(request):
if request.method == 'POST':
key = request.POST.get("key")
if key is not None:
..
So it's str received from input form, and error is first raised at this point, so I supposed it must be decoded but it didn't help.
more traceback:
UnicodeEncodeError at /search/
'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)

UnicodeEncodeError with nginx and django

I've tried this SO answer, this doc is inapplicable as I'm running nginx, I've added charset utf-8; to my nginx config and I'm still getting this error.
Summarised traceback is here:
UnicodeEncodeError at /
'ascii' codec can't encode character u'\xe1' in position 69: ordinal not in range(128)
Request Method: GET
Request URL: http://django/
Django Version: 1.4.20
Exception Type: UnicodeEncodeError
Exception Value:
'ascii' codec can't encode character u'\xe1' in position 69: ordinal not in range(128)
Exception Location: /opt/envs/venv/lib/python2.7/genericpath.py in getmtime, line 54
Unicode error hint
The string that could not be encoded/decoded was: choacán.jpg
I think this error is not about nginx. It's on the file creation step.
Python uses system locale when saving files.
Check your system locale:
$ python manage.py shell
> import os
> print os.popen("locale").read()
If it's incorrect you should set system locale.
But filenames like this can cause any kind of troubles for users. Please think about defining custom file storage for models.FileField and generating random file name for every file - it's good practice.

Django 1.4 - django.db.models.FileField.save(filename, file, save=True) produces error with non-ascii filename

I'm making a fileupload feature using django.db.models.FileField of Django 1.4
When I try to upload a file whose name includes non-ascii characters, it produces error below.
'ascii' codec can't encode characters in position 109-115: ordinal not
in range(128)
The actual code is like below
file = models.FileField(_("file"),
max_length=512,
upload_to=os.path.join('uploaded', 'files', '%Y', '%m', '%d'))
file.save(filename, file, save=True) #<- This line produces the error
above, if 'filename' includes non-ascii character
If I try to use unicode(filename, 'utf-8') insteadof filename, it produces error below
TypeError: decoding Unicode is not supported
How can I upload a file whose name has non-ascii characters?
Info of my environment:
sys.getdefaultencoding() : 'ascii'
sys.getfilesystemencoding() : 'UTF-8'
using Django-1.4.10-py2.7.egg
You need to use .encode() to encode the string:
file.save(filename.encode('utf-8', 'ignore'), file, save=True)
In your FileField definition the 'upload_to' argument might be like os.path.join(u'uploaded', 'files', '%Y', '%m', '%d')
(see the first u'uploaded' started with u') so all string will be of type unicode and this may help you.

How can I solve UnicodeDecodeError in Django?

I am getting this error in Django:
UnicodeDecodeError at /category/list/
'utf8' codec can't decode byte 0xf5 in position 7: invalid start byte
Request Method: GET
Request URL: ...
Django Version: 1.3.1
Exception Type: UnicodeDecodeError
Exception Value:
'utf8' codec can't decode byte 0xf5 in position 7: invalid start byte
Exception Location: /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py in iterencode, line 264
...
I should save Turkish characters in the database. How can I fix this error?
A start-byte of 0xf5 would indicate the start of a 4-character UTF-8 encoding. One strong possibility is that the input isn't UTF-8 at all but ISO-8859-9, the Turkish ISO encoding. On that codepage 0xf5 is a lowercase o with tilde or õ.
Below code solved my problem. Thank you.
if isinstance(encObject, unicode):
myStr = encObject.encode('utf-8')
http://www.fileformat.info/info/unicode/char/f5/index.htm
it is an o with a tilde
try
some_string.decode('latin1','replace')

Insert record of utf-8 character (Chinese, Arabic, Japanese.. etc) into GAE datastore programatically with python

I just want to build simple UI translation built in GAE (using python SDK).
def add_translation(self, pid=None):
trans = Translation()
trans.tlang = db.Key("agtwaW1kZXNpZ25lcnITCxILQXBwTGFuZ3VhZ2UY8aIEDA")
trans.ttype = "UI"
trans.transid = "ui-about"
trans.content = "关于我们"
trans.put()
this is resulting encoding error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 0: ordinal not in range(128)
How to encode the correct insert content with unicode(utf-8) character?
using the u notation:
>>> s=u"关于我们"
>>> print s
关于我们
Or explicitly, stating the encoding:
>>> s=unicode('אדם מתן', 'utf8')
>>> print s
אדם מתן
Read more at the Unicode HOWTO page in the python documentation site.

Categories