Locale on django and uwsgi UnicodeEncodeError - python

EDIT: I just realized, that when i'm not trying to print to console that variable, it works. Why?
I run into an issue related to displaying string label with utf chars. I set locale env in uwsgi ini file like this:
env =LC_ALL=en_US.UTF-8
env =LANG=en_US.UTF-8
and in wsgi.py:
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
When I run app code:
print (locale.getlocale(), locale.getpreferredencoding())
print locale.getdefaultlocale()
print "option_value", option_value
label = force_text(option_label)
print 'label', label #THIS FAILS
the output is:
(('en_US', 'UTF-8'), 'UTF-8')
('en_US', 'UTF-8')
option_value d
ERROR <stack trace>
print 'label', label
UnicodeEncodeError: 'ascii' codec can't encode character u'\u015b' in position 5: ordinal not in range(128)
The problem is not present when I run app via runserver in production environment.
Django 1.6.5 Python 2.7.6 Ubuntu 14.04 uWSGI 2.0.5.1

I just found answer here: http://chase-seibert.github.io/blog/2014/01/12/python-unicode-console-output.html
Realized that the console is responsible for that error, so exporting additional env variable in uwsgi config file solves the issue: env = PYTHONIOENCODING=UTF-8

for all in django when you want use unicode , like in forms and etc .. you must set a u in leading of your unicode that you want to be saved ! do this any where that your unicode have been saved !
in this case i think it is (option_label)

Related

uWSGI encode error 'ascii' codec can't encode character '\xe9' in position 7: ordinal not in range(128)

I have already tried everything suggested at uWSGI / Emperor: UnicodeEncodeError: 'ascii' codec can't encode character
but nothing worked.
here is my ini file
[uwsgi]
# -------------
# Settings:
# key = value
# Comments >> #
# -------------
env LANG=en_US.utf8
env LC_ALL=en_US.UTF-8
env LC_LANG=en_US.UTF-8
env = PYTHONIOENCODING=UTF-8
# socket = [addr:port]
http = 127.0.0.1:8000
# WSGI module and callable
# module = [wsgi_module_name]:[application_callable_name]
module = church.wsgi
# master = [master process (true of false)]
master = true
static-map = /static=/home/....church/static
# processes = [number of processes]
processes = 5
..
Basically I have a simple django app fetching stocks names with unicode characters like öé etc.
the Django development server works find and return no error. In production it returns
" 'ascii' codec can't encode character '\xe9' in position 7: ordinal not in range(128)"
so parsing fails and date is not added to the db.

Scrapy - FEED_EXPORT_ENCODING Doesn't Work in Ubuntu Server

Even though both my local and server scrapy versions are the same, setting FEED_EXPORT_ENCODING = 'utf-8' in the settings.py doesn't make change in the server for exporting the result in the JSON file.
What I've done in settings.py file :
FEED_EXPORT_ENCODING = 'utf-8'
Which command I run to get the result :
scrapy crawl spiderName -o file.json
What I get in return :
...
'content': u'\n \r\nTruffle
\u0628\u0627 \u0645\u0627\u0654\u0645\u0648'
u'\u0631\u06cc\u062a
\u0631\u0627\u062d\u062a\u200c\u062a\u0631 \u06a9\u0631\u062f\u0646
\u0632'
u'\u0646\u062f\u06af\u06cc
\u062f\u0648\u0644\u0648\u067e\u0631\u0647\u0627\u06cc \u06a9\u0631\
u06cc\u067e\u062a\u0648\u06a9\u
...
I do exactly the same process in my local machine and every unicode decode to utf-8.
What would you suggest?

python3 default encoding UnicodeDecodeError ascii using apache WSGI

import locale
prefered_encoding = locale.getpreferredencoding()
prefered_encoding 'ANSI_X3.4-1968'
I'm using a framework called inginious and it's using web.py to render its template.
web.template.render(os.path.join(root_path, dir_path),
globals=self._template_globals,
base=layout_path)
The rendering works on my localhost but not on my staging server.
They both run python3. I see that web.py enforces utf-8 on
the encoding in Python2 only (that's out of my hands)
def __str__(self):
self._prepare_body()
if PY2:
return self["__body__"].encode('utf-8')
else:
return self["__body__"]
here is the stack trace
t = self._template(name),
File "/lib/python3.5/site-packages/web/template.py", line 1028, in _template,
self._cache[name] = self._load_template(name),
File "/lib/python3.5/site-packages/web/template.py", line 1016, in _load_template
return Template(open(path).read(), filename=path, **self._keywords)
File "/lib64/python3.5/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 83: ordinal not in range(128),
My html do include hebew chars, small example
<div class="modal-content">
<div class="modal-header">
<button type="button" class="close" data-dismiss="modal">×</button>
<h4 class="modal-title feedback-modal-title">
חישוב האיברים הראשונים בסדרה של איבר ראשון חיובי ויחס שלילי:
<span class="red-text">אי הצלחה</span>
and I open it like so :
open('/path/to/feedback.html').read()
and the line where the encoding fails is where the Hebrew chars are.
I tried setting some environment variables in ~/.bashrc:
export PYTHONIOENCODING=utf8
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
export LANGUAGE=en_US.UTF-8
under the user centos
The ingenious framework is installed as a pip under python3.5 site-packages. and it served by an apache server under the user apache
Tried setting the environment variables in the code (during the init of the app) so that the apache WSGI will be aware of them
import os
os.environ['LC_ALL'] = 'en_US.UTF-8'
os.environ['LANG'] = 'en_US.UTF-8'
os.environ['LANGUAGE'] = 'en_US.UTF-8'
I have edited the /etc/httpd/conf/httpd.conf using the setenv method:
SetEnv LC_ALL en_US.UTF-8
SetEnv LANG en_US.UTF-8
SetEnv LANGUAGE en_US.UTF-8
SetEnv PYTHONIOENCODING utf8
and restarted using sudo service httpd restart and still no luck.
My question is, what is the best practice to solve this. I understand there are hacks for this, but I want to understand what is the underline cause as well as how to solve it.
Thanks!
finally found the answer when reading the file
changed from
open('/path/to/feedback.html').read()
to
import codecs
with codecs.open(file_path,'r',encoding='utf8') as f:
text = f.read()
if anyone has a more general approach that will work, I'll accept his answer
A Python 2+3 solution would be:
import io
with io.open(file_path, mode='r', encoding='utf8') as f:
text = f.read()
See the documentation of io.open.

Decoding with 'utf-8' but error show an Unicode Encode Error?

My Python 2.x script trys to download a web page including Chinese words. It's encoded in UTF-8. By urllib.openurl(url), I get content in type str, so I decode content with UTF-8. It throws UnicodeEncodeError. I googled a lot of posts like this and this, but they don't work for me. Am I misunderstand something?
My code is:
import urllib
import httplib
def get_html_content(url):
response = urllib.urlopen(url)
html = response.read()
print type(html)
return html
if __name__ == '__main__':
url = 'http://weekly.manong.io/issues/58'
html = get_html_content(url)
print html.decode('utf-8')
Error message:
<type 'str'>
Traceback (most recent call last):
File "E:\src\infra.py", line 32, in <module>
print html.decode('utf-8')
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 44: ordinal not in range(128)
[Finished in 1.6s]
print statement converts arguments to str objects. Encoding it manually will prevent to encode it with ascii:
import sys
...
if __name__ == '__main__':
url = 'http://weekly.manong.io/issues/58'
html = get_html_content(url)
print html.decode('utf-8').encode(sys.stdout.encoding, 'ignore')
Replace sys.stdout.encoding with encoding of your terminal unless it print correctly.
UPDATE
Alternatively you can use PYTHONIOENCODING environmental variable without encoding in the source code:
PYTHONIOENCODING=utf-8:ignore python program.py
If the standard output is redirected to a pipe then Python 2 fails to use your locale encoding:
⟫ python -c'print u"\u201c"' # no redirection -- works
“
⟫ python -c'print u"\u201c"' | cat
Traceback (most recent call last):
File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 0: ordinal not in range(128)
To fix it; you could specify PYTHONEIOENCODING environment variable e.g., in bash:
⟫ PYTHONIOENCODING=utf-8 python -c'print u"\u201c"' | cat
“
On Windows, you need to set the envvar using a different syntax.
If your Windows console doesn't support utf-8 (it matters only for the first command where there is no redirection) then you could try to print Unicode directly using Win32 API calls like win-unicode-console does. See windows console doesn't print or input Unicode.

Python locale settings on Heroku

I'm developing a python web application on Heroku and I'm facing a problem with the locale settings.
My aim ist to format a python datetime object as a string like this
import datetime
now = datetime.datetime.now()
print now.strftime('%a %d %B %Y') # output: Sat 14 July 2012
but in different languages.
On my local machine I use therefore:
import locale
locale.setlocale(locale.LC_ALL, '')
or locale.setlocale(locale.LC_ALL, 'de_DE.UTF-8') for specific languages.
On my local machine this works and I get the date in the right language but on Heroku it fails and all I get is a locale.Error: unsupported locale settings.
Am I doing something wrong or is it permitted to change locale setting in a python app on Heroku?
Thanks.
You can see available locales by running:
$ heroku run "locale -a"
Running `locale -a` attached to terminal... up, run.5061
aa_DJ.utf8
aa_ER
aa_ER#saaho
aa_ET
af_ZA.utf8
am_ET
an_ES.utf8
ar_AE.utf8
ar_BH.utf8
ar_DZ.utf8
ar_EG.utf8
ar_IN
ar_IQ.utf8
ar_JO.utf8
ar_KW.utf8
ar_LB.utf8
ar_LY.utf8
ar_MA.utf8
ar_OM.utf8
ar_QA.utf8
ar_SA.utf8
ar_SD.utf8
ar_SY.utf8
ar_TN.utf8
ar_YE.utf8
as_IN
ast_ES.utf8
az_AZ
be_BY#latin
be_BY.utf8
ber_DZ
ber_MA
bg_BG.utf8
bn_BD
bn_IN
bo_CN
bo_IN
br_FR.utf8
bs_BA.utf8
C
ca_AD.utf8
ca_ES.utf8
ca_ES.utf8#valencia
ca_FR.utf8
ca_IT.utf8
crh_UA
csb_PL
cs_CZ.utf8
cy_GB.utf8
da_DK.utf8
de_AT.utf8
de_BE.utf8
de_CH.utf8
de_DE.utf8
de_LI.utf8
de_LU.utf8
dv_MV
dz_BT
el_CY.utf8
el_GR.utf8
en_AG
en_AU.utf8
en_BW.utf8
en_CA.utf8
en_DK.utf8
en_GB.utf8
en_HK.utf8
en_IE.utf8
en_IN
en_NG
en_NZ.utf8
en_PH.utf8
en_SG.utf8
en_US.utf8
en_ZA.utf8
en_ZW.utf8
eo_US.utf8
eo.utf8
es_AR.utf8
es_BO.utf8
es_CL.utf8
es_CO.utf8
es_CR.utf8
es_DO.utf8
es_EC.utf8
es_ES.utf8
es_GT.utf8
es_HN.utf8
es_MX.utf8
es_NI.utf8
es_PA.utf8
es_PE.utf8
es_PR.utf8
es_PY.utf8
es_SV.utf8
es_US.utf8
es_UY.utf8
es_VE.utf8
et_EE.utf8
eu_ES.utf8
eu_FR.utf8
fa_IR
fi_FI.utf8
fil_PH
fo_FO.utf8
fr_BE.utf8
fr_CA.utf8
fr_CH.utf8
fr_FR.utf8
fr_LU.utf8
fur_IT
fy_DE
fy_NL
ga_IE.utf8
gd_GB.utf8
gl_ES.utf8
gu_IN
ha_NG
he_IL.utf8
hi_IN
hne_IN
hr_HR.utf8
hsb_DE.utf8
ht_HT
hu_HU.utf8
hy_AM
ia
id_ID.utf8
ig_NG
is_IS.utf8
it_CH.utf8
it_IT.utf8
iu_CA
ja_JP.utf8
ka_GE.utf8
kk_KZ.utf8
km_KH
kn_IN
ko_KR.utf8
ks_IN
ks_IN#devanagari
ku_TR.utf8
kw_GB.utf8
ky_KG
la_AU.utf8
lg_UG.utf8
li_BE
li_NL
lo_LA
lt_LT.utf8
lv_LV.utf8
mai_IN
mg_MG.utf8
mi_NZ.utf8
mk_MK.utf8
ml_IN
mn_MN
mr_IN
ms_MY.utf8
mt_MT.utf8
nan_TW#latin
nb_NO.utf8
nds_DE
nds_NL
ne_NP
nl_AW
nl_BE.utf8
nl_NL.utf8
nn_NO.utf8
nr_ZA
nso_ZA
oc_FR.utf8
om_ET
om_KE.utf8
or_IN
pa_IN
pap_AN
pa_PK
pl_PL.utf8
POSIX
pt_BR.utf8
pt_PT.utf8
ro_RO.utf8
ru_RU.utf8
ru_UA.utf8
rw_RW
sa_IN
sc_IT
sd_IN
sd_IN#devanagari
se_NO
shs_CA
si_LK
sk_SK.utf8
sl_SI.utf8
so_DJ.utf8
so_ET
so_KE.utf8
so_SO.utf8
sq_AL.utf8
sr_ME
sr_RS
sr_RS#latin
ss_ZA
st_ZA.utf8
sv_FI.utf8
sv_SE.utf8
ta_IN
te_IN
tg_TJ.utf8
th_TH.utf8
ti_ER
ti_ET
tk_TM
tlh_GB.utf8
tl_PH.utf8
tn_ZA
tr_CY.utf8
tr_TR.utf8
ts_ZA
tt_RU
tt_RU#iqtelif
ug_CN
uk_UA.utf8
ur_PK
uz_UZ#cyrillic
uz_UZ.utf8
ve_ZA
vi_VN
wa_BE.utf8
wo_SN
xh_ZA.utf8
yi_US.utf8
yo_NG
zh_CN.utf8
zh_HK.utf8
zh_SG.utf8
zh_TW.utf8
zu_ZA.utf8
To fix your issue try
locale.setlocale(locale.LC_ALL, 'de_DE.utf8')
or
heroku config:add LANG=de_DE.utf8
Only English locales are installed on the Heroku environment by default. So far there seems to be no way to install additional locales. Your best bet will be to implement your own formatting functions for the languages you support.
This is an older question, but I thought it worth mentioning here that Heroku added support for installing additional locales in September 2018.
To do this, commit a .locales file containing the locales you'd like to set up:
de_DE
fr_FR
Then add the locale buildpack:
heroku buildpacks:add https://github.com/heroku/heroku-buildpack-locale
For more information, check the buildpack's GitHub repository.

Categories