Cyrillic symbols do not show in Python 3 - python

I want to write a script that takes a POST (HTTP) request and sends a response. If the POST data has english characters, they are returned/displayed normally. But cyrillic chars look like this:
������ ������ �������������� ������
My Python script is as follows:
# -*- coding: utf-8 -*-
print('Content-type: text/html\n')
import cgi
from imp import reload
form = cgi.FieldStorage()
text2 = form.getfirst("postvar1", "не задано")
print(text2)
I use a custom server script
# -*- coding: utf-8 -*-
from http.server import HTTPServer, CGIHTTPRequestHandler
server_address = ("", 8080)
httpd = HTTPServer(server_address, CGIHTTPRequestHandler)
httpd.serve_forever()
The python script is saved as UTF-8.
How can this problem be solved?

Related

Python CGI redirect returns "End of script output before headers"

I am trying to build a redirect service that will allow me to track clicks within emails I send to foreign websites.
Example of the URL of this script:
https://example.com/cgi-bin/redir.py?rurl=https%3A%2F%2Fwww.example.com%2Fde&utm_content=test
I am using Apache2 on Ubunut20.04 with the cgi module, calling the following redir.py script:
#!/usr/bin/python3
import webbrowser
redirect_url = "https://www.example.com"
webbrowser.open(redirect_url)
Now this results in the following error:
End of script output before headers: redir.py
Adding a header:
print('Content-Type: text/plain')
print('')
print('hello world')
With this "hello world" output I get exactly this, a "hello world" message.
How to redirect if a header is needed?
Everything you print gets returned to the requesting browser. Your python file should look like this:
#!/usr/bin/python3
# -*- coding: UTF-8 -*-
print("Content-Type: text/html\n\n")
print()
print("<meta http-equiv=\"refresh\" content=\"0; url=TARGETURL\" />")

(python/boto sqs) UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 5: ordinal not in range(128)

I can not send messages with accented characters for SQS in python with the AWS SDK (boto).
Versions
Python: 2.7.6
boto: 2.20.1
CODE
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import boto.sqs
from boto.sqs.message import RawMessage
# print boto.Version
sqs_conn = boto.sqs.connect_to_region(
'my_region',
aws_access_key_id='my_kye',
aws_secret_access_key='my_secret_ky')
queue = sqs_conn.get_queue('my_queue')
queue.set_message_class(RawMessage)
msg = RawMessage()
body = '1 café, 2 cafés, 3 cafés ...'
msg.set_body(body)
queue.write(msg)
One solution:
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
Full code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import boto.sqs
from boto.sqs.message import RawMessage
import sys # <== added this line
reload(sys) # <== added this line
sys.setdefaultencoding('utf-8') # <== added this line
# print boto.Version
sqs_conn = boto.sqs.connect_to_region(
'my_region',
aws_access_key_id='my_kye',
aws_secret_access_key='my_secret_ky')
queue = sqs_conn.get_queue('my_queue')
queue.set_message_class(RawMessage)
msg = RawMessage()
body = '1 café, 2 cafés, 3 cafés ...'
msg.set_body(body)
queue.write(msg)
Source: https://pythonadventures.wordpress.com/2012/03/29/ascii-codec-cant-encode-character/#comment-4672

Python UTF8 encoding with arabic

I have a encoding problem, When I try to crawl youtube (arabic channel) :
#!/usr/bin/python
# -*- coding: utf8 -*-
from django.core.management.base import BaseCommand, CommandError
import requests, lxml, re
from lxml import html
class Command(BaseCommand):
def handle(self, *args, **options):
r = requests.get("https://www.youtube.com/user/aljazeerachannel/videos?view=0")
root = lxml.html.fromstring(r.content)
for data in root.xpath('.//*[#id="branded-page-body"]/div/div/div[1]/div/div[2]/ul/li[1]/span/span/a'):
print data.text
The result is :
[root#vmi9105 buzzbal]# python manage.py youtube
اÙتخابات اÙÙجاÙس اÙبÙدÙØ© Ù٠سÙØ·ÙØ© عÙÙاÙ
try this it sloved my problem in python:
f"{yourString}".encode('latin-1').decode("utf-8")

avoid user abort python subprocesses

I want the process i initiate through the script to run on webserver even if user closes the page.
It doesn't seem to be working with this:
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
import cgi,cgitb,subprocess
print "Content-Type: text/plain;charset=utf-8"
print
form = cgi.FieldStorage()
ticker = form['ticker'].value
print subprocess.Popen(['/usr/bin/env/python','options.py',ticker])
Please help! Thanks!
I guess this is wrong:
'/usr/bin/env/python'
it should be usually:
'/usr/bin/env python'
but better use this:
>>> import sys
>>> sys.executable # contains the executable running this python process
'C:\\Python27\\pythonw.exe'
I use to do it like this:
p = subprocess.Popen([sys.executable,'options.py',ticker])

Python/feedparser script won't display on CGI/ character coding

#!/usr/bin/python
# -*- coding: utf-8 -*-
import sys
import os
import cgi
import string
import feedparser
count = 0
print "Content-Type: text/html\n\n"
print """<PRE><B>WORK MAINTENANCE/B></PRE>"""
d = feedparser.parse("http://www.hep.hr/ods/rss/radovi.aspx?dp=zagreb")
for opis in d:
try:
print """<B>Place/Time:</B> %s<br>""" % d.entries[count].title
print """<B>Streets:</B> %s<br>""" % d.entries[count].description
print """<B>Published:</B> %s<br>""" % d.entries[count].date
print "<br>"
count+= 1
except:
pass
I have a problem with CGI and paython script. Under the terminal script runs just fine except "IndexError: list index out of range", and I put pass for that. But when I run script through CGI I only get WORK MAINTENANCE line and first line from d.entries[count].title repeated 9 times? So confusing...
Also how can I setup support in feedparser for Croation(balkan) letters; č,ć,š,ž,đ ?
# -- coding: utf-8 -- is not working and I m running Ubuntu server.
Thank you in advance for help.
Regards.
for opis in d:
try:
print """<B>Place/Time:</B> %s<br>""" % d.entries[count].title
You're not using 'opis' in your output.
Try something like this:
for entry in d.entries:
try:
print """<B>Place/Time:</B> %s<br>""" % entry.title
....
Oke had another problem, text that I manualy entered would show on CGI but RSS web pages wouldnt. So you need to encode before you write:
# -*- coding: utf-8 -*-
import sys, os, string
import cgi
import feedparser
import codecs
d = blablablabla
print "Content-Type: text/html; charset=utf-8\n\n"
print
for entry in d.entries:
print """%s""" % entry.title.encode('utf-8')

Categories