Coding: utf-8 doesn't seem to work - python

Utf-8 doesn't work on my computer. I tried the exact same code at another computer and it worked but on my computer it doesn't. It's in python.
My program starts like this:
# -*- coding: utf-8 -*- # Behövs i python 2 för åäö
from Tkinter import *
class Kryssruta(Button):
""" Knapp som kryssas i/ur när man trycker på den """
def __init__(self, master, nr = 0, rad = 0, kolumn = 0):
#Konstruktor, notera master
Button.__init__(self,master)
self.master = master
self.rad = rad
self.kolumn = kolumn
self.markerad = False
self.kryssad = False
self.cirklad = False
self["command"] = self.kryssa
def kryssa(self):
if self.markerad==False:
self.master.klickat(self)
On one computer it works like a charm, but on my own computer I get the message.
SyntaxError: Non-ASCII character '\xc3' in file 'blah' but no encoding declared;
see http://www.python.org/peps/pep-0263.html for details
Using a PC, running in powershell.
Anyone who knows what seems to be the problem?

You have a (number of) blank line(s) above the coding: line. From the document listed in the error message:
To define a source code encoding, a magic comment must
be placed into the source files either as first or second
line in the file, such as:

You declare that the source file is using utf-8 encoding but actually it isn't, it's using the Windows code page default for your system.
Open the file in Notepad and save it out again with Save As, setting UTF-8 in the Encoding dropdown.

Related

Load font file with Windows api Python

I was looking for a way to load a ttf file and print text with that font. While looking for some information, I found this question: load a ttf font with the Windows API
In that question they recommend adding a private font with AddFontResourceEx. But I didn't find any way to access such function with pywin32 module and ctypes.windll. Is there any way to access this function? Or failing that, another way to print text with a ttf font without using Pillow???
Next, I will leave a code so you can do the tests:
import win32print
import win32gui
import win32ui
hprinter = win32print.OpenPrinter("Microsoft Print to Pdf")
devmode = win32print.GetPrinter(hprinter, 2)["pDevMode"]
hdc = win32gui.CreateDC("WINSPOOL", printer, devmode)
dc = win32ui.CreateDCFromHandle(Self.hdc)
Edit
I managed to access the function with
ctypes.windll.gdi32.AddFontResourceExA
But now I want to access the FR_PRIVATE constant. How can I do it?
Edit 2
I found out that the function doesn't work even without that constant.
I adapted the code from this answer and got the answer!
I will put the code below:
def add_font_file(file):
FR_PRIVATE = 0x10
file = ctypes.byref(ctypes.create_unicode_buffer(file))
font_count = gdi32.AddFontResourceExW(file, FR_PRIVATE, 0)
if(font_count == 0):
raise RuntimeError("Error durante la carga de la fuente.")
In case the original link goes down, the original code was as follows:
from ctypes import windll, byref, create_unicode_buffer, create_string_buffer
FR_PRIVATE = 0x10
FR_NOT_ENUM = 0x20
def loadfont(fontpath, private=True, enumerable=False):
'''
Makes fonts located in file `fontpath` available to the font system.
`private` if True, other processes cannot see this font, and this
font will be unloaded when the process dies
`enumerable` if True, this font will appear when enumerating fonts
See https://msdn.microsoft.com/en-us/library/dd183327(VS.85).aspx
'''
# This function was taken from
# https://github.com/ifwe/digsby/blob/f5fe00244744aa131e07f09348d10563f3d8fa99/digsby/src/gui/native/win/winfonts.py#L15
# This function is written for Python 2.x. For 3.x, you
# have to convert the isinstance checks to bytes and str
if isinstance(fontpath, str):
pathbuf = create_string_buffer(fontpath)
AddFontResourceEx = windll.gdi32.AddFontResourceExA
elif isinstance(fontpath, unicode):
pathbuf = create_unicode_buffer(fontpath)
AddFontResourceEx = windll.gdi32.AddFontResourceExW
else:
raise TypeError('fontpath must be of type str or unicode')
flags = (FR_PRIVATE if private else 0) | (FR_NOT_ENUM if not enumerable else 0)
numFontsAdded = AddFontResourceEx(byref(pathbuf), flags, 0)
return bool(numFontsAdded)

(MATE) pluma "PLUMA_SELECTED_TEXT" is missing from environment

I'm writing a pluma plugin (in python) to automate HTML markup of a selected text.
According to (the poor and scarce) documentation, the selected text in the editor should be found in os.environ["PLUMA_SELECTED_TEXT"].
However, when I select some text, run my plugin and examine the environment there is no variable such as "PLUMA_SELECTED_TEXT".
I do find 'PLUMA_CURRENT_LINE' but it contains only the last line of the selected text.
Here is the plugin itself (with debugging stuff...)
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import os
import re
print(os.environ)
try:
ptext = os.environ["PLUMA_SELECTED_TEXT"]
except KeyError:
ptext = "SELECTION NOT FOUND"
print(ptext)
#ptext = re.sub('\n','<br/>\n',ptext)
#ptext = "<p>\n%s\n</p>\n"%ptext
#print(ptext)
Anyone ran into this?
I found the solution, for the benefit of whoever runs into this.
The selected text is actually sent to the script as STDIN so this needs to be read.
Hence the code looks like that:
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
try:
ptext = sys.stdin.read()
except:
ptext = "SELECTION NOT FOUND"
ptext = re.sub('\n','<br/>\n',ptext)
ptext = "<p>\n%s\n</p>\n"%ptext
print(ptext)

OperationalError at /liste/ - DJANGO

Codes:
# -*- coding: utf-8 -*-
from django.http import *
import sqlite3
def verileri_listele(request):
vt = sqlite3.connect("/root/bot/v1/database.db")
im = vt.cursor()
im.execute("""SELECT * FROM kullanim""")
veriler = im.fetchall()
for i in veriler:
return HttpResponse(i[0], i[1], i[2])
Page:
http://paste.ubuntu.com/9400322/
Page ss:
use django and python2
First of all, check permissions on /root/bot/v1/database.db (it seems that root is owner of the file) and try to move it in another place. For example, with your python code (don't forget to change path in your code and set owner/permissions on database.db)
Also, even if you're will get working database, your code returns single row per request and it's would be problem.
Return of the HttpResponse object closes the connection.
You must prepare full response text, and only then return HttpResponse object (only one).
Reading tutorial would be great for you!

Unable to display Japanese (UTF-8) characters in email body with webbrowser

I am reading text from two different .txt files and concatenating them together. Then add that to a body of the email through by using webbrowser.
One text file is English characters (ascii) and the other Japanese (UTF-8). The text will display fine if I write it to a text file. But if I use webbrowser to insert the text into an email body the Japanese text displays as question marks.
I have tried running the script on multiple machines that have different mail clients as their defaults. Initially I thought maybe that was the issue, but that does not appear to be. Thunderbird and Mail (MacOSX) display question marks.
Hello. Today is 2014-05-09
????????????????2014-05-09????
I have looked at similar issues around on SO but they have not solved the issue.
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in
position 20: ordinal not in
range(128)
Japanese in python function
Printing out Japanese (Chinese) characters
python utf-8 japanese
Is there a way to have the Japanese (UTF-8) display in the body of an email created with webbrowser in python? I could use the email functionality but the requirement is the script needs to open the default mail client and insert all the information.
The code and text files I am using are below. I have simplified it to focus on the issue.
email-template.txt
Hello. Today is {{date}}
email-template-jp.txt
こんにちは。今日は {{date}} です。
Python Script
#
# -*- coding: utf-8 -*-
#
import sys
import re
import os
import glob
import webbrowser
import codecs,sys
sys.stdout = codecs.getwriter('utf8')(sys.stdout)
# vars
date_range = sys.argv[1:][0]
email_template_en = "email-template.txt"
email_template_jp = "email-template-jp.txt"
email_to_send = "email-to-send.txt" # finished email is saved here
# Default values for the composed email that will be opened
mail_list = "test#test.com"
cc_list = "test1#test.com, test2#test.com"
subject = "Email Subject"
# Open email templates and insert the date from the parameters sent in
try:
f_en = open(email_template_en, "r")
f_jp = codecs.open(email_template_jp, "r", "UTF-8")
try:
email_content_en = f_en.read()
email_content_jp = f_jp.read()
email_en = re.sub(r'{{date}}', date_range, email_content_en)
email_jp = re.sub(r'{{date}}', date_range, email_content_jp).encode("UTF-8")
# this throws an error
# UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in position 26: ordinal not in range(128)
# email_en_jp = (email_en + email_jp).encode("UTF-8")
email_en_jp = (email_en + email_jp)
finally:
f_en.close()
f_jp.close()
pass
except Exception, e:
raise e
# Open the default mail client and fill in all the information
try:
f = open(email_to_send, "w")
try:
f.write(email_en_jp)
# Does not send Japanese text to the mail client. But will write to the .txt file fine. Unsure why.
webbrowser.open("mailto:%s?subject=%s&cc=%s&body=%s" %(mail_list, subject, cc_list, email_en_jp), new=1) # open mail client with prefilled info
finally:
f.close()
pass
except Exception, e:
raise e
edit: Forgot to add I am using Python 2.7.1
EDIT 2: Found a workable solution after all.
Replace your webbrowser call with this.
import subprocess
[... other code ...]
arg = "mailto:%s?subject=%s&cc=%s&body=%s" % (mail_list, subject, cc_list, email_en_jp)
subprocess.call(["open", arg])
This will open your default email client on MacOS. For other OSes please replace "open" in the subprocess line with the proper executable.
EDIT: I looked into it a bit more and Mark's comment above made me read the RFC (2368) for mailto URL scheme.
The special hname "body" indicates that the associated hvalue is the
body of the message. The "body" hname should contain the content for
the first text/plain body part of the message. The mailto URL is
primarily intended for generation of short text messages that are
actually the content of automatic processing (such as "subscribe"
messages for mailing lists), not general MIME bodies.
And a bit further down:
8-bit characters in mailto URLs are forbidden. MIME encoded words (as
defined in [RFC2047]) are permitted in header values, but not for any
part of a "body" hname."
So it looks like this is not possible as per RFC, although that makes me question why the JavaScript solution in the JSFiddle provided by naota works at all.
I leave my previous answer as is below, although it does not work.
I have run into same issues with Python 2.7.x quite a couple of times now and every time a different solution somehow worked.
So here are several suggestions that may or may not work, as I haven't tested them.
a) Force unicode strings:
webbrowser.open(u"mailto:%s?subject=%s&cc=%s&body=%s" % (mail_list, subject, cc_list, email_en_jp), new=1)
Notice the small u right after the opening ( and before the ".
b) Force the regex to use unicode:
email_jp = re.sub(ur'{{date}}', date_range, email_content_jp).encode("UTF-8")
# or maybe
email_jp = re.sub(ur'{{date}}', date_range, email_content_jp)
c) Another idea regarding the regex, try compiling it first with the re.UNICODE flag, before applying it.
pattern = re.compile(ur'{{date}}', re.UNICODE)
d) Not directly related, but I noticed you write the combined text via the normal open method. Try using the codecs.open here as well.
f = codecs.open(email_to_send, "w", "UTF-8")
Hope this helps.

Encoding error using Python

I wrote a code to connect to imap and then parse the body information and insert into database. But I am having some problems with accents.
From email header I got this information:
Content-Type: text/html; charset=ISO-8859-1
But, I am not sure if I can trust in this information...
The email was wrote in portuguese, so we have a lot of words with accents. For example, I extract the following phrase from the email source code (using my browser):
"...instalação de eletrônicos..."
So, I connected to imap and fetched some emails:
... typ, data = M.fetch(num, '(RFC822)') ...
When I print the content, I get the following word:
print data[0][1]
instala+º+úo de eletr+¦nicos
I tried to use .decode('utf-8') but I had no success.
instalação de eletrônicos
How can I make it a human readable? My database is in utf-8.
The header says it is using "ISO-8859-1" charset. So you need to decode the string with that encoding.
Try this:
data[0][1].decode('iso-8859-1')
Specifying the source code encoding worked for me. It's the code at the top of my example code below. This should be defined at the top of your python file.
#!/usr/bin/python
# -*- coding: iso-8859-15 -*-
value = """...instalação de eletrônicos...""".decode("iso-8859-15")
print value
# prints: ...instalação de eletrônicos...
import unicodedata
value = unicodedata.normalize('NFKD', value).encode('ascii','ignore')
print value
# prints: ...instalacao de eletronicos...
And now you can do str(value) without an exception as well.
See: http://docs.python.org/2/library/unicodedata.html
This seems to keep all accents:
#!/usr/bin/python
# -*- coding: iso-8859-15 -*-
import unicodedata
value = """...instalação de eletrônicos...""".decode("iso-8859-15")
value = unicodedata.normalize('NFKC', value).encode('utf-8')
print value
print str(value)
# prints (without exceptions/errors):
# ...instalação de eletrônicos...
# ...instalação de eletrônicos...
EDIT:
Do note that with the last version even though the outcome looks the same it doesn't return equal is True. In example:
#!/usr/bin/python
# -*- coding: iso-8859-15 -*-
import unicodedata
inValue = """...instalação de eletrônicos...""".decode("iso-8859-15")
normalizedValue = unicodedata.normalize('NFKC', inValue).encode('utf-8')
try:
print inValue == normalizedValue
except UnicodeWarning:
pass
# False
EDIT2:
This returns the same:
normalizedValue = unicode("""...instalação de eletrônicos...""".decode("iso-8859-15")).encode('utf-8')
print normalizedValue
print str(normalizedValue )
# prints (without exceptions/errors):
# ...instalação de eletrônicos...
# ...instalação de eletrônicos...
Though I'm not sure this will actually be valid for a utf-8 encoded database. Probably not?
Thanks for Martijn Pieters. We figured out that the email had two different encode. I had to split this parts and treat individually.

Categories