How to disable fuzzy on django translations? - python

I don't want to use fuzzy tag. Is it possible?
For example;
When i added new sentence or word translations , generally fuzz automatically wrap it. But i don't like it.
#: frontend/src/components/language_consts.js:74
#, fuzzy
#| msgid "Patient Address"
msgid "Patient's address?"
msgstr "Adresse du doctor"

This is probably because of the software you use to translate your strings. fuzzy means that the translation needs reviewing. Mark the translations as reviewed and it should disappear.

Related

Sphinx - Split up long paragraphs docstrings for internazionalization

I'm trying to internazionalize the documents of a python library using sphinx and crowdin.
Through sphinx i firstly generate the .pot files but there's a problem with these files.
As mentioned in the sphinx docs
It is the maintainer’s task to split up paragraphs which are too large as there is no sane automated way to do that.
that's an example of what i have
...
#: ../../../disnake/client.py:docstring of disnake.client.Client:4
msgid "A number of options can be passed to the :class:`Client`."
msgstr ""
#: ../../../disnake/abc.py:docstring of disnake.abc.GuildChannel.clone:0
#: ../../../disnake/abc.py:docstring of disnake.abc.GuildChannel.create_invite:0
#: ../../../disnake/abc.py:docstring of disnake.abc.GuildChannel.delete:0
...
where i need all the docstrings of the methods with a msgid and the empty msgstr for translators.
Now, am i supposed to create a script to do this? If so, that script should extract paragraphs to use as msgid but i don't know where to start. I've also searched on internet but there isn't any example.
Thanks in advance.

Django French Translation - how to handle single quotes in translation strings?

I am using Python 3.5.2 and Django 1.10.
I have received the French translation .po file and can run the compilemessages command without receiving any errors.
However, when I run the site, many pages refuse to load.
I suspect that this is because the French translation .po file contains many single quotes (') in the translation strings.
For example,
#: .\core\constants\address_country_style_types.py:274
msgid "Ascension Island"
msgstr "Île de l'Ascension"
I remember reading somewhere (but cannot find that reference anywhere) that the single quotes must have either a forward or back slash before them. So I tried that, but when I ran the compilemessage command, I got an error message of:
C:\Users\me\desktop\myapp\myapp\locale\fr\LC_MESSAGES\django.po:423:18: invalid control sequence
So how do I escape the French single quote in strings issue?
here is the header of my French language .po file:
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL#ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2017-05-04 12:55+1000\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL#ADDRESS>\n"
"Language-Team: LANGUAGE <LL#li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=2; plural=(n > 1);\n"
I am unsure what is the cause of this issue (maybe that the translator somehow corrupted the file?).
However, a workaround is instead of using the standard single quotation mark ', I have used this single quotation mark (taken from symbols in MS Word):
′
I am yet to check this with the French translator, but it looks and works OK.
I hope this helps someone.
The correct way is to "Escape" the single quote, however, you need to know the end-point consuming the text. Like you found out with the backslash, as in:
L\'Ascension
Trust me, nobody that is French will like seeing the backquote. Back in the DOS days of the 90's, visually, there was almost no difference. Now with fonts, it gets ugly.
Since you're producing for the web, use a HTML replacement, like &apos;
See this article:
Why shouldn't `&apos;` be used to escape single quotes?
The solution is
#: .\core\constants\address_country_style_types.py:274
msgid "Ascension Island"
msgstr "Ile de l‘Ascension"
It works, even if it will be used in some JavaScript. Don't use the numeric code ', it will not work inside Form fields, it will not be rendered and you will see the ugly number. I already tested all this.
As I said in the comments, beginning a word with a uppercase accented letter is not recommended. If you put Île and you then sort the list of countries, the Î character will come after the Z and will not be sorted following a natural order, as you would expect.
This is another problem with Python sorting capabilities. It will only follow the extended ASCII code according of each letter encoding number. And Î has an ANSI code of 206, it comes after the Z, which is 90.
Maybe Python provides a solution to this, but I didn't find yet. If someone found it I would be glad to know.
I'm a French speaker, so are most of my users.
Very annoying bug.
the normal django escaping techniques (through \' or format_html(my_translated_string)) do not work for me as well.
I have used ′ instead of ' and it works OK - the compilemessage command works and the html node works ok.
it is however not very elegant or Robust as any future message needs to take this into account, and it is not very common to use the character ´
I found out another better and more robust solution:
escaping through template filters.
in html template:
<h5 class="modal-title">{{help_message_body|escape}}</h5>
and in javascript:
modal.find('.modal-message').html('<h5 class="modal-title">{{help_message_body|escapejs}}</h5>')

Using .po keys in bottle i18n

In the basic example for bottle i18n, the msgID is used to get the string in the corresponding language.
return bottle.template("<b>{{_('hello')}} I18N<b/>?")
in the corresponding .po, the msgid is defined:
msgid "hello"
msgstr "Hello"
In other projects, the .po does not only contain msgid and msgstr, but also a key before, defined with a hash sign. This is especially useful for longer phrases to avoid clutter in source code:
#: wordpress_file_monitor.php:138 wordpress_file_monitor.php:147
msgid "Remove Alert"
msgstr "Benachrichtigung entfernen"
How can I access this # key using bottle-i18n?

Gettext fallbacks don't work with untranslated strings

In source code of my application I wrapped with gettext strings in russian, so this is my default language and *.po files based on it.
Now I need to make fallbacks chain - string that doesn’t translated in spanish catalog should be searched in english catalog and than if it doesn’t translated will be returned itself in russian.
I trying to do this with add_fallback method, but untranslated strings in self._catalog of GNUTranslations(NullTranslations) already replaced with itself and ugettext method never doing fallbacks.
What I am doing wrong?
Example:
Current locale is Spanish, and we’ve got no translations for string "Титул должен быть уникальным" in Spanish catalog and as a result "Title should be unique" from English catalog should be returned.
Spanish *.po file
msgid "Титул должен быть уникальным"
msgstr "" # <— We've got no translation for this string
English *.po file
msgid "Титул должен быть уникальным"
msgstr "Title should be unique"
Russian *.po file does not contains translations, because this language used as keys in source code (default language)
msgid "Титул должен быть уникальным"
msgstr ""
I’ve got Spanish translator (object of GNUTranslations), and I add English traslator (object of GNUTranslations) as fallback for it with add_fallback method.
So, my es_translator._fallback is en_translator object.
In ugettext function we trying to get value from self._catalog by message as key, and only if it is missing we doing self._fallback call.
But self._catalog.get(message) for untranslated string return string itself.
self._catalog["Титул должен быть уникальным"] -> "Титул должен быть уникальным" and we never doing search in English catalog.
def add_fallback(self, fallback):
if self._fallback:
self._fallback.add_fallback(fallback)
else:
self._fallback = fallback
def ugettext(self, message):
missing = object()
tmsg = self._catalog.get(message, missing)
if tmsg is missing:
if self._fallback:
return self._fallback.ugettext(message)
return unicode(message)
return tmsg
However if message marked as fuzzy it does’t include in self._catalog and fallback works well.
#, fuzzy
msgid "Отсутствуют файлы фотографий"
msgstr "Archivos de fotos ausentes"
Ok, python is doing something different from the standard fallback mechanism for added functionality which is not working like you think it should. This may warrant a bug report.
The standard fallback mechanism only has one fall back if a string is not in a translation: use the source string. In most cases this is english (the C or POSIX locale forces no lookups), but in your case because the messages in the source the C locale has russian text (which may cause other problems because sometimes the C locale assumes ascii not utf8). The current recommended best practice is to use english in the C locale encoded in seven bit ascii and then translate to all other languages. This is a significant redesign (and admittedly anglocentric) but unless someone improves the tools (which would be even more significant redesign) this is probably your best bet.
Only way to solve it was removing untranslated strings while compiling *.mo files.
Patch babel/messages/mofile.py write_mo with
messages = [m for m in messages if m.string]

Detect whether the word you is a subject or object pronoun based on sentence context.

Ideally using regex, in python. I'm making a simple chatbot, and it's currently having problems responding to phrases like "I love you" correctly (it'll throw back "You love I" out of the grammar handler, when it should be giving back "You love me").
In addition, I'd like it if you could think of good phrases to throw into this grammar handler, that'd be great. I'd love some testing data.
If there's a good list of transitive verbs out there (something like a "top 100 used") it may be acceptable to use that and special case the "transitive verb + you" pattern.
Well, what you're trying to implement is definitely very challenging but also very difficult.
Logic
As a starter, I would look a bit into the Grammar rules first.
Basic sentence structure :
SUBJECT + TRANSITIVE VERB + OBJECT
SUBJECT + INTRANSITIVE VERB
(Of course, we could also talk about "Subject+Verb+Indirect Object+Direct Object" formats, etc (e.g. I give you the ball) but this would get too complicated for now...)
Obviously, this scheme is VERY simplistic, but let's stick to that for now.
Then (another over-simplistic assumption), that each part is a single word.
so basically you have the following Sentence Scheme :
WORD WORD WORD
which could be generally matched using a regex like :
([\w]+)\s+([\w]+)\s+([\w]+)?
Explanation :
([\w]+) # first word (=subject)
\s+ # one or more spaces
([\w]+) # second word (=verb)
\s+ # one or more spaces
([\w]+)? # (optional) third word (=object - if the verb is transitive)
Now, obviously to formulate sentences like "You love me" and not "You love I", your algorithm should also "understand" that :
The third part of the sentence has the role of the Object
Since "I" is a personal pronoun (used only in nominative case : "as a subject"), we should you its "accusative form" (=as an object); so, for this purpose, you may also need e.g. personal pronoun tables like :
I - my - me
You - your - you
He - his - him
etc...
Just a few ideas... (purely out of my enthusiasm for linguistics :-))
Data
As for the wordlists you are interested in, just a few samples :
330 Most Common English Verbs (most - if not all of them - are
transitive)
Personal Pronouns Chart
What you want is a syntactic analyser (aka parser)- this can be done by a rule-based system as described by #Dr.Kameleon, or statistically. There are many implementations out there, one being the Stanford one. These will generally tell you what the syntactic role of a word is (e.g. subject "You are here", or object "She like you"). How you use that information to turn statements into questions is a whole different can of worms. For English, you can get a fairly simple rule-based system to work OK.

Categories