Jupyterlab i18next: How to extract translatable strings from JupyterLab notebooks?

Jupyterlab i18next: How to extract translatable strings from JupyterLab notebooks? - python

I would like to offer my JupyterLab notebooks in several languages (de-de, en-us).
In order to do so, I marked some strings
import gettext
domain = 'my_application_name'
localedir = '.'
translate = gettext.translation(domain, localedir, fallback=True)
_ = translate.gettext # using _ as name for the translation function is kind of standard in python
# do not confuse with private markers or underscore library
# https://github.com/serkanyersen/underscore.py
print(_('Hello World'))
print(_('another translation key'))
Then I downloaded xgettext.exe for windows from
https://github.com/mlocati/gettext-iconv-windows/releases/download/v0.21-v1.16/gettext0.21-iconv1.16-static-64.zip
and tried to extract the strings with following console command:
xgettext.exe my_notebook.ipynb
I got the warning
xgettext.exe: warning: file 'my_notebook.ipynb' extension 'ipynb' is unknown; will try C
and no output file was generated.
=> What is the recommented way for extracting translatable strings from JupyterLab notebooks?
I would prefer a solution, where no extra binary (like xgettext.exe) would be required on windows.
Does JupyterLab itself provide some translation features/extension (not for the UI but for notebooks)?
As a possible workaround, the notebook could first be converted to a python file with nbconvert and then passed to xgettext.exe. However, that seems to be too complicated. There should be a more elegant solution.
(The extraction of translatable strings from python files does work on windows, e.g.
xgettext.exe my_python_file.py
)
The rough workflow for translations seems to be:
Mark all strings that should be translated
Generate a translation file template from the strings (="master list" or Portable Object Template (POT) file)
Translate the translation file template
Apply the translation file
Related:
https://github.com/jupyterlab/jupyterlab/issues/11753
https://docs.python.org/3/library/i18n.html
https://www.mattlayman.com/blog/2015/i18n/

Here is the already mentioned workaround based on nbconvert and xgettext:
jupyter nbconvert --to script my_notebook.ipynb --output z_temp_file_for_translation
xgettext.exe z_temp_file_for_translation.py -o translations.pot
del z_temp_file_for_translation.py
As an alternative to xgettext, the python package Babel can be used:
pip install Babel
Also see: http://babel.pocoo.org/en/latest/cmdline.html#extract
jupyter nbconvert --to script my_notebook.ipynb --output z_temp_file_for_translation
pybabel extract z_temp_file_for_translation.py -o translations.pot
del z_temp_file_for_translation.py

Related

PyLaTeX: pylatex.errors.CompilerError: No LaTex compiler was found

I am trying to run the exact code from here to get an example of pylatex working.
In the directory that I am working in, I have copied and pasted from the link:
from pylatex import Document, Section, Subsection, Command
from pylatex.utils import italic, NoEscape
import pdflatex
def fill_document(doc):
"""Add a section, a subsection and some text to the document.
:param doc: the document
:type doc: :class:`pylatex.document.Document` instance
"""
with doc.create(Section('A section')):
doc.append('Some regular text and some ')
doc.append(italic('italic text. '))
with doc.create(Subsection('A subsection')):
doc.append('Also some crazy characters: $&#{}')
if __name__ == '__main__':
# Basic document
doc = Document('basic')
fill_document(doc)
doc.generate_pdf(clean_tex=False,compiler='pdflatex')
doc.generate_tex()
# Document with `\maketitle` command activated
doc = Document()
doc.preamble.append(Command('title', 'Awesome Title'))
doc.preamble.append(Command('author', 'Anonymous author'))
doc.preamble.append(Command('date', NoEscape(r'\today')))
doc.append(NoEscape(r'\maketitle'))
fill_document(doc)
doc.generate_pdf('basic_maketitle', clean_tex=False)
# Add stuff to the document
with doc.create(Section('A second section')):
doc.append('Some text.')
doc.generate_pdf('basic_maketitle2', clean_tex=False)
tex = doc.dumps() # The document as string in LaTeX syntax
I consistently get the error:
Traceback (most recent call last):
File "test.py", line 26, in <module>
doc.generate_pdf(clean_tex=False,compiler='pdflatex')
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.7/site-packages/pylatex/document.py", line 280, in generate_pdf
'Either specify a LaTex compiler ' +
pylatex.errors.CompilerError: No LaTex compiler was found
You can see some of the things I've tried based on other people's suggestions:
1. if I just open a python console in this directory, and type:
from pylatex import Document, Section, Subsection, Command
from pylatex.utils import italic, NoEscape
import pdflatex
there are no errors, implying importing was successful.
I saw another recommendation that perl must be installed, which it is:
localhost:test1$ perl --version : returns info about perl
I've specified the compiler as was suggested elsewhere on StackOverflow: 'doc.generate_pdf(clean_tex=False,compiler='pdflatex')'
What else can I do? The ultimate aim is I have generated strings and an image with python, and I want to put them both into a PDF and format, in case there is a better alternative that anyone can suggest. P.s. I'm aware of the tex stack exchange, but I specifically need a way for latex to interact with python, which is why I asked here.

Apparantly you'll need to run these two apt-get dependencies
sudo apt-get install latexmk
sudo apt-get install -y texlive-latex-extra
also install pdflatex with pip
pip install pdflatex
then use
doc.generate_pdf(compiler='pdflatex')

For me, on CentOS 8, I had to run:
pip install pylatex
sudo dnf install texlive
sudo dnf install texlive-lastpage
Then it worked for me.

I had this same error and my problem was that I had forgotten to add pdflatex (through MiKTex) to my environment path in Windows. My code worked after reloading my terminal. I have not installed Perl by the way.

can not find nbconvert.py after git clone and pip install nbconvert

I am following the instruction here http://www.slideviper.oquanta.info/tutorial/slideshow_tutorial_slides.html#/10 to convert my ipynb file to a html slideshow.
As I didn't have nbconvert, I $git clone https://github.com/jupyter/nbconvert.git and $pip install nbconvert. Everything looked good.
In my folder $ls returns CONTRIBUTING.md COPYING.md docs MANIFEST.in nbconvert README.md readthedocs.yml scripts setup.cfg setup.py
Here I could not find python nbconvert.py
As nbconvert is a folder, I $cd nbconvert and $ls. It showed:
exporters __init__.py nbconvertapp.py preprocessors templates utils writers
filters __main__.py postprocessors resources tests _version.py
Still, I did not see nbconvert.py I tried $python nbconvertapp.py -f reveal myfile.ipynb, the error:
File "nbconvertapp.py", line 26, in <module>
from .exporters.export import get_export_names, get_exporter
ValueError: Attempted relative import in non-package`
I am not clear what's going on here. Could someone direct me the right way to insall or use nbconvert so I can translate my ipynb file to a html slideshow? Thank you!

You should add c:\python27\Script\jupyter.exe (or wherever this file is) to the windows path, then you should cd to where your .ipynb file is and type in a windows cmd console :
jupyter nbconvert youfile.ipynb format
"format" can be : html, slides, pdf, etc...
Good luck

Since you are importing a module, the package for nbconvert are installed in your base python path and is accessible anywhere you have access to python. Therefore the correct way to use nbconvert is:
jupyter nbconvert <command>
If you run just:
jupyter nbconvert
you will see help information on commands it receives

pyreport LaTeX formulae not working

I'm trying to create a HTML report using pyreport and it works up to the single point, that the LaTeX formulae are not generated.
Here is the input file I use for testing:
#$ This is \LaTeX : $c = 2\cdot(a+b)$
Than I run pyreport -l -t html --verbose file.py, but the report that I get is empty. When I add other comments to the input file, or some Python code, than it is displayed properly within the report. Here is the output from pyreport:
Running python script
/tmp/file.py:
Outputing report to
/tmp/file.html Ran script in 0.13s
I'm using Ubuntu and I have the texlive package installed. Why isn't the formula added to the report?

I think i have find the problem.
The problem is the RST tools to convert in html.
In pyreport, when you choose the math mode, the program will do the sentence in a bock .. raw:: LaTeX
But in the new version of rst2html, this command doesnt work, it's replace by:
.. math::
If you use the command:
pyreport -l -e -t rst --verbose file.py
and after
rst2html file.rst > test.html
You will see the problem.
You can change that in pyreport code, in main.py of pyreport. (use a locate to find it). And replace the
.. raw:: Latex
, by
.. math::
The last problem is for the command \LaTeX, that's not in a math mode of latex. So it's not work.
You can report to RST documentation http://docutils.sourceforge.net/docs/ref/rst/directives.html#raw

Ignore IPython magic in python

What is the best way to ignore IPython magic when running scripts using the python interpreter?
I often include IPython magic in my script files because it work with the code interactively. For example, with the autoreload magic, I don't have to keep reload-ing the modules after I make some changes and fix bugs:
%load_ext autoreload
%autoreload 2
However, when I try to run this script using a usual python interpreter, I get an error:
File "<string>", line 1
%load_ext autoreload
^
SyntaxError: invalid syntax
Wrapping IPython magic inside an if statement does not work, because incorrect syntax is detected before the file is actually ran.
So what is the best way to get python to ignore IPython magic?
It's annoying to have to change your scripts whenever you want to run then in python, pdb, sphinx, etc.

For all tools that can read from standard input you could use grep to remove any magic lines and pipe the result into python:
grep -v '^%' magicscript.ipy | python
Works well as a bash alias:
alias pynomagic='( grep -v "^%" | python ) < '
pynomagic magicscript.ipy
Tools like pdb that only accept filenames could be called like this (bash again):
pdb <(grep -v '^%' magicscript.ipy)

In case this helps anyone.
At least for Databricks, when syncing a notebook with a .py file in Github, a magic function can be specified with a specially formatted comment.
Like this:
# MAGIC %run ./my_external_file

You should load such magic in your config file, not in your scripts! It is just not valid Python.
Put the following in your ~/.ipython/profile_default/ipython_config.py:
c = get_config()
c.InteractiveShellApp.extensions = ['autoreload']
c.InteractiveShellApp.exec_lines = ['%autoreload 2']
c.InteractiveShellApp.exec_lines.append('print("Warning: disable autoreload in ipython_config.py to improve performance.")')

Create a template file named simplepython.tpl. Copy the below statements.
{% extends 'python.tpl'%}
{% block codecell %}
{{ super().replace('get_ipython','#get_ipython') if "get_ipython" in super() else super() }}
{% endblock codecell %}
Save simplepython.tpl.
Type in command line:
jupyter nbconvert --to python 'IPY Notebook' --template=simplepython.tpl --stdout

Spyder gives warning (as given in the picture below), when a coder use this type of code and says that it is not a valid Python code.
So, in order to use IPython magics, saving files with the .ipy extension may be a solution.
Spyder screenshot

Have the same README both in Markdown and reStructuredText

I have a project hosted on GitHub. For this I have written my README using the Markdown syntax in order to have it nicely formatted on GitHub.
As my project is in Python I also plan to upload it to PyPi. The syntax used for READMEs on PyPi is reStructuredText.
I would like to avoid having to handle two READMEs containing roughly the same content; so I searched for a markdown to RST (or the other way around) translator, but couldn't find any.
The other solution I see is to perform a markdown/HTML and then a HTML/RST translation. I found some ressources for this here and here so I guess it should be possible.
Would you have any idea that could fit better with what I want to do?

I would recommend Pandoc, the "swiss-army knife for converting files from one markup format into another" (check out the diagram of supported conversions at the bottom of the page, it is quite impressive). Pandoc allows markdown to reStructuredText translation directly. There is also an online editor here which lets you try it out, so you could simply use the online editor to convert your README files.

As #Chris suggested, you can use Pandoc to convert Markdown to RST. This can be simply automated using pypandoc module and some magic in setup.py:
from setuptools import setup
try:
from pypandoc import convert
read_md = lambda f: convert(f, 'rst')
except ImportError:
print("warning: pypandoc module not found, could not convert Markdown to RST")
read_md = lambda f: open(f, 'r').read()
setup(
# name, version, ...
long_description=read_md('README.md'),
install_requires=[]
)
This will automatically convert README.md to RST for the long description using on PyPi. When pypandoc is not available, then it just reads README.md without the conversion – to not force others to install pypandoc when they wanna just build the module, not upload to PyPi.
So you can write in Markdown as usual and don’t care about RST mess anymore. ;)

2019 Update
The PyPI Warehouse now supports rendering Markdown as well! You just need to update your package configuration and add the long_description_content_type='text/markdown' to it. e.g.:
setup(
name='an_example_package',
# other arguments omitted
long_description=long_description,
long_description_content_type='text/markdown'
)
Therefore, there is no need to keep the README in two formats any longer.
You can find more information about it in the documentation.
Old answer:
The Markup library used by GitHub supports reStructuredText. This means you can write a README.rst file.
They even support syntax specific color highlighting using the code and code-block directives (Example)

PyPI now supports Markdown for long descriptions!
In setup.py, set long_description to a Markdown string, add long_description_content_type="text/markdown" and make sure you're using recent tooling (setuptools 38.6.0+, twine 1.11+).
See Dustin Ingram's blog post for more details.

You might also be interested in the fact that it is possible to write in a common subset so that your document comes out the same way when rendered as markdown or rendered as reStructuredText: https://gist.github.com/dupuy/1855764 ☺

For my requirements I didn't want to install Pandoc in my computer. I used docverter. Docverter is a document conversion server with an HTTP interface using Pandoc for this.
import requests
r = requests.post(url='http://c.docverter.com/convert',
data={'to':'rst','from':'markdown'},
files={'input_files[]':open('README.md','rb')})
if r.ok:
print r.content

I ran into this problem and solved it with the two following bash scripts.
Note that I have LaTeX bundled into my Markdown.
#!/usr/bin/env bash
if [ $# -lt 1 ]; then
echo "$0 file.md"
exit;
fi
filename=$(basename "$1")
extension="${filename##*.}"
filename="${filename%.*}"
if [ "$extension" = "md" ]; then
rst=".rst"
pandoc $1 -o $filename$rst
fi
Its also useful to convert to html. md2html:
#!/usr/bin/env bash
if [ $# -lt 1 ]; then
echo "$0 file.md <style.css>"
exit;
fi
filename=$(basename "$1")
extension="${filename##*.}"
filename="${filename%.*}"
if [ "$extension" = "md" ]; then
html=".html"
if [ -z $2 ]; then
# if no css
pandoc -s -S --mathjax --highlight-style pygments $1 -o $filename$html
else
pandoc -s -S --mathjax --highlight-style pygments -c $2 $1 -o $filename$html
fi
fi
I hope that helps

Using the pandoc tool suggested by others I created a md2rst utility to create the rst files. Even though this solution means you have both an md and an rst it seemed to be the least invasive and would allow for whatever future markdown support is added. I prefer it over altering setup.py and maybe you would as well:
#!/usr/bin/env python
'''
Recursively and destructively creates a .rst file for all Markdown
files in the target directory and below.
Created to deal with PyPa without changing anything in setup based on
the idea that getting proper Markdown support later is worth waiting
for rather than forcing a pandoc dependency in sample packages and such.
Vote for
(https://bitbucket.org/pypa/pypi/issue/148/support-markdown-for-readmes)
'''
import sys, os, re
markdown_sufs = ('.md','.markdown','.mkd')
markdown_regx = '\.(md|markdown|mkd)$'
target = '.'
if len(sys.argv) >= 2: target = sys.argv[1]
md_files = []
for root, dirnames, filenames in os.walk(target):
for name in filenames:
if name.endswith(markdown_sufs):
md_files.append(os.path.join(root, name))
for md in md_files:
bare = re.sub(markdown_regx,'',md)
cmd='pandoc --from=markdown --to=rst "{}" -o "{}.rst"'
print(cmd.format(md,bare))
os.system(cmd.format(md,bare))

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Jupyterlab i18next: How to extract translatable strings from JupyterLab notebooks? - python

Related

PyLaTeX: pylatex.errors.CompilerError: No LaTex compiler was found

can not find nbconvert.py after git clone and pip install nbconvert

pyreport LaTeX formulae not working

Ignore IPython magic in python

Have the same README both in Markdown and reStructuredText

Categories

Resources