wiki/docbook/latex documentation template system - python

I'm searching for a documentation template system or rather will be creating one.
It should support the following features:
Create output in PDF and HTML
Support for large & complicated (LaTeX) formulas
References between documents
Bibliographies
Templates will be filled by a Python script
I've tried LaTeX with various TeX-to-HTML converters but I'm not satisfied with the results.
I've been using DocBook for a while, but I think that editing DocBook is not easy to write and the support for formulas is not yet sufficient.
The main problem is, that there will be users of this system that do not know LaTeX syntax or DocBook. I've thought about an alternative for these users providing an editing possibility with Wiki syntax (converted by Python to LaTeX).
Let's sum up: I want HTML and PDF output from at least LaTeX and Wiki input. DocBook could be used as intermediate format.
Has anybody had a similar problem or can give me an advice on which tools and which file formats I should use ?

We use sphinx: https://www.sphinx-doc.org
It does almost all of that.
Your python script or your users or whomever (I can't follow the question) can create content using RST markup (which is perhaps the easiest of markup languages). You run it through Sphinx and you get HTML and Latex.

I created a LaTeX pre-processor and python module that allows you to embed python or SQL inside a LaTeX file. The python and/or SQL is executed and the output is folded in.
With latex2html or latex2rtf you can then use the LaTeX code to produce HTM and RTF.
I've posted it for you at http://simson.net/pylatex/

Arbortext supports LaTeX natively. You can send the publishing engine or print composer LaTeX and it'll pass it through directly.
It also supports a lot of other composition languages as well and even gives the opportunity to do page-layout manipulation like you'd see in InDesign (without the headache and overhead of ID).

I think that Asciidoc is better targeted at what you are trying to get. It is a simple markup language, it allows latex formulas in it and it generates Docbook documents from which you can further generate the readable HTML or Latex representation

Related

Is there a way to add Julia, R and python to a single text file like R markdown or a notebook that could be manipulated as a text file?

Stated briefly: I would like to have a text file where I can smoothly switch among R, python and Julia. Of importance, I am looking for a way to run rather than just display code
I know it is possible to add python (and many other languages) to R markdown http://goo.gl/4w8XIb , but not sure I could add Julia. Also possible to use notebooks like Beaker http://beakernotebook.com/ with all three languages (and more) , but my issue with notebooks is that they are not nearly as fast to manipulate compared to what can be done with a text file in an editor environment (sublime, emacs, vim, atom ...). I know very little about notebooks, and the ones I know of are represented as json files, but manipulating a json file to write a report is all but user friendly.
I'm probably missing the obvious, but any other way to do this? thanks
I recently created an R package JuliaCall, and it can be used as julia engine in R Markdown document, see https://non-contradiction.github.io/JuliaCall/articles/JuliaCall_in_RMarkdown.html for an example.
Although JuliaCall is already on CRAN, this new feature is still in the development version on github. If you want to try it, use
devtools::install_github("Non-Contradiction/JuliaCall")
to install JuliaCall.
The feature includes
Multiple julia chunks running by same julia session.
Accessing R variables, functions inside julia code and vice versa.
The current limitation is that it only fully support html output.
With Restructured Text, there is good support for including code samples, where each code-block directive can include the relevant
language.
.. code-block:: ruby
Some Ruby code.
Markdown also supports mentioning the language with each code block, e.g.:
```javascript
var s = "JavaScript syntax highlighting";
alert(s);
```
```python
s = "Python syntax highlighting"
print s
```
```
No language indicated, so no syntax highlighting.
But let's throw in a <b>tag</b>.
```
I think Beaker Notebook is actually a very good solution for your needs. It is a polyglot tool which will let you combine R, Python and Julia very well. There is a Vim editing mode which is not perfect, but still quite fast. There are shortcut keys for executing cells quickly, executing only selected lines, as well as jumping between cells. Beaker is also a permissively licensed open source project on GitHub with a very responsive maintainer, so you could also contribute any missing features directly as PRs.

enable markdown syntax highlighting in code comments/docs

There are great in-code documentation standards for python, for example:
Google
Numpy
Which use some nice, simple ReStructured / Markdown like syntax. Is there a way to have emacs render ReST / Md inside the comments of python code? I.e. the major-mode would still be python.el, and normal python syntax would work; but inside a comment block ('''...''' ) it would be rendered as some sort of markdown.
You should be able to achieve this by using the polymode package (https://github.com/vspinu/polymode). It allows you to have multiple major modes in one buffer. Have a look at the screenshot examples.
There are other packages too which can enable multiple major modes in one file: http://www.emacswiki.org/emacs/MultipleModes

XML embedded in Python docstring, use syntax highlighting

I'm working on a Python project where there are some blocks of XML defined inside docstrings.
The code contains strings like this
xml_str = """<a>
<b>text</b>
</a>"""
In reality, the blocks of embedded XML are much larger. The problem is this XML becomes difficult to read. Since an IDE renders the XML as a String in one color, the text cannot be visually parsed as easily as if it had normal XML syntax highlighting applied.
I'm looking for an editor which will either natively support syntax highlighting in Strings, or where such a feature can be hacked in easily. This feature would be really cool to have, so I'm prepared to invest some time to make it happen.
I realize there are some arguments for why embedding XML in such a manner would be bad practice. I would agree, except in this situation. I have found a way to solve a problem very effectively by placing XML in a Python file directly instead of into an external resource.
Edit
I normally use PyDev for Eclipse, so I would be biased to solutions using it. Although I am prepared to swtich IDEs for this project if necessary.
Use syntax highlighting plugin http://colorer.sf.net/eclipsecolorer/ for eclipse. It Automatic folding support for mostly all (200+) supported languages.

Sweave for python

I've recently started using Sweave* for creating reports of analyses run with R, and am now looking to do the same with my python scripts.
I've found references to embedding python in Sweave docs, but that seems like a bit of a hack. Has anyone worked out a better solution, or is there an equivalent for python I'm not aware of?
* Sweave is a tool that allows to embed the R code for complete data analyses in latex documents
I have written a Python implementation of Sweave called Pweave that implements basic functionality and some options of Sweave for Python code embedded in reST or Latex document. You can get it here: http://mpastell.com/pweave and see the original blog post here: http://mpastell.com/2010/03/03/pweave-sweave-for-python/
Some suggestions:
I have been using Pweave for several years now, and it is very similar to Sweave. Highly recommended.
The most popular tool for embedded reports in python at this stage is Jupyter notebooks, which allow you to embed markdown, and they are quite useful although I personally still like writing things in LaTeX...
You can also have a look at PyLit, which is intended for literate programming with Python, but not as well maintained as some of the alternatives.
Sphinx is great for documenting with python, and can output LaTex.
Here's a list of tools for literate programming. Some of these work with any programming language.
Dexy is a very similar product to Sweave. One advantage of Dexy is that it is not exclusive to one single language. You could create a Dexy document that included R code, Python code, or about anything else.
This is a bit late, but for future reference you might consider my PythonTeX package for LaTeX. PythonTeX allows you enter Python code in a LaTeX document, run it, and bring back the output. But unlike Sweave, the document you actually edit is a valid .tex document (not .Snw or .Rnw), so editing the non-code part of the document is fast and convenient.
PythonTeX provides many features, including the following:
The document can be compiled without running any Python code; code only needs to be executed when it is modified.
All Python output is saved or cached.
Code runs in user-defined sessions. If there are multiple sessions, sessions automatically run in parallel using all available cores.
Errors and warnings are synchronized with the line numbers of the .tex document, so you know exactly where they came from.
Code can be executed, typeset, or typeset and executed. Syntax highlighting is provided by Pygments.
Anything printed by Python is automatically brought into the .tex document.
You can customize when code is re-executed (modified, errors, warnings, etc.).
The PythonTeX utilities class is available in all code that is executed. It allows you to automatically track dependencies and specify created files that should be cleaned up. For example, you can set the document to detect when the data it depends on is modified, so that code will be re-executed.
A basic PythonTeX file looks like this:
\documentclass{article}
\usepackage{pythontex}
\begin{document}
\begin{pycode}
#Whatever you want here!
\end{pycode}
\end{document}
You might consider noweb, which is language independent and is the basis for Sweave. I've used it for Python and it works well.
http://www.cs.tufts.edu/~nr/noweb/
I've restructured Matti's Pweave a bit, so that it is possible to define arbitrary "chunk-processors" as plugin-modules. This makes it easy to extend for several chunk-based text-preprocessing applications. The restructured version is available at https://bitbucket.org/edgimar/pweave/src. As an example, you could write the following LaTeX-Pweave document (notice the "processor name" in this example is specified with the name 'mplfig'):
\documentclass[a4paper]{article}
\usepackage{graphicx}
\begin{document}
\title{Test document}
\maketitle
Don't miss the great information in Figure \ref{myfig}!
<<p=mplfig, label=myfig, caption = "Figure caption...">>=
import sys
import pylab as pl
pl.plot([1,2,3,4,5],['2,4,6,8,10'], 'b.', markersize=15)
pl.axis('scaled')
pl.axis([-3,3, -3,3]) # [xmin,xmax, ymin,ymax]
#
\end{document}
You could try SageTeX which implements Sweave-Like functionality for the SAGE mathematics platform. I haven't played around with it as much as I would like to, but SAGE is basically a python shell and evaluates python as it's native language.
I have also thought about the same thing many times. After reading your questions and looking into your link I made small modifications to the custom python Sweave driver, that you link to. I modified it to also keep the source code and produce the output as well the same way that Sweave does for R.
I posted the modified version and an example here: http://mpastell.com/2010/02/09/python-in-sweave-document/
Granted, it is not optimal but I'm quite happy with the output and I like the ability to include both R and Python in the same document.
Edit about PyLit:
I also like PyLit and contrary to my original answer you can catch ouput with it as well, although it not as elegant as Sweave! Here is a small example how to do it:
import sys
# Catch PyLit output
a = range(3)
sys.stdout = open('output.txt', 'w')
print a
sys.stdout = sys.__stdout__
# .. include:: output.txt
What you're looking for is achieved with GNU Emacs and org-mode*. org-mode does far more than can be detailed in a single response, but the relevant points are:
Support for literate programming with the ability to integrate multiple languages within the same document (including using one language's results as the input for another language).
Graphics integration.
Export to LaTeX, HTML, PDF, and a variety of other formats natively, automatically generating the markup (but you can do it manually, too).
Everything is 100% customizable, allowing you to adapt the editor to your needs.
I don't have Python installed on my system, but below is an example of two different languages being run within the same session. The excerpt is modified from the wonderful org-mode R tutorial by Erik Iverson which explains the set up and effective use of org-mode for literate programming tasks. This SciPy 2013 presentation demonstrates how org-mode can be integrated into a workflow (and happens to use Python).
Emacs may seem intimidating. But for statistics/data science, it offers tremendous capabilities that either aren't offered anywhere else or are spread across various systems. Emacs allows you to integrate them all into a single interface. I think Daniel Gopar says it best in his Emacs tutorial,
Are you guys that lazy? I mean, c'mon, just read the tutorial, man.
An hour or so with the Emacs tutorial opens the door to some extremely powerful tools.
* Emacs comes with org-mode. No separate install is required.
Well, with reticulate which is a recent best implementation of a Python interface in R you could continue using Sweave and call Python inline using the R interpreter. For example this now works in a .Rnw or .Rmd markdown file.
```{r example, include=FALSE}
library(reticulate)
use_python("./dir/python")
```
```{python}
import pandas
data = pandas.read_csv("./data.csv")
print(data.head())
```
I think that Jupyter-book may do what you want.

What's the Name of the Python Module that Formats arbitrary Text to nicely looking HTML?

A while ago I came across a Python library that formats regular text to HTML similar to Markdown, reStructuredText and Textile, just that it had no syntax at all. It detected indentatations, quotes, links and newlines/paragraphs only.
Unfortunately I lost the name of the library and was unable to Google it. Anyone any ideas?
Edit: reStructuredText aka rst == docutils. That's not what I'm looking for :)
Okay. I found it now. It's called PottyMouth.
Markdown in python is a python implementation of the perl based markdown utility.
Markown converts various forms of structured text to valid html, and one of the supported forms is just plain ascii. Use is pretty straight forward.
python markdown.py input_file.txt > output_file.html
Markdown can be easily called as a module too:
import markdown
html = markdown.markdown(your_text_string)
Sphinx is a documentation generator using reStructuredText. It's quite nice, although I haven't used it personally.
The website Hazel Tree, which compiles python text uses Sphinx, and so does the new Python documentation.

Categories