How to programmatically generate markdown output in Jupyter notebooks?

How to programmatically generate markdown output in Jupyter notebooks? - python

I want to write a report for classes in Jupyter notebook. I'd like to count some stuff, generate some results and include them in markdown. Can I set the output of the cell to be interpreted as markdown?
I'd like such command: print '$\phi$' to generate phi symbol, just like in markdown.
In other words, I'd like to have a template made in markdown and insert the values generated by the program written in the notebook. Recalculating the notebook should generate new results and new markdown with those new values inserted. Is that possible with this software, or do I need to replace the values by myself?

The functions you want are in the IPython.display module.
from IPython.display import display, Markdown, Latex
display(Markdown('*some markdown* $\phi$'))
# If you particularly want to display maths, this is more direct:
display(Latex('\phi'))

You are basically asking for two different things:
Markdown cells outputting code results.
I'd like to count some stuff, generate some results and include them in markdown. [...] I'd like to have a template in markdown and insert values generated by the program in the notebook
Code cells outputting markdown
I'd like such command: print '$\phi$' to generate phi symbol, just like in markdown.
Since 2. is already covered by another answer (basically: use Latex() or Markdown() imported from IPython.display), I will focus on the first one:
1. Markdown Template with inserted variables
With the Jupyter extension Python Markdown it actually is possible to do exactly what you describe.
Installation instructions can be found on the github page of nbextensions. Make sure you'll enable the python markdown extension using a jupyter command or the extension configurator.
With the extension, variables are accessed via {{var-name}}. An example for such a markdown template could look like this:
Python Code in Markdown Cells
The variable a is {{a}}
You can also embed LateX: {{b}} in here!
Even images can be embedded: {{i}}
Naturally all variables or images a, b, i should be set in previous code. And of course you may also make use of Markdown-Latex-style expressions (like $\phi$) without the print command. This image is from the wiki of the extension, demonstrating the capability.
Further info on this functionality being integrated into ipython/jupyter is discussed in the issue trackers for ipython and jupyter.

As an addition to Thomas's answer. Another easier way to render markdown markup is to use display_markdown function from IPython.display module:
from IPython.display import display_markdown
display_markdown('''## heading
- ordered
- list
The table below:
| id |value|
|:---|----:|
| a | 1 |
| b | 2 |
''', raw=True)
Output below:
Usage example could be found on Google Colab Notebook

Another option is to use Rich for Markdown rendering and UnicodeIt for symbols. It has some limitations, as Rich uses CommonMark, which does not support tables, for example. Rich has other ways to render tables though; this is detailed in the documentation.
Here is an example:
from rich.markdown import Markdown
import unicodeit
alpha = unicodeit.replace('\\alpha')
epsilon = unicodeit.replace('\\epsilon')
phi = unicodeit.replace('\\phi')
MARKDOWN = f"""
# This is an h1
Rich can do a pretty *decent* job of rendering markdown.
1. This is a list item
2. This is another list item
## This is an h2
List of **symbols**:
- alpha: {alpha}
- epsilon: {epsilon}
- phi: {phi}
This is a `code` snippet:
```py
# Hello world
print('Hello world')
```
This is a blockquote:
> Rich uses [CommonMark](https://commonmark.org/) to parse Markdown.
---
### This is an h3
See [Rich](https://github.com/Textualize/rich) and [UnicodeIt](https://github.com/svenkreiss/unicodeit) for more information.
"""
Markdown(MARKDOWN)
... which produces the following output:

from tabulate import tabulate
from IPython.display import Markdown
A2 = {
'Variable':['Bundle Diameter','Shell Diameter','Shell Side Cross Flow area','Volumetric Flowrate','Shell Side Velocity'],
'Result':[3.4, 34, 78.23, 1.0 , 2.0],
'Unit' : ['$in$', '$in$', '$ft^2$', '$ft^{3}s^{-1}$', '$fts^{-1}$']}
temp_html=tabulate(A2, headers='keys', tablefmt='html')
Markdown(temp_html.replace('<table>','<table style="width:50%">'))
.replace() usage will not break latex code & avoid column(s) overstrech. This way one can dynamically generate tables with Latex

Related

How to create a tree from a markdown file?

I would like to create a tree from a markdown file using Python. From what I have researched, it seems that I can use Python's markdown module to do this.
For example by using this file: https://github.com/Python-Markdown/markdown/blob/master/markdown/treeprocessors.py
I am stuck because I am not sure how to access the modules which Python-Markdown have not quite exposed to end users.
Here is what I would like to do for an example Markdown file:
# Heading 1
## Heading 2
Line 1
- Bullet 1
- Bullet 1.1
I would like to be able to receive an output that has a structure something like this:
tree = process_markdown(markdown_text)
And tree[0].title can contain "# Heading 1" and tree[0].value can contain the entire markdown text inside of "# Heading 1" and including "#Heading 1".
Likewise, tree[0][0].title be equal to "## Heading 2".
Would ideally also be able to have a tree element represent "- Bullet 1" and the sub-bullet be included in the text in the value of that tree element.
I hope this makes sense. Now, I believe this can be obtained using the functions/classes in Python-Markdown but I am not able to figure out the syntax of those inside functions as to how to extract it.
Edit:
As I was researching further, stumbled upon the documentation for creating extensions for Python-Markdown. And in there is this reference to tree - https://python-markdown.github.io/extensions/api/#working_with_et. I can tell that this is what I think I need to figure out how to use, but still unable to because not able to figure out the syntax and how to use the related functions in Python-Markdown.
This question from 2014 is similar. The first answer there suggests Mistune, however, mistune does not seem to create the tree either.

Is there a way to dynamically change the font color in Jupyter markdown?

I am using Jupyter to create a report file of an analysis I'm doing. At the end of each analysis I'll provide a summary of how many errors/irregularities the analysis has found. I was wondering if there is a way to dynamically change the font color based on the results. e.g. let's say we have a variable called "font_color" and we have a if statement that sets the variable to "Red" if there are errors and "Black" if there is none, now in Jupyter markdown set the color as:
In code cell:
font_color = *IF statement to define color*
In markdown cell:
<font color={{font_color}}>
- Testing
I'm open to suggestions and if there is a better way to dynamically change font colors.

Yes, in Jupyter notebooks you can use code to output markdown as well the standout and stderr channels. And also in Jupyter notebooks you can use HTML within the markdown to color code parts of text. Combining those, you could customize something like this for your report generation:
from IPython.display import Markdown, display
a = "Good"
if a == "Good":
font_color="green"
else:
font_color="red"
def printmd(string):
display(Markdown(string))
printmd("Summary:")
printmd(f'**<font color={font_color}>Status for a.</font>**')
Also see here and here.

How to use comparison operators within pandas query string in a markdown cell of Jupyter Notebook

A similar question is as here - Print Variable In Jupyter Notebook Markdown Cell Python - but they were having issues with the compiling.
My issue is in what Python code I should be using so that the variable is viewable when the html is compiled.
I have a dataset df4 with which I want to reference inline in the markdown of my jupyter notebook.
df4['amount'].sum()
works fine.
But (all on the same line in Markdown)
df4.groupby(['customer_name'])['amount_due'].sum().reset_index() \
.query('amount_due > 0')['amount_due'] \
.sum()
returns the error **SyntaxError**: only a single expression is allowed ()
I could of course define all the variables I need in a Python cell above the Markdown cell in Jupyter Notebook and then refer to them within the Markdown cell only by name. E.g. "The total amount is {{x}}".
But since there is this functionality (and also I have already compiled this report in R Markdown with inline R code) - I wanted to use it.
This is an example of what I am trying to achieve within jupyter notebook:
# Python3 cell
import pandas as pd
dt = {'customer_name': ['a','a','b','b','c'], 'amount': [-1,-1,1,1,1000]}
df4 = pd.DataFrame(data = dt)
df4
#### Markdown cell
This is a large amount : {{df4['amount'].sum()}} - let me explain further...
Output: This is a large amount : 1000 - let me explain further...
#### Markdown cell
This is a *larger* amount : {{df4.groupby(['customer_name'])['amount'].sum().reset_index().query('amount > 0')['amount'].sum()}} - let me explain further...
Expected: This is a large amount : 1002 - let me explain further...
When running the last cell the output is **SyntaxError**: only a single expression is allowed ().
The problem is with the ">" operator. It seems that markdown does not like the use of this character. Related issue on GitHub: Python-markdown syntax error if contains < or >.

Apply Number formatting to Pandas HTML CSS Styling

In Pandas, there is a new styler option for formatting CSS ( http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.core.style.Styler.html ).
Before, when I wanted to make my numbers into accounting/dollar terms, I would use something like below:
df = pd.DataFrame.from_dict({'10/01/2015': {'Issued': 200}}, orient='index')
html = df.to_html(formatters={'Issued': format_money})
format_money function:
def format_money(item):
return '${:,.0f}'.format(item)
Now I want to use the Style options, and keep my $ formatting. I'm not seeing any way to do this.
Style formatting for example would be something like this:
s = df.style.bar(color='#009900')
#df = df.applymap(config.format_money) -- Doesn't work
html = s.render()
This would add bars to my HTML table like so(Docs here: http://pandas.pydata.org/pandas-docs/stable/style.html):
So basically, how do I do something like add the bars, and keep or also add in the dollar formatting to the table? If I try to do it before, the Style bars don't work because now they can't tell that the data is numerical and it errors out. If I try to do it after, it cancels out the styling.

That hasn't been implemented yet (version 0.17.1) - but there is a pull request for that (https://github.com/pydata/pandas/pull/11667) and should come out in 0.18. For now you have to stick to using the formatters.

Why does IPython notebook only output one DIV from this code?

In an IPython notebook I input this code in a cell:
from IPython.display import HTML
HTML("""<div>One</div>""")
HTML("""<div>Two</div>""")
How come the output cell only contains the second div?
EDIT. #Dunno has shown how I can put all the html into one HTML() and both elements are rendered, but I still don't understand what's going on. Here's a more general case:
When I enter this in an input cell:
1
2
3
The output is
3
But if I enter the following:
print 1
print 2
print 3
Then I get this output:
1
2
3
What's the difference? Is IPython notebook only evaluating the last statement when I don't use print statements? Or is each subsequent evaluation overwriting the previous one?

Yeah I found some documentation on this, and HTML is actually a class, not a function.
So the correct code would be
from IPython.display import HTML
myhtml = HTML("""<div>One</div><div>Two</div>""") #make the html object
myhtml #display it
Now it makes sense why your code displays only one div.
To display multiple parts create multiple variables, containing the html and then concatenate them inside one call to HTML.
div1 = """<div>One</div>"""
div2 = """<div>Two</div>"""
myhtml = HTML(div1 + div2)
myhtml
Edit:
I opened a ticket on ipython's github profile to see if it's a bug or a feature that only the last line of statements is displayed. Turns out, it's planned behaviour:
quoting Thomas Kluyver:
This is deliberate, because if you call something in a for loop that returns a value:
for line in lines:
f.write(line) # Returns number of bytes written
You probably don't want to see all those numbers in the output.
The rule in IPython is: if the last statement in your code is an expression, we display its >value. 1;2 is a pair of statements, whereas 1,2 is one statement, so both values will >display.
I hope that explains things a bit. We're happy to revisit decisions like this, but it's >been this way for years, and I don't think anyone has taken issue with it.

Just a small addition to #Dunno's answer: if you want to display multiple IPython.display.DisplayObject objects (this includes HTML objects but also images etc.) from a single cell, you can use the IPython.display.display function.
For example, you can do:
from IPython.display import HTML, Image, display
display(HTML("""<div>One</div>"""))
display(HTML("""<div>Two</div>"""))
display(Image("http://cdn.sstatic.net/stackoverflow/company/img/logos/so/so-logo.png?v=9c558ec15d8a", format="png"))

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.