How to prevent Jupyter Notebook download PDF from printing outside of margin? - python

In Jupyter Notebook, via "File- Download as - PDF via LaTeX (.pdf)", I downloaded my notebook as a pdf file. However, many of my code blocks get printed outside of the PDF page margins - i.e. for longer code lines, they get cut out at the pdf page right border. Any way to fix this so that I can have a readable PDF doc (other than manually add hard returns for each line or the way suggested in this post? Thanks!

I was having the same problem. Ultimately, I found the answer at:
http://www.markus-beuckelmann.de/blog/customizing-nbconvert-pdf.html
Basically, it involves adding a custom latex template that wraps lines. He also adjusts some of the font sizes so that the need for wrapping is less.
I did find one bug. His code is missing the final end macro line:
((*- endmacro *))
I put the file in ~/anaconda2/lib/python2.7/site-packages/nbconvert/templates/latex . The location will vary depending on your installation. I'm using anaconda 4.3.1 (Jupyter 4.3.1 as well?)
Actually what I did was rename the existing article.tplx, changed the "extends 'article.tplx' " reflect the new name, and wrote the new template as article.tplx . That way I can change the templates without having to restart the server.

Related

In python, how to save print output and plots together to a single file of any file format in an automated fashion?

How to save print output as well as plots in python to a single file in whichever output file format, be it .txt, .html, .pdf, etc. in an automated fashion? Since I will be doing this for thousands of outputs and plots, is there a python command I can use.
I know we can save them separately using python commands, but is there a python command to save them together in the same order that they are outputted, for example how they appear in a Jupyter notebook together as shown below. The format of the file in which they are saved does not matter as long as there is a way to save both together (ideally file format should not be very memory intensive, but that is secondary).
This is so that I can open the file later in a folder and the output is saved for me to always access later. If there is a lot of output Jupyter notebook unfortunately crashes, corrupting the file and making the code irrecoverable.
jupyter notebook have option "file" - download as"
you can save as HTML and insert HTML code fragment for it.
So I found out that this is in fact a much bigger question and problem I had originally thought and the answer is much bigger and non-conventional than I had thought.
The open source platform MLflow (https://mlflow.org/) for machine learning lifecycle does this, and it does a better job than just keeping the plots and text output. It does this by storing the runs as well as saving the plots and outputs as artifacts. Further, a lot of my outputs were the different performance metrics and hyperparameters, which MLflow provides a simple method to store them for the different runs.
This is in fact what I was trying to do and kind of solves the underlying problem I was having of storing the output and the plots in one central location where they could be later accessed.
Thanks for all the help everyone. I appreciate it.
In jupyter notebook you can save it in multiple ways like pdf, ipynb etc. Saving your edits is simple. There is a disk icon in the upper left of the Jupyter tool bar. Click the save icon and your notebook edits are saved. It's important to realize that you will only be saving edits you've made to the text sections and to the coding windows.
On the course website in Chromium, right-click on the . ipynb file you want to download, and select Save link as...
In the Save File dialog that appears, make sure to save the . ipynb file
if you want to save in form of pdf then try opening the jupyter notebook in chrome and click right and print and save as pdf.

Custom Jupyter cell format?

If I have a python function that can take text, parse it, and generate formatted HTML, (or re-formatted text), as output, is their any way of adding that as a custom cell format to Jupyter?
I would like to create a custom markup format for register definitions and haveit displayed as pretty HTML/SVG but have the source remain text.
Thanks
EXTRA: I read a biy more and although I see input cells that can go on to generate HTML output, there seems to be nothing that allows the output to hide the input, in the same way that Markdown HTML replaces its source when not editing.
Here is a combination of answers that I should get you what you're looking for. Using this answer as a guide, you can have IPython output HTML using display:
from IPython.core.display import display, HTML
html_custom = '<h1>%s</h1>' % 'Whatever you want'
display(HTML(html_custom))
That allows you to use python to read in whatever text you need to and format it as you need.
Next step is to hide the input. The nbextensions notebook extensions give you a lot of functionality within the notebook and was suggested here. One of the available extensions is Hide input, which as the name suggests, hides the input of a cell. The collapsed state is even maintained within the notebook metadata, so it displays collapsed as you'd expect when reopening the notebook.
Then within the notebook:

Jupyter Python Markdown: Evaluated Inline code in LaTeX output

Python Markdown is a very nice extension for the jupyter notebook, which is in turn great for literate programming, i.e. mixing text and code.
Python Markdown makes it possible to include short inline code in Markdown cells in Jupyter like in the following example:
Python cell: a = 3.1415
Markdown cell: The value of a was {{a}}.
Everything works fine in the browser interface, but when I export it to LaTeX (or PDF via LaTeX), the output will still contain {{a}} as an unevaluated expression.
It would of course be really helpful to have the evaluated expression in the output for generating reports.
The solution was actually rather trivial:
When enabling a certain option on the command line, this will create an entry in the web interface irrespective of this option actually exists.
I had mispelled Python Markdown with python-markdown and ended up with this second entry in the web interface.
Enabling the first entry fixed the problem.
The second entry could be savely removed.

Exporting Jupyter Notebook as either PDF or HTML makes all HTML plaintext

I am just getting started with Jupyter Notebook and I'm running into an issue when exporting.
In my current notebook, I alternate between code cells with code and markdown cells. (Which explain my code).
In the markdown cells, sometimes I will use a little HTML to display a table or a list. I will also use the bold tag <b></b> to emphasize a particular portion of text.
My problem is, when I export this notebook to PDF (via the menu in Jupyter Notebook) all of my HTML gets saved as plaintext.
For example, instead of displaying a table, when exporting to PDF, the HTML will be displayed instead. <tr>Table<tr> <th>part1</th>, etc.
I've tried exporting to HTML instead, but even the HTML file displays the HTML as plaintext.
I tried downloading nbconvert (which is probably what I'm doing when I use the jupter GUI anyways) and using that via terminal, but I still get the same result.
Has anyone run into this problem before?
I tried to export it to html and it worked normally.
Where did you define your html? Did you used the Markdown textfields?
Alternatives:
I don't have the nbconverter, but what about exporting it to html and use another tool to convert it to a pdf?
Use markdown language, it provides tables. Link:
https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet
Consider upgrading your notebook
I fixed this myself.
It turns out that somewhere in the code, there was a tag.
Although it did not run the entire length of the cell, the fact that the plaintext tag was there at all changed the dynamic of the cell.
Next, I had strange formatting errors (Text was of different size and strangely emphasized) when using = as plaintext in the cell. When opening the cell for editing, these = symbols were big bold and blue. This probably has something to do with the markdown language.
This was solved by placing the = on the same line as other text.
I did have to convert the page to HTML, then use a firefox addon to convert to PDF.
Converting to PDF from jupyter notebook uses LaTeX to transcribe the page, and all html is converted to plaintext.
The page appeared as normal with html tables, and normal html in the markdown cell. I just had to be careful with any extraneous tags.
If anyone else encounters this problem, check your html tags, and make sure that you are not accidentally doing something in markdown language.

Jupyter get arbitrary notebook cell contents

I would like to access the textual contents of another cell in the notebook from Python so that I can feed it to the unit testing script, regardless of whether it has been executed or not. It seems like this should be possible, but I can't find the right API to do it in the IPython kernel. Thoughts?
This question asks the same question, but I don't really want to use magics unless I have to. Ideally the workflow would be "select widget choice" followed by "click widget" to cause the tests to run.
Background
I'm working with some students to teach them Python, and in the past when I've done this I've set up a bunch of unit tests and then provided instructions on how to run the tests via a shell script. However, I'm working with some students that don't have access to computers at home, so I decided to try and use an Jupyter notebook environment (via mybinder.org) to allow them to do the same thing. I've got most of it working already via some ipywidgets and a helper script that runs the unit tests on some arbitrary set of code.
As far as the cells that have been run, the input and output caching system described at https://ipython.readthedocs.io/en/stable/interactive/reference.html#input-caching-system might be useful. (Examples of its use at https://stackoverflow.com/a/27952661/8508004 ). It works in Jupyter notebooks as shown below. (The corresponding notebook can be viewed/accessed here.)
Because #A. Donda raised the issue of markdown in the comments below, I'll add here that nbformat provides related abilities that works with saved notebook files. Reading a saved notebook file with nbformat allowa getting cells and content, no matter if it is code or markdown, and sorting whether the cells are markdown or code, etc.. I have posted a number of examples of using nbformat on the Jupyter Discourse forum that can be seen listed via this search here. It offers more utility than the related json.load(open('test.ipynb','r')) command, highlighted in a comment here to read a notebook file because of the additional notebook context automatically included.
If you want to capture the contents of a specific cell to access it from another one, one workaround seems to bee to write the contents of the first cell to file when executing it and later loading the resulting file (i.e. the text giving the former cell's content) inside the latter cell, where the content of the former cell is required.
Cell's contents can be saved to file (when executing the respective cell) via the command
%%writefile foo.py
which has to be placed at the beginning of a cell. This results in the cell's content (in which the upper command is executed) being saved to the file foo.py and it's just a matter of later reading it in, again.
The output of a cell can be made available more easily:
Just place %%capture output in the first line of a cell. Then, the output of the cell (after execution) is going to be saved as a string to the variable output and can be used like any standard python string-variable.
References:
Programmatically get current Ipython notebook cell output? and
https://nbviewer.jupyter.org/github/ipython/ipython/blob/1.x/examples/notebooks/Cell%20Magics.ipynb
I found a solution to this that can get the output of any cell. It requires running some Javascript.
Here is the code that you can put in a cell and run it an it will generate a Python variable called cell_outputs which will be an array of the cell outputs in the same order as they appear on the page. The output of the cell executing this code, will have an empty string.
%%js
{
let outputs=[...document.querySelectorAll(".cell")].map(
cell=> {
let output=cell.querySelector(".output_text")
if(output) return output.innerText
output=cell.querySelector(".rendered_html")
if(output) return output.innerHTML
return ""
}
)
IPython.notebook.kernel.execute("cell_outputs="+JSON.stringify(outputs))
}
If you need the output of a specific cell, just use its index, eg: cell_outputs[2] in your Python code to access the output of cell #3.
I tested this on my Jupyter notebook 6.0.3 (recent install via the Anaconda community edition) on Google Chrome. The above could should work fine on any modern browser (Chrome, Firefox or Edge).

Categories