Force Google Colab to use R kernel for existing notebook - python

I have several existing Jupyter notebooks that use R instead of python.
When I open these notebooks in Colab, sometimes it will automatically use the R kernel (ir), and other times it will use Jupyter3 (which results in all the code being broken). I can't figure out why it uses the R kernel for one notebook but not for another.
Is there a way to manually change the kernel to R? Or some code to include that ensures Colab will recognize the notebook as being an R notebook and not a Python notebook?
I know that I can start a new notebook with the R kernel using https://colab.research.google.com/#create=true&language=r. If you go to Runtime -> Change runtime type, then you can select between the R and Python 3 kernels. However, that only works for new notebooks.
If I open an existing notebook that doesn't use automatically use the R kernel, if I go to Runtime -> Change runtime type, it only shows me options to change the Hardware acceleration options. It doesn't allow me to manually select the R kernel.
Any help would be greatly appreciated.

It seems like including the following metadata just before the final "}" of the raw file corrected the issue:
"metadata": {
"kernelspec": {
"name": "ir",
"display_name": "R",
"language": "R"
}
}

Related

IPython cell magic to convert cell to raw

I'm using Databricks to run my Python notebook, and often use %md to create cells containing section titles and annotations in my code (Markdown cells). Is there some way to create Raw NBConvert cells using a % command? Raw NBConvert is available in JupyterLab in drop down menu:
but not in Databricks.
No, magics can't do that. The fact that a cell is a "raw" cell is independent of the kernel that execute code. There might be some hacks possible, but they will likely not be added in core Jupyter, and will probably break often.
If you are using Databrick, you want to contact support for that.

Synapse Notebook reference - Call Synapse notebook from another with parameters

I have a synapse notebook with parameters .I am trying to call that notebook from another notebook. I am using %run command.
How should I pass the parameters from the base notebook to the one that is being called?
Also, for me the above answers didn't work.
As a separate solution to this, below is an answer.
Open the notebook and go to the properties tab on extreme right adjacent to 3 dots.
Check "Enable Unpublish Notebook reference."
Commit all changes.
Now, you will be able to use both %run and mssparkutils command.
at this time you should first import an library
from notebookutils import mssparkutils
mssparkutils.notebook.run('yourfolder/yournotebook')
Can you use Python and follow the example shown here?
mssparkutils.notebook.run("folder/Sample1", 90, {"input": 20 })

What is the usage of tag over the cell in jupyter?

I found that, in jupyter notebook, there is a tag tool over the cell, which can be activated by "View - Cell Toolbar - Tags". But I can not figure out, why we need these tags. Can someone give some suggestions or usage examples?
Tagging is a fairly recent and perhaps not quite finished feature of jupyter-notebooks, added with version 5.0. From what I understand they are mostly meant for tools such as nbconvert (converts notebooks to other formats such as pdf) and nbval (validates notebooks) and other more or less integrated tools working with jupyter notebooks. Being able to add tags to a cell would enable different behaviours for such tools depending on a cells tag. Some example that could be accomplished with the ability to add tags would be:
nbconvert - hide a cell, hide the input leaving the output visible,
collapse a cell leaving a way to reveal it
nbconvert to latex - markdown cell contains title (or subtitle, abstract...)
nbval - check/ignore output from a cell, skip executing a cell, expect a cell
to raise an error
nbgrader - solution cell, tests cell
nbparameterise - cell contains input parameters.
as envisaged by takluyver over at jupyter's github. If you want more information on implementation and the discussion surrounding it you can read more here.
Adding to Christian's answer, there is an important utility you can get from using tags. You can run all cells and keep running even when encountering runtime errors. You flag a cell with raises-exception tag. Very useful for educational purposes. Source.
As pointed out by Christian above, one great use is to provide different input parameters value to this notebook program, nbparameterise is one example. See here. Papermill is another one: see here
From a user perspective, one can probably achieve the same thing by using env variable os.getenv(), or getting from command line argument sys.getargv(), but adding tag to a cell seems to be the easiest.
Under the hood, the jupyter notebook is saved as a json file. Let says you tag the first cell as Parameters, and declare variables var1=10 and var2 ='adam'. The json file would look something like below, and the tags is in the metadata section. So it is simple for a tool to parse this json and get to the tags section, and say replace the variables with different values.
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": [
"Parameters"
]
},
....
},
"source": [
"var1 = 10",
"var2 = 'adam'"
]
Another usage is for papermill.
Papermill is a useful tool you can use to run notebooks from the CLI (as part of a script or from CRON or such).
Papermill uses tags to be able to parametrize a notebook so you can have some global parameters in your notebook and then run that notebook multiple times with different parameter values automatically
source: https://papermill.readthedocs.io/en/latest/usage-parameterize.html

Integrating R with Python - how to use R magics as input and output of a python process

I have a conceptual question for anyone who knows it: you can use an R magic to run R Plots in Python in a Jupyter cell. For it to work, after the load statement, you need %%R as the very first line of the cell. Alternatively, Jupyter has its own plotting including a ggplot, but anecdotally, I am told that the one in R is better maintained and works better.
Is it possible to set things up in Python and then pass them into the R cell and then have the results come back out for Python to work on them again? Or are R magic cells completely autonomous from other Python cells?
If you export a Jupyter notebook as Python code, and it has the %%R magic in the middle of it, will this work? Or will it fail because the %%R line is now not at the start of the script? And if it fails, is there a way to work around that?
The ultimate goal of these questions: Imagine writing code in Python, using Pandas and other Python libraries to get and manipulate data, then pass the results into R to work its magic for plotting and statistical calculation, then when needed, pass a result from that R magic cell back out for Python to use in some next step.
Is this possible within a Jupyter Notebook? If so, what do you need to do so your code will work as a .py script as well as a Jupyter Notebook?

Jupyter get arbitrary notebook cell contents

I would like to access the textual contents of another cell in the notebook from Python so that I can feed it to the unit testing script, regardless of whether it has been executed or not. It seems like this should be possible, but I can't find the right API to do it in the IPython kernel. Thoughts?
This question asks the same question, but I don't really want to use magics unless I have to. Ideally the workflow would be "select widget choice" followed by "click widget" to cause the tests to run.
Background
I'm working with some students to teach them Python, and in the past when I've done this I've set up a bunch of unit tests and then provided instructions on how to run the tests via a shell script. However, I'm working with some students that don't have access to computers at home, so I decided to try and use an Jupyter notebook environment (via mybinder.org) to allow them to do the same thing. I've got most of it working already via some ipywidgets and a helper script that runs the unit tests on some arbitrary set of code.
As far as the cells that have been run, the input and output caching system described at https://ipython.readthedocs.io/en/stable/interactive/reference.html#input-caching-system might be useful. (Examples of its use at https://stackoverflow.com/a/27952661/8508004 ). It works in Jupyter notebooks as shown below. (The corresponding notebook can be viewed/accessed here.)
Because #A. Donda raised the issue of markdown in the comments below, I'll add here that nbformat provides related abilities that works with saved notebook files. Reading a saved notebook file with nbformat allowa getting cells and content, no matter if it is code or markdown, and sorting whether the cells are markdown or code, etc.. I have posted a number of examples of using nbformat on the Jupyter Discourse forum that can be seen listed via this search here. It offers more utility than the related json.load(open('test.ipynb','r')) command, highlighted in a comment here to read a notebook file because of the additional notebook context automatically included.
If you want to capture the contents of a specific cell to access it from another one, one workaround seems to bee to write the contents of the first cell to file when executing it and later loading the resulting file (i.e. the text giving the former cell's content) inside the latter cell, where the content of the former cell is required.
Cell's contents can be saved to file (when executing the respective cell) via the command
%%writefile foo.py
which has to be placed at the beginning of a cell. This results in the cell's content (in which the upper command is executed) being saved to the file foo.py and it's just a matter of later reading it in, again.
The output of a cell can be made available more easily:
Just place %%capture output in the first line of a cell. Then, the output of the cell (after execution) is going to be saved as a string to the variable output and can be used like any standard python string-variable.
References:
Programmatically get current Ipython notebook cell output? and
https://nbviewer.jupyter.org/github/ipython/ipython/blob/1.x/examples/notebooks/Cell%20Magics.ipynb
I found a solution to this that can get the output of any cell. It requires running some Javascript.
Here is the code that you can put in a cell and run it an it will generate a Python variable called cell_outputs which will be an array of the cell outputs in the same order as they appear on the page. The output of the cell executing this code, will have an empty string.
%%js
{
let outputs=[...document.querySelectorAll(".cell")].map(
cell=> {
let output=cell.querySelector(".output_text")
if(output) return output.innerText
output=cell.querySelector(".rendered_html")
if(output) return output.innerHTML
return ""
}
)
IPython.notebook.kernel.execute("cell_outputs="+JSON.stringify(outputs))
}
If you need the output of a specific cell, just use its index, eg: cell_outputs[2] in your Python code to access the output of cell #3.
I tested this on my Jupyter notebook 6.0.3 (recent install via the Anaconda community edition) on Google Chrome. The above could should work fine on any modern browser (Chrome, Firefox or Edge).

Categories