Importing ipynb file from another ipynb notebook in azure databricks - python

I am trying to import ipynb notebook from another notebook in Azure Databricks using
from ipynb.fs.full.test_1 import *
While importing I am getting the following key error
KeyError: 'package'
Here is my test code
class Test1:
def t1():
a=10
b= 10
c= a+b
return c
Test1.t1()
Am I missing something?

Notebooks in the Databricks aren't the real files - they are something like an entry in the database not stored on the file system. Because of this you can't use Python's import to code from one notebook into another.
Right now it's possible to use %run to include content of one notebook into another (see docs), for example, to implement testing of notebooks. Just split your code into two pieces:
Notebook with functions that you want to test (name it functions, for example):
def func1(....):
....
And in the notebook with test code put the following as a separate cell
%run ./functions
this will include the whole content of the first notebook into context of the second notebook.
I have a demo project that shows how to use this approach to test notebooks on Databricks.
P.S. There is a workaround by downloading the notebooks onto local file system, adding them to sys.path, etc., but it's cumbersome - you can find an example in the following answer.

Related

Synapse Notebook reference - Call Synapse notebook from another with parameters

I have a synapse notebook with parameters .I am trying to call that notebook from another notebook. I am using %run command.
How should I pass the parameters from the base notebook to the one that is being called?
Also, for me the above answers didn't work.
As a separate solution to this, below is an answer.
Open the notebook and go to the properties tab on extreme right adjacent to 3 dots.
Check "Enable Unpublish Notebook reference."
Commit all changes.
Now, you will be able to use both %run and mssparkutils command.
at this time you should first import an library
from notebookutils import mssparkutils
mssparkutils.notebook.run('yourfolder/yournotebook')
Can you use Python and follow the example shown here?
mssparkutils.notebook.run("folder/Sample1", 90, {"input": 20 })

Acess the JSON of a Jupyter Noteook While inside of it

I find that Jupyter notebooks are stored in .json format.
How can I access this (read only) while inside of the notebook itself?
I want to programmatically get the name of the notebook I am currently working in as a string.
EDIT:
Just want to clarify that I know of the solutions using ipynbname and ipyparams, and I am looking for alternate solutions. Thanks all.
for your particular use case you might want to use a package called ipyparams
You can import it at the top of your file and get your file's name as such:
import ipyparams
notebook_name = ipyparams.notebook_name

Starting second Jupyter notebook where first left off

Context:
I started teaching myself a few new libraries using Jupyter Lab. I know showing emotion on SO is strictly forbidden and this will get edited, but WOW, Jupyter notebooks are cool!
Anyway, I'm taking notes in markdown as I work through code examples. It gave me the idea of writing my own little textbook as I learn.
For example, in notebook 1, I talk about (teach myself) linear regression. It take notes on vocabulary, show some mathy formulas then work through some code examples. End section.
In notebook 2, I start the conversation about different metrics to show how effective the regression model was. Then I want to execute some code to calculate those metrics... but all the code for the regression model is in the last notebook and I can't access it.
Question:
Is there a way to link these two notebooks together so that I don't have to re-write the code from the first one?
My attempt:
It seems like the closest thing to what I want to do is to use
%run notebook_01.ipynb
However, this throws an error. Note that it appears to search for a .py file to run:
ERROR:root:File 'linear_regression01.ipynb.py' not found.
I have found some questions/answers where this appears to work for other users, but it is not for me.
Edit: I got the magic command %run to work, however it runs AND prints the entire first notebook into the second. I'ts good to know how to do this and it does achieve the goal of not having to re-code, but it re-prints absolutely everything, which I do not want.
If you run this from the command line :
jupyter nbconvert --to script first_notebook.iynb
It will create a python file from your first notebook called 'first_notebook.py'. After that you can import from that file into your second notebook with:
import first_notebook
Ok, I found the answer by way of suppressing outputs:
Just put this at the top of your second notebook:
from IPython.utils import io
with io.capture_output() as captured:
%run your_linked_notebook.ipynb
This will cause the notebook you want to link to run, allowing you to use any of the data from it, but without having to see all of the outputs and visualizations from it.
This is probably not a great way to go if you are working with a lot of data or linking a lot of notebooks that perform expensive computations, but will likely do the trick for most people.
If there is an answer that does not involve running the notebook linked, I'd be excited to see it.

save a juptyer notebook with specific name within the code

I have a Jupyter notebook that is more or a less a 'template' of how things are done. For example the notebook is a template of say each country's economic data. All of the plots, and analysis is standardized.
I'm looking for a way to have this saving done in a coded way rather than manually naming it myself. Is there anyway so that if I have a variable labeled as:
my_assignment = 'india'
I could save the notebook name as
file_name = my_assignment + todays_date
save(file_name)
I code in python.
You may have to jump to %%javascript to interact with Jupyter, which is different to the ipython kernel that the python code is sent to, e.g.:
%%javascript
Jupyter.notebook.copy_notebook()
Not sure you can copy with a specific name.
You can programmaticly rename the current notebook with:
Jupyter.notebook.rename(<new_name>)

Jupyter get arbitrary notebook cell contents

I would like to access the textual contents of another cell in the notebook from Python so that I can feed it to the unit testing script, regardless of whether it has been executed or not. It seems like this should be possible, but I can't find the right API to do it in the IPython kernel. Thoughts?
This question asks the same question, but I don't really want to use magics unless I have to. Ideally the workflow would be "select widget choice" followed by "click widget" to cause the tests to run.
Background
I'm working with some students to teach them Python, and in the past when I've done this I've set up a bunch of unit tests and then provided instructions on how to run the tests via a shell script. However, I'm working with some students that don't have access to computers at home, so I decided to try and use an Jupyter notebook environment (via mybinder.org) to allow them to do the same thing. I've got most of it working already via some ipywidgets and a helper script that runs the unit tests on some arbitrary set of code.
As far as the cells that have been run, the input and output caching system described at https://ipython.readthedocs.io/en/stable/interactive/reference.html#input-caching-system might be useful. (Examples of its use at https://stackoverflow.com/a/27952661/8508004 ). It works in Jupyter notebooks as shown below. (The corresponding notebook can be viewed/accessed here.)
Because #A. Donda raised the issue of markdown in the comments below, I'll add here that nbformat provides related abilities that works with saved notebook files. Reading a saved notebook file with nbformat allowa getting cells and content, no matter if it is code or markdown, and sorting whether the cells are markdown or code, etc.. I have posted a number of examples of using nbformat on the Jupyter Discourse forum that can be seen listed via this search here. It offers more utility than the related json.load(open('test.ipynb','r')) command, highlighted in a comment here to read a notebook file because of the additional notebook context automatically included.
If you want to capture the contents of a specific cell to access it from another one, one workaround seems to bee to write the contents of the first cell to file when executing it and later loading the resulting file (i.e. the text giving the former cell's content) inside the latter cell, where the content of the former cell is required.
Cell's contents can be saved to file (when executing the respective cell) via the command
%%writefile foo.py
which has to be placed at the beginning of a cell. This results in the cell's content (in which the upper command is executed) being saved to the file foo.py and it's just a matter of later reading it in, again.
The output of a cell can be made available more easily:
Just place %%capture output in the first line of a cell. Then, the output of the cell (after execution) is going to be saved as a string to the variable output and can be used like any standard python string-variable.
References:
Programmatically get current Ipython notebook cell output? and
https://nbviewer.jupyter.org/github/ipython/ipython/blob/1.x/examples/notebooks/Cell%20Magics.ipynb
I found a solution to this that can get the output of any cell. It requires running some Javascript.
Here is the code that you can put in a cell and run it an it will generate a Python variable called cell_outputs which will be an array of the cell outputs in the same order as they appear on the page. The output of the cell executing this code, will have an empty string.
%%js
{
let outputs=[...document.querySelectorAll(".cell")].map(
cell=> {
let output=cell.querySelector(".output_text")
if(output) return output.innerText
output=cell.querySelector(".rendered_html")
if(output) return output.innerHTML
return ""
}
)
IPython.notebook.kernel.execute("cell_outputs="+JSON.stringify(outputs))
}
If you need the output of a specific cell, just use its index, eg: cell_outputs[2] in your Python code to access the output of cell #3.
I tested this on my Jupyter notebook 6.0.3 (recent install via the Anaconda community edition) on Google Chrome. The above could should work fine on any modern browser (Chrome, Firefox or Edge).

Categories