How to run a google Colab Notebook from terminal? - python

Suppose I have a Google Colab Notebook in an address like below:
https://colab.research.google.com/drive/XYZ
I want to keep it running for 12 hours, however, then again I want to turn my computer off. As a solution, I can connect to our Lab's server via ssh. The server is running all the time. I would like to know if it's possible that I load and run the notebook there?
I found a solution to connect to a Google Colab Session via ssh (colab_ssh package), but it again needs a running Colab Session.
I also tried to browse the link with lynx, but it needs login and this isn't supported by this browser.

Yes, it is possible. You would first need to download your colab notebook as an .ipynb file, then copy it to your server. Then, you can follow one of the guides on how to connect to a remotely running jupyter notebook session, like this one. All you need is the jupyter notebook software on your server, and an ssh client on your local computer.
Edit: I forgot to mention this: To keep your session alive even after closing the ssh connection, you can use tools like screen. The link provides more detailed explanation, but the general idea is that after connecting to your server, first you need to create a session like this:
screen -S <session_name>
which will create a new session and attach you to it (which is the term used when you are inside a session). Then, you can fire up your jupyter notebook here, and it will keep running even after closing the ssh connection. (You just have to make sure you don't kill the screen session using Ctrl+a followed by k)
Now, you have an indefinitely running jupyter notebook session on your server. You can connect to it via
ssh -N -f -L localhost:YYYY:localhost:XXXX remoteuser#remotehost
as mentioned in the first linked guide, use the browser to run a code cell on your jupyter notebook, and then turn off your laptop without worrying about interrupting your notebook session.

Related

VS Code connecting to jupyterlab server not selecting kernel

We have recently moved to a JupyterLab Server from another IDE. We are trying to get VS Code hooked up so that we can code in it rather. After much struggle, we got VS Code to connect to our remote JupyterLab server. On the status bar in the bottom, it shows
However, as soon as we connect to the JupyerLab server, all the 'run' buttons on screen disappears.
We are getting no support from our IT and have to figure it out ourselves.
A colleague suspects that it (VS Code) is not picking up the python kernel from the server. How do we go about selecting it? or pointing to it?
An additional question, how do we see and browse the folders on the JupyterLab server in VS Code?
Appreciate any assistance
I think the problem is that you didn't really connect to the remote server.
Install Remote-SSH extension. Then you can see the button on the bottom left. Click and you can connect to your service and view your folder.
You can read document about Remote-SSH for more details/
Connect to a remote Jupyter server.
According to the document about jupyter, you have to do the following steps:
Open the Kernel Picker button on the top right-hand side of the notebook (or run the Notebook: Select Notebook Kernel command from the Command Palette).
Select the Existing Jupyter Server option to connect to an existing Jupyter server.
To connect to an existing server for the first time, select Enter the URL of the running Jupyter server.
When prompted to Enter the URL of the running Jupyter server, provide the server's URI (hostname) with the authentication token included with a ?token= URL parameter. (If you start the server in the VS Code terminal with an authentication token enabled, the URL with the token typically appears in the terminal output from where you can copy it.) Alternatively, you can specify a username and password after providing the URI.

Run local code in the Jupyter notebook on remote server via kernel

I want to run local code using local data on a remote server and get back execution results back to my Jupyter notebook cells.
Not usual scheme "run Jupyter notebook remotely, connect to remote notebook via ssh tunneling" but more sophisticated via custom remote kernel which I may choose from the kernel list, and run local code on remote server seamlessly.
Some packages (like this -- https://pypi.org/project/remote-kernel) mention that it is possible, but look dated and come with limited usage instructions.
Anyone knows how to implement this? If so, be as more detailed as possible, thanks!

AWS instance (Jupyter notebook) stops responding

I set up a new Ubuntu instance on AWS EC2. I SSH to the instance using a private key pair. I installed python, jupyter, pyspark and all the necessary modules. I then start a Jupyter notebook using tmux.
My main aim is simply to run pyspark on an AWS instance (using Jupyter). Unfortunately, I keep running into problems with stability of the Jupyter notebook/connection to the instance. After running the Jupyter notebook for some time (sometimes 5 minutes, other times 2 hours+), it ends up "disconnecting". The kernel in the Jupyter disconnects and then does not process any further calls. At that point, I cannot SSH into the instance (just hangs -> blank screen).
I tried running the same setup on GCP but run into the same symptoms.
Is there something basic that I am missing?
Why would I not be able to SSH into the instance?
Is it possible that the Ubuntu server is crashing?

Connect to remote python kernel from python code

I have been using PaperMill for executing my python notebook periodically. To execute compute intensive notebook, I need to connect to remote kernel running in my EMR cluster.
In case of Jupyter notebook I can do that by starting jupyter server with jupyter notebook --gateway-url=http://my-gateway-server:8888 and I am able to execute my code on remote kernel. But how do I let my local python code(through PaperMill) to use remote kernel? What changes do what to make in Kernel Manager to connect to remote kernel?
One related SO answer I could find is here. This suggests to do port forwarding to remote server and initialize KernelManager with the connection file from the server. I am not able to do this as blockingkernelmanager is no longer in Ipython.zmp and I would also prefer HTTP connection like how jupyter does.
Hacky approach - Set up a shell script to do the following :
Create a python environment on your EMR masternode using the hadoop user
Install sparkmagic in your environment and configure all kernels as described in the README.md file for sparkmagic
Copy your notebook to master node/use it directly from s3 location
Run with papermill :
papermill s3://path/to/notebook/input.ipynb s3://path/to/notebook/output.ipynb -p param=1
Step 1 and 2 are one time requirements if your cluster master node is the same every time.
A slightly better approach :
Set up a remote kernel in your Jupyter itself : REMOTE KERNEL
Execute with papermill as a normal notebook by selecting this remote kernel
I am using both approaches for different use cases and they seem to work fine for now.

Stdout output not remotely accessible when connecting to a Jupyter notebook over ssh

My current workflow with Jupyter notebooks is to have a notebook server running on machine X, inside a screen session:
screen -S jupyter
jupyter notebook
I then connect with my notebook server via ssh from laptop Y:
ssh -N -f -L localhost:8888:localhost:8888 [user]#[server]
What I want is to be able to start, disconnect and regularly check in on calculations on machine X from Y. I found running Jupyter from a screen session worked better when starting a calculation from laptop Y. At least the currently executing cell finishes even if the ssh connection is lost. But any print statements or matplotlib plots produced when I am disconnected does not show in the notebook once I connect again. I guess stdout may be sending things only to the notebook if it's currently opened?
EDIT: the problem is not related to screen at all, it is a problem with jupyter browser notebooks in general:
https://github.com/jupyter/notebook/issues/1647
https://github.com/ipython/ipython/issues/4140
Workarounds can be found in the discussion in the links above.

Categories