Docker + Jupyter Lab + Create Persistent .ipynb files - python

I've been attempting for some time now to use Docker to build containers to use with Jupyter Lab notebooks. I've Googled, read documentation and checked out numerous SO posts but I can't get the results I'm looking for.
I want to keep these containers of Python modules/packages with distinct versions such that my .ipynb files are guaranteed to work X years from now, on whatever machine.
Additionally, I want to be able to use my Docker container to develop new .ipynb files with Python code. I run the following command:
docker run -it -p 10000:8888 \
-e GRANT_SUDO=yes \
-e JUPYTER_TOKEN=letmein \
--user root \
-v "/Users/aloha2018/My Drive/coding projects/website_scraper/jupyter":/tmp \
data-science-image
Which creates a new folder, jupyter in the same folder as my Dockerfile. Then, I create a new Jupyter notebook called, say, "asdf.ipynb". I write code in it. I save it. No permission errors.
But when I terminate the Docker cluster, asdf.ipynb is nowhere to be found. How do I make it so that I can save my Jupyter notebook, created in my Docker container, is persistent and saved to my current, local machine?

Related

How to install additional dependencies in Tensorman

I am on popos 20.04 LTS and I want to use Tensorman for tenserflow/python. I'm new into docker and I want to install additional dependencies for example using default Image I can run jupyter notebook using these commands -
tensorman run -p 8888:8888 --gpu --python3 --jupyter bash
jupyter notebook --ip=0.0.0.0 --no-browser
but now I have to install additional dependencies for example if I want to install jupytertheme how can I change that ? I have tried to install it directly inside docker container but its not working that way.
this issue is looking similar to my problem but there was no explanation exactly how I have to make custom image in tensorman.
There are two ways to install dependencies.
Create a custom image, install dependencies and save it.
Use the --root tag to gain root access to the container, install dependencies and use them.
Build your own custom image
If you are working on a project and want some dependencies for that project or just want to save all your favourite dependencies, you can create a custom image according to that project and save it and later use that image for your project.
Now make a list of all packages once you are ready use this command
tensorman run -p 8888:8888 --root --python3 --gpu --jupyter --name CONTAINER_NAME bash
Where CONTAINER_NAME is the name of the container you can give any name you want and -p sets the port (you can search about port forwarding in docker)
Now you are running container as root, now in the container shell use.
# its always a good idea to update the apt, you can install packages from apt also
apt update
# install jupyterthemes
pip install jupyterthemes
# check if all your desired packages are installed
pip list
Now it's time to save your image
Open a new terminal and use this command to save your image
tensorman save CONTAINER_NAME IMAGE_NAME
CONTAINER_NAME should be the one which was used earlier, and for IMAGE_NAME you can choose according to your preferences.
Now you can close the terminals use tensorman list to check your custom image is there or not. To use your custom image use
tensorman =IMAGE_NAME run -p 8888:8888 --gpu bash
# to use jupyter
jupyter notebook --ip=0.0.0.0 --no-browser
Use --root and install dependencies
Now you might be wondering that in a normal Jupyter notebook, you can install dependencies even inside the notebook, but that's not the case with tensorman; it's because we're not running it as root because if we run it as root, exported files in host machine would also be using root permissions that's why it is good to avoid using --root tag but we can use that to install dependencies. After installing, you have to save that image (it's not necessary though you can also install them every time) otherwise, installed dependencies will be lost.
In the last step of the custom image building, use these commands instead
# notice --root
tensorman =IMAGE_NAME run -p 8888:8888 --gpu --root bash
# to use jupyter, notice --allow-root
jupyter notebook --allow-root --ip=0.0.0.0 --no-browser

Why does container stop when closing VSCode window although "shutdownAction" is set to "none"?

I use VSCode 1.63.2 to ssh into a remote machine with Ubuntu 20.04, to then work on a project inside a Docker container. Whenever I close a VSCode window while executing a Python script in the container, it stops all terminal processes. When I reattach to the container, I see a Python terminal showing Session contents restored from <date> at <time> and the script's outputs up to the moment I deconnected from the container. However, I would like the container to just keep going when I close VSCode or shut down my local computer.
Things I tried so far: First, I cloned my GitHub repo in the remote machine and built a Docker image with the following Dockerfile
FROM python:3.8-bullseye
RUN pip install -U pip setuptools wheel &&\
useradd -m -r fabioklr
WORKDIR /home/fabioklr/masterthesis
RUN chown -R fabioklr .
COPY requirements.txt ./
RUN pip install -r requirements.txt
COPY . .
ARG GIT_HASH
ENV GIT_HASH=${GIT_HASH:-dev}
USER fabioklr
RUN git config --global init.defaultBranch main &&\
git init &&\
git remote add origin <url-to-remote-repo>
Then I ran docker build . for the image, docker run -dit <image-name:tag> /bin/bash to spin up the container, and I attached VSCode to the container with the Remote-Containers: Attach to Running Container command.
Second, I tried it without a custom Dockerfile and without the command line. I opened my project folder on the remote machine, chose the Remote-Containers: Open Folder in Container command and a Python 3 base image from the command palette. VSCode did the rest automatically, but still I encountered the same problem.
Third, I tried it with the same Open Folder in Container command but using the Dockerfile from above and a custom devcontainer.json file, where I specify "shutdownAction: "none" because it says in the VSCode Docs that this setting should prevent my problem.
Indicates whether VS Code and other devcontainer.json supporting tools should stop the containers when the related tool window is closed / shut down.
Values are none, stopContainer (default for image or Dockerfile), and stopCompose (default for Docker Compose).
I managed to work around this issue with VSCode thanks to this post by using nohup, but it is not ideal for my workflow. Plus, the problem is particularly strange because I did not encounter it a few weeks ago. Am I missing something or is this an issue? Thanks!
Plus, the problem is particularly strange because I did not encounter
it a few weeks ago.
Hi,
it sounds a bit like problem after upgrade.
Have you tried to downgrade ms-vscode-remote.remote-containers extension?
(right click -> install another version).
I am using v0.245.2 and "shutdownAction": "none" keeps my container running when VS Code is closed.

How to watch content of a csv file using bash and Docker

I have a Keras model, written in Python, and I want to run it in a Docker container. The Python script outputs a set of CSV-files (which are the predictions). I have tested the Python script locally on my PC and everything looks fine. When I run my Docker container, after building it, I write the following in the terminal:
docker run -it username/file bash
After this, I run my prediction which creates some CSV-files. I can see that the file is there, but I don't know how to see the content of them
You can always use the vim editor to open files inside a linux container.
Also there is a csv.vim plugin which you can use
Also you can use the -v tag with docker command to mention your link your local OS directory with the container directory. Any file change added or removed will be visible in the OS directory then

Use docker image without jupyter auto starting

I'm using the Docker image jupyter/scipy-notebook (which includes lot of packages and launches and jupyter notebook with them).
Problem: When I want to use the notebook and, with the same packages, run files with the terminal., I can't, because I cant kill the notebook without killing the container.
How can I modify the image in order to delete the auto-run of the notebook?
The entrypoint and command for the image are defined here: https://github.com/jupyter/docker-stacks/blob/6c85e4b4/base-notebook/Dockerfile#L108-L109
ENTRYPOINT ["tini", "-g", "--"]
CMD ["start-notebook.sh"]
You can edit these to run the container without the notebook server. You can either define new ones in your own dockerfile, or you could overwrite them on the command line. eg:
docker run --it --rm --entrypoint=bash jupyter/scipy-notebook echo hi
echo hi is the command in this place. With this image specifically you'll need to overwrite both to prevent the notebook server from starting.

How do you iteratively develop with docker?

How does one iteratively develop their app using Docker? I have only just started using it and my workflow is very slow, so I'm pretty sure I'm using it wrong.
I'm following along with a python machine learning course on Youtube, and so I am using Docker to work with python 3. I know I can use virtualenv or a VM, but I want to learn Docker as well so bear with me.
My root directory looks like so:
Dockerfile main.py*
My docker file:
FROM python
COPY . /src
RUN pip install quandl
RUN pip install pandas
CMD ["python", "/src/main.py"]
And the Python file:
#!/usr/bin/env python
import pandas as pd
import quandl
print("Hello world from main.py")
df = quandl.get("WIKI/GOOGL")
print("getting data frame for WIKI/GOOGL")
print(df.head())
My workflow has been:
Learn something new from the tutorial
Update python file
Build the docker image: docker build -t myapp .
Run the app: docker run my app python /src/main.py
Questions:
How can I speed this all up? For every change I want to try, I end up rebuilding. This causes pip to get dependencies each time which takes way too long.
Instead of editing a python file and running it, how might a get an interactive shell from the python version running in the container?
If I wanted my program to write out a file, how could I get this file back to my local system from the container after the program has finished?
Thanks for the help!
Edit:
I should add, this was the tutorial I was following in general to run some python code in Docker: https://www.civisanalytics.com/blog/using-docker-to-run-python/
Speeding up the rebuild process
The simplest thing you can do is reorder your Dockerfile.
FROM python
RUN pip install quandl
RUN pip install pandas
COPY . /src
CMD ["python", "/src/main.py"]
The reason this helps is that Docker will re-use the cached build for commands it has already run. Now when you rebuild after modifying your source code, it will re-use the build results for the pip commands, as they do not need to be run again. It will only run the COPY step.
Getting a python shell
You can exec a shell in the running container and run your python command.
docker exec -it <container-id> bash
python <...>
Or, you can run a container with just a shell, and skip running your app entirely (then, run it however you want).
docker run -it <image> bash
python <...>
Writing outside the container
Mount an external directory into the container. Then write to the mounted path.
docker run -v /local/path:/path <.. rest of command ..>
Then when you write in the container to /path/file, the file will show up outside the container at /local/path/file.

Categories