How to restart Python Docker Container from inside - python

My Objective: I want to be able to restart a container based on the official Python Image using some command inside the container.
My system: I have a own Docker image based on the official python image which look like this:
FROM python:3.6.15-buster
WORKDIR /webserver
COPY requirements.txt /webserver
RUN /usr/local/bin/python -m pip install --upgrade pip
RUN pip3 install -r requirements.txt --no-binary :all:
COPY . /webserver
ENTRYPOINT ["./start.sh"]
As you can see, the image does not execute a single python file but it executes a script called start.sh, which looks like this:
#!/bin/bash
echo "Starting"
echo "Env: $ENTORNO"
exec python3 "$PATH_ENTORNO""Script1.py" &
exec python3 "$PATH_ENTORNO""Script2.py" &
exec python3 "$PATH_ENTORNO""Script3.py" &
All of this works perfectly, but, I want that if, for example, script 3 fails, the entire container based on this image get restarted.
My approach: I had two ideas about this problem. First, try to execute a reboot command in the python3 script, something like this:
from subprocess import call
[...]
call(["reboot"])
This does not work inside the Python Debian image, because of error:
reboot: command not found
The other approach was to mount the docker.sock inside the container, but the error this time is:
root#MachineName:/var/run# /var/run/docker.sock docker ps
bash: /var/run/docker.sock: Permission denied
I dont know if I'm doing right these two approach, or if anyone has any idea about this but any help will be very appreciated.

Update
After thinking about it, I realised you could send some signal to the PID 1 (your entrypoint), trap it and use a handler to exit with an appropriate code so that docker will reschedule it.
Here's an MRE:
Dockerfile
FROM python:3.9
WORKDIR /app
COPY ./ /app
ENTRYPOINT ["./start.sh"]
start.sh
#!/usr/bin/env bash
python script.py &
# This traps user defined signal and kills the last command
# (`tail -f /dev/null`) before exiting with code 1.
trap 'kill ${!}; echo "Killed by backgrounded process"; exit 1' USR1
# Launches `tail` in the background and sets this program to wait
# for it to finish, so that it does not block execution
tail -f /dev/null & wait $!
script.py
import os
import signal
# Process 1 will be your entrypoint if you declared it in `exec-form`*
print("Sending signal to stop container")
os.kill(1, signal.SIGUSR1)
*exec form
Testing it
> docker build . -t test
> docker run test
Sending signal to stop container
Killed by backgrounded process
> docker inspect $(docker container ls -n 1 -q) --format='{{.State.ExitCode}}'
1
Original post
I think the safest bet would be to instruct docker to restart your container when there's some failure. Then you'd only have to exit your program with a non-zero code (i.e: run exit 1 from your start.sh) and docker will restart it from scratch.
Option 1: docker run --restart
Related documentation
docker run --restart on-failure <image>
Option 2: Using docker-compose
Version 3
In your docker-compose.yml you can set the restart_policy directive to the service you're interested on restarting. i.e:
version: "3"
services:
app:
...
restart_policy:
condition: on-failure
...
Version 2
Before version 3, the same policy could be applied with the restart directive, which allows for less configuration.
version: "2"
services:
app:
...
restart: "on-failure"
...

Is there any reason why you are running 3 processes in the same container? As per the microservice architecture basics, only one process should run in a container. So you should run 3 dockers for the 3 scripts. All 3 scripts should have the logic that if one of the 3 dockers is not reachable, then it should get killed.

Well, in the end the solution was much simpler than I expected.
I started from the base where I mount the docker socket inside the container (I know that this practice is not recommended, but in my case, I know that it does not pose security problems), using the command in docker-compose:
volumes:
- /var/run/docker.sock:/var/run/docker.sock
Then, it was as simple as using the Docker library for python, which gives a complete SDK through that socket that allowed me to restart the container inside the python script in an ultra-simple way.
import docker
[...]
docker_client = docker.DockerClient(base_url='unix://var/run/docker.sock')
docker_client.containers.get("container_name").restart()

Related

Docker restart python script on failure

I have the docker container that executes the python file - I want it to restart on the failure of the script - usually memory errors.
Docker is running, python file works but after script failure container just exit.
The container just --restart always policy does not work - what am I doing wrong?
docker command:
sudo docker run --init --gpus all \
--ipc host --privileged --net host \
-p 8888:8888 -p49053:49053
--restart always \
-v /mnt/disks/sde:/home/sliceruser/data \
-v /mnt/disks/sdb:/home/sliceruser/dataOld \
slicerpicai:latest
end of docker file
ENTRYPOINT [ "/bin/bash", "start.sh","-l", "-c" ]
start.sh
cd /home/sliceruser/data/piCaiCode
git pull
python3.8 /home/sliceruser/data/piCaiCode/Three_chan_baseline_hyperParam.py
He restart policy works, you might just not see it. I suggest to check the number of retries so far on the container:
docker inspect -f "{{ .RestartCount }}" my-container
Also docker tries indefinitely (with --retry always) but it does wait always longer if the start keeps failing.
If you say your script has memory issues, it would be good to address those before looking at issues with Docker. if the reason why the container stops lies outside docker, that obviously stops the container from restarting as well. So I recommend checking the container logs and thinking of what you do in order to manually restart the container after a failure.
For more details check also the official reference of docker run
If you want to reproduce what I wrote above do the following:
Open 1 terminal and run:
docker stats
Open a second terminal and run:
docker run -d --name testcontainer --restart always alpine:latest sh -c "sleep 5 && exit 2"
This will start a container that "crashes" every 5s.
In the same terminal run:
# check the status and see how it waits longer and longer to restart
docker container ls --filter name="testcontainer"
# check the number of restarts so far
docker inspect -f "{{ .RestartCount }}" testcontainer
Friendly footnote: I think you are lucky it doesn't restart because this is such an unsecure container. ;)

running two separate Python-Flask api via a single docker file

I want to run two different python api files running on different ports via a single container.
My docker file looks like:
FROM python:3.7-slim-buster
RUN apt-get update && apt-get install -y libgtk2.0-dev cmake libpoppler-cpp-dev poppler-utils tesseract-ocr
WORKDIR /app
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
RUN chmod a+x run.sh
CMD ["./run.sh"]
And the .sh file looks like:
#!/bin/bash
exec python3 /app1/numberToWord.py &
exec python3 /app2/dollarToGbp.py &
While the docker build is a success without any error, the docker run doesn't throw any error and exits the command line. I'm curios to know where is it failing, any insight is highly appreciated.
Try using nohup to ignore hangup signal
Ex:
#!/bin/bash
nohup python3 /app1/numberToWord.py &
nohup python3 /app2/dollarToGbp.py &
When you run a container, you can specify the specific command to run. You can run two containers, from the same image, with different commands:
docker run -p 8000:8000 --name spelling -d image /app1/numberToWord.py
docker run -p 8001:8000 --name currency -d image /app2/dollarToGbp.py
The important points here are that each container runs a single process, in the foreground.
If your main command script makes it to the end and exits, the container will exit too. The script you show only launches background processes and then completes, and when it completes the container will exit. There needs to be some foreground process to keep the container running, and the easiest way to do this is to just launch the main server you need to run as the only process in the container.

How to gracefully stop a Dockerized Python ROS2 node when run with docker-compose up?

I have a Python-based ROS2 node running inside a Docker container and I am trying to handle the graceful shutdown of the node by capturing the SIGTERM/SIGINT signals and/or by catching the KeyboardInterrupt exception.
The problem is when I run the node in a container using docker-compose. I cannot seem to catch the "moment" when the container is being stopped/killed. I've explicitly added the STOPSIGNAL in the Dockerfile and the stop_signal in the docker-compose file.
Here is a sample of the node code:
import signal
import sys
import rclpy
def stop_node(*args):
print("Stopping node..")
rclpy.shutdown()
return True
def main():
rclpy.init(args=sys.argv)
print("Creating node..")
node = rclpy.create_node("mynode")
print("Running node..")
while rclpy.ok():
rclpy.spin_once(node)
if __name__ == '__main__':
try:
signal.signal(signal.SIGINT, stop_node)
signal.signal(signal.SIGTERM, stop_node)
main()
except:
stop_node()
Here is a sample Dockerfile to re-create the image:
FROM osrf/ros2:nightly
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C1CF6E31E6BADE8868B172B4F42ED6FBAB17C654
RUN apt-get update && \
apt-get install -y vim
WORKDIR /nodes
COPY mynode.py .
ADD run-node.sh /run-node.sh
RUN chmod +x /run-node.sh
STOPSIGNAL SIGTERM
Here is the sample docker-compose.yml:
version: '3'
services:
mynode:
container_name: mynode-container
image: mynode
entrypoint: /bin/bash -c "/run-node.sh"
privileged: true
stdin_open: false
tty: true
stop_signal: SIGTERM
Here is the run-node.sh script:
source /opt/ros/$ROS_DISTRO/setup.bash
python3 /nodes/mynode.py
When I manually run the node inside the container (using python3 mynode.py or by /run-node.sh) or when I do docker run -it mynode /bin/bash -c "/run-node.sh", I get the "Stopping node.." message. But when I do docker-compose up, I never see that message when I stop the container, by Ctrl+C or by docker-compose down.
$ docker-compose up
Creating network "ros-node_default" with the default driver
Creating mynode-container ... done
Attaching to mynode-container
mynode-container | Creating node..
mynode-container | Running node..
^CGracefully stopping... (press Ctrl+C again to force)
Stopping mynode-container ... done
$
I've tried:
moving the calls to signal.signal
using atexit instead of signal
using docker stop and docker kill --signal
I've also checked this Python inside docker container, gracefully stop question but there's no clear solution there, and I'm not sure if using ROS/rclpy makes my setup different (also, my host machine is Ubuntu 18.04, while that user was on Windows).
Is it possible to catch the stopping of the container in my stop_node method?
When your docker-compose.yml file says:
entrypoint: /bin/bash -c "/run-node.sh"
Since that's a bare string, Docker wraps it in a /bin/sh -c wrapper. So your container's main process is something like
/bin/sh -c '/bin/bash -c "/run-node.sh"'
In turn, the bash script stays running. It launches a Python script, and stays running as its parent until that script exits. (The two levels of sh -c wrappers may or may not stay running.)
The important part here is that this wrapper shell, not your script, is the main container process that receives signals, and (it turns out) won't receive SIGTERM unless it's explicitly coded to.
The most important restructuring to do here is to have your wrapper script exec the Python script. That causes it to replace the wrapper, so it becomes the main process and receives signals. If nothing else changing the last line to
exec python3 /nodes/mynode.py
will likely help.
I would go a little further here and make sure as much of this code is built into your Docker image, and try to minimize the number of explicit shell wrappers. "Do some initialization, then exec something" is an extremely common Docker pattern, and you can write this script and make it your image's entrypoint:
#!/bin/sh
# Do the setup
# ("." is the same as "source", but standard)
. "/opt/ros/$ROS_DISTRO/setup.bash"
# Run the main CMD
exec "$#"
Similarly, your main script should start with a "shebang" line like
#!/usr/bin/env python3
import ...
Your Dockerfile already contains the setup to be able to run the wrapper directly, you may need a similar RUN chmod line for the main script. But then you can add
ENTRYPOINT ["/run-node.sh"]
CMD ["/nodes/my-node.py"]
Since both scripts are executable and have the "shebang" lines you can run them directly. Using the JSON syntax keeps Docker from adding an additional shell wrapper. Since your entrypoint script will now run whatever the command is, it's easy to change that separately. For example, if you want an interactive shell that's done the environment variable setup to try to debug your container startup, you can override just the command part
docker run --rm -it mynode sh

Dockerized web-service does not run in background even with detached flag

I'm trying to Dockerize a web service using Tangelo and python.
My project structure is as follows:
test.py
requirements.txt
Dockerfile
test.py
import ...
def run(query):
...
return response
requirements.txt
... # other packages, numpy, open-cv, etc
tangelo
Dockerfile
FROM ubuntu:latest
RUN apt-get update
RUN apt-get install -y python python-pip git
EXPOSE 9220
ADD . /test
WORKDIR /test
RUN pip install -r requirements.txt
CMD "tangelo --port 9220"
I build this using
docker build -t "test" .
And run in detached mode using
docker run -p 9220:9220 -d "test"
But docker ps shows me that the docker stops almost as soon as it has started. I don't know what the problem is since I cannot inspect the logs.
I have tried a lot of things but I still can't figure this thing out.
Any ideas? If needed, I can provide more info.
EDIT:
When I build, step 8 says
Step 8/8 : ENTRYPOINT tangelo --port 9220
---> Running in 8b54841853ab
Removing intermediate container 8b54841853ab
So it means these are run in an intermediate container. Why is that and how can I prevent it?
TL;DR: Use:
CMD tangelo -np --port 9220
Instead of:
CMD "tangelo --port 9220"
Explanation:
You have two ways to debug the problem:
Inspect the logs of the container:
$ docker run -d test
28684015e519c0c8d644fccf98240d1465acabab6d16c19fd59c5f465b7f18af
$ sudo docker logs 28684015e519c
/bin/sh: 1: tangelo --port 9220: not found
Instead of running in detached mode, run in foreground with -i/--interactive (and optionally also -t/--tty):
$ docker run -ti test
/bin/sh: 1: tangelo --port 9220: not found
As you can see from above, the problem is that tangelo --port 9220 is being interpreted as a single argument. Split it by removing quotes:
CMD tangelo --port 9220 # this will use a shell
or use the "exec" form (preferred, given that you don't need any shell features):
CMD ["tangelo", "--port", "9220"] # this will execute tangelo directly
or even better use ENTRYPOINT + CMD:
ENTRYPOINT ["tangelo"]
CMD ["--port", "9220"] # this will execute tangelo directly
After this change, you'll still have a problem:
$ sudo docker run -ti test
...
[29/Apr/2018:02:43:39] TANGELO no such group 'nobody' to drop privileges to
Tangelo is complaining about the fact that there is no user and group named nobody inside the container. Again, there are two things you can do: add a RUN to create the nobody user and group, or run Tangelo with the -np/--no-drop-privileges option:
ENTRYPOINT ["tangelo"]
CMD ["--no-drop-privileges", "--port", "9220"]
It's fine if during the build you see intermediate containers: Docker creates them for each build step. The commands you specify in ENTRYPOINT or CMD are not executed during build, they're just recorded into the final image.

Docker container/image running but there is no port number

I am trying to get a django project that I have built to run on docker and create an image and container for my project so that I can push it to my dockerhub profile.
Now I have everything set up and I've created the initial image of my project. However, when I run it I am not getting any port number attached to the container. I need this to test and see if the container is actually working.
Here is what I have:
Successfully built a047506ef54b
Successfully tagged test_1:latest
(MySplit) omars-mbp:mysplit omarjandali$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
test_1 latest a047506ef54b 14 seconds ago 810MB
(MySplit) omars-mbp:mysplit omarjandali$ docker run --name testing_first -d -p 8000:80 test_1
01cc8173abfae1b11fc165be3d900ee0efd380dadd686c6b1cf4ea5363d269fb
(MySplit) omars-mbp:mysplit omarjandali$ docker container ls -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
01cc8173abfa test_1 "python manage.py ru…" 13 seconds ago Exited (1) 11 seconds ago testing_first
(MySplit) omars-mbp:mysplit omarjandali$ Successfully built a047506ef54b
You can see there is no port number so I don't know how to access the container through my local machine on my web browser.
dockerfile:
FROM python:3
WORKDIR tab/
COPY requirements.txt ./
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "manage.py", "runserver", "0.0.0.0"]
This line from the question helps reveal the problem;
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
01cc8173abfa test_1 "python manage.py ru…" 13 seconds ago Exited (1) 11 seconds ago testing_first
Exited (1) (from the STATUS column) means that the main process has already exited with a status code of 1 - usually meaning an error. This would have freed up the ports, as the docker container stops running when the main process finishes for any reason.
You need to view the logs in order to diagnose why.
docker logs 01cc will show the logs of the docker container that has the ID starting with 01cc. You should find that reading these will help you on your way. Knowing this command will help you immensely in debugging weirdness in docker, whether the container is running or stopped.
An alternative 'quick' way is to drop the -d in your run command. This will make your container run inline rather than as a daemon.
Created Dockerise django seed project
django-admin.py startproject djangoapp
Need a requirements.txt file outlining the Python dependencies
cd djangoapp/
RUN follwoing command to create the files required for dockerization
cat <<EOF > requirements.txt
Django
psycopg2
EOF
Dockerfile
cat <<EOF > Dockerfile
FROM python:3.6
ENV PYTHONUNBUFFERED 1
RUN mkdir /app
WORKDIR /app
ADD requirements.txt /app/
RUN pip install -r requirements.txt
ADD . /app/
EXPOSE 8000
CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]
EOF
docker-compose.yml
cat <<EOF > docker-compose.yml
version: "3.2"
services:
web:
image: djangoapp
command: python manage.py runserver 0.0.0.0:8000
ports:
- "8000:8000"
EOF
Run the application with
docker-compose up -d
When you created the container you published the ports. Your container would be accessible via port 8000 if it successfully built. However, as Shadow pointed out, your container exited with an error. That is why you must add the -a flag to your docker container ls command. docker container ls only shows running containers without the -a flag.
I recommend forgoing the detached flag -d to see what is causing the error. Then creating a new container after you have successfully launched the one you are working on. Or simply run the following commands once you fix the issue. docker stop testing_first then docker container rm testing_first finally run the same command you ran before. docker run --name testing_first -d -p 8000:80 test_1
I ran into similar problems with the first docker instances I attempted to run as well.

Categories