Python how to recieve SIGINT in Docker to stop service? - python

I'm writing a monitor service in Python that monitors another service and while the monitor & scheduling part works fine, I have a hard time figuring out how to do a proper shutdown of the service using a SIGINT signal send to the Docker container. Specifically, the service should catch the SIGINT from either a docker stop or a Kubernetes stop signal, but so far it doesn't. I have reduced the issue to a minimal test case which is easy to replicate in Docker:
import signal
import sys
import time
class MainApp:
def __init__(self):
self.shutdown = False
signal.signal(signal.SIGINT, self.exit_gracefully)
signal.signal(signal.SIGTERM, self.exit_gracefully)
def exit_gracefully(self, signum, frame):
print('Received:', signum)
self.shutdown = True
def start(self):
print("Start app")
def run(self):
print("Running app")
time.sleep(1)
def stop(self):
print("Stop app")
if __name__ == '__main__':
app = MainApp()
app.start()
# This boolean flag should flip to false when a SIGINT or SIGTERM comes in...
while not app.shutdown:
app.run()
else: # However, this code gets never executed ...
app.stop()
sys.exit(0)
And the corresponding Dockerfile, again minimalistic:
FROM python:3.8-slim-buster
COPY test/TestGS.py .
STOPSIGNAL SIGINT
CMD [ "python", "TestGS.py" ]
I opted for Docker because the Docker stop command is documented to issue a SIGINT signal, waits a bit, and then issues a SIGKILL. This should be an ideal test case.
However, when starting the docker container with an interactive shell attached, and stopping the container from a second shell, the stop() code never gets executed. Verifying the issue, a simple:
$ docker inspect -f '{{.State.ExitCode}}' 64d39c3b
Shows exit code 137 instead of exit code 0.
Apparently, one of two things is happening. Either the SIGTERM signal isn't propagated into the container or Python runtime and this might be true because the exit_gracefully function isn't called apparently otherwise we would see the printout of the signal. I know that you have to be careful about how to start your code from within Docker to actually get a SIGINT, but when adding the stop signal line to the Dockerfile, a global SIGINT should be issued to the container, at least to my humble understanding reading the docs.
Or, the Python code I wrote isn't catching any signal at all. Either way, I simply cannot figure out why the stop code never gets called. I spent a fair amount of time researching the web, but at this point, I feel I'm running circles, Any idea how to solve the issue of correctly ending a python script running inside docker using a SIGINT signal?
Thank you
Marvin

Solution:
The app must run as PID 1 inside docker to receive a SIGINT. To do so, one must use ENTRYPOINT instead of CMD. The fixed Dockerfile:
FROM python:3.8-slim-buster
COPY test/TestGS.py .
ENTRYPOINT ["python", "TestGS.py"]
Build the image:
docker build . -t python-signals
Run the image:
docker run -it --rm --name="python-signals" python-signals
And from a second terminal, stop the container:
docker stop python-signals
Then you get the expected output:
Received SIGTERM signal
Stop app
It seems a bit odd to me that Docker only emits SIGTERMS to PID 1, but thankfully that's relatively easy to fix. The article below was most helpful to solve this issue.
https://itnext.io/containers-terminating-with-grace-d19e0ce34290

Related

Raspberry Pi Flask Server Not Accessible After Shutdown

I followed a YouTube video to create a remote viewable camera with a Raspberry Pi, source code for tutorial available here. Basically it creates a Flask server to stream a live feed of a Pi Camera, which is available via browser on other devices. The problem I am having is I cannot get a feed after shutting down and starting the Pi. If I reboot the Pi, debug the app or manually start the service, everything works just fine, however if I actually shut down the Pi, unplug it, plug it back in and let it boot, the server seems to fail to start and the video fee cannot be accessed from any device including the Pi itself, although the service status says it is running. I need this server to start whenever I plug in the Pi, the OS starts and I connect to a predefined network.
The final portion of the tutorial states to add sudo python3 /home/pi/pi-camera-stream-flask/main.py at the end of the /etc/profile file is supposed to start the main.py file which starts the flask server. This did not work so I created a service to start the app after there's a network connection, which looks like:
[Unit]
Description=Start Camera Flask
After=systemd-networkd-wait-online.service
Wants=systemd-networkd-wait-online.service
[Service]
User=pi
WorkingDirectory=/home/pi/pi-camera-stream-flask/
ExecStart=sudo python3 /home/pi/pi-camera-stream-flask/main.py
Restart=always
[Install]
WantedBy=multi-user.target
note, I have also tried After=Network.target and After=Network-online.target
I also enabled NetworkManager-wait-online.service and systemd-networkd-wait-online.service
My Python app looks like:
#Modified by smartbuilds.io
#Date: 27.09.20
#Desc: This web application serves a motion JPEG stream
# main.py
# import the necessary packages
from flask import Flask, render_template, Response, request
from camera import VideoCamera
import time
import threading
import os
pi_camera = VideoCamera(flip=False) # flip pi camera if upside down.
# App Globals (do not edit)
app = Flask(__name__)
#app.route('/')
def index():
return render_template('index.html') #you can customze index.html here
def gen(camera):
#get camera frame
while True:
frame = camera.get_frame()
yield (b'--frame\r\n'
b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n\r\n')
#app.route('/video_feed')
def video_feed():
return Response(gen(pi_camera),
mimetype='multipart/x-mixed-replace; boundary=frame')
if __name__ == '__main__':
app.run(host='192.168.0.14', port=5000, debug=False) # have also tried app.run(host='0.0.0.0', debug=False)
you can try to autoboot your code everytime its connected to power by set it up in your .bashrc
sudo nano /home/pi/.bashrc
scroll down to the bottom. Add these two line
echo running flask
sudo python3 /home/pi/pi-camera-stream-flask/main.py
try to remove your editing in /etc/profile first
and make sure you have some cooldown at the start maybe 5 secound
time.sleep(5)
This problem usually occurs when the Flask server is started before the Raspberry Pi is connected to the network. There are a few ways to solve the problem, but here's my approach.
Add a new function to check connection status.
Create a shell script to execute main.py
Make the shell script executable.
Create a cron job to execute the script after reboot.
Check Connection Status
We will create a function that checks network connection using subprocess module.
The function will check network connection periodically until it is properly connected to the network and return True. (assume that the Raspberry Pi ever connected to the network and network adapter is always enabled)
Add the following code snippet in your code and execute it before initializing Flask Server.
from subprocess import check_output
from time import sleep
def initializing_network():
while True:
try:
result = check_output(['hostname', '-I']).decode('utf-8').strip()
# The conditional below may not be the cleanest solution.
# Feel free to come up with better solution
# It will check if the shortest IP length exists,
# i.e. "x.x.x.x" = 7 characters
# or check '192' in the result string if you are sure your private network
# always begins with '192'
if (len(result) > 6) or ('192' in result):
return True
except:
pass
# If it fails, wait for 10 seconds then recheck the connection again.
sleep(10)
Considering the only possible return value of initializing_network() function is True or loop indefinitely, the function can be called without additional condition in main function = blocking function. You may want to log the exception or terminate the Python script to prevent infinite loop.
if __name__ == '__main__':
initializing_network()
app.run(host='192.168.0.14', port=5000, debug=False)
Create Shell Script
Assuming that you are in main directory, create a shell script in the directory, let say runner.sh.
Type the following in terminal
nano runner.sh
Then add the following code snippet.
#!/bin/sh
cd /
sudo python3 /home/pi/pi-camera-stream-flask/main.py
cd /
When you're done, press Ctrl + X and select Yes to save the change.
Make Shell Script Executable
Assuming we are still in the current directory, type the following command on terminal.
chmod 755 runner.sh
Create a Cron Job
Next, let's add a new cron job for the Flask Server.
Back to terminal and execute the following.
sudo crontab -e
Next, select nano as the text editor, but feel free to use any text editor that you like.
At the very bottom of the content, insert new line and add this line.
#reboot sh /home/pi/runner.sh
Similarly, press Ctrl + X and Yes to save the change.
Final Test
To ensure the shell script runs properly, execute it and check if everything works.
./runner.sh
If it works, then it is time to test the cron job.
Type sudo reboot in the terminal. After reboot, wait for a while then check the designated IP address whether the server has been started. It may take some time, so check it periodically. Ideally, it will work without any problem. Otherwise, repeat the steps and make sure you don't miss anything.

What is the most acceptable way to manage (i.e. properly terminate) the MongoDB daemon in Python?

I am using a local ("community") version of MongoDB, Python 3, and pymongo.
First, I had been manually starting the MongoDB daemon (mongod) in a separate shell and then running my Python script. This script uses pymongo to successfully instantiate the MongoDB client (mongo) and interact with my database. When my script terminated, I would manually terminate the daemon by sending the kill command to its PID.
While this is all fine and dandy, the goal of this script is to automate as much as possible. As such, I want to incorporate the starting and terminating of the daemon via script, but it must be asynchronous so as to not keep the main code from running. I tried using subprocess.Popen(), and then a Thread class with the daemon attribute set to true -- yet I still see the mongod process up and running when I call ps after my entire program exits.
Below is the Thread class:
class MongoDaemonThread(object):
def __init__(self):
thread = threading.Thread(target=self.run, args=())
thread.daemon = True
thread.start()
def run(self):
mongo_daemon = subprocess.Popen('/Users/<me>/.mongodb/bin/mongod')
return mongo_daemon
And here is the function in which I interact with the database:
def db_write(report_list, args):
# ...
client = MongoClient()
db = client.cbf
# ...
reports = db.reports
for report in report_list:
report_id = reports.insert_one(report).inserted_id
# ...
What I'm looking to do is the following, all through Python (be it one script or multiple):
enter code
start mongod (asynchronously to rest of code/in background and let it listen for connections from a Mongo client) (#TODO)
create a Mongo client
interface with my database through said client
terminate the Mongo daemon (#TODO)
exit code/terminate program
Is threading the right way to do this? If so, what am I doing wrong? If threading is not the answer, what method might you suggest I use instead?
First of all, you shouldn't start the mongod process from python. mongod should be started and stopped from shell. Because database must always be ready for connections. But IF you really want to do it from python, you can still use:
from subprocess import call
call(["mongod","&"])
to start the mongod process.
To end the process:
from subprocess import call
call(["pkill", "mongod","&"])
This answer -- using p = Popen() and then p.terminate() -- seemed to be exactly what I was looking for.

How to detect system ACPI G2/S5 Soft Off event with python on linux

I am working on an app using Google's compute engine and would like to use pre-emptible instances.
I need my code to respond to the 30s warning google gives via an ACPI G2 Soft Off signal that they send when they are going to take away your VM as described here: https://cloud.google.com/compute/docs/instances/preemptible.
How do I detect this event in my python code that is running on the machine and react to it accordingly (in my case I need to put the job the VM was working on back on a queue of open jobs so that a different machine can take it).
I am not answering the question directly, but I think that your actual intent is different:
The G2 power button event is generated by both preemption of a VM and the gcloud instances stop command (or the corresponding API, which it calls);
I am assuming that you want to react specially only on instance preemption.
Avoid a common misunderstanding
GCE does not send a "30s termination warning" with the power button event. It just sends the normal, honest power button soft-off event that immediately initiates shutdown of the system.
The "warning" part that comes with it is simple: “Here is your power button event, shutdown the OS ASAP, because you have 30s before we pull the plug off the wall socket. You've been warned!”
You have two system services that you can combine in different ways to get the desired behavior.
1. Use the fact that the system is shutting down upon ACPI G2
The most kosher (and, AFAIK, the only supported) way of handling the ACPI power button event is let the system handle it, and execute what you want in the instance shutdown script. In a systemd-managed machine, the default GCP shutdown script is simply invoked by a Type=oneshot service's ExecStop= command (see systemd.service(8)). The script is ran relatively late in shutdown sequence.
If you must ensure that the shutdown script is ran after (or before) some of your services is sent a signal to terminate, you can modify some of service dependencies. Things to keep in mind:
After and Before are reversed on shutdown: if X is started after Y, then it's stopped before Y.
The After dependency ensures that the service in the sequence is told to terminate before the shutdown script is run. It does not ensure that the service has already terminated.
The shutdown script is run when the google-shutdown-scripts.service is stopped as part of system shutdown.
With all that in mind, you can do sudo systemctl edit google-shutdown-scripts.service. This will create an empty configuration override file and open your $EDITOR, where you can put your After and Before dependencies, for example,
[Unit]
# Make sure that shutdown script is run (synchronously) *before* mysvc1.service is stopped.
After=mysvc1.service
# Make sure that mysvc2.service is sent a command to stop before the shutdown script is run
Before=mysvc2.service
You may specify as many After or Before clauses as you want, 0 or more of each. Read systemd.unit(8) for more information.
2. Use GCP metadata
There is an instance metadatum v1/instance/preempted. If the instance is preempted, it's value is TRUE, otherwise it's FALSE.
GCP has a thorough documentation on working with instance metadata. In short, there are two ways you can use this (or any other) metadata value:
Query its value at any time, e. g. in the shutdown script. curl(1) equivalent:
curl -sfH 'Metadata-Flavor: Google' \
'http://169.254.169.254/computeMetadata/v1/instance/preempted'
Run an HTTP request that will complete (200) when the metadatum changes. The only change that can ever happen to it is from FALSE to TRUE, as preemption is irreversible.
curl -sfH 'Metadata-Flavor: Google' \
'http://169.254.169.254/computeMetadata/v1/instance/preempted?wait_for_change=true'
Caveat: The metadata server may return the 503 response if it's temporarily unavailable (this is very rare, but happens), so certain retry logic is required. This especially true for the long-running second form (with ?wait_for_change=true), as the pending request may return at any time with the code 503. Your code should be ready to handle this and restart the query. curl does not return the HTTP error code directly, but you can use the fact that x=$(curl ....) expression returned an empty string if you scripting it; your criterion for positive detection of preemption is [[ $x == TRUE ]] in this case.
Summary
If you want to detect that the VM is shutting down for any reason, use Google-provided shutdown script.
If you also need to distinguish whether the VM was in fact preempted, as opposed to gcloud instance stop <vmname> (which also sends the power button event!), query the preempted metadata in the shutdown script.
Run a pending HTTP request for metadata change, and react on it accordingly. This will complete successfully when VM is preempted only (but may complete with an error at any time too).
If the daemon that you run is your own, you can also directly query the preempted metadata from the code path which handles the termination signal, if you need to distinguish between different shutdown reasons.
It is not impossible that the real decision point is whether you have an "active job" that you want to return to the "queue", or not: if your service is requested to stop while holding on an active job, just return it, regardless of the reason why you are being stopped. But I cannot comment on this, not knowing your actual design.
I think the simplest way to handle GCP preemption is using SIGTERM.
The SIGTERM signal is a generic signal used to cause program
termination. Unlike SIGKILL, this signal can be blocked, handled, and
ignored. It is the normal way to politely ask a program to terminate. https://www.gnu.org/software/libc/manual/html_node/Termination-Signals.html
This does depend on shutdown scripts, which are run on a "best effort" basis. In practice, shutdown scripts are very reliable for short scripts.
In your shutdown script:
echo "Running shutdown script"
preempted = curl "http://metadata.google.internal/computeMetadata/v1/instance/preempted" -H "Metadata-Flavor: Google"
if $preempted; then
PID="$(pgrep -o "python")"
echo "Send SIGTERM to python"
kill "$PID"
sleep infinity
fi
echo "Shutting down"
In main.py:
import signal
import os
def sigterm_handler(sig, frame):
print("Got SIGTERM")
os.environ["IS_PREEMPTED"] = True
# Call cleanup functions
signal.signal(signal.SIGTERM, sigterm_handler)
if __name__ == "__main__":
print("Main")

sending trigger to my python script

I have a long running python script that I would like to be called via a udev rule. The script should run unattended, in the background if you like. The udev RUN is not suitable for long running commands which is getting in my way - my script gets killed after a wile.So I cannot call my script directly via udev.
I tried to disowning it by calling it in udev RUN via a shell script:
#!/bin/bash
/path/to/pythonscript.py & disown
This still got killed.
Now I was thinking to turn my script into a daemon, e.g. using pythons daemon module. Now "all" I need to do is put a short command in my udev RUN statement that will send some sort of trigger to my daemon, and things should be fine. What is less clear to me is the best way to implement this communication. I was thinking of adding maybe a jsonrpc service to my daemon, only listening on the loopback device. Is there a simpler or better way to achive what I want?
EDIT
Thanks to the below comments I came up with the following solution. Main "daemon":
import signal, time
import auto_copy
SIGNAL_RECEIVED = False
def run_daemon():
signal.signal(signal.SIGUSR1, signal_handler)
while True:
time.sleep(1)
global SIGNAL_RECEIVED
if SIGNAL_RECEIVED:
auto_copy.auto_copy()
SIGNAL_RECEIVED = False
def signal_handler(dump1, dump2):
global SIGNAL_RECEIVED
SIGNAL_RECEIVED = True
if __name__ == "__main__":
auto_copy.setup_logging()
run_daemon()
This gets started via systemd, the unit file beeing
[Unitt]
Description=Start auto_copy.py script
[Service]
ExecStart=/home/isaac/bin/auto_copy_daemon.py
[Install]
WantedBy=multi-user.target
Then there is a udev rule:
SUBSYSTEM=="block", KERNEL=="sr0", ACTION=="change", RUN+="/home/isaac/bin/send_siguser1.sh"
and finally the script that sends the signal:
#!/bin/bash
kill -SIGUSR1 $(ps -elf |grep auto_copy_daemon.py |awk '/python/ {print $4}')
If anyone is interested, the project is on github.

Python subprocess -- close Django server and Docker container with Ctrl-C, return to terminal

I'm trying to figure out how to properly close out my script that's supposed to start up a Django server running in a docker container (boot2docker, on Mac OS X). Here's the pertinent code block:
try:
init_code = subprocess.check_output('./initdocker.sh', shell=True)
subprocess.call('./startdockerdjango.sh', shell=True)
except subprocess.CalledProcessError:
try:
subprocess.call('./startdockerdjango.sh', shell=True)
except KeyboardInterrupt:
return
Where startdockerdjango.sh takes care of setting the environment variables that docker needs and starts the server up. The script overall is supposed to know whether to do first-time setup and initialization or simply start the container and server; catching the CalledProcessError means that first time setup was already done and that the container and server can just be started up. The startup works fine, but when a user presses Ctrl-C to stop the server, the server stops normally but then apparently the process that started the server is still going. If I press return, then I can go back to the normal terminal command prompt. If I do any sort of shell command, like ls, then it will be carried out and then I can return to the terminal. I want to change the code so that, if a user presses Ctrl-C, then the server and the container that the server is running in will stop normally and then, afterward, stop the process and have the whole script exit. How can this be done? I don't want to just kill or terminate the process upon KeyboardInterrupt, since then the server and container won't be able to stop normally but will be killed off abruptly.
UPDATE:
I recently tried the following according to Padraic Cunningham's comment:
try:
init_code = subprocess.check_output('./initdocker.sh', shell=True)
subprocess.call('./startdockerdjango.sh', shell=True)
except subprocess.CalledProcessError:
try:
startproc = subprocess.Popen('./startdockerdjango.sh')
except KeyboardInterrupt:
startproc.send_signal(SIGTERM)
startproc.wait()
return
This was my attempt to send a term to the server to shut down gracefully and then use wait() to wait for the process (startproc) to complete. This, however, results in just having the container and server end abruptly, something that I was trying to prevent. The same thing happens if I try SIGINT instead. What, if anything, am I doing wrong in this second approach? I still want the same overall thing as before, which is having one single Ctrl-C end the container and server, then exit the script.
You might want to create the process using Popen. It will give you a little more control on how you manage the child process.
env = {"MY_ENV_VAR": "some value"}
proc = subprocess.Popen("./dockerdjango.sh", env=env)
try:
proc.wait()
except KeyboardInterupt:
proc.terminate() # on linux this gives the a chance to clean up,
# or even ignore the signal entirely
# use proc.send_signal(...) and the module signal to send other signals.
# or proc.kill() if you wish to be kill the process immediately.
If you set the environment variables in python it will also result in less child processes that need to be killed.
In the end, it wasn't worth the effort to have the script know to either do first-time initialization or server+container startup. Instead, the script will just try first-time setup and then will tell the user to do docker-compose up after successful setup. This is a better solution for my specific situation than trying to figure out how to have Ctrl-C properly shut down the server and then exit the script.
To Reset django server subprocess execute in your terminal:
$ sudo lsof -i tcp:8080
$ sudo lsof -i tcp:8080|awk '{print $2}'|cut -d/ -f 1|xargs kill

Categories