This is my bash script used in CMD
#!/bin/bash
set -eo pipefail
echo "Setting trap"
echo $$
echo $BASHPID
trap 'cleanup' TERM
trap 'cleanup' KILL
cleanup() {
echo "Cleaning up..."
kill -TERM `jobs -p`
}
# To start the essential services
service ntp start
service awslogs start
cd /app
python -m job_manager &
wait
The Docker file is not very interesting
FROM ubuntu:16.04
RUN apt-get update --fix-missing && apt-get install -y \
git \
python \
python-pip \
ntp \
curl
ENV APP_HOME /app
RUN mkdir -p ${APP_HOME}
COPY src/ ${APP_HOME}/
# job-cmd.sh is kept here
COPY docker/helper-files/* /
CMD /job-cmd.sh
The idea is trap the TERM signal inside job-cmd.sh and then pass on to the python task.
I have tried a number of time and it did not work. After I add these call
echo $$
echo $BASHPID
I realised the pid of the CMD process is actually 7 instead of 1 as I would expect.
My questions:
1) Why the bash process is assigned PID 7?
2) How can I fix the my job script/dockerfile?
I think this is happening because you are using the shell form of the CMD instruction. From https://docs.docker.com/engine/reference/builder/#cmd:
If you want to run your command without a shell then you must express the command as a JSON array and give the full path to the executable. This array form is the preferred format of CMD.
So, replace your CMD instruction in Dockerfile with:
CMD ["/job-cmd.sh"]
Then your Bash process will be assigned PID 1. Your TERM handler will work, but you can't trap the KILL signal. From man trap:
Trapping SIGKILL or SIGSTOP is syntactically accepted by some historical implementations, but it has no effect. Portable POSIX applications cannot attempt to trap these signals.
FYI, I explained more about the PID 1 problem here: https://serverfault.com/questions/869543/bash-script-entrypoint-pid-1-kills-tail-sub-process-only-if-a-fake-trap-whi/870872#870872
You could use trap command in the bash to do this.
#!/bin/bash
#
function gracefulShutdown {
echo "Shutting down!"
# do something..
}
trap gracefulShutdown SIGTERM TERM INT
./subprocess.sh &
tail --pid=${!} -f /dev/null &
wait "${!}"
tail command just waits for subprocess to complete, while wait command waits for the tail to complete... Now, main process is the one which is waiting on.. so any docker signals directly reach the trap we set above...
Example is available at: https://github.com/iamdvr/docker-trap-subprocess
Related
For one gitlab CI runner
I have a jar file which needs to be continuosly running in the Git linux box but since this is a application which is continuosly running, the python script in the next line is not getting executed. How to run the jar application and then execute the python script simultaneously one after another?
.gitlab.ci-yml file:
pwd && ls -l
unzip ZAP_2.8.0_Core.zip && ls -l
bash scan.sh
python3 Report.py
scan.sh file has the code java -jar app.jar.
Since, this application is continuosly running, 4th line code python3 Report.py is not getting executed.
How do I make both these run simulataneously without the .jar application stopping?
The immediate solution would probably be:
pwd && ls -l
echo "ls OK"
unzip ZAP_2.8.0_Core.zip && ls -l
echo "unzip + ls OK"
bash scan.sh &
scanpid=$!
echo "started scanpid with pid $scanpid"]
ps axuf | grep $scanpid || true
echo "ps + grep OK"
( python3 Report.py ; echo $? > report_status.txt ) || true
echo "report script OK"
kill $scanpid
echo "kill OK"
echo "REPORT STATUS = $(cat report_status.txt)"
test $(cat report_status.txt) -eq 0
Start the java process in the background,
run your python code and remember its return status and always return true.
kill the background process after running python
check for the status code of the python script.
Perhaps this is not necessary, as I never checked how gitlabci deals with background processes, that were spawned by its runners.
I do here a conservative approach.
- I remember the process id of the bash script, so that I can kill it later
- I ensure, that the line running the python script always returns a 0 exit code such, that gitlabci does not stop executing the next lines, but I remember the status code
- then I kill the bash script
- then I check whether the exit code of the python script was 0 or not, such, that gitlabci can perform the proper checking whether the runner was executed successfully or not.
Another minor comment (not related to your question)
I don't really understand why you write
unzip ZAP_2.8.0_Core.zip && ls -l
instead of
unzip ZAP_2.8.0_Core.zip ; ls -l```
If you expect the unzip command to fail you could just write
unzip ZAP_2.8.0_Core.zip
ls -l
and gitlabci would abort automatically before executing ls -l
I also added many echo statements for better debugging, error analysis, you might remove them in your final solution.
To run the two scripts one after the other, you can add & to the end of the line that is blocking. That will make it run in the background.
Either do
bash scan.sh & or add & to the end of the line calling the jar file within the scan.sh...
I am trying to execute a Python program as a background process inside a container with kubectl as below (kubectl issued on local machine):
kubectl exec -it <container_id> -- bash -c "cd some-dir && (python xxx.py --arg1 abc &)"
When I log in to the container and check ps -ef I do not see this process running. Also, there is no output from kubectl command itself.
Is the kubectl command issued correctly?
Is there a better way to achieve the same?
How can I see the output/logs printed off the background process being run?
If I need to stop this background process after some duration, what is the best way to do this?
The nohup Wikipedia page can help; you need to redirect all three IO streams (stdout, stdin and stderr) - an example with yes:
kubectl exec pod -- bash -c "yes > /dev/null 2> /dev/null &"
nohup is not required in the above case because I did not allocate a pseudo terminal (no -t flag) and the shell was not interactive (no -i flag) so no HUP signal is sent to the yes process on session termination. See this answer for more details.
Redirecting /dev/null to stdin is not required in the above case since stdin already refers to /dev/null (you can see this by running ls -l /proc/YES_PID/fd in another shell).
To see the output you can instead redirect stdout to a file.
To stop the process you'd need to identity the PID of the process you want to stop (pgrep could be useful for this purpose) and send a fatal signal to it (kill PID for example).
If you want to stop the process after a fixed duration, timeout might be a better option.
Actually, the best way to make this kind of things is adding an entry point to your container and run execute the commands there.
Like:
entrypoint.sh:
#!/bin/bash
set -e
cd some-dir && (python xxx.py --arg1 abc &)
./somethingelse.sh
exec "$#"
You wouldn't need to go manually inside every single container and run the command.
Through Fabric, I am trying to start a celerycam process using the below nohup command. Unfortunately, nothing is happening. Manually using the same command, I could start the process but not through Fabric. Any advice on how can I solve this?
def start_celerycam():
'''Start celerycam daemon'''
with cd(env.project_dir):
virtualenv('nohup bash -c "python manage.py celerycam --logfile=%scelerycam.log --pidfile=%scelerycam.pid &> %scelerycam.nohup &> %scelerycam.err" &' % (env.celery_log_dir,env.celery_log_dir,env.celery_log_dir,env.celery_log_dir))
I'm using Erich Heine's suggestion to use 'dtach' and it's working pretty well for me:
def runbg(cmd, sockname="dtach"):
return run('dtach -n `mktemp -u /tmp/%s.XXXX` %s' % (sockname, cmd))
This was found here.
As I have experimented, the solution is a combination of two factors:
run process as a daemon: nohup ./command &> /dev/null &
use pty=False for fabric run
So, your function should look like this:
def background_run(command):
command = 'nohup %s &> /dev/null &' % command
run(command, pty=False)
And you can launch it with:
execute(background_run, your_command)
This is an instance of this issue. Background processes will be killed when the command ends. Unfortunately on CentOS 6 doesn't support pty-less sudo commands.
The final entry in the issue mentions using sudo('set -m; service servicename start'). This turns on Job Control and therefore background processes are put in their own process group. As a result they are not terminated when the command ends.
For even more information see this link.
you just need to run
run("(nohup yourcommand >& /dev/null < /dev/null &) && sleep 1")
DTACH is the way to go. It's a software you need to install like a lite version of screen.
This is a better version of the "dtach"-method found above, it will install dtach if necessary. It's to be found here where you can also learn how to get the output of the process which is running in the background:
from fabric.api import run
from fabric.api import sudo
from fabric.contrib.files import exists
def run_bg(cmd, before=None, sockname="dtach", use_sudo=False):
"""Run a command in the background using dtach
:param cmd: The command to run
:param output_file: The file to send all of the output to.
:param before: The command to run before the dtach. E.g. exporting
environment variable
:param sockname: The socket name to use for the temp file
:param use_sudo: Whether or not to use sudo
"""
if not exists("/usr/bin/dtach"):
sudo("apt-get install dtach")
if before:
cmd = "{}; dtach -n `mktemp -u /tmp/{}.XXXX` {}".format(
before, sockname, cmd)
else:
cmd = "dtach -n `mktemp -u /tmp/{}.XXXX` {}".format(sockname, cmd)
if use_sudo:
return sudo(cmd)
else:
return run(cmd)
May this help you, like it helped me to run omxplayer via fabric on a remote rasberry pi!
You can use :
run('nohup /home/ubuntu/spider/bin/python3 /home/ubuntu/spider/Desktop/baidu_index/baidu_index.py > /home/ubuntu/spider/Desktop/baidu_index/baidu_index.py.log 2>&1 &', pty=False)
nohup did not work for me and I did not have tmux or dtach installed on all the boxes I wanted to use this on so I ended up using screen like so:
run("screen -d -m bash -c '{}'".format(command), pty=False)
This tells screen to start a bash shell in a detached terminal that runs your command
You could be running into this issue
Try adding 'pty=False' to the sudo command (I assume virtualenv is calling sudo or run somewhere?)
This worked for me:
sudo('python %s/manage.py celerycam --detach --pidfile=celerycam.pid' % siteDir)
Edit: I had to make sure the pid file was removed first so this was the full code:
# Create new celerycam
sudo('rm celerycam.pid', warn_only=True)
sudo('python %s/manage.py celerycam --detach --pidfile=celerycam.pid' % siteDir)
I was able to circumvent this issue by running nohup ... & over ssh in a separate local shell script. In fabfile.py:
#task
def startup():
local('./do-stuff-in-background.sh {0}'.format(env.host))
and in do-stuff-in-background.sh:
#!/bin/sh
set -e
set -o nounset
HOST=$1
ssh $HOST -T << HERE
nohup df -h 1>>~/df.log 2>>~/df.err &
HERE
Of course, you could also pass in the command and standard output / error log files as arguments to make this script more generally useful.
(In my case, I didn't have admin rights to install dtach, and neither screen -d -m nor pty=False / sleep 1 worked properly for me. YMMV, especially as I have no idea why this works...)
I've created small web server using werkzeug and I'm able to run it in usual python way with python my_server.py. Pages load, everything works fine. Now I want to start it when my pc boots. What's the easiest way to do that? I've been struggling with upstart but it doesn't seem to "live in a background" cuz after I execute start my_server I immediately receive kernel: [ 8799.793942] init: my_server main process (7274) terminated with status 1
my_server.py:
...
if __name__ == '__main__':
from werkzeug.serving import run_simple
app = create_app()
run_simple('0.0.0.0', 4000, app)
upstart configuration file my_server.conf:
description "My service"
author "Some Dude <blah#foo.com>"
start on runlevel [2345]
stop on runlevel [016]
exec /path/to/my_server.py
start on startup
Any Ideas how to make it work? Or any other better way to daemonize the script?
Update:
I believe the problem lies within my_server.py. It doesn't seem to initiate the webserver (method run_simple()) in the first place. What steps should be taken to make .py file be run by task handler such as upstart?
Place shebang as first line #!/usr/bin/env python
Allow execution permissions chmod 755
Start the daemon with superuser rights (to be absolutely sure no permission restrictions prevents it from starting)
Make sure all python libraries are there!
Something else?
Solved:
The problem was with missing python dependencies. When starting the script through task manager (e.g. upstart or start-stop-daemon) no errors are thrown. Need to be absolutely sure that pythonpath contains everything you need.
In addition to gg.kaspersky method, you could also turn your script into a "service", so that you can start or stop it using:
$ sudo service myserver start
* Starting system myserver.py Daemon [ OK ]
$ sudo service myserver status
* /path/to/myserver.py is running
$ sudo service myserver stop
* Stopping system myserver.py Daemon [ OK ]
and define it as a startup service using:
$ sudo update-rc.d myserver defaults
To do this, you must create this file and save it in /etc/init.d/.
#!/bin/sh -e
DAEMON="/path/to/myserver.py"
DAEMONUSER="myuser"
DAEMON_NAME="myserver.py"
PATH="/sbin:/bin:/usr/sbin:/usr/bin"
test -x $DAEMON || exit 0
. /lib/lsb/init-functions
d_start () {
log_daemon_msg "Starting system $DAEMON_NAME Daemon"
start-stop-daemon --background --name $DAEMON_NAME --start --user $DAEMONUSER --exec $DAEMON
log_end_msg $?
}
d_stop () {
log_daemon_msg "Stopping system $DAEMON_NAME Daemon"
start-stop-daemon --name $DAEMON_NAME --stop --retry 5 --name $DAEMON_NAME
log_end_msg $?
}
case "$1" in
start|stop)
d_${1}
;;
restart|reload|force-reload)
d_stop
d_start
;;
force-stop)
d_stop
killall -q $DAEMON_NAME || true
sleep 2
killall -q -9 $DAEMON_NAME || true
;;
status)
status_of_proc "$DAEMON_NAME" "$DAEMON" "system-wide $DAEMON_NAME" && exit 0 || exit $?
;;
*)
echo "Usage: /etc/init.d/$DAEMON_NAME {start|stop|force-stop|restart|reload|force-reload|status}"
exit 1
;;
esac
exit 0
In this example, I assume you have a shebang like #!/usr/bin/python at the head of your python file, so that you can execute it directly.
Last but not least, do not forget to give execution rights to your python server and to the service script :
$ sudo chmod 755 /etc/init.d/myserver
$ sudo chmod 755 /path/to/mserver.py
Here's the page where I learned this originally (french).
Cheers.
One simple way to do is using crontab:
$ crontab -e
A crontab file will appear for editing, write the line at the end:
#reboot python myserver.py
and quit. Now, after each reboot, the cron daemon will run your myserver python script.
If you have supervisor service that starts at boot, write a supervisor service is much, much simpler.
You can even set autorestart if your program fails.
Hallo,
I'm trying to let a python script run as service (daemon) on (ubuntu) linux.
On the web there exist several solutions like:
http://pypi.python.org/pypi/python-daemon/
A well-behaved Unix daemon process is tricky to get right, but the required steps are much the same for every daemon program. A DaemonContext instance holds the behaviour and configured process environment for the program; use the instance as a context manager to enter a daemon state.
http://www.jejik.com/articles/2007/02/a_simple_unix_linux_daemon_in_python/
However as I want to integrate my python script specifically with ubuntu linux my solution is a combination with an init.d script
#!/bin/bash
WORK_DIR="/var/lib/foo"
DAEMON="/usr/bin/python"
ARGS="/opt/foo/linux_service.py"
PIDFILE="/var/run/foo.pid"
USER="foo"
case "$1" in
start)
echo "Starting server"
mkdir -p "$WORK_DIR"
/sbin/start-stop-daemon --start --pidfile $PIDFILE \
--user $USER --group $USER \
-b --make-pidfile \
--chuid $USER \
--exec $DAEMON $ARGS
;;
stop)
echo "Stopping server"
/sbin/start-stop-daemon --stop --pidfile $PIDFILE --verbose
;;
*)
echo "Usage: /etc/init.d/$USER {start|stop}"
exit 1
;;
esac
exit 0
and in python:
import signal
import time
import multiprocessing
stop_event = multiprocessing.Event()
def stop(signum, frame):
stop_event.set()
signal.signal(signal.SIGTERM, stop)
if __name__ == '__main__':
while not stop_event.is_set():
time.sleep(3)
My question now is if this approach is correct. Do I have to handle any additional signals? Will it be a "well-behaved Unix daemon process"?
Assuming your daemon has some way of continually running (some event loop, twisted, whatever), you can try to use upstart.
Here's an example upstart config for a hypothetical Python service:
description "My service"
author "Some Dude <blah#foo.com>"
start on runlevel [234]
stop on runlevel [0156]
chdir /some/dir
exec /some/dir/script.py
respawn
If you save this as script.conf to /etc/init you simple do a one-time
$ sudo initctl reload-configuration
$ sudo start script
You can stop it with stop script. What the above upstart conf says is to start this service on reboots and also restart it if it dies.
As for signal handling - your process should naturally respond to SIGTERM. By default this should be handled unless you've specifically installed your own signal handler.
Rloton's answer is good. Here is a light refinement, just because I spent a ton of time debugging. And I need to do a new answer so I can format properly.
A couple other points that took me forever to debug:
When it fails, first check /var/log/upstart/.log
If your script implements a daemon with python-daemon, you do NOT use the 'expect daemon' stanza. Having no 'expect' works. I don't know why. (If anyone knows why - please post!)
Also, keep checking "initctl status script" to make sure you are up (start/running). (and do a reload when you update your conf file)
Here is my version:
description "My service"
author "Some Dude <blah#foo.com>"
env PYTHON_HOME=/<pathtovirtualenv>
env PATH=$PYTHON_HOME:$PATH
start on runlevel [2345]
stop on runlevel [016]
chdir <directory>
# NO expect stanza if your script uses python-daemon
exec $PYTHON_HOME/bin/python script.py
# Only turn on respawn after you've debugged getting it to start and stop properly
respawn