Proper way to write a daemon in 2019 in Python

Proper way to write a daemon in 2019 in Python - python

TL;DR;
I would like to write a daemon in Python, but I feel that PEP 3143 is overkill now that almost everybody uses systemd. I am looking for advice for a good start to write a daemon in Python.
Context
By reading other related questions on SO, it seems there is a before and an after systemd. The articles I have read are more than 10 years old and I feel that nowadays it is much simpler to achieve what I want to do.
I would like to write a program container that can be run either:
In the front (blocking) ($ ./foo)
In the background ($ ./foo &)
In a detached state ($ ./foo start, $ ./foo stop).
Managed by systemd ($ sudo systemctl start foo)
Being able to start and stop the program by itself would require these commands:
$ daemon start
$ daemon stop
$ daemon status
Also if the program is able to demonize itself it would also take care of some side effects (prevent zombies, double fork, pidfile, logging...)
I have yet not figured out how to manage the log. Since a daemon is detached from a TTY, it should redirect stdin, stdout, stderr to /dev/null and use a logger instead. To use a logger I can see different options:
Use Syslog, but require a write access to /dev/log
Use daemon.log through stdout and systemd, but requires systemd
Use a custom log file that will be specified with --log=~/foo.log
Use stdout/stderr because the process is not detached
For the PID file, the traditional location is in /var/run/ in which most users have no access. So the user should be able to configure the --pidfile.
From this I realize that building a simple daemon is not an easy task and I do not know from where to start.
One trivial approach would be to have two separated programs. One simple blocking program that performs the task, use stdout and one process manager that can do what systemd do, but at a user level.
If I would summarize my question in one sentence I would say:
Is it worth it to use PEP3143 Standard daemon process library in 2019 to write a daemon in Python instead of relying on a daemon manager such as systemd?

Related

Python (2.7) script monitoring and notification system

I've read a lot of other posts about monitoring python scripts, but haven't been able to find anything like what I am hoping to do. Essentially, I have 2 desktops running Linux. Each computer has multiple python scripts running non-stop 24/7. Most of them are web scraping, while a few others are scrubbing and processing data. I have built pretty extensive exception handling into them that sends me an email in the event of any error or crash, but there are some situations that I haven't been able to get emailed about (such as if the script itself just freezes or the computer itself crashes, or the computer looses internet connection, etc.)
So, I'm trying to build a sort of check-in service where a python script checks in to the service multiple times throughout it's run, and if it doesn't check-in within X amount of time, then send me an email. I don't know if this is something that can be done with the signal or asyncore module(s) and/or sockets, or what a good place would be to even start.
Has anyone had any experience in writing anything like this? Or can point me in the right direction?

Take a look at supervision tools like monit or supervisord.
Those tools are built to do what you described.
For example: create a simple init.d script for your python process:
PID_FILE=/var/run/myscript.pid
LOG_FILE=/mnt/logs/myscript.log
SOURCE=/usr/local/src/myscript
case $1 in
start)
exec /usr/bin/python $SOURCE/main_tread.py >> LOG_FILE 2>&1 &
echo $! > $PID_FILE
;;
stop)
kill `cat ${PID_FILE}`
;;
*)
echo "Usage: wrapper {start|stop}"
;;
esac
exit 0
Then add this to the monit config:
check process myscript pidfile /var/run/myscript.pid
start program = "/etc/init.d/myscript start"
stop program = "/etc/init.d/myscript stop"
check file myscript.pid path /var/run/myscript.pid
if changed checksum then alert
Also check documentation, it has pretty good example on how to setup alerts and send emails.

Upstart is a good choice but I' afraid it is only available for Ubuntu and Redhat based distros

Run Python script forever, logging errors and restarting when crashes

I have a python script that continuously process new data and writes to a mongodb. In the script, its a while loop and a sleep that runs the code continuously.
What is the recommended way to run the Python script forever, logging errors when they occur, and restarting when it crashes?
Will node.js's forever be suitable? I'm also running node/meteor on the same Ubuntu server.

supervisord is perfect for this sort of thing. While I used to check that programs were still running every couple of minutes with a cron job, supervisord runs all programs in an in-process thread, so in the event your program terminates, supervisord will automatically restart the process. I no longer need to parse the output of ps to see if a program crashed.
It has a simple declaritive config file and configurable logging. By default it creates a log file for your-program-name-stderr.log your-program-name-stdout.log which are automatically handled by logrotate when supervisord is installed from an OS package manager (Debian for me).
If you don't want to configure supervisord's logging, you should look at logging in python so you can control what goes into those files.
if you're on a debian derivative you should be able to install and start the daemon simply by executing apt-get install supervisord as root.
The config file is very straightforward too:
[program:myprogram]
command=/path/to/my/program/script
directory=/path/to/my/program/base
user=myuser
autostart=true
autorestart=true
redirect_stderr=True
supervisorctl also allows you to see what your program is doing interactively and can start and stop multiple programs with supervisorctl start myprogram etc

Recently wrote something similar. The basic pattern I follow is
while True:
try:
#functionality
except SpecificError:
#log exception
except: #catch everything else
finally:
time.sleep(600)
to handle reboots you can use init.d or cron jobs.

If you are writing a daemon, you should probably do it with this command:
http://manpages.ubuntu.com/manpages/lucid/man8/start-stop-daemon.8.html
You can spawn this from a System V /etc/init.d/ script, or use Upstart which is slowly replacing it.
Upstart: http://upstart.ubuntu.com/getting-started.html
System V: http://www.cyberciti.biz/tips/linux-write-sys-v-init-script-to-start-stop-service.html
I find System V easier to write, but if this will ever be packaged and distributed in a debian file, I recommend writing an Upstart conf.
Definitely keep the sleep so it won't keep a grip on CPU load.

I don't know if this is still relevant to you, but I have been reading forever about how to do this and want to share somewhere what I did.
For me, the goal was to have a python script running always (on my Linux computer). The python script also has a "while True " loop in it which should theoretically run forever, but if it for any reason I cannot think of would crash, I want the script to restart. Also, when I restart the computer it should run the script.
I am not an expert but for me the best and most understandable was to use systemd (assuming you use Linux).
There are two nice examples of how to do this given here and here, showing how to write your .service files in either /etc/systemd/system or /lib/systemd/system. If you want to be completely correct you should take the former:
" /etc/systemd/system/: units installed by the system administrator" 1
The documentation of systemd here is actually nice to read, even if you are not an expert.
Hope this helps someone!

Python daemon and systemd service

I have a simple Python script working as a daemon. I am trying to create systemd script to be able to start this script during startup.
Current systemd script:
[Unit]
Description=Text
After=syslog.target
[Service]
Type=forking
User=node
Group=node
WorkingDirectory=/home/node/Node/
PIDFile=/var/run/zebra.pid
ExecStart=/home/node/Node/node.py
[Install]
WantedBy=multi-user.target
node.py:
if __name__ == '__main__':
with daemon.DaemonContext():
check = Node()
check.run()
run contains while True loop.
I try to run this service with systemctl start zebra-node.service. Unfortunately service never finished stating sequence - I have to press Ctrl+C.
Script is running, but status is activating and after a while it change to deactivating.
Now I am using python-daemon (but before I tried without it and the symptoms were similar).
Should I implement some additional features to my script or is systemd file incorrect?

The reason, it does not complete the startup sequence is, that for Type forking your startup process is expected to fork and exit (see $ man systemd.service - search for forking).
Simply use only the main process, do not daemonize
One option is to do less. With systemd, there is often no need to create daemons and you may directly run the code without daemonizing.
#!/usr/bin/python -u
from somewhere import Node
check = Node()
check.run()
This allows using simpler Type of service called simple, so your unit file would look like.
[Unit]
Description=Simplified simple zebra service
After=syslog.target
[Service]
Type=simple
User=node
Group=node
WorkingDirectory=/home/node/Node/
ExecStart=/home/node/Node/node.py
StandardOutput=syslog
StandardError=syslog
[Install]
WantedBy=multi-user.target
Note, that the -u in python shebang is not necessary, but in case you print something out to the stdout or stderr, the -u makes sure, there is no output buffering in place and printed lines will be immediately caught by systemd and recorded in journal. Without it, it would appear with some delay.
For this purpose I added into unit file the lines StandardOutput=syslog and StandardError=syslog. If you do not care about printed output in your journal, do not care about these lines (they do not have to be present).
systemd makes daemonization obsolete
While the title of your question explicitly asks about daemonizing, I guess, the core of the question is "how to make my service running" and while using main process seems much simpler (you do not have to care about daemons at all), it could be considered answer to your question.
I think, that many people use daemonizing just because "everybody does it". With systemd the reasons for daemonizing are often obsolete. There might be some reasons to use daemonization, but it will be rare case now.
EDIT: fixed python -p to proper python -u. thanks kmftzg

It is possible to daemonize like Schnouki and Amit describe. But with systemd this is not necessary. There are two nicer ways to initialize the daemon: socket-activation and explicit notification with sd_notify().
Socket activation works for daemons which want to listen on a network port or UNIX socket or similar. Systemd would open the socket, listen on it, and then spawn the daemon when a connection comes in. This is the preferred approch because it gives the most flexibility to the administrator. [1] and [2] give a nice introduction, [3] describes the C API, while [4] describes the Python API.
[1] http://0pointer.de/blog/projects/socket-activation.html
[2] http://0pointer.de/blog/projects/socket-activation2.html
[3] http://www.freedesktop.org/software/systemd/man/sd_listen_fds.html
[4] http://www.freedesktop.org/software/systemd/python-systemd/daemon.html#systemd.daemon.listen_fds
Explicit notification means that the daemon opens the sockets itself and/or does any other initialization, and then notifies init that it is ready and can serve requests. This can be implemented with the "forking protocol", but actually it is nicer to just send a notification to systemd with sd_notify().
Python wrapper is called systemd.daemon.notify and will be one line to use [5].
[5] http://www.freedesktop.org/software/systemd/python-systemd/daemon.html#systemd.daemon.notify
In this case the unit file would have Type=notify, and call
systemd.daemon.notify("READY=1") after it has established the sockets. No forking or daemonization is necessary.

You're not creating the PID file.
systemd expects your program to write its PID in /var/run/zebra.pid. As you don't do it, systemd probably thinks that your program is failing, hence deactivating it.
To add the PID file, install lockfile and change your code to this:
import daemon
import daemon.pidlockfile
pidfile = daemon.pidlockfile.PIDLockFile("/var/run/zebra.pid")
with daemon.DaemonContext(pidfile=pidfile):
check = Node()
check.run()
(Quick note: some recent update of lockfile changed its API and made it incompatible with python-daemon. To fix it, edit daemon/pidlockfile.py, remove LinkFileLock from the imports, and add from lockfile.linklockfile import LinkLockFile as LinkFileLock.)
Be careful of one other thing: DaemonContext changes the working dir of your program to /, making the WorkingDirectory of your service file useless. If you want DaemonContext to chdir into another directory, use DaemonContext(pidfile=pidfile, working_directory="/path/to/dir").

I came across this question when trying to convert some python init.d services to systemd under CentOS 7. This seems to work great for me, by placing this file in /etc/systemd/system/:
[Unit]
Description=manages worker instances as a service
After=multi-user.target
[Service]
Type=idle
User=node
ExecStart=/usr/bin/python /path/to/your/module.py
Restart=always
TimeoutStartSec=10
RestartSec=10
[Install]
WantedBy=multi-user.target
I then dropped my old init.d service file from /etc/init.d and ran sudo systemctl daemon-reload to reload systemd.
I wanted my service to auto restart, hence the restart options. I also found using idle for Type made more sense than simple.
Behavior of idle is very similar to simple; however, actual execution
of the service binary is delayed until all active jobs are dispatched.
This may be used to avoid interleaving of output of shell services
with the status output on the console.
More details on the options I used here.
I also experimented with keeping the old service and having systemd resart the service but I ran into some issues.
[Unit]
# Added this to the above
#SourcePath=/etc/init.d/old-service
[Service]
# Replace the ExecStart from above with these
#ExecStart=/etc/init.d/old-service start
#ExecStop=/etc/init.d/old-service stop
The issues I experienced was that the init.d service script was used instead of the systemd service if both were named the same. If you killed the init.d initiated process, the systemd script would then take over. But if you ran service <service-name> stop it would refer to the old init.d service. So I found the best way was to drop the old init.d service and the service command referred to the systemd service instead.
Hope this helps!

Also, you most likely need to set daemon_context=True when creating the DaemonContext().
This is because, if python-daemon detects that if it is running under a init system, it doesn't detach from the parent process. systemd expects that the daemon process running with Type=forking will do so. Hence, you need that, else systemd will keep waiting, and finally kill the process.
If you are curious, in python-daemon's daemon module, you will see this code:
def is_detach_process_context_required():
""" Determine whether detaching process context is required.
Return ``True`` if the process environment indicates the
process is already detached:
* Process was started by `init`; or
* Process was started by `inetd`.
"""
result = True
if is_process_started_by_init() or is_process_started_by_superserver():
result = False
Hopefully this explains better.

Check if Twisted Server launched with twistd was started successfully

I need a reliable way to check if a Twisted-based server, started via twistd (and a TAC-file), was started successfully. It may fail because some network options are setup wrong. Since I cannot access the twistd log (as it is logged to /dev/null, because I don't need the log-clutter twistd produces), I need to find out if the Server was started successfully within a launch-script which wraps the twistd-call.
The launch-script is a Bash script like this:
#!/usr/bin/bash
twistd \
--pidfile "myservice.pid" \
--logfile "/dev/null" \
--python \
myservice.tac
All I found on the net are some hacks using ps or stuff like that. But I don't like an approach like that, because I think it's not reliable.
So I'm thinking about if there is a way to access the internals of Twisted, and get all currently running Twisted applications? That way I could query the currently running apps for the the name of my Twisted application (as I named it in the TAC-file) to start.
I'm also thinking about not using the twistd executable but implementing a Python-based launch script which includes the twistd-content, like the answer to this question provides, but I don't know if that helps me in getting the status of the server to run.
So my question is just: is there a reliable not-ugly way to tell if a Twisted Server started with twistd was started successfully, when twistd-logging is disabled?

You're explicitly specifying a PID file. twistd will write its PID into that file. You can check the system to see if there is a process with that PID.
You could also re-enable logging with a custom log observer which only logs your startup event and discards all other log messages. Then you can watch the log for the startup event.
Another possibility is to add another server to your application which exposes the internals you mentioned. Then try connecting to that server and looking around to see what you wanted to see (just the fact that the server is running seems like a good indication that the process started up properly, though). If you make it a manhole server then you get the ability to evaluate arbitrary Python code, which lets you inspect any state in the process you want.
You could also just have your application code write out an extra state file that explicitly indicates successful startup. Make sure you delete it before starting the application and you'll have a fine indicator of success vs failure.

Python Daemon Packaging Best Practices

I have a tool which I have written in python and generally should be run as a daemon. What are the best practices for packaging this tool for distribution, particularly how should settings files and the daemon executable/script be handled?
Relatedly are there any common tools for setting up the daemon for running on boot as appropriate for the given platform (i.e. init scripts on linux, services on windows, launchd on os x)?

The best tool I found for helping with init.d scripts is "start-stop-daemon". It will run any application, monitor run/pid files, create them when necessary, provide ways to stop the daemon, set process user/group ids, and can even background your process.
For example, this is a script which can start/stop a wsgi server:
#! /bin/bash
case "$1" in
start)
echo "Starting server"
# Activate the virtual environment
. /home/ali/wer-gcms/g-env/bin/activate
# Run start-stop-daemon, the $DAEMON variable contains the path to the
# application to run
start-stop-daemon --start --pidfile $WSGI_PIDFILE \
--user www-data --group www-data \
--chuid www-data \
--exec "$DAEMON"
;;
stop)
echo "Stopping WSGI Application"
# Start-stop daemon can also stop the application by sending sig 15
# (configurable) to the process id contained in the run/pid file
start-stop-daemon --stop --pidfile $WSGI_PIDFILE --verbose
;;
*)
# Refuse to do other stuff
echo "Usage: /etc/init.d/wsgi-application.sh {start|stop}"
exit 1
;;
esac
exit 0
You can also see there an example of how to use it with a virtualenv, which I would always recommend.

To answer one part of your question, there are no tools I know of that will do daemon setup portably even across Linux systems let alone Windows or Mac OS X.
Most Linux distributions seem to be using start-stop-daemon within init scripts now, but you're still going to have minor difference in filesystem layout and big differences in packaging. Using autotools/configure, or distutils/easy_install if your project is all Python, will go a long way to making it easier to build packages for different Linux/BSD distributions.
Windows is a whole different game and will require Mark Hammond's win32 extensions and maybe Tim Golden's WMI extensions.
I don't know Launchd except that "none of the above" are relevant.
For tips on daemonizing Python scripts, I would look to Python apps that are actually doing it in the real world, for example inside Twisted.

There are many snippets on the internet offering to write a daemon in pure python (no bash scripts)
http://www.jejik.com/articles/2007/02/a_simple_unix_linux_daemon_in_python/
looks clean...
If you want to write your own,
the principle is the same as with the bash daemon function.
Basically:
On start:
you fork to another process
open a logfile to redirect your
stdout and stderr
Save the pid somewhere.
On stop:
You send SIGTERM to the process with pid stored in your pidfile.
With signal.signal(signal.SIGTERM, sigtermhandler) you can bind a stopping
procedure to the SIGTERM signal.
I don't know any widely used package doing this though.

Check the Ben Finney's daemon module. He has started to write a PEP targeting python 3.X:
http://www.python.org/dev/peps/pep-3143/
But an implementation is already available here :
http://pypi.python.org/pypi/python-daemon/

Not a silver bullet for what you're asking, but check out supervisord. It handles all the fun bits of managing processes. I use it heavily in a large production environment. Also, it's written in Python!

I can't remember where I downloaded it... but this is the best daemonizing script that I've found. It works beautifully (on Mac and Linux.) (save it as daemonize.py)
import sys, os
def daemonize (stdin='/dev/null', stdout='/dev/null', stderr='/dev/null'):
# Perform first fork.
try:
pid = os.fork( )
if pid > 0:
sys.exit(0) # Exit first parent.
except OSError, e:
sys.stderr.write("fork #1 failed: (%d) %sn" % (e.errno, e.strerror))
sys.exit(1)
# Decouple from parent environment.
os.chdir("/")
os.umask(0)
os.setsid( )
# Perform second fork.
try:
pid = os.fork( )
if pid > 0:
sys.exit(0) # Exit second parent.
except OSError, e:
sys.stderr.write("fork #2 failed: (%d) %sn" % (e.errno, e.strerror))
sys.exit(1)
# The process is now daemonized, redirect standard file descriptors.
for f in sys.stdout, sys.stderr: f.flush( )
si = file(stdin, 'r')
so = file(stdout, 'a+')
se = file(stderr, 'a+', 0)
os.dup2(si.fileno( ), sys.stdin.fileno( ))
os.dup2(so.fileno( ), sys.stdout.fileno( ))
os.dup2(se.fileno( ), sys.stderr.fileno( ))
In your script, you would simply:
from daemonize import daemonize
daemonize()
And you can also specify places to redirect the stdio, err, etc...

On Linux systems, the system's package manager (Portage for Gentoo, Aptitude for Ubuntu/Debian, yum for Fedora, etc.) usually takes care of installing the program including placing init scripts in the right places. If you want to distribute your program for Linux, you might want to look into bundling it up into the proper format for various distributions' package managers.
This advice is obviously irrelevant on systems which don't have package managers (Windows, and Mac I think).

This blog entry made it clear for me that there are actually two common ways to have your Python program run as a deamon (I hadn't figured that out so clearly from the existing answers):
There are two approaches to writing daemon applications like servers
in Python.
The first is to handle all the tasks of sarting and
stopping daemons in Python code itself. The easiest way to do this is
with the python-daemon package which might eventually make its way
into the Python distribution.
Poeljapon's answer is an example of this 1st approach, although it doesn't use the python-daemon package, but links to a custom but very clean python script.
The other approach is to use the tools
supplied by the operating system. In the case of Debain, this means
writing an init script which makes use of the start-stop-daemon
program.
Ali Afshar's answer is a shell script example of the 2nd approach, using the start-stop-daemon.
The blog entry I quoted has a shell script example, and some additional details on things such as starting your daemon at system startup and restarting your daemon automatically when it stopped for any reason.

correct me if wrong, but I believe the question is how to DEPLOY the daemon. Set your app to install via pip and then make the entry_point a cli(daemon()). Then create an init script that simply runs $app_name &

"generally should be run as a daemon?"
Doesn't -- on surface -- make a lot of sense. "Generally" isn't sensible. It's either a a daemon or not. You might want to update your question.
For examples of daemons, read up on daemons like Apache's httpd or any database server (they're daemons) or the SMTPD mail daemon.
Or, perhaps, read up on something simpler, like the FTP daemon, SSH daemon, Telnet daemon.
In Linux world, you'll have your application installation directory, some working directory, plus the configuration file directories.
We use /opt/ourapp for the application (it's Python, but we don't install in Python's lib/site-packages)
We use /var/ourapp for working files and our configuration files.
We could use /etc/ourapp for configuration files -- it would be consistent -- but we don't.
We don't -- yet -- use the init.d scripts for startup. But that's the final piece, automated startup. For now, we have sys admins start the daemons.
This is based, partly, on http://www.pathname.com/fhs/ and http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/Linux-Filesystem-Hierarchy.html.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.