How to write a system agnostic Python daemon/service? [duplicate]

How to write a system agnostic Python daemon/service? [duplicate] - python

I would like to have my Python program run in the background as a daemon, on either Windows or Unix. I see that the python-daemon package is for Unix only; is there an alternative for cross platform? If possible, I would like to keep the code as simple as I can.

In Windows it's called a "service" and you could implement it pretty easily e.g. with the win32serviceutil module, part of pywin32. Unfortunately the two "mental models" -- service vs daemon -- are very different in detail, even though they serve similar purposes, and I know of no Python facade that tries to unify them into a single framework.

This question is 6 years old, but I had the same problem, and the existing answers weren't cross-platform enough for my use case. Though Windows services are often used in similar ways as Unix daemons, at the end of the day they differ substantially, and "the devil's in the details". Long story short, I set out to try and find something that allows me to run the exact same application code on both Unix and Windows, while fulfilling the expectations for a well-behaved Unix daemon (which is better explained elsewhere) as best as possible on both platforms:
Close open file descriptors (typically all of them, but some applications may need to protect some descriptors from closure)
Change the working directory for the process to a suitable location to prevent "Directory Busy" errors
Change the file access creation mask (os.umask in the Python world)
Move the application into the background and make it dissociate itself from the initiating process
Completely divorce from the terminal, including redirecting STDIN, STDOUT, and STDERR to different streams (often DEVNULL), and prevent reacquisition of a controlling terminal
Handle signals, in particular, SIGTERM.
The fundamental problem with cross-platform daemonization is that Windows, as an operating system, really doesn't support the notion of a daemon: applications that start from a terminal (or in any other interactive context, including launching from Explorer, etc) will continue to run with a visible window, unless the controlling application (in this example, Python) has included a windowless GUI. Furthermore, Windows signal handling is woefully inadequate, and attempts to send signals to an independent Python process (as opposed to a subprocess, which would not survive terminal closure) will almost always result in the immediate exit of that Python process without any cleanup (no finally:, no atexit, no __del__, etc).
Windows services (though a viable alternative in many cases) were basically out of the question for me: they aren't cross-platform, and they're going to require code modification. pythonw.exe (a windowless version of Python that ships with all recent Windows Python binaries) is closer, but it still doesn't quite make the cut: in particular, it fails to improve the situation for signal handling, and you still cannot easily launch a pythonw.exe application from the terminal and interact with it during startup (for example, to deliver dynamic startup arguments to your script, say, perhaps, a password, file path, etc), before "daemonizing".
In the end, I settled on using subprocess.Popen with the creationflags=subprocess.CREATE_NEW_PROCESS_GROUP keyword to create an independent, windowless process:
import subprocess
independent_process = subprocess.Popen(
'/path/to/pythonw.exe /path/to/file.py',
creationflags=subprocess.CREATE_NEW_PROCESS_GROUP
)
However, that still left me with the added challenge of startup communications and signal handling. Without going into a ton of detail, for the former, my strategy was:
pickle the important parts of the launching process' namespace
Store that in a tempfile
Add the path to that file in the daughter process' environment before launching
Extract and return the namespace from the "daemonization" function
For signal handling I had to get a bit more creative. Within the "daemonized" process:
Ignore signals in the daemon process, since, as mentioned, they all terminate the process immediately and without cleanup
Create a new thread to manage signal handling
That thread launches daughter signal-handling processes and waits for them to complete
External applications send signals to the daughter signal-handling process, causing it to terminate and complete
Those processes then use the signal number as their return code
The signal handling thread reads the return code, and then calls either a user-defined signal handler, or uses a cytpes API to raise an appropriate exception within the Python main thread
Rinse and repeat for new signals
That all being said, for anyone encountering this problem in the future, I've rolled a library called daemoniker that wraps both proper Unix daemonization and the above Windows strategy into a unified facade. The cross-platform API looks like this:
from daemoniker import Daemonizer
with Daemonizer() as (is_setup, daemonizer):
if is_setup:
# This code is run before daemonization.
do_things_here()
# We need to explicitly pass resources to the daemon; other variables
# may not be correct
is_parent, my_arg1, my_arg2 = daemonizer(
path_to_pid_file,
my_arg1,
my_arg2
)
if is_parent:
# Run code in the parent after daemonization
parent_only_code()
# We are now daemonized, and the parent just exited.
code_continues_here()

Two options come to mind:
Port your program into a windows service. You can probably share much of your code between the two implementations.
Does your program really use any daemon functionality? If not, you rewrite it as a simple server that runs in the background, manages communications through sockets, and perform its tasks. It will probably consume more system resources than a daemon would, but it would be quote platform independent.

In general the concept of a daemon is Unix specific, in particular expected behaviour with respect to file creation masks, process hierarchy, and signal handling.
You may find PEP 3143 useful wherein a proposed continuation of python-daemon is considered for Python 3.2, and many related daemonizing modules and implementations are discussed.

The reason it's unix only is that daemons are a Unix specific concept i.e a background process initiated by the os and usually running as a child of the root PID .
Windows has no direct equivalent of a unix daemon, the closest I can think of is a Windows Service.
There's a program called pythonservice.exe for windows . Not sure if it's supported on all versions of python though

Related

When should you use system(3) to run an external process?

C supplies the standard function system to run a subprocess using the shell, and many languages provide similar functions, like AWK, Perl (with a single argument), and PHP. Sometimes those functions are criticized as being unsuitable for general use, either on security grounds or because the shell is not portable or is not the one used interactively.
Some other languages seem to agree: they provide only a means of running a process without the shell, like Java (which tokenizes any single string argument itself) and Tcl. Python provides both a direct wrapper and a sophisticated replacement that can avoid using the shell and explicitly recommends the latter (as does the user community).
Certainly the shell is unnecessary complexity for many applications; running an external process at all can bring in issues of deadlock, orphan processes, ambiguous exit statuses, and file descriptor sharing and is unnecessary in cases like running mkdir or echo $VAR. However, assuming that system exists for a reason, when is it the right tool to use?

Even assuming a use case for which it's appropriate to run an external process and in particular to run one via the shell (without being able to filter output as with popen), for C and Python (that uses the actual C system(3)) there are additional caveats. POSIX specifies additional behavior for system: it ignores SIGINT and SIGQUIT and blocks SIGCHLD during its execution. The rationale is that the user (who can send SIGINT and SIGQUIT from the terminal) is interacting with the subprocess, not the parent, during its execution, and that system must handle the SIGCHLD for its child process without the application's interference.
This directly implies the answer to the question: it is appropriate to use system only when
The user has directly asked for a particular shell command to be executed (e.g., with ! in less), and
The application need not react to any other child process exiting during this time (e.g, it should not be multithreaded).
If #1 is not satisfied, the user is likely to send a terminal signal expecting it to kill the whole process and have it kill only the (unexpected if not invisible) child. The Linux man pages caution particularly about using it in a loop that the user cannot then interrupt. It is possible to notice that a child has exited with a signal and reraise it, but this is unreliable because some programs (e.g., Python) exit upon receiving certain signals rather than reraising it to indicate why they exited—and because the shell (mandated by system!) conflates exit statuses with signal-kill statuses.
In Python the error-handling problems are compounded by the fact that os.system follows the C exit-status (read: error code) convention instead of reporting failure as an exception, inviting the user to ignore the exit status of the child.

The answer is simple (in theory), because it's the same answer that applies to many other programming questions: it's appropriate to use system() when it makes the programmer's life easier, and makes the user's life no harder.
Spotting when this is true, however, requires considerable judgement, and probably we won't always get it right. But, again, that's true of many judgement calls in programming.
Since most shells are written in C, there's no reason in principle why anything done using system() can't be done without it. However, sometimes it requires a whole heap of coding to do what can be done in one line by invoking a shell. The same applies to popen() which, I guess, raises exactly the same kinds of questions.
Using system() raises portability, thread safety, and signal-management concerns.
My experience, unfortunately, is that the situations where system() gives the most benefit (to the programmer) are precisely the ones where it will be least portable.
Sometimes concerns like this will suggest a different approach, and sometimes they won't matter -- it depends on the application.

Interprocess communication with a modified python interpreter

TL;DR: How can I spawn a different python interpreter (from within python) and create a communication channel between the parent and child when stdin/stdout are unavailable?
I would like my python script to execute a modified python interpreter and through some kind of IPC such as multiprocessing.Pipe communicate with the script that interpreter runs.
Lets say I've got something similar to the following:
subprocess.Popen(args=["/my_modified_python_interpreter.exe",
"--my_additional_flag",
"my_python_script.py"])
Which works fine and well, executes my python script and all.
I would now like to set up some kind of interprocess communication with that modified python interpreter.
Ideally, I would like to share something similar to one of the returned values from multiprocessing.Pipe(), however I will need to share that object with the modified python process (and I suspect multiprocessing.Pipe won't handle that well even if I do that).
Although sending text and binary will be sufficient (I don't need to share python objects or anything), I do need this to be functional on all major OSes (windows, Linux, Mac).
Some more use-case/business explanation
More specifically, the modified interpreter is the IDAPython interpreter that is shipped with IDA to allow scripting within the IDA tool.
Unfortunately, since stdio is already heavily used for the existing user interface functionalities (provided by IDA), I cannot use stdin/stdout for the communication.
I'm searching for possibilities that are better than the one's I thought of:
Use two (rx and tx channels) hard-disk files and pass paths to both as the arguments.
Use a local socket and pass a path as an argument.
Use a memory mapped file and the tagname on windows and some other sync method on other OSes.

After some tinkering with the multiprocessing.Pipe function and the multiprocesing.Connection objects it returns, I realized that serialization of Connection objects is far simpler that I originally thought.
A Connection object has three descripting properties:
fileno - A handle. An arbitrary file descriptor on Unix and a socket on windows.
readable - A boolean controlling whether Connection object can be read.
writable - A boolean controlling whether Connection object can be written.
All three properties are accessible as object attributes and are controllable through the Connection class constructor.
It appears that if:
The process calling Pipe spawns a child process and shares the connection.fileno() number.
The child process creates a Connection object using that file descriptor as the handle.
Both interpreters implement the Connection object roughly the same (And this is the risky part, I guess).
It is possible to Connection.send and Connection.recv between those two processes although they do not share the same interpreter build and the multiprocessing module was not actually used to instantiate the child process.
EDIT:
Please note the Connection class is available as multiprocessing.connection.Connection in python3 and as _multiprocessing.Connection in python2 (which might suggest it's usage is discouraged. YMMV)

Going with the other answer of mine turned out to be a mistake. Because of how handles are inherited in python2 on Windows I couldn't get the same solution to work on Windows machines. I ended up using the far superior Listener and Client interfaces also found in the multiprocessing module.
This question of mine discusses that mistake.

Why is Python's subprocess' popen so different between unix and windows?

I am trying to write cross-platform code in Python. The code should be spawning new shells and run code.
This lead me to look at Python's subprocess tool and in particular its Popen part. So I read through the documentation for this class Popen doc and find too many "if on Unix/if on Windows" statements. Not very cross-platform, unless I have misunderstood the doc.
What is going on? I understand that the two operating systems are different, but really, there is no way to write a common interface? I mean, the same arguments "windows is different than unix" can be applied to os, system, etc., and they all seem 100 % cross-platform.

The problem is that process management is something deeply engrained in the operating system and differs greatly not only in the implementation but often even in the basic functionality.
It's actually often rather easy to abstract code in for example the os class. Both C libraries, be it *nix or Windows, implement reading files as an I/O stream, so you can even write rather low level file operation functions which work the same in Windows and *nix.
But processes differ greatly. In *nix for example processes are all hierarchical, every process has a parent and all processes go back to the init system running under PID 1. A new process gets created by forking itself, checking if it's the parent or the child and then continuing accordingly.
In Windows processes are strictly non-hierarchical and get created by the CreateProcess () system call, for which you need special privileges.
There a good deal more differences, these were just two examples, but I hope it shows that implementing a platform independent process library is a very daunting task.

Correct daemon behaviour (from PEP 3143) explained

I have some tasks [for my RPi] in Python that involve a lot of sleeping: do something that takes a second or two or three, then go wait for several minutes or hours.
I want to pass control back to the OS (Linux) in that sleep time. For this, I should daemonise those tasks. One way is by using Python's Standard daemon process library.
But daemons aren't so easy to understand. As per the Rationale paragraph of PEP 3143, a well behaved daemon should do the following.
Close all open file descriptors.
Change current working directory.
Reset the file access creation mask.
Run in the background.
Disassociate from process group.
Ignore terminal I/O signals.
Disassociate from control terminal.
Don't reacquire a control terminal.
Correctly handle the following circumstances:
Started by System V init process.
Daemon termination by SIGTERM signal.
Children generate SIGCLD signal.
For a Linux/Unix novice like me, some of this is hardly an explanation. But I want to know why I do what I do. So what is the rationale behind this rationale?

PEP 3142 took these requirements from Unix Network Programming ('UNP') by the late W. Richard Stevens. The explanation below is quoted or summarised from that book. It's not so easily found online, and it may be illegal to download. So I borrowed it from the library. Pages referred to are in the second edition, Volume 1 (1998). (The PEP refers to the first edition, 1990.)
Close all open file descriptors.
"We close any open descriptors inherited from the process that executed the daemon (ie the shell). [..] Some daemons open /dev/null for reading and writing and duplicate the descriptor to standard input, standard output and standard error."
(This 'Howdy World' Python daemon demonstrates this.)
"This guarantees that the common descriptors are open, and a read from any of these descriptors returns 0 (End Of File) and the kernel just discards anything written to any of these three descriptors. The reason for opening these descriptors is so that any library function called by the daemon that assumes it can read from standard input or write to standard output or standard error, will not fail. Alternately, some daemons open a log file that they will write to while running and duplicate its descriptor to standard output and standard error". (UNP p. 337)
Change current working directory
"A printer daemon might change to the printer's spool directory, where it does all its work. [...] The daemon could have been started anywhere in the filesystem, and if it remains there, that filesystem cannot be unmounted." (UNP p 337)
Why would you want to unmount a filesystem? Two reasons:
1. You want to separate (and be able mount and unmount) directories that can fill up with user data from directories dedicated to the OS.
2. If you start a daemon from, say, a USB-stick, you want to be able to unmount that stick without interfering with the daemon.
Reset the file access creation mask.
"So that if the daemon creates its own files, permission bits in the inherited file mode creation mask do not affect the permission bits of the new files." (UNP, p 337)
Run in the background.
By definition,
"a daemon is a process that runs in the background and is independent of control from all terminals". (UNP p 331)
Disassociate from process group.
In order to understand this, you need to understand what a process group is, and that means you need to know what fork does.
What fork does
fork is the only way (in Unix) to create a new process. (in Linux, there is also clone). Key in understanding fork is that it returns twice when called (once): once in the calling process (= parent) with the process ID of the newly created process (= child), and once in the child. "All descriptors known by the parent when forking, are shared with the child when fork returns." (UNP p 102).
When a process wants to execute another program, it creates a new process by calling fork, which creates a copy of itself. Then one of them (usually the child) calls the new program. (UNP, p 102)
Why disassociate from process group
The point is that a session leader may acquire a controlling terminal. A daemon should never do this, it must stay in the background. This is achieved by calling fork twice: the parent forks to create a child, the child forks to create a grandchild. Parent and child are terminated, but grandchild remains. But because it's a grandchild, it's not a session leader, and therefor can't acquire a controlling terminal. (Summarised from UNP par 12.4 p 335)
The double fork is discussed in more detail here, and in the comments below.
Ignore terminal I/O signals.
"Signals generated from terminal keys must not affect any daemons started from that terminal earlier". (UNP p. 331)
Disassociate from control terminal and don't reacquire a control terminal.
By now, the reasons are obvious:
"If the daemon is started from a terminal, we want to be able to use that terminal for other tasks at a later time. For example, if we start the daemon from a terminal, log off the terminal, and someone else logs in on that terminal, we do not want any daemon error messages appearing during the next user's terminal session." (UNP p 331)
Correctly handle the following circumstances:
Started by System V init process
A daemon should be launchable at boot time, obviously.
Daemon termination by SIGTERM signal
SIGTERM means Signal Terminate. At shutdown, the init process normally sends SIGTERM to all processess, waits usually 5 to 20 seconds to give them time to clean up and terminate. (UNP, p 135) Also, a child can send SIGTERM to its parent, when its parent should stop doing what it's doing. (UNP p 408)
Children generate SIGCLD signal
Stevens discusses SIGCHLD, not SIGCLD. The difference between them isn't important for understanding daemon behaviour. If a child terminates, it sends SIGCHLD to it's parent. If a parent doesn't catch it, the child becomes a zombie (UNP p 118). Oh what fun.
On a final note, when I started to find answers to my question in UNP, it soon struck me I really should read more of it. It's 900+ (!) pages, from 1998 (!) but I believe the concepts and the explanations in UNP stand the test of time, gloriously. Stevens not only knew very well what he was talking about, he also understood what was difficult about it, and made it easier to understand. That's really rare.

How do I run long term (infinite) Python processes?

I've recently started experimenting with using Python for web development. So far I've had some success using Apache with mod_wsgi and the Django web framework for Python 2.7. However I have run into some issues with having processes constantly running, updating information and such.
I have written a script I call "daemonManager.py" that can start and stop all or individual python update loops (Should I call them Daemons?). It does that by forking, then loading the module for the specific functions it should run and starting an infinite loop. It saves a PID file in /var/run to keep track of the process. So far so good. The problems I've encountered are:
Now and then one of the processes will just quit. I check ps in the morning and the process is just gone. No errors were logged (I'm using the logging module), and I'm covering every exception I can think of and logging them. Also I don't think these quitting processes has anything to do with my code, because all my processes run completely different code and exit at pretty similar intervals. I could be wrong of course. Is it normal for Python processes to just die after they've run for days/weeks? How should I tackle this problem? Should I write another daemon that periodically checks if the other daemons are still running? What if that daemon stops? I'm at a loss on how to handle this.
How can I programmatically know if a process is still running or not? I'm saving the PID files in /var/run and checking if the PID file is there to determine whether or not the process is running. But if the process just dies of unexpected causes, the PID file will remain. I therefore have to delete these files every time a process crashes (a couple of times per week), which sort of defeats the purpose. I guess I could check if a process is running at the PID in the file, but what if another process has started and was assigned the PID of the dead process? My daemon would think that the process is running fine even if it's long dead. Again I'm at a loss just how to deal with this.
Any useful answer on how to best run infinite Python processes, hopefully also shedding some light on the above problems, I will accept
I'm using Apache 2.2.14 on an Ubuntu machine.
My Python version is 2.7.2

I'll open by stating that this is one way to manage a long running process (LRP) -- not de facto by any stretch.
In my experience, the best possible product comes from concentrating on the specific problem you're dealing with, while delegating supporting tech to other libraries. In this case, I'm referring to the act of backgrounding processes (the art of the double fork), monitoring, and log redirection.
My favorite solution is http://supervisord.org/
Using a system like supervisord, you basically write a conventional python script that performs a task while stuck in an "infinite" loop.
#!/usr/bin/python
import sys
import time
def main_loop():
while 1:
# do your stuff...
time.sleep(0.1)
if __name__ == '__main__':
try:
main_loop()
except KeyboardInterrupt:
print >> sys.stderr, '\nExiting by user request.\n'
sys.exit(0)
Writing your script this way makes it simple and convenient to develop and debug (you can easily start/stop it in a terminal, watching the log output as events unfold). When it comes time to throw into production, you simply define a supervisor config that calls your script (here's the full example for defining a "program", much of which is optional: http://supervisord.org/configuration.html#program-x-section-example).
Supervisor has a bunch of configuration options so I won't enumerate them, but I will say that it specifically solves the problems you describe:
Backgrounding/Daemonizing
PID tracking (can be configured to restart a process should it terminate unexpectedly)
Log normally in your script (stream handler if using logging module rather than printing) but let supervisor redirect to a file for you.

You should consider Python processes as able to run "forever" assuming you don't have any memory leaks in your program, the Python interpreter, or any of the Python libraries / modules that you are using. (Even in the face of memory leaks, you might be able to run forever if you have sufficient swap space on a 64-bit machine. Decades, if not centuries, should be doable. I've had Python processes survive just fine for nearly two years on limited hardware -- before the hardware needed to be moved.)
Ensuring programs restart when they die used to be very simple back when Linux distributions used SysV-style init -- you just add a new line to the /etc/inittab and init(8) would spawn your program at boot and re-spawn it if it dies. (I know of no mechanism to replicate this functionality with the new upstart init-replacement that many distributions are using these days. I'm not saying it is impossible, I just don't know how to do it.)
But even the init(8) mechanism of years gone by wasn't as flexible as some would have liked. The daemontools package by DJB is one example of process control-and-monitoring tools intended to keep daemons living forever. The Linux-HA suite provides another similar tool, though it might provide too much "extra" functionality to be justified for this task. monit is another option.

I assume you are running Unix/Linux but you don't really say. I have no direct advice on your issue. So I don't expect to be the "right" answer to this question. But there is something to explore here.
First, if your daemons are crashing, you should fix that. Only programs with bugs should crash. Perhaps you should launch them under a debugger and see what happens when they crash (if that's possible). Do you have any trace logging in these processes? If not, add them. That might help diagnose your crash.
Second, are your daemons providing services (opening pipes and waiting for requests) or are they performing periodic cleanup? If they are periodic cleanup processes you should use cron to launch them periodically rather then have them run in an infinite loop. Cron processes should be preferred over daemon processes. Similarly, if they are services that open ports and service requests, have you considered making them work with INETD? Again, a single daemon (inetd) should be preferred to a bunch of daemon processes.
Third, saving a PID in a file is not very effective, as you've discovered. Perhaps a shared IPC, like a semaphore, would work better. I don't have any details here though.
Fourth, sometimes I need stuff to run in the context of the website. I use a cron process that calls wget with a maintenance URL. You set a special cookie and include the cookie info in with wget command line. If the special cookie doesn't exist, return 403 rather than performing the maintenance process. The other benefit here is login to the database and other environmental concerns of avoided since the code that serves normal web pages are serving the maintenance process.
Hope that gives you ideas. I think avoiding daemons if you can is the best place to start. If you can run your python within mod_wsgi that saves you having to support multiple "environments". Debugging a process that fails after running for days at a time is just brutal.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.