Which is the correct function to call to exit the child process after os.fork()?
The documentation for os._exit() states:
The standard way to exit is sys.exit(n).
_exit() should normally only be used in the child process after a fork().
It does not say whether it's acceptable to terminate the child process using sys.exit(). So:
Is it?
Are there any potential side effects of doing so?
The unix way is that if you are a child of a fork then you call _exit. The main difference between exit and _exit is that exit tidies up more - calls the atexit handlers, flushes stdio etc, whereas _exit does the minimum amount of stuff in userspace, just getting the kernel to close all its files etc.
This translates pretty directly into the python world with sys.exit doing what exit does and doing more of the python interpreter shutdown, where os._exit does the minimum possible.
If you are a child of fork and you call exit rather than _exit then you may end up calling exit handlers that the parent will call again when it exits causing undefined behaviour.
Part of the documentation on os._exit(n) you did not cite is
Exit the process with status n, without calling cleanup handlers, flushing stdio buffers, etc.
So, how I'm reading this, you should use os._exit() as long as you share file handlers (so they will be close()'d by another process and you take care of flushing the buffers yourself (if it matters in your case). Without shared resources (like in "files") - it doesn't matter.
So if your child processes are computation-only, and are fed raw data (not resource handlers), then it's safe to use exit().
Related
Based on the Python documentation, daemon threads are threads that die once the main thread dies. This seems to be the complete opposite behavior of daemon processes which involve creating a child process and terminating the parent process in order to have init take over the child process (aka killing the parent process does NOT kill the child process).
So why do daemon threads die when the parent dies, is this a misnomer? I would think that "daemon" threads would keep running after the main process has been terminated.
It's just names meaning different things in different contexts.
In case you are not aware, like threading.Thread, multiprocessing.Process also can be flagged as "daemon". Your description of "daemon processes" fits to Unix-daemons, not to Python's daemon-processes.
The docs also have a section about Process.daemon:
... Note that a daemonic process is not allowed to create child processes.
Otherwise a daemonic process would leave its children orphaned if it
gets terminated when its parent process exits. Additionally, these are
not Unix daemons or services, they are normal processes that will be
terminated (and not joined) if non-daemonic processes have exited.
The only thing in common between Python's daemon-processes and Unix-daemons (or Windows "Services") is that you would use them for background-tasks
(for Python: only an option for tasks which don't need proper clean up on shutdown, though).
Python imposes it's own abstraction layer on top of OS-threads and processes. The daemon-attribute for Thread and Process is about this OS-independent, Python-level abstraction.
At the Python-level, a daemon-thread is a thread which doesn't get joined (awaited to exit voluntarily) when the main-thread exits and a daemon-process is a process which gets terminated (not joined) when the parent-process exits. Daemon-threads and processes both experience the same behavior in that their natural exit is not awaited in case the main or parent-process is shutting down. That's all.
Note that Windows doesn't even have the concept of "related processes" like Unix, but Python implements this relationship of "child" and "parent" in a cross-platform manner.
I would think that "daemon" threads would keep running after the main
process has been terminated.
A thread cannot exist outside of a process. A process always hosts and gives context to at least one thread.
I have read most of the similar questions in stackoverflow, but none see to solve my problem. I use ctypes to call a function from dll file. Therefore, I can't edit the source codes of the dll file to add any "end looping" conditions. Also, this function may last long (like some printing command). I need to design a "halt" command in case that something of emergency happens while printing is processed. The only way I can do is to kill the thread.
It is never good to forcibly kill a thread. Your program should be designed to cleanly exit from threads.
You can mark it as "daemon" before starting it. If you exit the main thread it will not wait on daemonized threads.
Terminating a thread can still be done in two ways. You can asynchronously raise a Python exception in a thread, via https://docs.python.org/2/c-api/init.html#c.PyThreadState_SetAsyncExc (as stated, this requires building a C module or using ctypes to make it work). The other approach on Windows is to call the Windows API TerminateThread():
TerminateThread is used to cause a thread to exit. When this occurs,
the target thread has no chance to execute any user-mode code. DLLs
attached to the thread are not notified that the thread is
terminating. The system frees the thread's initial stack.
[...]
TerminateThread is a dangerous function that should only be used in
the most extreme cases. You should call TerminateThread only if you
know exactly what the target thread is doing, and you control all of
the code that the target thread could possibly be running at the time
of the termination. For example, TerminateThread can result in the
following problems: ...
I think this should also be doable using ctypes.
You cannot safely terminate a thread without its cooperation. Threads are not isolated within a process, so unsafely terminating a thread contaminates the process. Please, don't go down this road.
If you need this kind of isolation, you need a process. You can safely terminate a process without its cooperation, though it may leave system objects (such as files) that the process was working on in an intermediate state. In your case, that may mean a print job half-done and a page halfway in the printer. Or it may mean temporary files that don't get removed.
In a multi-threaded Python process I have a number of non-daemon threads, by which I mean threads which keep the main process alive even after the main thread has exited / stopped.
My non-daemon threads hold weak references to certain objects in the main thread, but when the main thread ends (control falls off the bottom of the file) these objects do not appear to be garbage collected, and my weak reference finaliser callbacks don't fire.
Am I wrong to expect the main thread to be garbage collected? I would have expected that the thread-locals would be deallocated (i.e. garbage collected)...
What have I missed?
Supporting materials
Output from pprint.pprint( threading.enumerate() ) showing the main thread has stopped while others soldier on.
[<_MainThread(MainThread, stopped 139664516818688)>,
<LDQServer(testLogIOWorkerThread, started 139664479889152)>,
<_Timer(Thread-18, started 139663928870656)>,
<LDQServer(debugLogIOWorkerThread, started 139664437925632)>,
<_Timer(Thread-17, started 139664463103744)>,
<_Timer(Thread-19, started 139663937263360)>,
<LDQServer(testLogIOWorkerThread, started 139664471496448)>,
<LDQServer(debugLogIOWorkerThread, started 139664446318336)>]
And since someone always asks about the use-case...
My network service occasionally misses its real-time deadlines (which causes a total system failure in the worst case). This turned out to be because logging of (important) DEBUG data would block whenever the file-system has a tantrum. So I am attempting to retrofit a number of established specialised logging libraries to defer blocking I/O to a worker thread.
Sadly the established usage pattern is a mix of short-lived logging channels which log overlapping parallel transactions, and long-lived module-scope channels which are never explicitly closed.
So I created a decorator which defers method calls to a worker thread. The worker thread is non-daemon to ensure that all (slow) blocking I/O completes before the interpreter exits, and holds a weak reference to the client-side (where method calls get enqueued). When the client-side is garbage collected the weak reference's callback fires and the worker thread knows no more work will be enqueued, and so will exit at its next convenience.
This seems to work fine in all but one important use-case: when the logging channel is in the main thread. When the main thread stops / exits the logging channel is not finalised, and so my (non-daemon) worker thread lives on keeping the entire process alive.
It's a bad idea for your main thread to end without calling join on all non-daemon threads, or to make any assumptions about what happens if you don't.
If you don't do anything very unusual, CPython (at least 2.0-3.3) will cover for you by automatically calling join on all non-daemon threads as pair of _MainThread._exitfunc. This isn't actually documented, so you shouldn't rely on it, but it's what's happening to you.
Your main thread hasn't actually exited at all; it's blocking inside its _MainThread._exitfunc trying to join some arbitrary non-daemon thread. Its objects won't be finalized until the atexit handler is called, which doesn't happen until after it finishes joining all non-daemon threads.
Meanwhile, if you avoid this (e.g., by using thread/_thread directly, or by detaching the main thread from its object or forcing it into a normal Thread instance), what happens? It isn't defined. The threading module makes no reference to it at all, but in CPython 2.0-3.3, and likely in any other reasonable implementation, it falls to the thread/_thread module to decide. And, as the docs say:
When the main thread exits, it is system defined whether the other threads survive. On SGI IRIX using the native thread implementation, they survive. On most other systems, they are killed without executing try ... finally clauses or executing object destructors.
So, if you manage to avoid joining all of your non-daemon threads, you have to write code that can handle both having them hard-killed like daemon threads, and having them continue running until exit.
If they do continue running, at least in CPython 2.7 and 3.3 on POSIX systems, that the main thread's OS-level thread handle, and various higher-level Python objects representing it, may be still retained, and not get cleaned up by the GC.
On top of that, even if everything were released, you can't rely on the GC ever deleting anything. If your code depends on deterministic GC, there are many cases you can get away with it in CPython (although your code will then break in PyPy, Jython, IronPython, etc.), but at exit time is not one of them. CPython can, and will, leak objects at exit time and let the OS sort 'em out. (This is why writable files that you never close may lose the last few writes—the __del__ method never gets called, and therefore there's nobody to tell them to flush, and at least on POSIX the underlying FILE* doesn't automatically flush either.)
If you want something to be cleaned up when the main thread finishes, you have to use some kind of close function rather than relying on __del__, and you have to make sure it gets triggered via a with block around the main block of code, an atexit function, or some other mechanism.
One last thing:
I would have expected that the thread-locals would be deallocated (i.e. garbage collected)...
Do you actually have thread locals somewhere? Or do you just mean locals and/or globals that are only accessed in one thread?
I need to spawn a background process in django, the view returns immediately, the background process continues make some changes, then update the db. This is done by os.spawnl() function to call a separate .py file.
The problem is after the background process is done, it becames a zombie function [python] <defunct>.
How do I avoid that? I followed this and this example but I still got the child process as zombie after the django render process.
I want to take this chance to practice my *nix process management skills so please do me a favor, don't give me Celery or other mq/async task solutions, and I hate dependencies.
This got to long for a comment-
The wait syscall (which os.wait is a wrapper for) reaps exit codes/pids from dead processes. You will want to os.wait in the process that is a generation above your zombie processes; the parent of the zombies processes. The parent processes will receive a SIGCHLD signal when one of its child processes die. If you insist on doing all of this yourself, you will need to install a signal handler to trap for SIGCHLD and in the signal handler call os.wait. Read some documentation on unix process handling and the Python documentation on the os module as there are variations of the os.wait function that will be non-blocking which maybe helpful.
import signal
signal.signal(signal.SIGCHLD, lambda _x,_y: os.wait())
I had a similar problem. I used active_children() from multiprocessing module.
import multiprocessing
# somewhere in middleware or where appropriate call
active_children()
Please help me in clarifying the concept of these two python statements in terms of difference in functionality:
sys.exit(0)
os._exit(0)
According to the documentation:
os._exit():
Exit the process with status n, without calling cleanup handlers, flushing stdio buffers, etc.
Note The standard way to exit is sys.exit(n). _exit() should normally only be used in the child process after a fork().
os._exit calls the C function _exit() which does an immediate program
termination. Note the statement "can never return".
sys.exit() is identical to raise SystemExit(). It raises a Python
exception which may be caught by the caller.
Original post: http://bytes.com/topic/python/answers/156121-os-_exit-vs-sys-exit
Excerpt from the book "The linux Programming Interface":
Programs generally don’t call _exit() directly, but instead call the exit() library function,
which performs various actions before calling _exit().
Exit handlers (functions registered with at_exit() and on_exit()) are called, in
reverse order of their registration
The stdio stream buffers are flushed.
The _exit() system call is invoked, using the value supplied in status.
Could someone expand on why _exit() should normally only be used in the child process after a fork()?
Instead of calling exit(), the child can call _exit(), so that it doesn’t flush stdio
buffers. This technique exemplifies a more general principle: in an application
that creates child processes, typically only one of the processes (most often the
parent) should terminate via exit(), while the other processes should terminate
via _exit(). This ensures that only one process calls exit handlers and flushes
stdio buffers, which is usually desirable