exit and clean up python fork - python

i am trying to code socket server with fork in python. somehow a new fork will be created when a client is connected and this fork process will handle the connection including send/receive.
i ran this script on Linux centOS and monitor resources with htop/top to see how many forks (task) are shown. the
problem is when i kill some fork by using os._exit(0) htop won't be changed (naturally it has to be decreased by killing forks) and when i close python script every thing will be back to normal ( Ram usage and tasks ).
so what i have to do that when i kill some fork by using os._exit(0), it effects on htop in other hand releases all resources and do not wait until its own parent is killed ?
here it's the code to create forks:
def test(sock):
//handle socket then return
for i in range (1000):
sock,addr=socket.accept()
pid=os.fork()
if pid==0:
test(sock)
os._exit(0)
elif pid !=-1:
os.waitpid(-1, os.WNOHANG)

The parent process needs to wait for the child in order for the child process' resources to be released. Until then the process still exists in a "zombie" state, and it will still appear in ps and top etc.
You can call one of os.wait(), os.waitpid(), os.wait3(), or os.wait4().
os.wait3() with the os.WNOHANG option might be most useful to you as it will wait for any child process and the parent will not block until a child terminates (or it's state changes - wait will return child processes that have been stopped or restarted too).
More details on the underlying system calls can be found in the Linux man page: man 2 wait.

Related

Daemon threads vs daemon processes in Python

Based on the Python documentation, daemon threads are threads that die once the main thread dies. This seems to be the complete opposite behavior of daemon processes which involve creating a child process and terminating the parent process in order to have init take over the child process (aka killing the parent process does NOT kill the child process).
So why do daemon threads die when the parent dies, is this a misnomer? I would think that "daemon" threads would keep running after the main process has been terminated.
It's just names meaning different things in different contexts.
In case you are not aware, like threading.Thread, multiprocessing.Process also can be flagged as "daemon". Your description of "daemon processes" fits to Unix-daemons, not to Python's daemon-processes.
The docs also have a section about Process.daemon:
... Note that a daemonic process is not allowed to create child processes.
Otherwise a daemonic process would leave its children orphaned if it
gets terminated when its parent process exits. Additionally, these are
not Unix daemons or services, they are normal processes that will be
terminated (and not joined) if non-daemonic processes have exited.
The only thing in common between Python's daemon-processes and Unix-daemons (or Windows "Services") is that you would use them for background-tasks
(for Python: only an option for tasks which don't need proper clean up on shutdown, though).
Python imposes it's own abstraction layer on top of OS-threads and processes. The daemon-attribute for Thread and Process is about this OS-independent, Python-level abstraction.
At the Python-level, a daemon-thread is a thread which doesn't get joined (awaited to exit voluntarily) when the main-thread exits and a daemon-process is a process which gets terminated (not joined) when the parent-process exits. Daemon-threads and processes both experience the same behavior in that their natural exit is not awaited in case the main or parent-process is shutting down. That's all.
Note that Windows doesn't even have the concept of "related processes" like Unix, but Python implements this relationship of "child" and "parent" in a cross-platform manner.
I would think that "daemon" threads would keep running after the main
process has been terminated.
A thread cannot exist outside of a process. A process always hosts and gives context to at least one thread.

Why use os.setsid() in Python?

I know os.setsid() is to change the process(forked) group id to itself, but why we need it?
I can see some answer from Google is:
To keep the child process running while the parent process exit.
But according to my test below, without os.setsid() the child process won't exit as well even if the parent process exit(or being killed). So why we need to add os.setsid()? Thanks.
import os
import time
import sys
mainPid = os.getpid()
print("Main Pid: %s" % mainPid)
pid = os.fork()
if pid > 0:
time.sleep(3)
print("Main process quit")
sys.exit(0)
#os.setsid()
for x in range(1, 10):
print("spid: %s, ppid: %s pgid: %s" % (os.getpid(), os.getppid(), os.getpgid(0)))
time.sleep(1)
Calling setsid is usually one of the steps a process goes through when becoming a so called daemon process. (We are talking about Linux/Unix OS).
With setsid the association with the controlling terminal breaks. This means that the process will be NOT affected by a logout.
There are other way how to survive a logout, but the purpose of this 'daemonizing' process is to create a background process as independent from the outside world as possible.
That's why all inherited descriptors are closed; cwd is set to an appropriate directory, often the root directory; and the process leaves the session it was started from.
A double fork approach is generally recommended. At each fork the parent exits and the child continues. Actually nothing changes except the PID, but that's exactly what is needed here.
First fork before the setsid makes sure the process is not a process group leader. That is required for a succesfull setsid.
The second fork after the setsid makes sure that a new association with a controlling terminal won't be started merely by opening a terminal device.
NOTE: when a daemon process is started from systemd, the systemd can arrange everything described above so the process does not have to.
Well, double fork to daemonize is a good example. However, It's better to understand what is process group and session.
Session ID (SID)
This is just the PID of the session leader. If PID == SID, then this process is a session leader.
Sessions and process groups are just ways to treat a number of related processes as a unit. All the members of a process group always belong to the same session, but a session may have multiple process groups.
Normally, a shell will be a session leader, and every pipeline executed by that shell will be a process group. This is to make it easy to kill the children of a shell when it exits. (See exit(3) for the gory details.)
Basically, if you log into a machine, your shell starts a session. If you want to keep your process running even when you log out, you should start a new session for the child.
The difference with double forked process is that you can still attach a control terminal to that process since it's a session leader, whereas the daemon process created by double fork can not be attached to the terminal anymore.
In some cases, the child process will be able to continue running even after the parent exits, but this is not foolproof. The child will also exit when the parent exits in some situations.
As of Python 3.2, you can use subprocess.Popen() and pass start_new_session=True to accomplish fully detach the child process from the parent.
The docs state:
If start_new_session is true the setsid() system call will be made in the child process prior to the execution of the subprocess. (POSIX only)
https://docs.python.org/3/library/subprocess.html#subprocess.Popen

Kill a created subprocess and all processes created by it

What I want? Create a script that starts and kill a communication protocol
What I have?
I have a python script that opens a shell script, and this shell script initialize the protocol. When I kill the parent process, everything goes fine (but in the final project, the parent process will have to stay alive), but when I kill the subprocess, it become a zombie function, and my protocol keep running.
Problems I believe can be: I'm "killing" the shell script (not the protocol, that's what I want)
The line I start the shell script:
`protocolProcess = subprocess.Popen(["sh", arquivo], cwd = localDoArquivo) #inicia o protocolo`
protocolProcessPID = protocolProcess.pid #armazena o pid do protocolProcess
The line I kill the shell script: os.kill(protocolPID, signal.SIGTERM)
Well, that's it! If anyone can help me, I'll be very grateful
Zombie processes are processes that have not yet been reaped by the parent process.
The parent process will hold onto those process handlers until the end of time, or until it reads the process exit status, or itself is killed.
It sounds like the parent process needs to have a better handle on how it spawns and reaps it's children. Simply killing a child process is not enough to free a zombie process.

python Kill all subprocess even parent has exited

I am trying to implement a job queuing system like torque PBS on a cluster.
One requirement would be to kill all the subprocesses even after the parent has exited. This is important because if someone's job doesn't wait its subprocesses to end, deliberately or unintentionally, the subprocesses become orphans and get adopted by process init, then it will be difficult to track down the subprocesses and kill them.
However, I figured out a trick to work around the problem, the magic trait is the cpu affinity of the subprocesses, because all subprocesses have the same cpu affinity with their parent. But this is not perfect, because the cpu affinity can be changed deliberately too.
I would like to know if there are anything else that are shared by parent process and its offspring, at the same time immutable
The process table in Linux (such as in nearly every other operating system) is simply a data structure in the RAM of a computer. It holds information about the processes that are currently handled by the OS.
This information includes general information about each process
process id
process owner
process priority
environment variables for each process
the parent process
pointers to the executable machine code of a process.
Credit goes to Marcus Gründler
Non of the information available will help you out.
But you can maybe use that fact that the process should stop, when the parent process id becomes 1(init).
#!/usr/local/bin/python
from time import sleep
import os
import sys
#os.getppid() returns parent pid
while (os.getppid() != 1):
sleep(1)
pass
# now that pid is 1, we exit the program.
sys.exit()
Would that be a solution to your problem?

Python subprocess script keeps running after it is done

In one of my Django views, I am calling a python script and getting its pid with:
from subprocess import Popen
p = Popen(['python', 'script.py'])
mypid = p.pid
When trying to find out if the process still is running from another page, I use the following function on mypid (thanks to this question):
def doesProcessExist(pid):
if pid < 0:
return False
try:
os.kill(pid, 0)
except OSError, e:
return e.errno == errno.EPERM
else:
return True
No matter how long I wait, the process still shows up as running. The only thing that stops it, is if I spawn a new python script process with Popen. Is there anyway I can fix this? I am not sure if this is caused by Django not closing python properly after the script is finished or something else. In Ubuntu's process status manager, the process shows up as [python] <defunct>.
--
The problem is true for all script.py I have tried. I am currently using one as simple as:
from time import sleep
sleep(5)
Really, what you're doing is wrong. When you use a high-level wrapper like a subprocess.Popen, you need to manage the process through that object. Just having the PID elsewhere isn't enough to manage it.
If you insist on dealing in PIDs instead of Popen objects, then you should use the low-level APIs in os.
Fortunately, you're not doing anything complicated, like creating pipes to talk to the child process. So, you can just launch it with your favorite spawn variant, then wait for it with waitpid or one of its variants.
I'm assuming you're doing this all in a single-process web server. If you're using a forking web server, where the other page could be in a different process, even using PIDs won't work. The parent process has to reap the child, not some other arbitrary process. If you want to make that work, you'll have to make things more complicated, and you're really going to have to learn about the Unix process model before anyone can explain it to you.
What you see is a zombie process. It doesn't keep running. It can't. It is dead. The only thing that is left is some info that allows for related processes to retrieve its status.
To find out whether a subprocess is alive without blocking, call p.poll(). If it returns None then the process is still alive, otherwise you can safely forget about it (it is already reaped by .poll()).
subprocess module calls _cleanup() function that reaps zombie processes inside Popen() constructor. So normally your script won't create many zombie processes anyway.
To see a list of zombie processes:
import os
#NOTE: don't use Popen() here
print os.popen(r"ps aux | grep Z | grep -v grep").read(),
Processes in Unix stick around until the parent waits for them. calling wait on the object returned by thepopen will wait for the process to be done and will wait for it so it goes away. Until you do that it will exist as a zombie process See this message for info on getting the process to go away in the background while your web server runs without waiting for it in a foreground thread/view.
So, let's say that you do
p = subprocess.Popen(...)
At some point you need to call
p.wait()

Categories