I know os.setsid() is to change the process(forked) group id to itself, but why we need it?
I can see some answer from Google is:
To keep the child process running while the parent process exit.
But according to my test below, without os.setsid() the child process won't exit as well even if the parent process exit(or being killed). So why we need to add os.setsid()? Thanks.
import os
import time
import sys
mainPid = os.getpid()
print("Main Pid: %s" % mainPid)
pid = os.fork()
if pid > 0:
time.sleep(3)
print("Main process quit")
sys.exit(0)
#os.setsid()
for x in range(1, 10):
print("spid: %s, ppid: %s pgid: %s" % (os.getpid(), os.getppid(), os.getpgid(0)))
time.sleep(1)
Calling setsid is usually one of the steps a process goes through when becoming a so called daemon process. (We are talking about Linux/Unix OS).
With setsid the association with the controlling terminal breaks. This means that the process will be NOT affected by a logout.
There are other way how to survive a logout, but the purpose of this 'daemonizing' process is to create a background process as independent from the outside world as possible.
That's why all inherited descriptors are closed; cwd is set to an appropriate directory, often the root directory; and the process leaves the session it was started from.
A double fork approach is generally recommended. At each fork the parent exits and the child continues. Actually nothing changes except the PID, but that's exactly what is needed here.
First fork before the setsid makes sure the process is not a process group leader. That is required for a succesfull setsid.
The second fork after the setsid makes sure that a new association with a controlling terminal won't be started merely by opening a terminal device.
NOTE: when a daemon process is started from systemd, the systemd can arrange everything described above so the process does not have to.
Well, double fork to daemonize is a good example. However, It's better to understand what is process group and session.
Session ID (SID)
This is just the PID of the session leader. If PID == SID, then this process is a session leader.
Sessions and process groups are just ways to treat a number of related processes as a unit. All the members of a process group always belong to the same session, but a session may have multiple process groups.
Normally, a shell will be a session leader, and every pipeline executed by that shell will be a process group. This is to make it easy to kill the children of a shell when it exits. (See exit(3) for the gory details.)
Basically, if you log into a machine, your shell starts a session. If you want to keep your process running even when you log out, you should start a new session for the child.
The difference with double forked process is that you can still attach a control terminal to that process since it's a session leader, whereas the daemon process created by double fork can not be attached to the terminal anymore.
In some cases, the child process will be able to continue running even after the parent exits, but this is not foolproof. The child will also exit when the parent exits in some situations.
As of Python 3.2, you can use subprocess.Popen() and pass start_new_session=True to accomplish fully detach the child process from the parent.
The docs state:
If start_new_session is true the setsid() system call will be made in the child process prior to the execution of the subprocess. (POSIX only)
https://docs.python.org/3/library/subprocess.html#subprocess.Popen
Related
i am trying to code socket server with fork in python. somehow a new fork will be created when a client is connected and this fork process will handle the connection including send/receive.
i ran this script on Linux centOS and monitor resources with htop/top to see how many forks (task) are shown. the
problem is when i kill some fork by using os._exit(0) htop won't be changed (naturally it has to be decreased by killing forks) and when i close python script every thing will be back to normal ( Ram usage and tasks ).
so what i have to do that when i kill some fork by using os._exit(0), it effects on htop in other hand releases all resources and do not wait until its own parent is killed ?
here it's the code to create forks:
def test(sock):
//handle socket then return
for i in range (1000):
sock,addr=socket.accept()
pid=os.fork()
if pid==0:
test(sock)
os._exit(0)
elif pid !=-1:
os.waitpid(-1, os.WNOHANG)
The parent process needs to wait for the child in order for the child process' resources to be released. Until then the process still exists in a "zombie" state, and it will still appear in ps and top etc.
You can call one of os.wait(), os.waitpid(), os.wait3(), or os.wait4().
os.wait3() with the os.WNOHANG option might be most useful to you as it will wait for any child process and the parent will not block until a child terminates (or it's state changes - wait will return child processes that have been stopped or restarted too).
More details on the underlying system calls can be found in the Linux man page: man 2 wait.
I'm wondering if this is the correct way to execute a system process and detach from parent, though allowing the parent to exit without creating a zombie and/or killing the child process. I'm currently using the subprocess module and doing this...
os.setsid()
os.umask(0)
p = subprocess.Popen(['nc', '-l', '8888'],
cwd=self.home,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
os.setsid() changes the process group, which I believe is what lets the process continue running when it's parent exits, as it no longer belongs to the same process group.
Is this correct and also is this a reliable way of performing this?
Basically, I have a remote control utility that communicate through sockets and allows to start processes remotely, but I have to ensure that if the remote control dies, the processes it started continue running unaffected.
I was reading about double-forks and not sure if this is necessary and/or subprocess.POpen close_fds somehow takes care of that and all that's needed is to change the process group?
Thanks.
Ilya
For Python 3.8.x, the process is a bit different. Use the start_new_session parameter available since Python 3.2:
import shlex
import subprocess
cmd = "<full filepath plus arguments of child process>"
cmds = shlex.split(cmd)
p = subprocess.Popen(cmds, start_new_session=True)
This will allow the parent process to exit while the child process continues to run. Not sure about zombies.
The start_new_session parameter is supported on all POSIX systems, i.e. Linux, MacOS, etc.
Tested on Python 3.8.1 on macOS 10.15.5
popen on Unix is done using fork. That means you'll be safe with:
you run Popen in your parent process
immediately exit the parent process
When the parent process exits, the child process is inherited by the init process (launchd on OSX) and will still run in the background.
The first two lines of your python program are not needed, this perfectly works:
import subprocess
p = subprocess.Popen(['nc', '-l', '8888'],
cwd="/",
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
I was reading about double-forks and not sure if this is necessary
This would be needed if your parent process keeps running and you need to protect your children from dying with the parent. This answer shows how this can be done.
How the double-fork works:
create a child via os.fork()
in this child call Popen() which launches the long running process
exit child: Popen process is inherited by init and runs in the background
Why the parent has to immediately exit? What happens if it doesn't exit immediately?
If you leave the parent running and the user stops the process e.g. via ctrl-C (SIGINT) or ctrl-\ (SIGQUIT) then it would kill both the parent process and the Popen process.
What if it exits one second after forking?
Then, during this 1s period your Popen process is vulnerable to ctrl-c etc. If you need to be 100% sure then use the double forking.
I am trying to implement a job queuing system like torque PBS on a cluster.
One requirement would be to kill all the subprocesses even after the parent has exited. This is important because if someone's job doesn't wait its subprocesses to end, deliberately or unintentionally, the subprocesses become orphans and get adopted by process init, then it will be difficult to track down the subprocesses and kill them.
However, I figured out a trick to work around the problem, the magic trait is the cpu affinity of the subprocesses, because all subprocesses have the same cpu affinity with their parent. But this is not perfect, because the cpu affinity can be changed deliberately too.
I would like to know if there are anything else that are shared by parent process and its offspring, at the same time immutable
The process table in Linux (such as in nearly every other operating system) is simply a data structure in the RAM of a computer. It holds information about the processes that are currently handled by the OS.
This information includes general information about each process
process id
process owner
process priority
environment variables for each process
the parent process
pointers to the executable machine code of a process.
Credit goes to Marcus Gründler
Non of the information available will help you out.
But you can maybe use that fact that the process should stop, when the parent process id becomes 1(init).
#!/usr/local/bin/python
from time import sleep
import os
import sys
#os.getppid() returns parent pid
while (os.getppid() != 1):
sleep(1)
pass
# now that pid is 1, we exit the program.
sys.exit()
Would that be a solution to your problem?
In one of my Django views, I am calling a python script and getting its pid with:
from subprocess import Popen
p = Popen(['python', 'script.py'])
mypid = p.pid
When trying to find out if the process still is running from another page, I use the following function on mypid (thanks to this question):
def doesProcessExist(pid):
if pid < 0:
return False
try:
os.kill(pid, 0)
except OSError, e:
return e.errno == errno.EPERM
else:
return True
No matter how long I wait, the process still shows up as running. The only thing that stops it, is if I spawn a new python script process with Popen. Is there anyway I can fix this? I am not sure if this is caused by Django not closing python properly after the script is finished or something else. In Ubuntu's process status manager, the process shows up as [python] <defunct>.
--
The problem is true for all script.py I have tried. I am currently using one as simple as:
from time import sleep
sleep(5)
Really, what you're doing is wrong. When you use a high-level wrapper like a subprocess.Popen, you need to manage the process through that object. Just having the PID elsewhere isn't enough to manage it.
If you insist on dealing in PIDs instead of Popen objects, then you should use the low-level APIs in os.
Fortunately, you're not doing anything complicated, like creating pipes to talk to the child process. So, you can just launch it with your favorite spawn variant, then wait for it with waitpid or one of its variants.
I'm assuming you're doing this all in a single-process web server. If you're using a forking web server, where the other page could be in a different process, even using PIDs won't work. The parent process has to reap the child, not some other arbitrary process. If you want to make that work, you'll have to make things more complicated, and you're really going to have to learn about the Unix process model before anyone can explain it to you.
What you see is a zombie process. It doesn't keep running. It can't. It is dead. The only thing that is left is some info that allows for related processes to retrieve its status.
To find out whether a subprocess is alive without blocking, call p.poll(). If it returns None then the process is still alive, otherwise you can safely forget about it (it is already reaped by .poll()).
subprocess module calls _cleanup() function that reaps zombie processes inside Popen() constructor. So normally your script won't create many zombie processes anyway.
To see a list of zombie processes:
import os
#NOTE: don't use Popen() here
print os.popen(r"ps aux | grep Z | grep -v grep").read(),
Processes in Unix stick around until the parent waits for them. calling wait on the object returned by thepopen will wait for the process to be done and will wait for it so it goes away. Until you do that it will exist as a zombie process See this message for info on getting the process to go away in the background while your web server runs without waiting for it in a foreground thread/view.
So, let's say that you do
p = subprocess.Popen(...)
At some point you need to call
p.wait()
I'm wondering if this is the correct way to execute a system process and detach from parent, though allowing the parent to exit without creating a zombie and/or killing the child process. I'm currently using the subprocess module and doing this...
os.setsid()
os.umask(0)
p = subprocess.Popen(['nc', '-l', '8888'],
cwd=self.home,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
os.setsid() changes the process group, which I believe is what lets the process continue running when it's parent exits, as it no longer belongs to the same process group.
Is this correct and also is this a reliable way of performing this?
Basically, I have a remote control utility that communicate through sockets and allows to start processes remotely, but I have to ensure that if the remote control dies, the processes it started continue running unaffected.
I was reading about double-forks and not sure if this is necessary and/or subprocess.POpen close_fds somehow takes care of that and all that's needed is to change the process group?
Thanks.
Ilya
For Python 3.8.x, the process is a bit different. Use the start_new_session parameter available since Python 3.2:
import shlex
import subprocess
cmd = "<full filepath plus arguments of child process>"
cmds = shlex.split(cmd)
p = subprocess.Popen(cmds, start_new_session=True)
This will allow the parent process to exit while the child process continues to run. Not sure about zombies.
The start_new_session parameter is supported on all POSIX systems, i.e. Linux, MacOS, etc.
Tested on Python 3.8.1 on macOS 10.15.5
popen on Unix is done using fork. That means you'll be safe with:
you run Popen in your parent process
immediately exit the parent process
When the parent process exits, the child process is inherited by the init process (launchd on OSX) and will still run in the background.
The first two lines of your python program are not needed, this perfectly works:
import subprocess
p = subprocess.Popen(['nc', '-l', '8888'],
cwd="/",
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
I was reading about double-forks and not sure if this is necessary
This would be needed if your parent process keeps running and you need to protect your children from dying with the parent. This answer shows how this can be done.
How the double-fork works:
create a child via os.fork()
in this child call Popen() which launches the long running process
exit child: Popen process is inherited by init and runs in the background
Why the parent has to immediately exit? What happens if it doesn't exit immediately?
If you leave the parent running and the user stops the process e.g. via ctrl-C (SIGINT) or ctrl-\ (SIGQUIT) then it would kill both the parent process and the Popen process.
What if it exits one second after forking?
Then, during this 1s period your Popen process is vulnerable to ctrl-c etc. If you need to be 100% sure then use the double forking.