im spawning a script that runs for a long time from a web app like this:
os.spawnle(os.P_NOWAIT, "../bin/producenotify.py", "producenotify.py", "xx",os.environ)
the script is spawned successfully and it runs, but till it gets over i am not able to free the port that is used by the web app, or in other words i am not able to restart the web app. how do i spawn off a process and make it completely independent of the web app?
this is on linux os.
As #mark clarified it's a Linux system, the script could easily make itself fully independent, i.e., a daemon, by following this recipe. (You could also do it in the parent after an os.fork and only then os.exec... the child process).
Edit: to clarify some details wrt #mark's comment on my answer: super-user privileges are not needed to "daemonize" a process as per the cookbook recipes, nor is there any need to change the current working directory (though the code in the recipe does do that and more, that's not the crucial part -- rather it's the proper logic sequence of fork, _exit and setsid calls). The various os.exec... variants that do not end in e use the parent process's environment, so that part is easy too -- see Python online docs.
To address suggestions made in others' comments and answers: I believe subprocess and multiprocessing per se don't daemonize the child process, which seems to be what #mark needs; the script could do it for itself, but since some code has to be doing forks and setsid, it seems neater to me to keep all of the spawning on that low-level plane rather than mix some high-level and some low-level code in the course of the operation.
Here's a vastly reduced and simplified version of the recipe at the above URL, tailored to be called in the parent to spawn a daemon child -- this way, the code can be used to execute non-Python executables just as well. As given, the code should meet the needs #mark explained, of course it can be tailored in many ways -- I strongly recommend reading the original recipe and its comments and discussions, as well as the books it recommends, for more information.
import os
import sys
def spawnDaemon(path_to_executable, *args)
"""Spawn a completely detached subprocess (i.e., a daemon).
E.g. for mark:
spawnDaemon("../bin/producenotify.py", "producenotify.py", "xx")
"""
# fork the first time (to make a non-session-leader child process)
try:
pid = os.fork()
except OSError, e:
raise RuntimeError("1st fork failed: %s [%d]" % (e.strerror, e.errno))
if pid != 0:
# parent (calling) process is all done
return
# detach from controlling terminal (to make child a session-leader)
os.setsid()
try:
pid = os.fork()
except OSError, e:
raise RuntimeError("2nd fork failed: %s [%d]" % (e.strerror, e.errno))
raise Exception, "%s [%d]" % (e.strerror, e.errno)
if pid != 0:
# child process is all done
os._exit(0)
# grandchild process now non-session-leader, detached from parent
# grandchild process must now close all open files
try:
maxfd = os.sysconf("SC_OPEN_MAX")
except (AttributeError, ValueError):
maxfd = 1024
for fd in range(maxfd):
try:
os.close(fd)
except OSError: # ERROR, fd wasn't open to begin with (ignored)
pass
# redirect stdin, stdout and stderr to /dev/null
os.open(os.devnull, os.O_RDWR) # standard input (0)
os.dup2(0, 1)
os.dup2(0, 2)
# and finally let's execute the executable for the daemon!
try:
os.execv(path_to_executable, args)
except Exception, e:
# oops, we're cut off from the world, let's just give up
os._exit(255)
You can use the multiprocessing library to spawn processes. A basic example is shown here:
from multiprocessing import Process
def f(name):
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
Related
I have a python script that does this:
p = subprocess.Popen(pythonscript.py, stdin=PIPE, stdout=PIPE, stderr=PIPE, shell=False)
theStdin=request.input.encode('utf-8')
(outputhere,errorshere) = p.communicate(input=theStdin)
It works as expected, it waits for the subprocess to finish via p.communicate(). However within the pythonscript.py I want to "fire and forget" a "grandchild" process. I'm currently doing this by overwriting the join function:
class EverLastingProcess(Process):
def join(self, *args, **kwargs):
pass # Overwrites join so that it doesn't block. Otherwise parent waits.
def __del__(self):
pass
And starting it like this:
p = EverLastingProcess(target=nameOfMyFunction, args=(arg1, etc,), daemon=False)
p.start()
This also works fine I just run pythonscript.py in a bash terminal or bash script. Control and a response returns while the child process started by EverLastingProcess keeps going. However, when I run pythonscript.py with Popen running the process as shown above, it looks from timings that the Popen is waiting on the grandchild to finish.
How can I make it so that the Popen only waits on the child process, and not any grandchild processes?
The solution above (using the join method with the shell=True addition) stopped working when we upgraded our Python recently.
There are many references on the internet about the pieces and parts of this, but it took me some doing to come up with a useful solution to the entire problem.
The following solution has been tested in Python 3.9.5 and 3.9.7.
Problem Synopsis
The names of the scripts match those in the code example below.
A top-level program (grandparent.py):
Uses subprocess.run or subprocess.Popen to call a program (parent.py)
Checks return value from parent.py for sanity.
Collects stdout and stderr from the main process 'parent.py'.
Does not want to wait around for the grandchild to complete.
The called program (parent.py)
Might do some stuff first.
Spawns a very long process (the grandchild - "longProcess" in the code below).
Might do a little more work.
Returns its results and exits while the grandchild (longProcess) continues doing what it does.
Solution Synopsis
The important part isn't so much what happens with subprocess. Instead, the method for creating the grandchild/longProcess is the critical part. It is necessary to ensure that the grandchild is truly emancipated from parent.py.
Subprocess only needs to be used in a way that captures output.
The longProcess (grandchild) needs the following to happen:
It should be started using multiprocessing.
It needs multiprocessing's 'daemon' set to False.
It should also be invoked using the double-fork procedure.
In the double-fork, extra work needs to be done to ensure that the process is truly separate from parent.py. Specifically:
Move the execution away from the environment of parent.py.
Use file handling to ensure that the grandchild no longer uses the file handles (stdin, stdout, stderr) inherited from parent.py.
Example Code
grandparent.py - calls parent.py using subprocess.run()
#!/usr/bin/env python3
import subprocess
p = subprocess.run(["/usr/bin/python3", "/path/to/parent.py"], capture_output=True)
## Comment the following if you don't need reassurance
print("The return code is: " + str(p.returncode))
print("The standard out is: ")
print(p.stdout)
print("The standard error is: ")
print(p.stderr)
parent.py - starts the longProcess/grandchild and exits, leaving the grandchild running. After 10 seconds, the grandchild will write timing info to /tmp/timelog.
!/usr/bin/env python3
import time
def longProcess() :
time.sleep(10)
fo = open("/tmp/timelog", "w")
fo.write("I slept! The time now is: " + time.asctime(time.localtime()) + "\n")
fo.close()
import os,sys
def spawnDaemon(func):
# do the UNIX double-fork magic, see Stevens' "Advanced
# Programming in the UNIX Environment" for details (ISBN 0201563177)
try:
pid = os.fork()
if pid > 0: # parent process
return
except OSError as e:
print("fork #1 failed. See next. " )
print(e)
sys.exit(1)
# Decouple from the parent environment.
os.chdir("/")
os.setsid()
os.umask(0)
# do second fork
try:
pid = os.fork()
if pid > 0:
# exit from second parent
sys.exit(0)
except OSError as e:
print("fork #2 failed. See next. " )
print(e)
print(1)
# Redirect standard file descriptors.
# Here, they are reassigned to /dev/null, but they could go elsewhere.
sys.stdout.flush()
sys.stderr.flush()
si = open('/dev/null', 'r')
so = open('/dev/null', 'a+')
se = open('/dev/null', 'a+')
os.dup2(si.fileno(), sys.stdin.fileno())
os.dup2(so.fileno(), sys.stdout.fileno())
os.dup2(se.fileno(), sys.stderr.fileno())
# Run your daemon
func()
# Ensure that the daemon exits when complete
os._exit(os.EX_OK)
import multiprocessing
daemonicGrandchild=multiprocessing.Process(target=spawnDaemon, args=(longProcess,))
daemonicGrandchild.daemon=False
daemonicGrandchild.start()
print("have started the daemon") # This will get captured as stdout by grandparent.py
References
The code above was mainly inspired by the following two resources.
This reference is succinct about the use of the double-fork but does not include the file handling we need in this situation.
This reference contains the needed file handling, but does many other things that we do not need.
Edit: the below stopped working after a Python upgrade, see the accepted answer from Lachele.
Working answer from a colleague, change to shell=True like this:
p = subprocess.Popen(pythonscript.py, stdin=PIPE, stdout=PIPE, stderr=PIPE, shell=True)
I've tested and the grandchild subprocesses stay alive after the child processes returns without waiting for them to finish.
I have the below code that I am running
try:
child = pexpect.spawn(
('some command --path {0} somethingmore --args {1}').format(
<iterator-output>,something),
timeout=300)
child.logfile = open(file_name,'w')
child.expect('x*')
child.sendline(something)
child.expect('E*')
child.sendline(something))
#child.read()
child.interact()
time.sleep(15)
print child.status
except Exception as e:
print "Exception in child process"
print str(e)
Now, the command in pexpect creates subprocess by taking the one of the input from a loop, now everytime it spins up a subprocess I try to capture the logs via the child.read, in this case it waits for that subprocess to complete before going to the loop again, how do I make it to keep running it in the background(I get the logs of command input/output that I enter dynamically, but not of the process that runs thereafter unless I use the read or interact? I used this How do I make a command to run in background using pexpect.spawn? but it uses interact which again waits for that subprocess to complete .. since the loop will be iterated alomst more than 100 times I cannot wait on one to complete before moving to other, as the command in pexpect is an AWS lambda call, all I need to make sure is the command is triggered but I am not able to capture the process output of that call without waiting for it to complete.... Please let me know your suggestions
If you don't actually want to interact with lots of processes in parallel, but instead want to interact with each process briefly, then just ignore it while it runs and move onto interacting with the next one…
# Do everything up to the final `interact`. After that, the child
# won't be writing to us anymore, but it will still be running for
# many seconds. So, return the child object so we can deal with it
# later, after we've started up all the other children.
def start_command(path, arg):
try:
child = pexpect.spawn(('some command --path {0} somethingmore --args {1}').format(path, arg), timeout=300)
child.logfile = open(file_name,'w')
child.expect('x*')
child.sendline(something)
child.expect('E*')
child.sendline(something))
# child.read()
child.interact()
return child
except Exception as e:
print "Exception in child process"
print str(e)
# First, start up all the children and do the initial interaction
# with each one.
children = []
for path, args in some_iterable:
children.append(start_command(path, args))
# Now we just need to wait until they're all done. This will get
# them in as-launched order, rather than as-completed, but that
# seems like it should be fine for your use case.
for child in children:
try:
child.wait()
print child.status
except Exception as e:
print "Exception in child process"
print str(e)
A few things:
Notice from the code comments that I'm assuming the child isn't writing anything to us (and waiting for us to read it) after the initial interaction. If that's not true, things are a bit more complicated.
If you want to not only do this, but also spin up 8 children at a time, or even all of them at once, you can (as shown in my other answer) use an executor or just a mess of threads for the initial start_command calls, and have those tasks/threads return the child object to be waited on later. For example, with the Executor version, each future's result() will be a pexpect child process. However, you definitely need to read the pexpect docs on threads in that case—with some versions of linux, passing child-process objects between threads can break the objects.
Finally, since you will now be seeing things much more out-of-order than the original version, you might want to change your print statements to show which child you're printing for (which also probably means changing children from a list of children to a list of (child, path, arg) tuples or the like).
If you want to run a process in the background, but at the same time interact with it, the simplest solution is to just kick off a thread to interact with the process.*
In your case, it sounds like you're running hundreds of processes, so you want to run some of them in parallel, but maybe not all of them at once? If so, you should use a thread pool or executor. For example, using concurrent.futures from the stdlib (or pip install the futures backport if your Python is too old):
def run_command(path, arg):
try:
child = pexpect.spawn(('some command --path {0} somethingmore --args {1}').format(path, arg), timeout=300)
child.logfile = open(file_name,'w')
child.expect('x*')
child.sendline(something)
child.expect('E*')
child.sendline(something))
# child.read()
child.interact()
time.sleep(15)
print child.status
except Exception as e:
print "Exception in child process"
print str(e)
with concurrent.futures.ThreadPoolExecutor(max_workers=8) as x:
fs = []
for path, arg in some_iterable:
fs.append(x.submit(run_command, path, arg))
concurrent.futures.wait(fs)
If you need to return a value (or raise an exception) from the threaded code, you'll probably want a loop over as_completed(fs) instead of just plain wait. But here, you just seem to be printing stuff out and then forgetting it.
If the path, arg really do come straight out of an iterable like this, it's usually simpler to use x.map(run_command, some_iterable).
All of this (and other options, too) is explained pretty nicely in the module docs.
Also see the pexpect FAQ and common problems. I don't think there are any issues that will affect you here in current versions (we're always spawning the child and interacting with it entirely in a single thread-pooled task), but I vaguely remember there used to be an additional problem in the past (something to do with signals?).
* I think asyncio would be a better solution, except that as far as I know none of the attempts to fork or reimplement pexpect in a nonblocking way are complete enough to actually use…
The child process is started with
subprocess.Popen(arg)
Is there a way to ensure it is killed when parent terminates abnormally? I need this to work both on Windows and Linux. I am aware of this solution for Linux.
Edit:
the requirement of starting a child process with subprocess.Popen(arg) can be relaxed, if a solution exists using a different method of starting a process.
Heh, I was just researching this myself yesterday! Assuming you can't alter the child program:
On Linux, prctl(PR_SET_PDEATHSIG, ...) is probably the only reliable choice. (If it's absolutely necessary that the child process be killed, then you might want to set the death signal to SIGKILL instead of SIGTERM; the code you linked to uses SIGTERM, but the child does have the option of ignoring SIGTERM if it wants to.)
On Windows, the most reliable options is to use a Job object. The idea is that you create a "Job" (a kind of container for processes), then you place the child process into the Job, and you set the magic option that says "when no-one holds a 'handle' for this Job, then kill the processes that are in it". By default, the only 'handle' to the job is the one that your parent process holds, and when the parent process dies, the OS will go through and close all its handles, and then notice that this means there are no open handles for the Job. So then it kills the child, as requested. (If you have multiple child processes, you can assign them all to the same job.) This answer has sample code for doing this, using the win32api module. That code uses CreateProcess to launch the child, instead of subprocess.Popen. The reason is that they need to get a "process handle" for the spawned child, and CreateProcess returns this by default. If you'd rather use subprocess.Popen, then here's an (untested) copy of the code from that answer, that uses subprocess.Popen and OpenProcess instead of CreateProcess:
import subprocess
import win32api
import win32con
import win32job
hJob = win32job.CreateJobObject(None, "")
extended_info = win32job.QueryInformationJobObject(hJob, win32job.JobObjectExtendedLimitInformation)
extended_info['BasicLimitInformation']['LimitFlags'] = win32job.JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE
win32job.SetInformationJobObject(hJob, win32job.JobObjectExtendedLimitInformation, extended_info)
child = subprocess.Popen(...)
# Convert process id to process handle:
perms = win32con.PROCESS_TERMINATE | win32con.PROCESS_SET_QUOTA
hProcess = win32api.OpenProcess(perms, False, child.pid)
win32job.AssignProcessToJobObject(hJob, hProcess)
Technically, there's a tiny race condition here in case the child dies in between the Popen and OpenProcess calls, you can decide whether you want to worry about that.
One downside to using a job object is that when running on Vista or Win7, if your program is launched from the Windows shell (i.e., by clicking on an icon), then there will probably already be a job object assigned and trying to create a new job object will fail. Win8 fixes this (by allowing job objects to be nested), or if your program is run from the command line then it should be fine.
If you can modify the child (e.g., like when using multiprocessing), then probably the best option is to somehow pass the parent's PID to the child (e.g. as a command line argument, or in the args= argument to multiprocessing.Process), and then:
On POSIX: Spawn a thread in the child that just calls os.getppid() occasionally, and if the return value ever stops matching the pid passed in from the parent, then call os._exit(). (This approach is portable to all Unixes, including OS X, while the prctl trick is Linux-specific.)
On Windows: Spawn a thread in the child that uses OpenProcess and os.waitpid. Example using ctypes:
from ctypes import WinDLL, WinError
from ctypes.wintypes import DWORD, BOOL, HANDLE
# Magic value from http://msdn.microsoft.com/en-us/library/ms684880.aspx
SYNCHRONIZE = 0x00100000
kernel32 = WinDLL("kernel32.dll")
kernel32.OpenProcess.argtypes = (DWORD, BOOL, DWORD)
kernel32.OpenProcess.restype = HANDLE
parent_handle = kernel32.OpenProcess(SYNCHRONIZE, False, parent_pid)
# Block until parent exits
os.waitpid(parent_handle, 0)
os._exit(0)
This avoids any of the possible issues with job objects that I mentioned.
If you want to be really, really sure, then you can combine all these solutions.
Hope that helps!
The Popen object offers the terminate and kill methods.
https://docs.python.org/2/library/subprocess.html#subprocess.Popen.terminate
These send the SIGTERM and SIGKILL signals for you.
You can do something akin to the below:
from subprocess import Popen
p = None
try:
p = Popen(arg)
# some code here
except Exception as ex:
print 'Parent program has exited with the below error:\n{0}'.format(ex)
if p:
p.terminate()
UPDATE:
You are correct--the above code will not protect against hard-crashing or someone killing your process. In that case you can try wrapping the child process in a class and employ a polling model to watch the parent process.
Be aware psutil is non-standard.
import os
import psutil
from multiprocessing import Process
from time import sleep
class MyProcessAbstraction(object):
def __init__(self, parent_pid, command):
"""
#type parent_pid: int
#type command: str
"""
self._child = None
self._cmd = command
self._parent = psutil.Process(pid=parent_pid)
def run_child(self):
"""
Start a child process by running self._cmd.
Wait until the parent process (self._parent) has died, then kill the
child.
"""
print '---- Running command: "%s" ----' % self._cmd
self._child = psutil.Popen(self._cmd)
try:
while self._parent.status == psutil.STATUS_RUNNING:
sleep(1)
except psutil.NoSuchProcess:
pass
finally:
print '---- Terminating child PID %s ----' % self._child.pid
self._child.terminate()
if __name__ == "__main__":
parent = os.getpid()
child = MyProcessAbstraction(parent, 'ping -t localhost')
child_proc = Process(target=child.run_child)
child_proc.daemon = True
child_proc.start()
print '---- Try killing PID: %s ----' % parent
while True:
sleep(1)
In this example I run 'ping -t localhost' b/c that will run forever. If you kill the parent process, the child process (the ping command) will also be killed.
Since, from what I can tell, the PR_SET_PDEATHSIG solution can result in a deadlock when any threads are running in the parent process, I didn't want to use that and figured out another way. I created a separate auto-terminate process that detects when its parent process is done and kills the other subprocess that is its target.
To accomplish this, you need to pip install psutil, and then write code similar to the following:
def start_auto_cleanup_subprocess(target_pid):
cleanup_script = f"""
import os
import psutil
import signal
from time import sleep
try:
# Block until stdin is closed which means the parent process
# has terminated.
input()
except Exception:
# Should be an EOFError, but if any other exception happens,
# assume we should respond in the same way.
pass
if not psutil.pid_exists({target_pid}):
# Target process has already exited, so nothing to do.
exit()
os.kill({target_pid}, signal.SIGTERM)
for count in range(10):
if not psutil.pid_exists({target_pid}):
# Target process no longer running.
exit()
sleep(1)
os.kill({target_pid}, signal.SIGKILL)
# Don't bother waiting to see if this works since if it doesn't,
# there is nothing else we can do.
"""
return Popen(
[
sys.executable, # Python executable
'-c', cleanup_script
],
stdin=subprocess.PIPE
)
This is similar to https://stackoverflow.com/a/23436111/396373 that I had failed to notice, but I think the way that I came up with is easier for me to use because the process that is the target of cleanup is created directly by the parent. Also note that it is not necessary to poll the status of the parent, though it is still necessary to use psutil and to poll the status of the target subprocess during the termination sequence if you want to try, as in this example, to terminate, monitor, and then kill if terminate didn't work expeditiously.
Hook exit of your process using SetConsoleCtrlHandler, and kill subprocess. I think I do a bit of a overkill there, but it works :)
import psutil, os
def kill_proc_tree(pid, including_parent=True):
parent = psutil.Process(pid)
children = parent.children(recursive=True)
for child in children:
child.kill()
gone, still_alive = psutil.wait_procs(children, timeout=5)
if including_parent:
parent.kill()
parent.wait(5)
def func(x):
print("killed")
if anotherproc:
kill_proc_tree(anotherproc.pid)
kill_proc_tree(os.getpid())
import win32api,shlex
win32api.SetConsoleCtrlHandler(func, True)
PROCESSTORUN="your process"
anotherproc=None
cmdline=f"/c start /wait \"{PROCESSTORUN}\" "
anotherproc=subprocess.Popen(executable='C:\\Windows\\system32\\cmd.EXE', args=shlex.split(cmdline,posix="false"))
...
run program
...
Took kill_proc_tree from:
subprocess: deleting child processes in Windows
I have a python script and I want to launch an independent daemon process. I want to call ym python script, launch this system tray dameon, do some python magic on a database file and quit, leaving the system tray daemon running.
I have tried os.system, subprocess.call, subprocess.Popen, os.execl, but it always keeps my script alive until I close the system tray daemon.
This sounds like it should be a simple solution, but I can't get anything to work.
You can use a couple nifty Popen parameters to accomplish a truly detached process on Windows (thanks to greenhat for his answer here):
import subprocess
DETACHED_PROCESS = 0x00000008
results = subprocess.Popen(['notepad.exe'],
close_fds=True, creationflags=DETACHED_PROCESS)
print(results.pid)
See also this answer for a nifty cross-platform version (make sure to add close_fds though as it is critical for Windows).
Solution for Windows: os.startfile()
Works as if you double clicked an executable and causes it to launch independently. A very handy one liner.
http://docs.python.org/library/os.html?highlight=startfile#os.startfile
I would recommend using the double-fork method.
Example:
import os
import sys
import time
def main():
fh = open('log', 'a')
while True:
fh.write('Still alive!')
fh.flush()
time.sleep(1)
def _fork():
try:
pid = os.fork()
if pid > 0:
sys.exit(0)
except OSError, e:
print >>sys.stderr, 'Unable to fork: %d (%s)' % (e.errno, e.strerror)
sys.exit(1)
def fork():
_fork()
# remove references from the main process
os.chdir('/')
os.setsid()
os.umask(0)
_fork()
if __name__ == '__main__':
fork()
main()
I'm trying to build a Python daemon that launches other fully independent processes.
The general idea is for a given shell command, poll every few seconds and ensure that exactly k instances of the command are running. We keep a directory of pidfiles, and when we poll we remove pidfiles whose pids are no longer running and start up (and make pidfiles for) however many processes we need to get to k of them.
The child processes also need to be fully independent, so that if the parent process dies the children won't be killed. From what I've read, it seems there is no way to do this with the subprocess module. To this end, I used the snippet mentioned here:
http://code.activestate.com/recipes/66012-fork-a-daemon-process-on-unix/
I made a couple necessary modifications (you'll see the lines commented out in the attached snippet):
The original parent process can't exit because we need the launcher daemon to persist indefinitely.
The child processes need to start with the same cwd as the parent.
Here's my spawn fn and a test:
import os
import sys
import subprocess
import time
def spawn(cmd, child_cwd):
"""
do the UNIX double-fork magic, see Stevens' "Advanced
Programming in the UNIX Environment" for details (ISBN 0201563177)
http://www.erlenstar.demon.co.uk/unix/faq_2.html#SEC16
"""
try:
pid = os.fork()
if pid > 0:
# exit first parent
#sys.exit(0) # parent daemon needs to stay alive to launch more in the future
return
except OSError, e:
sys.stderr.write("fork #1 failed: %d (%s)\n" % (e.errno, e.strerror))
sys.exit(1)
# decouple from parent environment
#os.chdir("/") # we want the children processes to
os.setsid()
os.umask(0)
# do second fork
try:
pid = os.fork()
if pid > 0:
# exit from second parent
sys.exit(0)
except OSError, e:
sys.stderr.write("fork #2 failed: %d (%s)\n" % (e.errno, e.strerror))
sys.exit(1)
# redirect standard file descriptors
sys.stdout.flush()
sys.stderr.flush()
si = file('/dev/null', 'r')
so = file('/dev/null', 'a+')
se = file('/dev/null', 'a+', 0)
os.dup2(si.fileno(), sys.stdin.fileno())
os.dup2(so.fileno(), sys.stdout.fileno())
os.dup2(se.fileno(), sys.stderr.fileno())
pid = subprocess.Popen(cmd, cwd=child_cwd, shell=True).pid
# write pidfile
with open('pids/%s.pid' % pid, 'w') as f: f.write(str(pid))
sys.exit(1)
def mkdir_if_none(path):
if not os.access(path, os.R_OK):
os.mkdir(path)
if __name__ == '__main__':
try:
cmd = sys.argv[1]
num = int(sys.argv[2])
except:
print 'Usage: %s <cmd> <num procs>' % __file__
sys.exit(1)
mkdir_if_none('pids')
mkdir_if_none('test_cwd')
for i in xrange(num):
print 'spawning %d...'%i
spawn(cmd, 'test_cwd')
time.sleep(0.01) # give the system some breathing room
In this situation, things seem to work fine, and the child processes persist even when the parent is killed. However, I'm still running into a spawn limit on the original parent. After ~650 spawns (not concurrently, the children have finished) the parent process chokes with the error:
spawning 650...
fork #2 failed: 35 (Resource temporarily unavailable)
Is there any way to rewrite my spawn function so that I can spawn these independent child processes indefinitely? Thanks!
Thanks to your list of processes I'm willing to say that this is because you have hit one of a number of fundamental limitations:
rlimit nproc maximum number of processes a given user is allowed to execute -- see setrlimit(2), the bash(1) ulimit built-in, and /etc/security/limits.conf for details on per-user process limits.
rlimit nofile maximum number of file descriptors a given process is allowed to have open at once. (Each new process probably creates three new pipes in the parent, for the child's stdin, stdout, and stderr descriptors.)
System-wide maximum number of processes; see /proc/sys/kernel/pid_max.
System-wide maximum number of open files; see /proc/sys/fs/file-max.
Because you're not reaping your dead children, many of these resources are held open longer than they should. Your second children are being properly handled by init(8) -- their parent is dead, so they are re-parented to init(8), and init(8) will clean up after them (wait(2)) when they die.
However, your program is responsible for cleaning up after the first set of children. C programs typically install a signal(7) handler for SIGCHLD that calls wait(2) or waitpid(2) to reap the children's exit status and thus remove its entries from the kernel's memory.
But signal handling in a script is a bit annoying. If you can set the SIGCHLD signal disposition to SIG_IGN explicitly, the kernel will know that you are not interested in the exit status and will reap the children for you_.
Try adding:
import signal
signal.signal(signal.SIGCHLD, signal.SIG_IGN)
near the top of your program.
Note that I don't know what this does for Subprocess. It might not be pleased. If that is the case, then you'll need to install a signal handler to call wait(2) for you.
I'm slightly modified your code and was able to run 5000 processes without any issues. So I agree with #sarnold that you hit some fundamental limitation. My modifications are:
proc = subprocess.Popen(cmd, cwd=child_cwd, shell=True, close_fds=True)
pid = proc.pid
# write pidfile
with open('pids/%s.pid' % pid, 'w') as f: f.write(str(pid))
proc.wait()
sys.exit(1)