Python subprocess script keeps running after it is done - python

In one of my Django views, I am calling a python script and getting its pid with:
from subprocess import Popen
p = Popen(['python', 'script.py'])
mypid = p.pid
When trying to find out if the process still is running from another page, I use the following function on mypid (thanks to this question):
def doesProcessExist(pid):
if pid < 0:
return False
try:
os.kill(pid, 0)
except OSError, e:
return e.errno == errno.EPERM
else:
return True
No matter how long I wait, the process still shows up as running. The only thing that stops it, is if I spawn a new python script process with Popen. Is there anyway I can fix this? I am not sure if this is caused by Django not closing python properly after the script is finished or something else. In Ubuntu's process status manager, the process shows up as [python] <defunct>.
--
The problem is true for all script.py I have tried. I am currently using one as simple as:
from time import sleep
sleep(5)

Really, what you're doing is wrong. When you use a high-level wrapper like a subprocess.Popen, you need to manage the process through that object. Just having the PID elsewhere isn't enough to manage it.
If you insist on dealing in PIDs instead of Popen objects, then you should use the low-level APIs in os.
Fortunately, you're not doing anything complicated, like creating pipes to talk to the child process. So, you can just launch it with your favorite spawn variant, then wait for it with waitpid or one of its variants.
I'm assuming you're doing this all in a single-process web server. If you're using a forking web server, where the other page could be in a different process, even using PIDs won't work. The parent process has to reap the child, not some other arbitrary process. If you want to make that work, you'll have to make things more complicated, and you're really going to have to learn about the Unix process model before anyone can explain it to you.

What you see is a zombie process. It doesn't keep running. It can't. It is dead. The only thing that is left is some info that allows for related processes to retrieve its status.
To find out whether a subprocess is alive without blocking, call p.poll(). If it returns None then the process is still alive, otherwise you can safely forget about it (it is already reaped by .poll()).
subprocess module calls _cleanup() function that reaps zombie processes inside Popen() constructor. So normally your script won't create many zombie processes anyway.
To see a list of zombie processes:
import os
#NOTE: don't use Popen() here
print os.popen(r"ps aux | grep Z | grep -v grep").read(),

Processes in Unix stick around until the parent waits for them. calling wait on the object returned by thepopen will wait for the process to be done and will wait for it so it goes away. Until you do that it will exist as a zombie process See this message for info on getting the process to go away in the background while your web server runs without waiting for it in a foreground thread/view.
So, let's say that you do
p = subprocess.Popen(...)
At some point you need to call
p.wait()

Related

Why use os.setsid() in Python?

I know os.setsid() is to change the process(forked) group id to itself, but why we need it?
I can see some answer from Google is:
To keep the child process running while the parent process exit.
But according to my test below, without os.setsid() the child process won't exit as well even if the parent process exit(or being killed). So why we need to add os.setsid()? Thanks.
import os
import time
import sys
mainPid = os.getpid()
print("Main Pid: %s" % mainPid)
pid = os.fork()
if pid > 0:
time.sleep(3)
print("Main process quit")
sys.exit(0)
#os.setsid()
for x in range(1, 10):
print("spid: %s, ppid: %s pgid: %s" % (os.getpid(), os.getppid(), os.getpgid(0)))
time.sleep(1)
Calling setsid is usually one of the steps a process goes through when becoming a so called daemon process. (We are talking about Linux/Unix OS).
With setsid the association with the controlling terminal breaks. This means that the process will be NOT affected by a logout.
There are other way how to survive a logout, but the purpose of this 'daemonizing' process is to create a background process as independent from the outside world as possible.
That's why all inherited descriptors are closed; cwd is set to an appropriate directory, often the root directory; and the process leaves the session it was started from.
A double fork approach is generally recommended. At each fork the parent exits and the child continues. Actually nothing changes except the PID, but that's exactly what is needed here.
First fork before the setsid makes sure the process is not a process group leader. That is required for a succesfull setsid.
The second fork after the setsid makes sure that a new association with a controlling terminal won't be started merely by opening a terminal device.
NOTE: when a daemon process is started from systemd, the systemd can arrange everything described above so the process does not have to.
Well, double fork to daemonize is a good example. However, It's better to understand what is process group and session.
Session ID (SID)
This is just the PID of the session leader. If PID == SID, then this process is a session leader.
Sessions and process groups are just ways to treat a number of related processes as a unit. All the members of a process group always belong to the same session, but a session may have multiple process groups.
Normally, a shell will be a session leader, and every pipeline executed by that shell will be a process group. This is to make it easy to kill the children of a shell when it exits. (See exit(3) for the gory details.)
Basically, if you log into a machine, your shell starts a session. If you want to keep your process running even when you log out, you should start a new session for the child.
The difference with double forked process is that you can still attach a control terminal to that process since it's a session leader, whereas the daemon process created by double fork can not be attached to the terminal anymore.
In some cases, the child process will be able to continue running even after the parent exits, but this is not foolproof. The child will also exit when the parent exits in some situations.
As of Python 3.2, you can use subprocess.Popen() and pass start_new_session=True to accomplish fully detach the child process from the parent.
The docs state:
If start_new_session is true the setsid() system call will be made in the child process prior to the execution of the subprocess. (POSIX only)
https://docs.python.org/3/library/subprocess.html#subprocess.Popen

Get a signal once a subprocess ends

I am using the subprocess.Popen function to run a command line. Without having to use Popen.wait(), I want to check the subprocess after it has finished using Popen.poll(). Any suggestions on how to do this?
import subprocess
job = subprocess.Popen('command line', shell = True)
print(job.poll())
As it is, I get job.poll() printed before the subprocess starts. I want it to wait until it ends. I don't want to use wait because the rest of the user interface becomes unusable until the process ends. This is in PyQt4.
As Python - wait on a condition without high cpu usage says, there are only two ways in existence to wait for something: polling or setting up/using a notification system.
If it's UI - didn't you forget about one notification system you always have - the message queue?
Besides, you can always (and, if it's UI, should always) perform any time-consuming tasks in worker threads. If which case, you are just fine with a synchronous call.

Retrieve exit code of processes launched with multiprocessing.Pool.map

I'm using python multiprocessing module to parallelize some computationally heavy tasks.
The obvious choice is to use a Pool of workers and then use the map method.
However, processes can fail. For instance, they may be silently killed for instance by the oom-killer. Therefore I would like to be able to retrieve the exit code of the processes launched with map.
Additionally, for logging purpose, I would like to be able to know the PID of the process launched to execute each value in the the iterable.
If you're using multiprocessing.Pool.map you're generally not interested in the exit code of the sub-processes in the pool, you're interested in what value they returned from their work item. This is because under normal conditions, the processes in a Pool won't exit until you close/join the pool, so there's no exit codes to retrieve until all work is complete, and the Pool is about to be destroyed. Because of this, there is no public API to get the exit codes of those sub-processes.
Now, you're worried about exceptional conditions, where something out-of-band kills one of the sub-processes while it's doing work. If you hit an issue like this, you're probably going to run into some strange behavior. In fact, in my tests where I killed a process in a Pool while it was doing work as part of a map call, map never completed, because the killed process didn't complete. Python did, however, immediately launch a new process to replace the one I killed.
That said, you can get the pid of each process in your pool by accessing the multiprocessing.Process objects inside the pool directly, using the private _pool attribute:
pool = multiprocessing.Pool()
for proc in pool._pool:
print proc.pid
So, one thing you could do to try to detect when a process had died unexpectedly (assuming you don't get stuck in a blocking call as a result). You can do this by examining the list of processes in the pool before and after making a call to map_async:
before = pool._pool[:] # Make a copy of the list of Process objects in our pool
result = pool.map_async(func, iterable) # Use map_async so we don't get stuck.
while not result.ready(): # Wait for the call to complete
if any(proc.exitcode for proc in before): # Abort if one of our original processes is dead.
print "One of our processes has exited. Something probably went horribly wrong."
break
result.wait(timeout=1)
else: # We'll enter this block if we don't reach `break` above.
print result.get() # Actually fetch the result list here.
We have to make a copy of the list because when a process in the Pool dies, Python immediately replaces it with a new process, and removes the dead one from the list.
This worked for me in my tests, but because it's relying on a private attribute of the Pool object (_pool) it's risky to use in production code. I would also suggest that it may be overkill to worry too much about this scenario, since it's very unlikely to occur and complicates the implementation significantly.

Pausing a process or thread in Python

I've created a little audio player that looks up a chapter of an audiobook at a website I've specified, downloads it, and plays it. The only problem is, I can't pause it. To play the mp3 file I'm using os.system("afplay chapter.mp3")
I've thought about creating a thread with the os.system call in it but I'm pretty sure I can't pause it that way. If the thread was a loop I could just lock a variable it needs to access and unlock when I'm ready to resume. But since this thread would be just one line of code that doesn't seem possible. I've also looked at creating a process and sending SIGSTOP to it. But for some unknown reason that won't work.
import os, signal
from multiprocessing import Process
p = Process(target=play)
p.start()
raw_input("press enter to pause: ")
os.kill(p.pid, signal.SIGSTOP)
The code just executes silently without stopping the process.
I know there are alternatives to afplay but for now I'm just going to stick with the os.system call. So my question is, How can I pause a one-line thread or process? Instead of creating a new process with the Process() call do I need to find the process id of afplay? If so how?
os.system creates a child process and waits for that child process to exit. You can use os.execv to replace a process with another program, or use subprocess.Popen to create a child process that you can find the pid of with Popen.pid.
Why don't you just use the subprocess module instead of subprocess and os.system? That should give you better control over spawned processes.

subprocess.wait() not waiting for Popen process to finish (when using threads)?

I am experiencing some problems when using subprocess.Popen() to spawn several instances of the same application from my python script using threads to have them running simultaneously. In each thread I run the application using the popen() call, and then I wait for it to finish by callingwait(). The problem seems to be that the wait()-call does not actually wait for the process to finish. I experimented by using only one thread, and by printing out text messages when the process starts, and when it finishes. So the thread function would look something like this:
def worker():
while True:
job = q.get() # q is a global Queue of jobs
print('Starting process %d' % job['id'])
proc = subprocess.Popen(job['cmd'], shell=True)
proc.wait()
print('Finished process %d' % job['id'])
job.task_done()
But even when I only use one thread, it will print out several "Starting process..." messages, before any "Finished process..." message appears. Are there any cases when wait() does not actually wait? I have several different external applications (C++ console applications), which in turn will have several instances running simultaneously, and for some of them my code works, but for others it won't. Could there be some issue with the external applications that somehow affects the call to wait()?
The code for creating the threads looks something like this:
for i in range(1):
t = Thread(target=worker)
t.daemon = True
t.start()
q.join() # Wait for the queue to empty
Update 1:
I should also add that for some of the external applications I sometimes get a return code (proc.returncode) of -1073471801. For example, one of the external applications will give that return code the first two times Popen is called, but not the last two (when I have four jobs).
Update 2:
To clear things up, right now I have four jobs in the queue, which are four different test cases. When I run my code, for one of the external applications the first two Popen-calls generate the return code -1073471801. But if I print the exact command which Popen calls, and run it in a command window, it executes without any problems.
Solved!
I managed to solve the issues I was having. I think the problem was my lack of experience in threaded programming. I missed the fact that when I had created my first worker threads, they would keep on living until the python script exits. By mistake I created more worker threads each time I put new items on the queue (I do that in batches for every external program I want to run). So by the time I got to the fourth external application, I had four threads running simultaneously even though I only thought I had one.
You could also use check_call() instead of Popen. check_call() waits for the command to finish, even when shell=True and then returns the exit code of the job.
Sadly when running your subprocess using shell=True, wait() will only wait for the sh subprocess to finish and not for the command cmd.
I will suggest if it possible to don't use the shell=True, if not possible you can create a process group like in this answer and use os.waitpid to wait for the process group not just the shell process.
Hope this was helpful :)
Make sure all applications your are calling have valid system return codes when they finish
I was having issues as well, but was inspired by yours.
Mine looks like this, and works beautifully:
startupinfo = subprocess.STARTUPINFO()
startupinfo.dwFlags = subprocess.STARTF_USESHOWWINDOW
startupinfo.wShowWindow = subprocess.SW_HIDE
proc = subprocess.Popen(command, startupinfo=startupinfo)
proc.communicate()
proc.wait()
Notice that this one hides the window as well.

Categories