I have the simplified following code in Python:
proc_args = "gzip --force file; echo this_still_prints > out"
post_proc = subprocess.Popen(proc_args, shell=True)
while True:
time.sleep(1)
Assume file is big enough to take several seconds do process. If I close the Python process while gzip is still running, it will cause gzip to end, but it will still execute the following line to gzip. I'd like to know why this happens, and if there's a way I can make it to not continue executing the following commands.
Thank you!
A process exiting does not automatically cause all its child processes to be killed. See this question and its related questions for much discussion of this.
gzip exits because the pipe containing its standard input gets closed when the parent exits; it reads EOF and exits. However, the shell that's running the two commands is not reading from stdin, so it doesn't notice this. So it just continues on and executes the echo command (which also doesn't read stdin).
post_proc.kill() I believe is what you are looking for ... but afaik you must explicitly call it
see: http://docs.python.org/library/subprocess.html#subprocess.Popen.kill
I use try-finally in such cases (unfortunately you cannot employ with like you would in file.open()):
proc_args = "gzip --force file; echo this_still_prints > out"
post_proc = subprocess.Popen(proc_args, shell=True)
try:
while True:
time.sleep(1)
finally:
post_proc.kill()
Related
I have a script that uses a really simple file based IPC to communicate with another program. I write a tmp file with the new content and mv it onto the IPC file to keep stuff atomar (the other program listens of rename events).
But now comes the catch: This works like 2 or 3 times but then the exchange is stuck.
time.sleep(10)
# check lsof => target file not opened
subprocess.run(
"mv /tmp/tempfile /tmp/target",
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
universal_newlines=True,
shell=True,
)
# check lsof => target file STILL open
time.sleep(10)
/tmp/tempfile will get prepared for every write
The first run results in:
$ lsof /tmp/target
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
python 1714 <user> 3u REG 0,18 302 10058 /tmp/target
which leave it open until I terminate the main python program. Consecutive runs change the content as expected, the inode and file descriptor but its still open what I would not expect from a mv.
The file is finally gets closed when the python program featuring these lines above is getting closed.
EDIT:
Found the bug: mishandeling the tempfile.mkstemp(). See: https://docs.python.org/3/library/tempfile.html#tempfile.mkstemp
I created the tempfile like so:
_fd, temp_file_path = tempfile.mkstemp()
where I discarded the filedescriptor _fd which was open by default. I did not close it and so it was left open even after the move. This resulted in an open target and since I was just lsofing on the target, I did not see that the tempfile was already opened. This would be the corrected version:
fd, temp_file_path = tempfile.mkstemp()
fd.write(content)
fd.close()
# ... mv/rename via shell execution/shutil/pathlib
Thank you all very much for your help and your suggestions!
I wasn't able reproduce this behavior. I created a file /tmp/tempfile and ran a python script with the subprocess.run call you give followed by a long sleep. /tmp/target was not in use, nor did I see any unexpected open files in lsof -p <pid>.
(edit) I'm not surprised at this, because there's no way that your subprocess command is opening the file: mv does not open its arguments (you can check this with ltrace) and subprocess.run does not parse its argument or do anything with it besides pass it along to be exec-ed.
However, when I added some lines to open a file and write to it and then move that file, I see the same behavior you describe. This is the code:
import subprocess
out=open('/tmp/tempfile', 'w')
out.write('hello')
subprocess.run(
"mv /tmp/tempfile /tmp/target",
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
universal_newlines=True,
shell=True,
)
import time
time.sleep(5000)
In this case, the file is still open because it was never closed, and even though it's been renamed the original file handle still exists. My bet would be that you have something similar in your code that's creating this file and leaving open a handle to it.
Is there any reason why you don't use shutil.move? Otherwise it may be necassary to wait for the mv command to finish moving and then kill it, read stdin, run something like
p = subprocess.run(...)
# wait to finish moving/read from stdin
p.terminate()
Of course terminate would be a bit harsh.
Edit: depending on your use rsync, which is not part of python, may be a elegant solution to keep your data synced over the network without writing a single line of code
you say it is still open by "mv" but you lsof result shown open by python. As it is an sub process see if the pid is the same of the python process maybe it is another python process.
I have a piece of code that is starting a process then reading from stdout to see if it has loaded OK.
After that, I'd ideally like to redirect the output to /dev/null or something that discards it. I was (A) what is the best practice in this situation and (B) what will happen to the writing process if the pipe becomes full? Will it ever block when the pipe becomes full and is not being read/cleared?
If the aim is to redirect to /dev/null would it be possible to show me how to to this with python and subprocess.Popen?
proc = subprocess.Popen(command, stderr=subprocess.PIPE)
while True:
if init_string in proc.stderr.readline():
break;
proc.stderr.redirect ??
As far as I know, there is no way to close and reopen file descriptors of a child process after it has started executing. And yes, there is a limited buffer in the OS, so if you don't consume anything from the pipe, eventually the child process will block. That means you'll just have to keep reading from the pipe until it's closed from the write end.
If you want your program to continue doing something useful in the meantime, consider moving the data-consuming part to a separate thread (untested):
def read_all_from_pipe(pipe):
for line in pipe: # assuming it's line-based
pass
Thread(lambda: read_all_from_pipe(proc.stderr)).start()
There may be other ways to solve your problem, though. Why do you need to wait for some particular output in the first place? Shouldn't the child just die with a nonzero exit code if it didn't "load OK"? Can you instead check that the child is doing what it should, rather than that it's printing some arbitrary output?
If you would like to discard all the output:
python your_script.py > /dev/null
However, if you want to do it from Python you can use:
import sys
sys.stdout = open('file', 'w')
print 'this goes to file'
Everytime you print, the standard output has been redirected to the file "file", change that to /dev/null or any file you want and you will obtain the wanted results.
I would like to run a section of code as long as a forked subprocess (rsync) is running. This is how I did it in my code:
rsync_proc = subprocess.Popen(proc_args, stdout=subprocess.PIPE)
while rsync_proc.poll() == None:
sys.stdout.write('\r'+
rsync_progress_report(source_size_kb, dest, start)),
sys.stdout.flush()
time.sleep(1)
For some reason, this causes the rsync subprocess to get stuck when it's almost finished. The while loop just continues looping with the rsync_proc.poll() returning None.
When I do run this same rsync call without the while loop code, it finishes without a problem.
Thanks in advance.
If you attach strace to your stuck rsync child process, you'll probably see it's blocked writing to stdout.
If it's blocked writing to stdout, it's probably because the pipe is full because you never read from it.
Try reading from the pipe and just discarding the output - or, if you really don't want the output, don't connect it to a pipe in the first place.
I have a python program which launches subprocesses using Popen and consumes their output nearly real-time as it is produced. The code of the relevant loop is:
def run(self, output_consumer):
self.prepare_to_run()
popen_args = self.get_popen_args()
logging.debug("Calling popen with arguments %s" % popen_args)
self.popen = subprocess.Popen(**popen_args)
while True:
outdata = self.popen.stdout.readline()
if not outdata and self.popen.returncode is not None:
# Terminate when we've read all the output and the returncode is set
break
output_consumer.process_output(outdata)
self.popen.poll() # updates returncode so we can exit the loop
output_consumer.finish(self.popen.returncode)
self.post_run()
def get_popen_args(self):
return {
'args': self.command,
'shell': False, # Just being explicit for security's sake
'bufsize': 0, # More likely to see what's being printed as it happens
# Not guarantted since the process itself might buffer its output
# run `python -u` to unbuffer output of a python processes
'cwd': self.get_cwd(),
'env': self.get_environment(),
'stdout': subprocess.PIPE,
'stderr': subprocess.STDOUT,
'close_fds': True, # Doesn't seem to matter
}
This works great on my production machines, but on my dev machine, the call to .readline() hangs when certain subprocesses complete. That is, it will successfully process all of the output, including the final output line saying "process complete", but then will again poll readline and never return. This method exits properly on the dev machine for most of the sub-processes I call, but consistently fails to exit for one complex bash script that itself calls many sub-processes.
It's worth noting that popen.returncode gets set to a non-None (usually 0) value many lines before the end of the output. So I can't just break out of the loop when that is set or else I lose everything that gets spat out at the end of the process and is still buffered waiting for reading. The problem is that when I'm flushing the buffer at that point, I can't tell when I'm at the end because the last call to readline() hangs. Calling read() also hangs. Calling read(1) gets me every last character out, but also hangs after the final line. popen.stdout.closed is always False. How can I tell when I'm at the end?
All systems are running python 2.7.3 on Ubuntu 12.04LTS. FWIW, stderr is being merged with stdout using stderr=subprocess.STDOUT.
Why the difference? Is it failing to close stdout for some reason? Could the sub-sub-processes do something to keep it open somehow? Could it be because I'm launching the process from a terminal on my dev box, but in production it's launched as a daemon through supervisord? Would that change the way the pipes are processed and if so how do I normalize them?
The main code loop looks right. It could be that the pipe isn't closing because another process is keeping it open. For example, if script launches a background process that writes to stdout then the pipe will no close. Are you sure no other child process still running?
An idea is to change modes when you see the .returncode has set. Once you know the main process is done, read all its output from buffer, but don't get stuck waiting. You can use select to read from the pipe with a timeout. Set a several seconds timeout and you can clear the buffer without getting stuck waiting child process.
Without knowing the contents of the "one complex bash script" which causes the problem, there's too many possibilities to determine the exact cause.
However, focusing on the fact that you claim it works if you run your Python script under supervisord, then it might be getting stuck if a sub-process is trying to read from stdin, or just behaves differently if stdin is a tty, which (I presume) supervisord will redirect from /dev/null.
This minimal example seems to cope better with cases where my example test.sh runs subprocesses which try to read from stdin...
import os
import subprocess
f = subprocess.Popen(args='./test.sh',
shell=False,
bufsize=0,
stdin=open(os.devnull, 'rb'),
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
close_fds=True)
while 1:
s = f.stdout.readline()
if not s and f.returncode is not None:
break
print s.strip()
f.poll()
print "done %d" % f.returncode
Otherwise, you can always fall back to using a non-blocking read, and bail out when you get your final output line saying "process complete", although it's a bit of a hack.
If you use readline() or read(), it should not hang. No need to check returncode or poll(). If it is hanging when you know the process is finished, it is most probably a subprocess keeping your pipe open, as others said before.
There are two things you could do to debug this:
* Try to reproduce with a minimal script instead of the current complex one, or
* Run that complex script with strace -f -e clone,execve,exit_group and see what is that script starting, and if any process is surviving the main script (check when the main script calls exit_group, if strace is still waiting after that, you have a child still alive).
I find that calls to read (or readline) sometimes hang, despite previously calling poll. So I resorted to calling select to find out if there is readable data. However, select without a timeout can hang, too, if the process was closed. So I call select in a semi-busy loop with a tiny timeout for each iteration (see below).
I'm not sure if you can adapt this to readline, as readline might hang if the final \n is missing, or if the process doesn't close its stdout before you close its stdin and/or terminate it. You could wrap this in a generator, and everytime you encounter a \n in stdout_collected, yield the current line.
Also note that in my actual code, I'm using pseudoterminals (pty) to wrap the popen handles (to more closely fake user input) but it should work without.
# handle to read from
handle = self.popen.stdout
# how many seconds to wait without data
timeout = 1
begin = datetime.now()
stdout_collected = ""
while self.popen.poll() is None:
try:
fds = select.select([handle], [], [], 0.01)[0]
except select.error, exc:
print exc
break
if len(fds) == 0:
# select timed out, no new data
delta = (datetime.now() - begin).total_seconds()
if delta > timeout:
return stdout_collected
# try longer
continue
else:
# have data, timeout counter resets again
begin = datetime.now()
for fd in fds:
if fd == handle:
data = os.read(handle, 1024)
# can handle the bytes as they come in here
# self._handle_stdout(data)
stdout_collected += data
# process exited
# if using a pseudoterminal, close the handles here
self.popen.wait()
Why are you setting the sdterr to STDOUT?
The real benefit of making a communicate() call on a subproces is that you are able to retrieve a tuple containining the stdout response as well as the stderr meesage.
Those might be useful if the logic depends on their succsss or failure.
Also, it would save you from the pain of having to iterate through lines. Communicate() gives you everything and there would be no unresolved questions about whether or not the full message was received
I wrote a demo with bash subprocess that can be easy explored.
A closed pipe can be recognized by '' in the output from readline(), while the output from an empty line is '\n'.
from subprocess import Popen, PIPE, STDOUT
p = Popen(['bash'], stdout=PIPE, stderr=STDOUT)
out = []
while True:
outdata = p.stdout.readline()
if not outdata:
break
#output_consumer.process_output(outdata)
print "* " + repr(outdata)
out.append(outdata)
print "* closed", repr(out)
print "* returncode", p.wait()
Example of input/output: Closing the pipe distinctly before terminating the process. That is why wait() should be used instead of poll()
[prompt] $ python myscript.py
echo abc
* 'abc\n'
exec 1>&- # close stdout
exec 2>&- # close stderr
* closed ['abc\n']
exit
* returncode 0
[prompt] $
Your code did output a huge number of empty strings for this case.
Example: Fast terminated process without '\n' on the last line:
echo -n abc
exit
* 'abc'
* closed ['abc']
* returncode 0
Is there a way to check if a subprocess has finished its job? My python script executes an 123.exe from cmd. 123.exe then does some things and in the end it writes 1 or 0 to a txt file. Python script then reads the txt file and continues with the job if '1' and stops if '0'. At the moment all that I can think of, is to put the python script to sleep for a minute. That way the 123.exe has most certainly written 1 or 0 into the txt file. This realization is very simple but at the same time stupid.
So my question is, is there a way to deal with this problem without the need for timeout? A way to make the python script to wait til the 123.exe stops?
Use:
retcode = subprocess.call(["123.exe"])
This will execute the command, wait until it finishes, and you get its return code into retcode (this way you could also avoid the need of checking the output status file, if the command returns a proper return code).
If you need to instantiate manually the subprocess.Popen, then go for
Popen.wait()
(and then check Popen.returncode).
I would use the call() shorthand from the subprocess module. From the documentation:
Run command with arguments. Wait for command to complete, then return
the returncode attribute.
It should work this way:
import subprocess
retcode = subprocess.call(["123.exe"])
of course you won't use the retcode, but your script will anyhow hang until the 123.exe has terminated.
Try this:
p = subprocess(...)
os.waitpid(p.pid, 0)
This will wait until the process is done and your script will continue from here.