Unable to debug python script with subprocess

Unable to debug python script with subprocess - python

I have written a python function which runs another python script in a remote desktop using PSTools (psexec). I have run the script successfully several times when the function is called only once. But when I call the function multiple times from another file, the subprocess does not run in the second call. In fact it immediately quits the entire program in the second iteration without throwing any exception or Traceback.
controlpc_clean_command = self.psexecpath + ' -s -i 2 -d -u ' + self.controlPClogin + ' -p ' + self.controlPCpwd + ' \\' + self.controlPCIPaddr + ' cmd.exe /k ' + self.controlPC_clean_code
logfilePath = self.psexeclogs + 'Ctrl_Clean_Log.txt'
logfile = file(logfilePath,'w')
try:
process = subprocess.Popen(controlpc_clean_command, stdout = subprocess.PIPE,stderr = subprocess.PIPE)
for line in process.stderr:
print "** INSIDE SUBPROCESS STDERR TO START PSEXEC **\n"
sys.stderr.write(line)
logfile.write(line)
process.wait()
except OSError:
print "********COULD NOT FIND PSEXEC.EXE, PLEASE REINSTALL AND SET THE PATH VARIABLE PROPERLY********\n"
The above code runs once perfectly. Even if I run it from a different python file with different parameters, it runs good. The problem happens when I call the function more than once from one file, then in the second call the function quits after printing "** INSIDE SUBPROCESS STDERR TO START PSEXEC **\n" and it does not even print anything in the main program after that.
I am unable to figure out how to debug this issue. As I am completely clueless where the program goes after printing this line. How do I debug this?
Edit:
After doing some search, I added
stdout, stderr = subprocess.communicate()
after the subprocess.Popen line in my script. Now, I am able to proceed with the code but with one problem. Nothing is now getting written in the logfile 'Ctrl_Clean_Log.txt' after adding subprocess.communicate() !! How can I write in the file as well as proceed with the code?

Maybe your first process is stuck waiting and blocking other processes.
https://docs.python.org/2/library/subprocess.html
Popen.wait()
Wait for child process to terminate. Set and return returncode attribute.
Warning This will deadlock when using stdout=PIPE and/or stderr=PIPE and the
child process generates enough output to a pipe such that it blocks waiting
for the OS pipe buffer to accept more data. Use communicate() to avoid that.

Related

How to capture stdout of shell after switching users

I'm making a shell with python. So far I have gotten cd to work (not pretty I know, but it's all I need for now). When I su root (for example) I get a root shell, but I can't capture the output I receive after running a command. However the shell does accept my commands, as when I type exit it exits. Is there a way to capture the output of a 'new' shell?
import os, subprocess
while True:
command = input("$ ")
if len(command.split(" ")) >= 2:
print(command.split(" ")[0]) #This line is for debugging
if command.split(" ")[0] == "cd" or command.split(" ")[1] == "cd":
os.chdir(command.split(" ")[command.split(" ").index("cd") + 1])
continue
process = subprocess.Popen(command.split(), stdout=subprocess.PIPE, universal_newlines=True)
output, error = process.communicate()
print(output.strip("\n"))
EDIT: To make my request a bit more precise, I'd like a way to authenticate as another user from a python script, basically catching the authentication, doing it in the background and then starting a new subprocess.

You really need to understand how subprocess.Popen works. This command executes a new sub-process (on a Unix machine, calls fork and then exec). The new sub-process is a separate process. Your code just calls communicate once and then discards of it.
If you just create a new shell by calling subprocess.Popen and then running su <user> inside of it, the shell will be closed right after that and the next time, you'll be running the command using the same (original) user again.
What you want is probably to create a single subprocess at the beginning of your application and then be a sort of a proxy between the user and the underlying process, and then just keep writing to its stdin and reading from stdout.
Here's an example:
import os, subprocess
process = subprocess.Popen(["bash"], stdin=subprocess.PIPE,
stdout=subprocess.PIPE, universal_newlines=True)
while True:
command = input("$ ")
process.stdin.write(command + "\n")
process.stdin.flush()
output = process.stdout.readline()
print(output.strip("\n"))
(I removed the cd command parsing bit because it wasn't constructive to understanding the solution here, but you can definitely add specific handlers for specific inputs that wrap the underlying shell)

How to use Popen with an interactive command? nslookup, ftp

Is there any way to use Popen with interactive commands? I mean nslookup, ftp, powershell... I read the whole subprocess documentation several times but I can't find the way.
What I have (removing the parts of the project which aren't of interest here) is:
from subprocess import call, PIPE, Popen
command = raw_input('>>> ')
command = command.split(' ')
process = Popen(command, stdout=PIPE, stderr=PIPE, shell=True)
execution = process.stdout.read()
error = process.stderr.read()
output = execution + error
process.stderr.close()
process.stdout.close()
print(output)
Basically, when I try to print the output with a command like dir, the output is a string, so I can work with the .read() on it. But when I try to use nslookup for example, the output isn't a string, so it can't be read, and the script enters in a deadlock.
I know that I can invoke nslookup in non-interactive mode, but that's not the point. I want to remove all the chances of a deadlock, and make it works with every command you can run in a normal cmd.
The real way the project works is through sockets, so the raw_input is a s.recv() and the output is sending back the output, but I have simplified it to focus on the problem.

How to execute a shell script in the background from a Python script

I am working on executing the shell script from Python and so far it is working fine. But I am stuck on one thing.
In my Unix machine I am executing one command in the background by using & like this. This command will start my app server -
david#machineA:/opt/kml$ /opt/kml/bin/kml_http --config=/opt/kml/config/httpd.conf.dev &
Now I need to execute the same thing from my Python script but as soon as it execute my command it never goes to else block and never prints out execute_steps::Successful, it just hangs over there.
proc = subprocess.Popen("/opt/kml/bin/kml_http --config=/opt/kml/config/httpd.conf.dev &", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, executable='/bin/bash')
if proc.returncode != 0:
logger.error("execute_steps::Errors while executing the shell script: %s" % stderr)
sleep(0.05) # delay for 50 ms
else:
logger.info("execute_steps::Successful: %s" % stdout)
Anything wrong I am doing here? I want to print out execute_steps::Successful after executing the shell script in the background.
All other command works fine but only the command which I am trying to run in background doesn't work fine.

There's a couple things going on here.
First, you're launching a shell in the background, and then telling that shell to run the program in the background. I don't know why you think you need both, but let's ignore that for now. In fact, by adding executable='/bin/bash' on top of shell=True, you're actually trying to run a shell to run a shell to run the program in the background, although that doesn't actually quite work.*
Second, you're using PIPE for the process's output and error, but then not reading them. This can cause the child to deadlock. If you don't want the output, use DEVNULL, not PIPE. If you want the output to process yourself, use proc.communicate().**, or use a higher-level function like check_output. If you just want it to intermingle with your own output, just leave those arguments off.
* If you're using the shell because kml_http is a non-executable script that has to be run by /bin/bash, then don't use shell=True for that, or executable, just make make /bin/bash the first argument in the command line, and /opt/kml/bin/kml_http the second. But this doesn't seem likely; why would you install something non-executable into a bin directory?
** Or you can read it explicitly from proc.stdout and proc.stderr, but that gets more complicated.
At any rate, the whole point of executing something in the background is that it keeps running in the background, and your script keeps running in the foreground. So, you're checking its returncode before it's finished, and then moving on to whatever's next in your code, and never coming back again.
It seems like you want to wait for it to be finished. In that case, don't run it in the background—use proc.wait, or just use subprocess.call() instead of creating a Popen object. And don't use & either, of course. While we're at it, don't use the shell, either:
retcode = subprocess.call(["/opt/kml/bin/kml_http",
"--config=/opt/kml/config/httpd.conf.dev"],
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
if retcode != 0:
# etc.
Now, you won't get to that if statement until kml_http finishes running.
If you want to wait for it to be finished, but at the same time keep doing other stuff, then you're trying to do two things at once in your program, which means you need a thread to do the waiting:
def run_kml_http():
retcode = subprocess.call(["/opt/kml/bin/kml_http",
"--config=/opt/kml/config/httpd.conf.dev"],
stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
if retcode != 0:
# etc.
t = threading.Thread(target=run_kml_http)
t.start()
# Now you can do other stuff in the main thread, and the background thread will
# wait around until kml_http is finished and execute the `if` statement whenever
# that happens

You're using stderr=PIPE, stdout=PIPE which means that rather than letting the stdin and stdout of the child process be forwarded to the current process' standard output and error streams, they are being redirected to a pipe which you must read from in your python process (via proc.stdout and proc.stderr.
To "background" a process, simply omit the usage of PIPE:
#!/usr/bin/python
from subprocess import Popen
from time import sleep
proc = Popen(
['/bin/bash', '-c', 'for i in {0..10}; do echo "BASH: $i"; sleep 1; done'])
for x in range(10):
print "PYTHON: {0}".format(x)
sleep(1)
proc.wait()
which will show the process being "backgrounded".

Detecting the end of the stream on popen.stdout.readline

I have a python program which launches subprocesses using Popen and consumes their output nearly real-time as it is produced. The code of the relevant loop is:
def run(self, output_consumer):
self.prepare_to_run()
popen_args = self.get_popen_args()
logging.debug("Calling popen with arguments %s" % popen_args)
self.popen = subprocess.Popen(**popen_args)
while True:
outdata = self.popen.stdout.readline()
if not outdata and self.popen.returncode is not None:
# Terminate when we've read all the output and the returncode is set
break
output_consumer.process_output(outdata)
self.popen.poll() # updates returncode so we can exit the loop
output_consumer.finish(self.popen.returncode)
self.post_run()
def get_popen_args(self):
return {
'args': self.command,
'shell': False, # Just being explicit for security's sake
'bufsize': 0, # More likely to see what's being printed as it happens
# Not guarantted since the process itself might buffer its output
# run `python -u` to unbuffer output of a python processes
'cwd': self.get_cwd(),
'env': self.get_environment(),
'stdout': subprocess.PIPE,
'stderr': subprocess.STDOUT,
'close_fds': True, # Doesn't seem to matter
}
This works great on my production machines, but on my dev machine, the call to .readline() hangs when certain subprocesses complete. That is, it will successfully process all of the output, including the final output line saying "process complete", but then will again poll readline and never return. This method exits properly on the dev machine for most of the sub-processes I call, but consistently fails to exit for one complex bash script that itself calls many sub-processes.
It's worth noting that popen.returncode gets set to a non-None (usually 0) value many lines before the end of the output. So I can't just break out of the loop when that is set or else I lose everything that gets spat out at the end of the process and is still buffered waiting for reading. The problem is that when I'm flushing the buffer at that point, I can't tell when I'm at the end because the last call to readline() hangs. Calling read() also hangs. Calling read(1) gets me every last character out, but also hangs after the final line. popen.stdout.closed is always False. How can I tell when I'm at the end?
All systems are running python 2.7.3 on Ubuntu 12.04LTS. FWIW, stderr is being merged with stdout using stderr=subprocess.STDOUT.
Why the difference? Is it failing to close stdout for some reason? Could the sub-sub-processes do something to keep it open somehow? Could it be because I'm launching the process from a terminal on my dev box, but in production it's launched as a daemon through supervisord? Would that change the way the pipes are processed and if so how do I normalize them?

The main code loop looks right. It could be that the pipe isn't closing because another process is keeping it open. For example, if script launches a background process that writes to stdout then the pipe will no close. Are you sure no other child process still running?
An idea is to change modes when you see the .returncode has set. Once you know the main process is done, read all its output from buffer, but don't get stuck waiting. You can use select to read from the pipe with a timeout. Set a several seconds timeout and you can clear the buffer without getting stuck waiting child process.

Without knowing the contents of the "one complex bash script" which causes the problem, there's too many possibilities to determine the exact cause.
However, focusing on the fact that you claim it works if you run your Python script under supervisord, then it might be getting stuck if a sub-process is trying to read from stdin, or just behaves differently if stdin is a tty, which (I presume) supervisord will redirect from /dev/null.
This minimal example seems to cope better with cases where my example test.sh runs subprocesses which try to read from stdin...
import os
import subprocess
f = subprocess.Popen(args='./test.sh',
shell=False,
bufsize=0,
stdin=open(os.devnull, 'rb'),
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
close_fds=True)
while 1:
s = f.stdout.readline()
if not s and f.returncode is not None:
break
print s.strip()
f.poll()
print "done %d" % f.returncode
Otherwise, you can always fall back to using a non-blocking read, and bail out when you get your final output line saying "process complete", although it's a bit of a hack.

If you use readline() or read(), it should not hang. No need to check returncode or poll(). If it is hanging when you know the process is finished, it is most probably a subprocess keeping your pipe open, as others said before.
There are two things you could do to debug this:
* Try to reproduce with a minimal script instead of the current complex one, or
* Run that complex script with strace -f -e clone,execve,exit_group and see what is that script starting, and if any process is surviving the main script (check when the main script calls exit_group, if strace is still waiting after that, you have a child still alive).

I find that calls to read (or readline) sometimes hang, despite previously calling poll. So I resorted to calling select to find out if there is readable data. However, select without a timeout can hang, too, if the process was closed. So I call select in a semi-busy loop with a tiny timeout for each iteration (see below).
I'm not sure if you can adapt this to readline, as readline might hang if the final \n is missing, or if the process doesn't close its stdout before you close its stdin and/or terminate it. You could wrap this in a generator, and everytime you encounter a \n in stdout_collected, yield the current line.
Also note that in my actual code, I'm using pseudoterminals (pty) to wrap the popen handles (to more closely fake user input) but it should work without.
# handle to read from
handle = self.popen.stdout
# how many seconds to wait without data
timeout = 1
begin = datetime.now()
stdout_collected = ""
while self.popen.poll() is None:
try:
fds = select.select([handle], [], [], 0.01)[0]
except select.error, exc:
print exc
break
if len(fds) == 0:
# select timed out, no new data
delta = (datetime.now() - begin).total_seconds()
if delta > timeout:
return stdout_collected
# try longer
continue
else:
# have data, timeout counter resets again
begin = datetime.now()
for fd in fds:
if fd == handle:
data = os.read(handle, 1024)
# can handle the bytes as they come in here
# self._handle_stdout(data)
stdout_collected += data
# process exited
# if using a pseudoterminal, close the handles here
self.popen.wait()

Why are you setting the sdterr to STDOUT?
The real benefit of making a communicate() call on a subproces is that you are able to retrieve a tuple containining the stdout response as well as the stderr meesage.
Those might be useful if the logic depends on their succsss or failure.
Also, it would save you from the pain of having to iterate through lines. Communicate() gives you everything and there would be no unresolved questions about whether or not the full message was received

I wrote a demo with bash subprocess that can be easy explored.
A closed pipe can be recognized by '' in the output from readline(), while the output from an empty line is '\n'.
from subprocess import Popen, PIPE, STDOUT
p = Popen(['bash'], stdout=PIPE, stderr=STDOUT)
out = []
while True:
outdata = p.stdout.readline()
if not outdata:
break
#output_consumer.process_output(outdata)
print "* " + repr(outdata)
out.append(outdata)
print "* closed", repr(out)
print "* returncode", p.wait()
Example of input/output: Closing the pipe distinctly before terminating the process. That is why wait() should be used instead of poll()
[prompt] $ python myscript.py
echo abc
* 'abc\n'
exec 1>&- # close stdout
exec 2>&- # close stderr
* closed ['abc\n']
exit
* returncode 0
[prompt] $
Your code did output a huge number of empty strings for this case.
Example: Fast terminated process without '\n' on the last line:
echo -n abc
exit
* 'abc'
* closed ['abc']
* returncode 0

Reading process output

I'm writing a simple wrapper over python debugger (pdb) and I need to parse pdb output. But I have a problem reading text from process pipe.
Example of my code:
import subprocess, threading, time
def readProcessOutput(process):
while not process.poll():
print(process.stdout.readline())
process = subprocess.Popen('python -m pdb script.py', shell=True, universal_newlines=True,
stdout=subprocess.PIPE, stderr=subprocess.STDOUT, stdin=subprocess.PIPE)
read_thread = threading.Thread(target=readProcessOutput, args=(process,))
read_thread.start()
while True:
time.sleep(0.5)
When i execute given command (python -m pdb script.py) in OS shell I get results like this:
> c:\develop\script.py(1)<module>()
-> print('hello, world!')
(Pdb)
But when i run my script i get only two lines, but can't get pdb prompt. Writing commands to stdin after this has no effect. So my question is:
why I cannot read third line? How can I avoid this problem and get correct output?
Platform: Windows XP, Python 3.3

The third line can not be read by readline() because it is not terminated yet by the end of line. You see usually the cursor after "(pdb) " until you write anything + enter.
The communication to processes that have some prompt is usually more complicated. It proved to me to write also an independent thread for data writer first for easier testing the communication in order to be sure that the main thread never freezes if too much is tried to be written or read. Then it can be simplified again.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.