Skip stdin and stderr of child with pexpect

Skip stdin and stderr of child with pexpect - python

I'm controlling a child process using pexpect (because subprocess doesn't support pty's and I run into a deadlock with two pipes). The process creates a lot of output on stderr, in which I'm not interested, and apparantly pexpect also echoes back anything I write to its stdin:
>>> import pexpect
>>> p = pexpect.spawn('rev')
>>> p.sendline('Hello!')
7
>>> p.readline()
'Hello!\r\n'
>>> p.readline()
'!olleH\r\n'
How can I turn this off?

Using pty's is not quite the same as a pipe. If you don't put in in raw mode the tty driver will echo back the characters and perform other line editing. So to get a clean data path you need to also put the pty/tty in raw mode.
Since you are now dealing with a pseudo device you have only a single I/O stream. There is no distinction there between stdout and stderr (that is a userspace convention). So you will always see stdout and stderr mixed when using a pty/tty.

Related

No stdout from killed subprocess

i have a homework assignment to capture a 4way handshake between a client and AP using scapy. im trying to use "aircrack-ng capture.pcap" to check for valid handshakes in the capture file i created using scapy
i launch the program using Popen. the program waits for user input so i have to kill it. when i try to get stdout after killing it the output is empty.
i've tried stdout.read(), i've tried communicate, i've tried reading stderr, and i've tried it both with and without shells
check=Popen("aircrack-ng capture.pcap",shell=True,stdin=PIPE,stdout=PIPE,stderr=PIPE)
check.kill()
print(check.stdout.read())

While you shouldn't do this (trying to rely on hardcoded delays is inherently race-condition-prone), that the issue is caused by your kill() being delivered while sh is still starting up can be demonstrated by the problem being "solved" (not reliably, but sufficient for demonstration) by tiny little sleep long enough let the shell start up and the echo run:
import time
from subprocess import Popen, PIPE
check=Popen("echo hello && sleep 1000", shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE)
time.sleep(0.01) # BAD PRACTICE: Race-condition-prone, use one of the below instead.
check.kill()
print(check.stdout.read())
That said, a much better-practice solution would be to close the stdin descriptor so the reads immediately return 0-byte results. On newer versions of Python (modern 3.x), you can do that with DEVNULL:
import time
from subprocess import Popen, PIPE, DEVNULL
check=Popen("echo hello && read input && sleep 1000",
shell=True, stdin=DEVNULL, stdout=PIPE, stderr=PIPE)
print(check.stdout.read())
...or, with Python 2.x, a similar effect can be achieved by passing an empty string to communicate(), thus close()ing the stdin pipe immediately:
import time
from subprocess import Popen, PIPE
check=Popen("echo hello && read input && sleep 1000",
shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE)
print(check.communicate('')[0])

Never, and I mean, never kill a process as part of normal operation. There's no guarantee whatsoever how far it has proceeded by the time you kill it, so you cannot expect any specific results from it in such a case.
To explicitly pass nothing to a subprocess as input to prevent hanging when it tries to read stdin:
connect its stdin to /dev/null (nul in Windows) as per run a process to /dev/null in python :
p=Popen(<...>, stdin=open(os.devnull)) #or stdin=subprocess.DEVNULL in Python 3.3+
or use stdin=PIPE and <process>.communicate() without arguments -- this will pass an empty stream
Use <process>.communicate(), or use subprocess.check_output() instead of Popen to read output reliably
A process, in the general case, is not guaranteed to output any data at any particular moment due to I/O buffering. So you need to read the output stream after the process completes to be sure you've got everything.
At the same time, you need to keep reading the stream in the meantime if the process can produce enough output to fill an I/O buffer1. Otherwise, it will hang waiting for you to read the buffered data. If both stdout and stderr are PIPEs, you need to read them both, in parallel -- i.e. in different threads.
communicate() and check_output (that uses the former under the hood) achieve this by reading stdout and stderr in two separate threads.
Prefer convenience functions to Popen for common use cases -- in your case, check_output -- as they take care of all the aforementioned caveats for you.
1Pipes are fully buffered and a typical buffer size is 64KB

Wait for a prompt from a subprocess before sending stdin input

I have a linux x86 binary that asks for a password and will print out whether the password is correct or incorrect. I would like to use python to fuzz the input.
Below is a screenshot of me running the binary, then giving it the string "asdf", and receiving the string "incorrect"
Screenshot:
So far, I have tried to use the Python3 subprocess module in order to
run the binary as a subprocess
receive the prompt for a password
send a string.
receive the response
Here is my script
p = subprocess.Popen("/home/pj/Desktop/L1/lab1",stdin=subprocess.PIPE, stdout=subprocess.PIPE)
print (p.communicate()[0])
the result of running this script is
b'Please supply the code: \nIncorrect\n'
I am expecting to receive only the prompt however the binary is returning a response of incorrect as well before I have gotten the chance to send my input.
How can I improve my script in order to interact with this binary successfully?

Read the documentation carefully (emphasis mine):
Popen.communicate(input=None)
Interact with process: Send data to stdin. Read data from stdout and
stderr, until end-of-file is reached. Wait for process to terminate.
The optional input argument should be a string to be sent to the child
process, or None, if no data should be sent to the child.
communicate() returns a tuple (stdoutdata, stderrdata).
Note that if you want to send data to the process’s stdin, you need to
create the Popen object with stdin=PIPE. Similarly, to get anything
other than None in the result tuple, you need to give stdout=PIPE
and/or stderr=PIPE too.
So, you're sending nothing to the process, and reading all of stdout at once.
In your case, you don't really need to wait for the prompt to send data to the process because streams work asynchronously: the process will get your input only when it tries to read its STDIN:
In [10]: p=subprocess.Popen(("bash", "-c","echo -n 'prompt: '; read -r data; echo $data"),stdin=subprocess.PIPE,stdout=subprocess.PIPE)
In [11]: p.communicate('foobar')
Out[11]: ('prompt: foobar\n', None)
If you insist on waiting for the prompt for whatever reason (e.g. your process checks the input before the prompt, too, expecting something else), you need to read STDOUT manually and be VERY careful how much you read: since Python's file.read is blocking, a simple read() will deadlock because it waits for EOF and the subprocess doesn't close STDOUT -- thus doesn't produce EOF -- until it get input from you. If the input or output length is likely to go over stdio's buffer length (unlikely in your specific case), you also need to do stdout reading and stdin writing in separate threads.
Here's an example using pexpect that takes care of that for you (I'm using pexpect.fdexpect instead of pexpect.spawn suggested in the doc 'cuz it works on all platforms):
In [1]: import pexpect.fdpexpect
In [8]: p=subprocess.Popen(("bash", "-c","echo -n 'prom'; sleep 5; echo 'pt: '; read -r data; echo $data"),stdin=subprocess.PIPE,stdout=subprocess.PIPE)
In [10]: o=pexpect.fdpexpect.fdspawn(p.stdout.fileno())
In [12]: o.expect("prompt: ")
Out[12]: 0
In [16]: p.stdin.write("foobar") #you can communicate() here, it does the same as
# these 3 steps plus protects from deadlock
In [17]: p.stdin.close()
In [18]: p.stdout.read()
Out[18]: 'foobar\n'

Using subprocess to launch hadoop job but can't get log from stdout

To simplify my question, here'a a python script:
from subprocess import Popen, PIPE
proc = Popen(['./mr-task.sh'], shell=True, stdout=PIPE, stderr=PIPE)
while True:
out = proc.stdout.readline()
print(out)
Here's mr-task.sh, it starts a mapreduce job:
hadoop jar xxx.jar some-conf-we-don't-need-to-care
When I run ./mr-task, I could see log printed on the screen, something like:
14/12/25 14:56:44 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/12/25 14:56:44 INFO snappy.LoadSnappy: Snappy native library loaded
14/12/25 14:57:01 INFO mapred.JobClient: Running job: job_201411181108_16380
14/12/25 14:57:02 INFO mapred.JobClient: map 0% reduce 0%
14/12/25 14:57:28 INFO mapred.JobClient: map 100% reduce 0%
But I can't get these output running python script. I tried removing shell=True or fetch stderr, still got nothing.
Does anyone have any idea why this happens?

You could redirect stderr to stdout:
from subprocess import Popen, PIPE, STDOUT
proc = Popen(['./mr-task.sh'], stdout=PIPE, stderr=STDOUT, bufsize=1)
for line in iter(proc.stdout.readline, b''):
print line,
proc.stdout.close()
proc.wait()
See Python: read streaming input from subprocess.communicate().
in my real program I redirect stderr to stdout and read from stdout, so bufsize is not needed, is it?
The redirection of stderr to stdout and bufsize are unrelated. Changing bufsize might affect the time performance (the default bufsize=0 i.e., unbuffered on Python 2). Unbuffered I/O might be 10..100 times slower. As usual, you should measure the time performance if it is important.
Calling Popen.wait/communicate after the subprocess has terminated is just for clearing zombie process, and these two methods have no difference in such case, correct?
The difference is that proc.communicate() closes the pipes before reaping the child process. It releases file descriptors (a finite resource) to be used by a other files in your program.
about buffer, if output fill buffer maxsize, will subprocess hang? Does that mean if I use the default bufsize=0 setting I need to read from stdout as soon as possible so that subprocess don't block?
No. It is a different buffer. bufsize controls the buffer inside the parent that is filled/drained when you call .readline() method. There won't be a deadlock whatever bufsize is.
The code (as written above) won't deadlock no matter how much output the child might produce.
The code in #falsetru's answer can deadlock because it creates two pipes (stdout=PIPE, stderr=PIPE) but it reads only from one pipe (proc.stderr).
There are several buffers between the child and the parent e.g., C stdio's stdout buffer (a libc buffer inside child process, inaccessible from the parent), child's stdout OS pipe buffer (inside kernel, the parent process may read the data from here). These buffers are fixed they won't grow if you put more data into them. If stdio's buffer overflows (e.g., during a printf() call) then the data is pushed downstream into the child's stdout OS pipe buffer. If nobody reads from the pipe then then this OS pipe buffer fills up and the child blocks (e.g., on write() system call) trying to flush the data.
To be concrete, I've assumed C stdio's based program and POSIXy OS.
The deadlock happens because the parent tries to read from the stderr pipe that is empty because the child is busy trying to flush its stdout. Thus both processes hang.

One possible reaosn is that the output is printed to standard error instead of standard output.
Try to replace stdout with stderr:
from subprocess import Popen, PIPE
proc = Popen(['./mr-task.sh'], stdout=PIPE, stderr=PIPE)
while True:
out = proc.stderr.readline() # <----
if not out:
break
print(out)

Is it possible to pre-pend each STDERR with a given string

I am writing a program to interact with a linux machine through the serial port, and I am using pexpect.spawn as my main communication channel as follows:
proc = pexpect.spawn("cu dir -l /dev/ttyUSB0 -s 115200", logfile = *someFile*)
and I am sending commands to the machine with the sendline("cmd") method, and at the end of each session I parse the log file to see how the commands behaved.
I would like to be able to distinguish between lines that were printed to stdout and stderr from my log file, but currently I have no way of doing that.
Is that a way to globally prepend each line printed to stderr with a given string?

You don't mention how you capture stdout and stderr, but one simple way distinguish the stdout and stderr is to simply place stdout and stderr in different files. For example:
./command.py >stdout-log 2>stderr-log

I think this is a limitation of pexpect. You're basically dealing with a black box command prompt, so pexpect has no knowledge about whether a string returned to the console (effectively) is stdout or stderr, just that something came back. Can you safely assume a limited set of message and error formats in your system so that you could write some regex-based post-processor?

How do I write to a Python subprocess' stdin?

I'm trying to write a Python script that starts a subprocess, and writes to the subprocess stdin. I'd also like to be able to determine an action to be taken if the subprocess crashes.
The process I'm trying to start is a program called nuke which has its own built-in version of Python which I'd like to be able to submit commands to, and then tell it to quit after the commands execute. So far I've worked out that if I start Python on the command prompt like and then start nuke as a subprocess then I can type in commands to nuke, but I'd like to be able to put this all in a script so that the master Python program can start nuke and then write to its standard input (and thus into its built-in version of Python) and tell it to do snazzy things, so I wrote a script that starts nuke like this:
subprocess.call(["C:/Program Files/Nuke6.3v5/Nuke6.3", "-t", "E:/NukeTest/test.nk"])
Then nothing happens because nuke is waiting for user input. How would I now write to standard input?
I'm doing this because I'm running a plugin with nuke that causes it to crash intermittently when rendering multiple frames. So I'd like this script to be able to start nuke, tell it to do something and then if it crashes, try again. So if there is a way to catch a crash and still be OK then that'd be great.

It might be better to use communicate:
from subprocess import Popen, PIPE, STDOUT
p = Popen(['myapp'], stdout=PIPE, stdin=PIPE, stderr=PIPE)
stdout_data = p.communicate(input='data_to_write')[0]
"Better", because of this warning:
Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.

To clarify some points:
As jro has mentioned, the right way is to use subprocess.communicate.
Yet, when feeding the stdin using subprocess.communicate with input, you need to initiate the subprocess with stdin=subprocess.PIPE according to the docs.
Note that if you want to send data to the process’s stdin, you need to create the Popen object with stdin=PIPE. Similarly, to get anything other than None in the result tuple, you need to give stdout=PIPE and/or stderr=PIPE too.
Also qed has mentioned in the comments that for Python 3.4 you need to encode the string, meaning you need to pass Bytes to the input rather than a string. This is not entirely true. According to the docs, if the streams were opened in text mode, the input should be a string (source is the same page).
If streams were opened in text mode, input must be a string. Otherwise, it must be bytes.
So, if the streams were not opened explicitly in text mode, then something like below should work:
import subprocess
command = ['myapp', '--arg1', 'value_for_arg1']
p = subprocess.Popen(command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
output = p.communicate(input='some data'.encode())[0]
I've left the stderr value above deliberately as STDOUT as an example.
That being said, sometimes you might want the output of another process rather than building it up from scratch. Let's say you want to run the equivalent of echo -n 'CATCH\nme' | grep -i catch | wc -m. This should normally return the number characters in 'CATCH' plus a newline character, which results in 6. The point of the echo here is to feed the CATCH\nme data to grep. So we can feed the data to grep with stdin in the Python subprocess chain as a variable, and then pass the stdout as a PIPE to the wc process' stdin (in the meantime, get rid of the extra newline character):
import subprocess
what_to_catch = 'catch'
what_to_feed = 'CATCH\nme'
# We create the first subprocess, note that we need stdin=PIPE and stdout=PIPE
p1 = subprocess.Popen(['grep', '-i', what_to_catch], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
# We immediately run the first subprocess and get the result
# Note that we encode the data, otherwise we'd get a TypeError
p1_out = p1.communicate(input=what_to_feed.encode())[0]
# Well the result includes an '\n' at the end,
# if we want to get rid of it in a VERY hacky way
p1_out = p1_out.decode().strip().encode()
# We create the second subprocess, note that we need stdin=PIPE
p2 = subprocess.Popen(['wc', '-m'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
# We run the second subprocess feeding it with the first subprocess' output.
# We decode the output to convert to a string
# We still have a '\n', so we strip that out
output = p2.communicate(input=p1_out)[0].decode().strip()
This is somewhat different than the response here, where you pipe two processes directly without adding data directly in Python.
Hope that helps someone out.

Since subprocess 3.5, there is the subprocess.run() function, which provides a convenient way to initialize and interact with Popen() objects. run() takes an optional input argument, through which you can pass things to stdin (like you would using Popen.communicate(), but all in one go).
Adapting jro's example to use run() would look like:
import subprocess
p = subprocess.run(['myapp'], input='data_to_write', capture_output=True, text=True)
After execution, p will be a CompletedProcess object. By setting capture_output to True, we make available a p.stdout attribute which gives us access to the output, if we care about it. text=True tells it to work with regular strings rather than bytes. If you want, you might also add the argument check=True to make it throw an error if the exit status (accessible regardless via p.returncode) isn't 0.
This is the "modern"/quick and easy way to do to this.

One can write data to the subprocess object on-the-fly, instead of collecting all the input in a string beforehand to pass through the communicate() method.
This example sends a list of animals names to the Unix utility sort, and sends the output to standard output.
import sys, subprocess
p = subprocess.Popen('sort', stdin=subprocess.PIPE, stdout=sys.stdout)
for v in ('dog','cat','mouse','cow','mule','chicken','bear','robin'):
p.stdin.write( v.encode() + b'\n' )
p.communicate()
Note that writing to the process is done via p.stdin.write(v.encode()). I tried using
print(v.encode(), file=p.stdin), but that failed with the message TypeError: a bytes-like object is required, not 'str'. I haven't figured out how to get print() to work with this.

You can provide a file-like object to the stdin argument of subprocess.call().
The documentation for the Popen object applies here.
To capture the output, you should instead use subprocess.check_output(), which takes similar arguments. From the documentation:
>>> subprocess.check_output(
... "ls non_existent_file; exit 0",
... stderr=subprocess.STDOUT,
... shell=True)
'ls: non_existent_file: No such file or directory\n'

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Skip stdin and stderr of child with pexpect - python

Related

No stdout from killed subprocess

Wait for a prompt from a subprocess before sending stdin input

Using subprocess to launch hadoop job but can't get log from stdout

Is it possible to pre-pend each STDERR with a given string

How do I write to a Python subprocess' stdin?

Categories

Resources