When using subprocess.Popen on a Windows interactive command-line program, and setting stdout = PIPE or stdout = somefileobject, the output will always be cut short, or truncated, as well as missing prompts and other text.
So my question is: How do I capture all of the output of the subprocess?
More Details below:
I am specifically trying to grab the output from steamcmd. Here are some code examples and outputs I've run through the Python environment in the terminal.
from subprocess import Popen
#This command opens steamcmd, logs in, and calls licenses_print. It dumps
#all licenses available to the user, and then quits the program.
cmd = (['path to steamcmd', 'arg1', 'arg2',...])
Popen(cmd)
In this I didn't set stdout to anything, so it dumps all output into the terminal and I can see everything. It's about 70 lines of text. The last few lines will be this, which is what I expect, and I get this from running steamcmd directly.
License packageID 166844:
- State : Active( flags 0 ) - Purchased : Sat Jun 2 12:43:06 2018 in "US", Wallet
- Apps : 620980, (1 in total)
- Depots : 620981, (1 in total)
But the moment I try to pass this into a file, like below
f = open('path to file', 'w+')
Popen(cmd, stdout = f).wait()
f.close()
The output dumped to the file gets cut short, and the last few lines look like this
License packageID 100123:
- State : Active( flags 512 ) - Purchased : Sat Jun 10 19:34:14 2017 in "US", Complimentary
- Apps : 459860, (1 in total)
- Depots : 4598
You can see it didn't make it to package 166844, and it stops in the middle of the line "- Depots : 459861, (1 in total)"
I have read that PIPE has a size limit and causes it to hang, but it never has hung for me, the output isn't nearly big enough, but I tried outputting straight to the file anyways and it hasn't worked. I've tried using check_output, getoutput, but I assume they use the same thing under the hood anyways.
So again, my question again is: How do I capture all the output from the subprocess?
Edit 1: I've tried reading it out line by line, but it still cuts at the same place. I've tried Powershell and Windows Command Prompt. I tried this same thing on Linux Mint, with the Linux build of Steamcmd. On Linux I was able to capture all the output from the subprocess and store it in the file. So this seems like it may not be a python issue, but a Windows, or Windows cli issue that is causing it to not capture all the output. I tell it to wait, send it to PIPE instead of a file, it will always cut it short. Somewhere the last of the output gets dropped before making to my file.
Related
I am attempting to write a (Bash) shell script that wraps around a third-party python script and captures all output (errors and stdout) into a log file, and also restarts the script with a new batch of data each time it completes successfully. I'm doing this on a standard Linux distribution, but hopefully this solution can be platform-independent.
So here's a simplified version of the shell script, omitting everything except the logging:
#!/bin/bash
/home/me/script.py &>> /home/me/logfile
The problem is the third-party python script's output is mostly on a single line, which is being refreshed periodically (~every 90 seconds) by use of a carriage return ("\r"). Here's an example of the type of output I mean:
#!/usr/bin/env python3
import time
tracker = 1
print("This line is captured in the logfile because it ends with a newline")
while tracker < 5:
print(" This output isn't captured in the log file. Tracker = " + str(tracker),end="\r")
tracker += 1
time.sleep(1)
print("This line does get captured. Script is done. ")
How can I write a simple shell script to capture the output each time it is refreshed, or at least to periodically capture the current output as it would appear on the screen if I were running the script in the terminal?
Obviously I could try to modify the python script to change its output behavior, but the actual script I'm using is very complex and I think beyond my abilities to do that easily.
The program should have disabled this behavior when output is not a tty.
The output is already captured completely, it's just that you see all the updates at once when you cat the file. Open it in a text editor and see for yourself.
To make the file easier to work with, you can just replace the carriage returns with line feeds:
/home/me/script.py | tr '\r' '\n'
If the process normally produces output right away, but not with this command, you can disable Python's output buffering.
I'm looking for a way to monitor a file that is written to by a program on Linux. I found the tail -F command in here, and also recommended was less +FG. I tested it by running tail -F file in one terminal, and a simple python script:
import time
for i in range(20):
print i
time.sleep(0.5)
in another. I redirected the output to the file:
python script.py >> file
I expected that tail would track the file contents and update the display in fixed intervals, instead it only shows what was written to the file after the command terminates.
The same thing happens with less +FG and also if I watch the output from cat. I've also tried using the usual redirect which truncates the file > instead of >>. Here it says the file was truncated, but still does not track it in real time.
Any idea why this doesn't work? (It's suggested here that it might be due to buffered writes, but since my script runs over 10 seconds, I suspect this might not be the cause)
Edit: In case it matters, I'm running Linux Mint 18.1
Python's standard out is buffered. If when you close the script / script is done, you see all the output - that's definitely buffer issue.
You can use this instead:
import time
import sys
for i in range(20):
sys.stdout.write('%d\n' % i)
sys.stdout.flush()
time.sleep(0.5)
I've tested it and it prints values in real time. To overcome buffer issue, after each .write() method I use .flush() force "flushing" the buffer.
Additional options from the comments:
Use the original print statement with sys.stdout.flush() afterwords
Run the python script with python -u for unbuffered binary stdout and stderr
Regarding jon1467 answer (sorry can't comment your answer), your understanding of redirection is wrong.
Try this :
dd if=/dev/urandom > test.txt
while looking at the file size with :
ls -l test.txt
You'll see the file grow while dd is running.
Vinny's answer is correct, python standard output is buffered.
The more common way to the "buffering effect" you notice is by flushing the stdout as Vinny showed you.
You could also use -u option to disable buffering for the whole python process, or you could just reopen standard output with a buffer size of 0 as below (in python2 at least):
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
Terminal automatically cut the output as it scrolled up with each iteration.
By default mac terminal will have a limited no of lines of buffer. Say 1,000.
When you run a program and the output crossed 1000 lines your lines will be lost from memory. It's like FIFO buffer queue.
Basically the answer to your question:
`Is there a way to store the output of the previously run command in a text file?` is no. Sorry.
You can re-run the program and preserve the output by redirecting it to another file. Or increase the number of lines in the buffer (Maybe make it unlimited)
You can use less to go through the output:
your_command | less
Your Enter key will take you down.
Also, press q to exit.
Or you can reroute the output to a file
In one terminal tab run the program and redirect the output to a output.log file like this.
python program.py > output.log
In another tab you can tailf on the same log file and see the output.
tailf output.log
To see the complete output open the log file in any text editor.
You can consider increasing the scrollback buffer.
Or
If you want to see the data and also run it to a file, use tee, e.g,
spark-shell | tee tmp.out
I am working on extracting PDFs from SEC filings. They usually come like this:
SEC Filing Example
For whatever reason when I save the raw PDF to a .text file, and then try to run
uudecode -o output_file.pdf input_file.txt
from the python subprocess.call() function or any other python function that allows commands to be executed from the command line, the PDF files that are generated are corrupted. If I run this same command from the command line directly there is no corruption.
When taking a closer look at the PDF file being output from the python script, it looks like the file ends prematurely. Is there some sort of output limit when executing a command line command from python?
Thanks!
This script worked fine for me running under Python 3.4.1 on Fedora 21 x86_64 with uudecode 4.15.2:
import subprocess
subprocess.call("uudecode -o output_file.pdf input_file.txt", shell=True)
Using the linked SEC filing (length: 173,141 B; sha1: e4f7fa2cbb3422411c2f2968d954d6bb9808b884), the decoded PDF (length: 124,557 B; sha1: 1676320e1d9923e14d19451c16688198bc93ca0d) appears correct when viewed.
There may be something else in your environment causing the problem. You may want to add additional details to your question.
Is there some sort of output limit when executing a command line command from python?
If by "output limit" you mean the size of the file being written by uudecode, then no. The only type of "output limit" you need to worry about when using the subprocess module is when you pass stdout=PIPE or stderr=PIPE when creating a child process. If the child process writes enough data to either of these streams, and your script does not regularly drain them, the child process will block (see the subprocess module documentation). In my test, uudecode wrote nothing to stdout or stderr.
Turns out it is an error with my C program. I changed my printf to only print a preset string and redirected it to a file and the extra characters were still there. I still don't know why though.
Hi I'm writing a python script to run analysis on a C program I'm making parallel. Write now I have the number of processors used and the iterations I want to pass to my C program in a separate file called tests. I'm extremely new to Python, here's my sample code I wrote to figure out how to write results to a file which fill eventually be a .csv file.
#!/usr/bin/env python
import subprocess
mpiProcess = "runmpi"
piProcess = "picalc"
tests = open("tests.txt")
analysis = open("analysis.txt", "w")
def runPiCalc (numProcs, numIterations):
numProcs = str(numProcs)
numIterations = str(numIterations)
args = (mpiProcess, piProcess, numProcs, numIterations)
popen = subprocess.Popen(args, stdout=subprocess.PIPE)
popen.wait()
output = popen.stdout.read()
return output
def runTest (testArgs):
testProcs = testArgs[0]
testIterations = testArgs[1]
output = runPiCalc(testProcs,testIterations)
appendResults(output)
def appendResults (results):
print results
analysis.write(results + '\n')
for testLine in tests:
testArgs = testLine.split()
runTest(testArgs)
tests.close()
analysis.close()
My problem right now is when I "print results" to stdout the output comes out as expected and I get 3.14blablablablawhatever. When I check the analysis.txt file though I get [H[2J (weirder characters that are encoded as ESC not on the web) at the start of every line before my pi calculation shows up. I can't figure out why that is. Why would file.write have different output than print. Again this is my first time with Python so I'm probably just missing something easy.
This is on a ubuntu server I'm sshing to btw.
Here's the tests.txt and a picture of how the characters look on linux
The problem was I had a bash script executing my C program. The bash script was inserting the weird characters before the program output and adding it to its standard output. Putting the command I was calling inside the python script directly instead of calling a bash script fixed the problem.