How to diff file and output stream "on-the-fly"? - python

I need to create a diff file using standard UNIX diff command with python subprocess module. The problem is that I must compare file and stream without creating tempopary file. I thought about using named pipes via os.mkfifo method, but didn't reach any good result. Please, can you write a simple example on how to solve this stuff? I tried like so:
fifo = 'pipe'
os.mkfifo(fifo)
op = popen('cat ', fifo)
print >> open(fifo, 'w'), output
os.unlink(fifo)
proc = Popen(['diff', '-u', dumpfile], stdin=op, stdout=PIPE)
but it seems like diff doesn't see the second argument.

You can use "-" as an argument to diff to mean stdin.

You could perhaps consider using the difflib python module (I've linked to an example here) and create something that generates and prints the diff directly rather than relying on diff. The various function methods inside difflib can receive character buffers which can be processed into diffs of various types.
Alternatively, you can construct a shell pipeline and use process substitution like so
diff <(cat pipe) dumpfile # You compare the output of a process and a physical file without explicitly using a temporary file.
For details, check out http://tldp.org/LDP/abs/html/process-sub.html

Related

How do I read from the terminal in Python? [duplicate]

This question already has answers here:
read subprocess stdout line by line
(10 answers)
Closed 16 days ago.
I just want to use os.system("dir") and also be able to save the text outputted to a variable. I tried using sys.stdout.read() but running sys.stdout.readable() returns False. Do you know how I can read from the terminal?
using os library:
info = os.popen('dir').read()
You can use the subprocess.check_output method
Example
import subprocess as sp
stdout = sp.check_output("dir")
print(stdout)
There is a bit of confusion here about the different streams, and possibly a better way to do things.
For the specific case of dir, you can replace the functionality you want with the os.listdir function, or better yet os.scandir.
For the more general case, you will not be able to read an arbitrary stdout stream. If you want to do that, you'll have to set up a subprocess whose I/O streams you control. This is not much more complicated than using os.system. you can use subprocess.run, for example:
content = subprocess.run("dir", stdout=subprocess.PIPE, check=True).stdout
The object returned by run has a stdout attribute that contains everything you need.
If you want to read just go with
x = input()
This reads a one line from the terminal. x is a string by default, but you can cast it, say to int, like so
x = int(x)

Pass file as input to a program and store its output using the sh library in python

I'm confused on how exactly we should use the python sh library, specifically the sh.Command(). Basically, I wish to pass input_file_a to program_b.py and store its output in a different directory as output_file_b, how should I achieve this using the sh library in python?
If you mean input and output redirection, then see here (in) and here (out) respectively. In particular, looks like to "redirect" stdin you need to pass as argument the actual bytes (e.g. read them beforehand), in particular, the following should work according to their documentation (untested, as I don't have/work with sh - please let know if this works for you / fix whatever is missing):
import sh
python3 = sh.Command("python3")
with open(input_file_a, 'r') as ifile:
python3("program_b.py", _in=ifile.read(), _out=output_file_b)
Note that may need to specify argument search_paths for sh.Command for it to find python. Also, may need to specify full path to program_b.py file or os.chdir() accordingly.

Capturing output file of ffmpeg with vid.stab in python into a variable

I'm trying to write a python script to stabilize videos using ffmpeg and the vid.stab library.
My problem is that the output file doesn't seem to go through stdout, so using subprocess.Popen() returns an empty variable.
cmd1=["ffmpeg", "-i","./input.MOV", "-vf", "vidstabdetect=stepsize=6:shakiness=10:accuracy=15", "-f","null","pipe:1"]
p = subprocess.Popen(cmd1, stdout=subprocess.PIPE)
vectors, err = p.communicate()
The issues is that vibstabdetect takes a filter called result, and outputs a file to whatever's specified there, and stdout remains empty. (If there's no result specified it defaults to transforms.trf.)
Is there a way to get the contents of the result file?
When running the script with the code above it executes without error, but the file is created with the default name and the variable remains empty.
You need to specify stdout for the filter logging data, not the transcoded output from ffmpeg, which is what your current -f null pipe:1 does.
However, the vidstabdetect filter uses the POSIX fopen to open the destination for the transform data, unlike most other filters which use the internal avio_open. For fopen, pipe:1 is not acceptable. For Windows, CON, and for linux, /dev/stdout, as you confirmed, is required.

subprocess, Popen, and stdin: Seeking practical advice on automating user input to .exe

Despite my obviously beginning Python skills, I’ve got a script that pulls a line of data from a 2,000-row CSV file, reads key parameters, and outputs a buffer CSV file organized as an N-by-2 rectangle, and uses the subprocess module to call the external program POVCALLC.EXE, which takes a CSV file organized that way as input. The relevant portion of the code is shown below. I THINK that subprocess or one of its methods should allow me to interact with the external program, but am not quite sure how - or indeed whether this is the module I need.
In particular, when POVCALLC.EXE starts it first asks for the input file, which in this case is buffer.csv. It then asks for several additional parameters including the name of an output file, which come from outside the snippet below. It then starts computing results, and then ask for further user input, including several carriage returns . Obviously, I would prefer to automate this interaction for the 2,000 rows in the original CSV.
Am I on the right track with subprocess, or should I be looking elsewhere to automate this interaction with the external executable?
Many thanks in advance!
# Begin inner loop to fetch Lorenz curve data for each survey
for i in range(int(L_points_number)):
index = 3 * i
line = []
P = L_points[index]
line.append(P)
L = L_points[index + 1]
line.append(L)
with open('buffer.csv', 'a', newline='') as buffer:
writer = csv.writer(buffer, delimiter=',')
P=1
line.append(P)
L=1
line.append(L)
writer.writerow(line)
subprocess.call('povcallc.exe')
# TODO: CALL povcallc and compute results
# TODO: USE Regex to interpret results and append them to
# output file
If your program expects these arguments on the standard input (e.g. after running POVCALLC you type csv filenames into the console), you could use subprocess.Popen() [see https://docs.python.org/3/library/subprocess.html#subprocess.Popen ] with stdin redirection (stdin=PIPE), and use the returned object to send data to stdin.
It would looks something like this:
my_proc = subprocess.Popen('povcallc.exe', stdin=subprocess.PIPE)
my_proc.communicate(input="my_filename_which_is_expected_by_the_program.csv")
You can also use the tuple returned by communicate to automatically check the programs stdout and stderr (see the link to docs for more).

Running a command line from python and piping arguments from memory

I was wondering if there was a way to run a command line executable in python, but pass it the argument values from memory, without having to write the memory data into a temporary file on disk. From what I have seen, it seems to that the subprocess.Popen(args) is the preferred way to run programs from inside python scripts.
For example, I have a pdf file in memory. I want to convert it to text using the commandline function pdftotext which is present in most linux distros. But I would prefer not to write the in-memory pdf file to a temporary file on disk.
pdfInMemory = myPdfReader.read()
convertedText = subprocess.<method>(['pdftotext', ??]) <- what is the value of ??
what is the method I should call and how should I pipe in memory data into its first input and pipe its output back to another variable in memory?
I am guessing there are other pdf modules that can do the conversion in memory and information about those modules would be helpful. But for future reference, I am also interested about how to pipe input and output to the commandline from inside python.
Any help would be much appreciated.
with Popen.communicate:
import subprocess
out, err = subprocess.Popen(["pdftotext", "-", "-"], stdout=subprocess.PIPE).communicate(pdf_data)
os.tmpfile is useful if you need a seekable thing. It uses a file, but it's nearly as simple as a pipe approach, no need for cleanup.
tf=os.tmpfile()
tf.write(...)
tf.seek(0)
subprocess.Popen( ... , stdin = tf)
This may not work on Posix-impaired OS 'Windows'.
Popen.communicate from subprocess takes an input parameter that is used to send data to stdin, you can use that to input your data. You also get the output of your program from communicate, so you don't have to write it into a file.
The documentation for communicate explicitly warns that everything is buffered in memory, which seems to be exactly what you want to achieve.

Categories