as an example I have two scripts, say script1.py
f = open("output1.txt", "w")
count = 1
for i in range(100):
f.write(str(count) + "\n")
print(str(count))
count +=1
f.close
This script prints numbers from 1 to 100 to a file and to standard output.
Then I have a second script, say script2.py
import sys
import time
stdin = sys.stdin
f1 = open("output2.txt", "w")
for line in stdin:
if len(line)>0:
print(line.strip())
time.sleep(0.05)
f1.write(line.strip() + "\n")
which reads data from standard input and prints them to a file. I added a time.sleep command to ensure the second script consumes data at a far lower rate than they are produced by the first one.
I run the scripts from the command line as
python3 script1.py | python3 script2.py
so redirecting the standard output of the first (so the print() command) to the standard input of the second one.
It works as somehow expected, two files are generated containing numbers from 1 to 100.
I am nevertheless wondering how the data transfer part works, from first to second script.
the first script generates data at a faster rate. Where are these data stored, waiting for the second script to access them?
Is there some sort of buffer that is put in place between the two process? Or what else?
Is Python responsible for this, or the OS?
Is the buffer limited in size? Can it be programmed (e.g. accessed to direct data to another target as well)?
Thanks a bunch
it is because of the pipe "|", more info here: https://ss64.com/nt/syntax-redirection.html
commandA | commandB Pipe the output from commandA into commandB
so the prints from your script1 are sent to your script2.
My guess on how it works is that every print is saved in memory as a big string and then sent back (as text) to the second that s why sys.stdin works
Related
I can successfully redirect my output to a file, however this appears to overwrite the file's existing data:
import subprocess
outfile = open('test','w') #same with "w" or "a" as opening mode
outfile.write('Hello')
subprocess.Popen('ls',stdout=outfile)
will remove the 'Hello' line from the file.
I guess a workaround is to store the output elsewhere as a string or something (it won't be too long), and append this manually with outfile.write(thestring) - but I was wondering if I am missing something within the module that facilitates this.
You sure can append the output of subprocess.Popen to a file, and I make a daily use of it. Here's how I do it:
log = open('some file.txt', 'a') # so that data written to it will be appended
c = subprocess.Popen(['dir', '/p'], stdout=log, stderr=log, shell=True)
(of course, this is a dummy example, I'm not using subprocess to list files...)
By the way, other objects behaving like file (with write() method in particular) could replace this log item, so you can buffer the output, and do whatever you want with it (write to file, display, etc) [but this seems not so easy, see my comment below].
Note: what may be misleading, is the fact that subprocess, for some reason I don't understand, will write before what you want to write. So, here's the way to use this:
log = open('some file.txt', 'a')
log.write('some text, as header of the file\n')
log.flush() # <-- here's something not to forget!
c = subprocess.Popen(['dir', '/p'], stdout=log, stderr=log, shell=True)
So the hint is: do not forget to flush the output!
Well the problem is if you want the header to be header, then you need to flush before the rest of the output is written to file :D
Are data in file really overwritten? On my Linux host I have the following behavior:
1) your code execution in the separate directory gets:
$ cat test
test
test.py
test.py~
Hello
2) if I add outfile.flush() after outfile.write('Hello'), results is slightly different:
$ cat test
Hello
test
test.py
test.py~
But output file has Hello in both cases. Without explicit flush() call stdout buffer will be flushed when python process is terminated.
Where is the problem?
Despite my obviously beginning Python skills, I’ve got a script that pulls a line of data from a 2,000-row CSV file, reads key parameters, and outputs a buffer CSV file organized as an N-by-2 rectangle, and uses the subprocess module to call the external program POVCALLC.EXE, which takes a CSV file organized that way as input. The relevant portion of the code is shown below. I THINK that subprocess or one of its methods should allow me to interact with the external program, but am not quite sure how - or indeed whether this is the module I need.
In particular, when POVCALLC.EXE starts it first asks for the input file, which in this case is buffer.csv. It then asks for several additional parameters including the name of an output file, which come from outside the snippet below. It then starts computing results, and then ask for further user input, including several carriage returns . Obviously, I would prefer to automate this interaction for the 2,000 rows in the original CSV.
Am I on the right track with subprocess, or should I be looking elsewhere to automate this interaction with the external executable?
Many thanks in advance!
# Begin inner loop to fetch Lorenz curve data for each survey
for i in range(int(L_points_number)):
index = 3 * i
line = []
P = L_points[index]
line.append(P)
L = L_points[index + 1]
line.append(L)
with open('buffer.csv', 'a', newline='') as buffer:
writer = csv.writer(buffer, delimiter=',')
P=1
line.append(P)
L=1
line.append(L)
writer.writerow(line)
subprocess.call('povcallc.exe')
# TODO: CALL povcallc and compute results
# TODO: USE Regex to interpret results and append them to
# output file
If your program expects these arguments on the standard input (e.g. after running POVCALLC you type csv filenames into the console), you could use subprocess.Popen() [see https://docs.python.org/3/library/subprocess.html#subprocess.Popen ] with stdin redirection (stdin=PIPE), and use the returned object to send data to stdin.
It would looks something like this:
my_proc = subprocess.Popen('povcallc.exe', stdin=subprocess.PIPE)
my_proc.communicate(input="my_filename_which_is_expected_by_the_program.csv")
You can also use the tuple returned by communicate to automatically check the programs stdout and stderr (see the link to docs for more).
I have a simple script that reads values from a device and outputs them via print, and another script, which listens on stdin and interprets each number. The device outputs one number each second. Surprisingly, piping the scripts on my ubuntu box does not work. However, if the first script is made not to read from the device but generate random numbers as fast as it can, the second script successfully receives the data.
Below is a simplified example of my situation.
print.py:
#!/usr/bin/env python2
import time
import sys
while True:
time.sleep(1) # without this everything works
print "42"
sys.stdout.flush()
read.py:
#!/usr/bin/env python2
import sys
while True:
for str in sys.stdin:
print str
Command line invocation:
vorac#laptop:~/test$ ./print.py | ./read.py
Here is the end result. The first script reads from the device and the second graphs the data in two separate time frames (what is shown are random numbers).
Ah, now that is a tricky problem. It happens because the iterator method for sys.stdin (which is xreadlines()) is buffered. In other words, when your loop implicitly calls next(sys.stdin) to get the next line of input, Python tries to read from the real under-the-hood standard input stream until its internal buffer is full, and only once the buffer is full does it proceed through the body of the loop. The buffer size is 8 kilobytes, so this takes a while.
You can see this by decreasing the time delay in the sleep() call to 0.001 or some such value, depending on the capabilities of your system. If you hit the time just right, you'll see nothing for a few seconds, and then a whole block of 42s come out all at once.
To fix it, use sys.stdin.readline(), which is unbuffered.
while True:
line = sys.stdin.readline()
print line
You might also want to strip off the trailing newline before printing it, otherwise you'll get double line breaks. Use line.rstrip('\n'), or just print line, to suppress the extra newline that gets printed.
I changed your read.py and it worked for me :), you forget to .readline() from stdin.
import sys
while True:
line = sys.stdin.readline()
if line:
print line.strip()
else:
continue
Output is :
$ python write.py | python read.py
42
42
42
42
42
42
42
Turns out it is an error with my C program. I changed my printf to only print a preset string and redirected it to a file and the extra characters were still there. I still don't know why though.
Hi I'm writing a python script to run analysis on a C program I'm making parallel. Write now I have the number of processors used and the iterations I want to pass to my C program in a separate file called tests. I'm extremely new to Python, here's my sample code I wrote to figure out how to write results to a file which fill eventually be a .csv file.
#!/usr/bin/env python
import subprocess
mpiProcess = "runmpi"
piProcess = "picalc"
tests = open("tests.txt")
analysis = open("analysis.txt", "w")
def runPiCalc (numProcs, numIterations):
numProcs = str(numProcs)
numIterations = str(numIterations)
args = (mpiProcess, piProcess, numProcs, numIterations)
popen = subprocess.Popen(args, stdout=subprocess.PIPE)
popen.wait()
output = popen.stdout.read()
return output
def runTest (testArgs):
testProcs = testArgs[0]
testIterations = testArgs[1]
output = runPiCalc(testProcs,testIterations)
appendResults(output)
def appendResults (results):
print results
analysis.write(results + '\n')
for testLine in tests:
testArgs = testLine.split()
runTest(testArgs)
tests.close()
analysis.close()
My problem right now is when I "print results" to stdout the output comes out as expected and I get 3.14blablablablawhatever. When I check the analysis.txt file though I get [H[2J (weirder characters that are encoded as ESC not on the web) at the start of every line before my pi calculation shows up. I can't figure out why that is. Why would file.write have different output than print. Again this is my first time with Python so I'm probably just missing something easy.
This is on a ubuntu server I'm sshing to btw.
Here's the tests.txt and a picture of how the characters look on linux
The problem was I had a bash script executing my C program. The bash script was inserting the weird characters before the program output and adding it to its standard output. Putting the command I was calling inside the python script directly instead of calling a bash script fixed the problem.
Im using a raspberry pi with raspbian, Debain Wheezy Jan 2014 and python3
I'm starting a python script from rc.local that captures a keyboard input and writes to a file, without logging in.
If the file that the script is writing to has not been created yet, the first keyboard input registers on the screen but isn't written to the file. All subsequent writes work fine.
My code works fine when I run it from the command line as user that's logged in, the first line is written to the new file as expected.
EDITED CODE FROM MIDNIGHTER
#!/usr/bin/env python3.2
import sys
from datetime import datetime
def main():
f = open('/home/pi/cards.csv','r')
sim = f.read()
sim = sim.split('\n')
simSet = set(sim)
while True:
try:
log = open('logs', 'a')
puk = input() # text input, i.e., always a string
included = "true" if puk in simSet else "false"
print(included, puk)
log.write("{included: %s, time: %s, number: %s}, \n" % (included, datetime.now(), puk))
log.close()
except ValueError:
log.close()
main()
And the rc.local
sudo python3 /home/pi/rf1
Im just learning this, please excuse the poor execution.
SOLUTION
I realise now I left out an important detail about a cron job closing and copying the file that was being written to.
I found my answer here what exactly the python's file.flush() is doing?
Instead of file.close.() I used file.flush() and it works.
Code below:
#!/usr/bin/env python3.2
import sys
from datetime import datetime
def main():
f = open('/home/pi/cards.csv','r')
sim = f.read()
sim = sim.split('\n')
simSet = set(sim)
log = open('logs', 'a')
while True:
try:
puk = input() # text input, i.e., always a string
included = "true" if puk in simSet else "false"
print(included, puk)
log.write("{included: %s, time: %s, number: %s}, \n" % (included, datetime.now(), puk))
log.flush()
except ValueError:
log.flush()
main()
The problem was I was running a cron job that copied the data to another file which was accessing the file being written to in the python program.
The first write after this was not saving to the file as it was being accessed by another program.
These paragraphs seem to be what was happening:
https://stackoverflow.com/a/7127162/1441620
The first, flush, will simply write out any data that lingers in a
program buffer to the actual file. Typically this means that the data
will be copied from the program buffer to the operating system buffer.
Specifically what this means is that if another process has that same
file open for reading, it will be able to access the data you just
flushed to the file. However, it does not necessarily mean it has been
"permanently" stored on disk.
I think #Midnighter also suggested using a withstatement to open and close the file would have also solved it.
Updated code is in the question > solution.