sys.stdin does not close on ctrl-d - python

I have the following code in program.py:
from sys import stdin
for line in stdin:
print line
I run, enter lines, and then press Ctrl+D, but the program does not exit.
This does work:
$ printf "echo" | python program.py
Why does the program not exit when I press Ctrl+d?
I am using the Fedora 18 terminal.

Ctrl+D has a strange effect. It doesn't close the input stream, but only causes a C-level fread() to return an empty result. For regular files such a result means that the file is now at its end, but it's acceptable to read more, e.g. to check if someone else wrote more data to the file in the meantime.
In addition, there are issues of buffering --- three levels of them!
Python's iteration over a file does block buffering. Avoid it to read from interactive streams.
the C-level stdin file has, by default, a line buffer.
the terminal itself(!), in its default mode ("cooked mode"), reads one line of data before sending it to the process, which explains why typing Ctrl+D doesn't have any effect when typed in the middle of a line.
This example avoids the first issue, which is all you need if all you want is detecting Ctrl+D typed as its own line:
import sys
while True:
line = sys.stdin.readline()
print repr(line)
You get every line with a final '\n', apart from when the "line" comes from a Ctrl+D, in which case you get just '' (but reading continues, unless of course we add if line == '': break).

Related

Pipe between python scripts

I have a simple script that reads values from a device and outputs them via print, and another script, which listens on stdin and interprets each number. The device outputs one number each second. Surprisingly, piping the scripts on my ubuntu box does not work. However, if the first script is made not to read from the device but generate random numbers as fast as it can, the second script successfully receives the data.
Below is a simplified example of my situation.
print.py:
#!/usr/bin/env python2
import time
import sys
while True:
time.sleep(1) # without this everything works
print "42"
sys.stdout.flush()
read.py:
#!/usr/bin/env python2
import sys
while True:
for str in sys.stdin:
print str
Command line invocation:
vorac#laptop:~/test$ ./print.py | ./read.py
Here is the end result. The first script reads from the device and the second graphs the data in two separate time frames (what is shown are random numbers).
Ah, now that is a tricky problem. It happens because the iterator method for sys.stdin (which is xreadlines()) is buffered. In other words, when your loop implicitly calls next(sys.stdin) to get the next line of input, Python tries to read from the real under-the-hood standard input stream until its internal buffer is full, and only once the buffer is full does it proceed through the body of the loop. The buffer size is 8 kilobytes, so this takes a while.
You can see this by decreasing the time delay in the sleep() call to 0.001 or some such value, depending on the capabilities of your system. If you hit the time just right, you'll see nothing for a few seconds, and then a whole block of 42s come out all at once.
To fix it, use sys.stdin.readline(), which is unbuffered.
while True:
line = sys.stdin.readline()
print line
You might also want to strip off the trailing newline before printing it, otherwise you'll get double line breaks. Use line.rstrip('\n'), or just print line, to suppress the extra newline that gets printed.
I changed your read.py and it worked for me :), you forget to .readline() from stdin.
import sys
while True:
line = sys.stdin.readline()
if line:
print line.strip()
else:
continue
Output is :
$ python write.py | python read.py
42
42
42
42
42
42
42

How do I accept piped input and then user-prompted input in a Python script?

I have a script that is designed to accept input piped in from stdin and then prompt the user for more input. Here is a contrived example illustrating what I mean:
import sys
# Get input from stdin
input_nums = [int(n.strip()) for n in sys.stdin]
# Prompt user
mult = int(raw_input("Enter a number by which to multiply your input: "))
for num in input_nums:
print num*mult
When I pipe data in from stdin, python interprets stdin as closed before it gets to raw_input and it gives an EOFError: EOF when reading a line:
[user]$ cat nums.txt
2
3
4
5
[user]$ cat nums.txt | python sample.py
Enter a number by which to multiply your input: Traceback (most recent call last):
File "sample.py", line 6, in <module>
mult = int(raw_input("Enter a number by which to multiply your input: "))
EOFError: EOF when reading a line
(Please don't worry about the useless use of cat... its just a minimal example)
What I want to know is if there is a way to somehow separate reading sys.stdin and calling raw_input so that I can both pipe in data and then prompt a user for input.
Updated to make it more clear what I really want by eliminating red herrings, and added traceback of EOFError
Result #TimPeter's solution worked for me, but I had to change "CON:" to "/dev/tty" since I'm on UNIX, not Windows.
I suspect you're out of luck, at least for any kind of cross-platform solution. Python uses sys.stdin for raw_input(), and if you invoke Python so that sys.stdin is on the receiving end of a pipe, Python can't do anything to magically change sys.stdin to the terminal when the piped input ends.
Here's a variant of the question with a Unix-specific workaround as the accepted answer. That cleverly worms around some (not all) of the problem by changing the way the program is invoked.
Sorry.
One way
This seems to work fine for Windows:
import sys
print len(sys.stdin.read()) # anything to consume piped input
sys.stdin = open("CON:", "r")
x = raw_input("sdfklj ")
That is, after reading the piped-in input, sys.stdin is rebound to the special file CON: (which is what Windows calls a DOS box) opened in read mode.
See your Unix docs for what to try there - perhaps /dev/tty1? There are mounds of terminal control options you may need to fiddle with too, depending on platform specifics. That's why I said (at the start) that I think you're out of luck for any cross-platform solution. Python has no special support for terminal devices; i.e., you're on your own for that.

Allow Rsync to read file open by python process without python process failing

I am trying to set up a mail log parser that will pull out specific lines into another file, which will then get rsync'd to a remote server. The problem I am having is that when rsync reads the file being written, it seems to cause the parser to stop functioning. I believe this is because the parser is emulating a tail -f as maillog is being written consistently.
So: How do I allow rsync to touch the file I'm writing with this code (result_file), while still allowing it to follow the end of the maillog looking for new files:
#! /usr/bin/python
import time, re, sys
result_file = open('/var/log/mrp_mail_parsed.log', 'a+')
def tail(logfile):
logfile.seek(0,2)
while True:
line = logfile.readline()
if not line:
time.sleep(0.1)
continue
yield line
if __name__ == '__main__':
logfile = open('/var/log/maillog', 'r')
logline = tail(logfile)
for line in logline:
match = re.search(r'.+postfix-mrp.+', line)
if match:
result_file.write(line,)
result_file.flush()
I don't know who's writing the file, or how, so I can't be sure, but I'd give better than even odds that your problem is this:
If the file isn't being appended to in-place, but is instead being rewritten, your code will stop tracking the file. To test this:
import sys
import time
def tail(logfile):
logfile.seek(0,2)
while True:
line = logfile.readline()
if not line:
time.sleep(0.1)
continue
yield line
with open(sys.argv[1]) as f:
for line in tail(f):
print(line.rstrip())
Now:
$ touch foo
$ python tailf.py foo &
$ echo "hey" >> foo
foo
$ echo "hey" > foo
To see what's happening better, try checking the inode and size via stat. As soon as the path refers to a different file than the one your script has open, your script is now watching a file that nobody else will ever touch again.
It's also possible that someone is truncating and rewriting the file in-place. This won't change the inode, but it will still mean that you won't read anything, because you're trying to read from a position past the end of the file.
I have no idea whether the file being rsync'd is causing this, or whether that's just a coincidence. Without knowing what rsync command you're running, or seeing whether the file is being replaced or the file is being truncated and rewritten when that command runs, all we can do is guess.
I don't believe rsync is causing your problems: A separate process reading the file shouldn't affect the writer. You can easily test this by pausing rsync.
I'm guessing the problem is with python's handling of file reads when you hit end of file. A crude way that's guaranteed to work is to read to remember the offest at the last EOF (using tell()). For each new read, reopen the file and seek to the remembered offset.

sys.stdin.readlines() hangs Python script

Everytime I'm executing my Python script, it appears to hang on this line:
lines = sys.stdin.readlines()
What should I do to fix/avoid this?
EDIT
Here's what I'm doing with lines:
lines = sys.stdin.readlines()
updates = [line.split() for line in lines]
EDIT 2
I'm running this script from a git hook so is there anyway around the EOF?
This depends a lot on what you are trying to accomplish. You might be able do:
for line in sys.stdin:
#do something with line
Of course, with this idiom as well as the readlines() method you are using, you need to somehow send the EOF character to your script so that it knows that the file is ready to read. (On unix Ctrl-D usually does the trick).
Unless you are redirecting something to stdin that would be expected behavior. That says to read input from stdin (which would be the console you are running the script from). It is waiting for your input.
See: "How to finish sys.stdin.readlines() input?
If you're running the program in an interactive session, then this line causes Python to read from standard input (i. e. your keyboard) until you send the EOF character (Ctrl-D (Unix/Mac) or Ctrl-Z (Windows)).
>>> import sys
>>> a = sys.stdin.readlines()
Test
Test2
^Z
>>> a
['Test\n', 'Test2\n']
I know this isn't directly answering your question, as others have already addressed the EOF issue, but typically what I've found that works best when reading live output from a long lived subprocess or stdin is the while/if line approach:
while True:
line = sys.stdin.readline()
if not line:
break
process(line)
In this case, sys.stdin.readline() will return lines of text before an EOF is returned. Once the EOF if given, the empty line will be returned which triggers the break from the loop. A hang can still occur here, as long as an EOF isn't provided.
It's worth noting that the ability to process the "live output", while the subprocess/stdin is still running, requires the writing application to flush it's output.

Python - Readline Control-D after non-empty line does not work why?

I am new to python, and I am sorry if what I am asking seems odd. I want to loop over each line on standard input and return a modified line to standard output immediately. I have code that works, mostly. However I do not know how to make this work completely.
I have the following code
while True:
line = sys.stdin.readline()
if not line:
break
sys.stdout.write(line)
When being used interactively this will exit if there is an EOF on a new line, however if there is text before I type Control-D I must give the code twice before it will exit the line, and then once more before the loop will exit.
How do I fix this.
I think my answer from here can be copied immediately:
It has to do with ^D really does: it just stops the current
read(2) call.
If the program does int rdbytes = read(fd, buffer, sizeof buffer);
and you press ^D inbetween, read() returns with the currently read
bytes in the buffer, returning their number. The same happens on line
termination; the \n at the end is always delivered.
So only a ^D at the start of a line or after another ^D has the
desired effect of having read() return 0, signalizing EOF.
And this behaviour, of course, affects Python code as well.
A strategy suggested in the python docs is:
for line in sys.stdin:
sys.stdout.write(line)
See the IO Tutorial.

Categories