How to read most recent line from stdin in python - python

Is there way to read only the current data from stdin?
I would like to pipe some never-ending input data (from a mouse like device) into a python script and grab only the most recent line of data.
The input x,y data looks like this and arrives at 600 lines per second:
0.123,0.123
0.244,0.566
etc.
So far I have tried something like this:
import sys, time
while 1:
data = sys.stdin.readline()
my_slow_function(data)
Python seems to buffer the data so nothing is skipped. I would like to skip everything except the current line.

Just spin up a separate thread to read stdin into a global variable. Make it a daemon thread so that you don't have to close it later on. The thread reads the data as it arrives and keeps discarding the old stuff. Have your regular program read last_line when it wants to.
I added an event so that the regular program can wait when no new data is available. If that's not what you want, take it out.
import sys
import threading
last_line = ''
new_line_event = threading.Event()
def keep_last_line():
global last_line, new_line_event
for line in sys.stdin:
last_line = line
new_line_event.set()
keep_last_line_thread = threading.Thread(target=keep_last_line)
keep_last_line_thread.daemon = True
keep_last_line_thread.start()

Keep the current line, only act on the last line.
buffer = None
for line in sys.stdin:
buffer = line
my_slow_function(buffer)

Related

Pipe between python scripts

I have a simple script that reads values from a device and outputs them via print, and another script, which listens on stdin and interprets each number. The device outputs one number each second. Surprisingly, piping the scripts on my ubuntu box does not work. However, if the first script is made not to read from the device but generate random numbers as fast as it can, the second script successfully receives the data.
Below is a simplified example of my situation.
print.py:
#!/usr/bin/env python2
import time
import sys
while True:
time.sleep(1) # without this everything works
print "42"
sys.stdout.flush()
read.py:
#!/usr/bin/env python2
import sys
while True:
for str in sys.stdin:
print str
Command line invocation:
vorac#laptop:~/test$ ./print.py | ./read.py
Here is the end result. The first script reads from the device and the second graphs the data in two separate time frames (what is shown are random numbers).
Ah, now that is a tricky problem. It happens because the iterator method for sys.stdin (which is xreadlines()) is buffered. In other words, when your loop implicitly calls next(sys.stdin) to get the next line of input, Python tries to read from the real under-the-hood standard input stream until its internal buffer is full, and only once the buffer is full does it proceed through the body of the loop. The buffer size is 8 kilobytes, so this takes a while.
You can see this by decreasing the time delay in the sleep() call to 0.001 or some such value, depending on the capabilities of your system. If you hit the time just right, you'll see nothing for a few seconds, and then a whole block of 42s come out all at once.
To fix it, use sys.stdin.readline(), which is unbuffered.
while True:
line = sys.stdin.readline()
print line
You might also want to strip off the trailing newline before printing it, otherwise you'll get double line breaks. Use line.rstrip('\n'), or just print line, to suppress the extra newline that gets printed.
I changed your read.py and it worked for me :), you forget to .readline() from stdin.
import sys
while True:
line = sys.stdin.readline()
if line:
print line.strip()
else:
continue
Output is :
$ python write.py | python read.py
42
42
42
42
42
42
42

Identifying end of stream while reading from stdin

I am reading input from sys.stdin in python and I need to perform some extra operations when the last line is encountered. How can I identify if the current line being executed is the last one?
for line in sys.stdin:
if <line is last line>:
// do some extra operation
else:
// rest of stuff
The only way to know that you're at the end of the stream is when you try to read from it and there's nothing there. Logic added after the for-loop will be at the end-of-stream case.
If you need to detect end-of-stream in the input stream, before you've finished with the previous record, then your logic can't use a "for"-loop. Instead, you must use "while." You must pre-read the first record, then, "while" the latest-thing-read isn't empty, you must read the next record and then process the current one. Only in this way can you know, before processing the current record, that there will be no records following it.
Before starting the loop, read the first line of the input. Then, in the loop, always process the line previously read. After the loop terminates, you'll still have the last line from sys.stdin in line_prev for your special processing.
import sys
line_prev = sys.stdin.readline()
for line in sys.stdin:
rest_of_stuff(line_prev)
line_prev = line
do_some_extra_operation(line)
Why don't use try this:
for a in iter(raw_input, ""):
# do something with a
The loop will break when the input equals to the sentinel (the second argument in iter). You can keep a reference to the last line as well, such as:
for a in iter(raw_input, ""):
if a == last_line:
# do stuff
last_line = a
# Do more stuff
For your understanding, all input in python is taken from sys.stdin, and as a result, you can use functions such as raw_input, and it will read from sys.stdin. Think of it like this:
def raw_input():
sys.stdin.readline()
It's not exactly like that, but it's similar to that concept.

How to idle file-processing program until new data arrives in the file

i have a text file that is being written by another program every 10 seconds.
my code goes through this file and parses the data i want. but at some point the for loop reaches the end of file and program closes.
GOAL: i want the program to wait inside the for loop for more data to come so that it parses the new data too.
i tried it using a while with a condition about the lines that are left to be read but for some reason the program just stops a little after exiting the while loop.if i add let's say 25 lines...it processes 9 of them and then the program exits the for loop and program finishes(not crashes)
QUESTION: is there a better way to idle the program until new data arrives? what is wrong in this code?
k = -1
with open('epideiksh.txt') as weather_file:
for line in weather_file:
k = k+1
lines_left = count_lines_of('epideiksh.txt') - k
while ( lines_left <= 10 ):
print("waiting for more data")
time.sleep(10)
pointer = count_lines('epideiksh.txt') - k
if line.startswith('Heat Index'):
do_my_thing()
time.sleep(10)
The simplest, but slightly error-prone way of simulating tail is:
with open("filename") as input:
while True:
for line in input:
if interesting(line):
do_something_with(line)
sleep a_little
input.seek(0, io.SEEK_CUR)
In my very limited testing, that seemed to work without the seek. But it shouldn't, since normally you have to do something like that in order to clear the eof flag. One thing to keep in mind is that tell() cannot be used on a (text) file while it is being iterated, and seeking from SEEK_CUR invokes tell(). So in the above code snippet, you could not break out of the for loop and fall into the input.seek() call.
The problem with the above is that it is possible that the readline (implicit in the iterator) will only read part of the line currently being written. So you need to be prepared to abandon and reread partial lines:
with open("filename") as input:
# where is the end of the last complete line read
where = input.tell()
# use readline explicitly because next() and tell() are incompatible
while True:
line = input.readline()
if not line or line[-1] != '\n':
time.sleep(a_little)
input.seek(where)
else:
where = input.tell()
if interesting(line):
do_something_with(line)

reading from sys.stdin without newline or EOF

I want to recieve data from my gps-tracker. It sends data by tcp, so I use xinetd to listen some tcp port and python script to handle data. This is xinetd config:
service gps-gprs
{
disable = no
flags = REUSE
socket_type = stream
protocol = tcp
port = 57003
user = root
wait = no
server = /path/to/gps.py
server_args = 3
}
Config in /etc/services
gps-gprs 57003/tcp # Tracking system
And Python script gps.py
#!/usr/bin/python
import sys
def main():
data = sys.stdin.readline().strip()
#do something with data
print 'ok'
if __name__ =='__main__':
main()
The tracker sends data strings in raw text like
$GPRMC,132017.000,A,8251.5039,N,01040.0065,E,0.00,,010111,0,,A*75+79161234567#
The problem is that sys.stdin in python script doesn't recieve end of line or end of file character and sys.stdin.readline() goes forever. I tried to send data from another pc with a python script
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('', 57003))
s.sendall( u'hello' )
data = s.recv(4024)
s.close()
print 'Received', data
and if the message is 'hello', it fails, but if the message is 'hello\n', it's ok and everything is fine. But I don't know ho to tell tracker or xinetd to add this '\n' at the end of messages. How can I read the data from sys.stdin without EOF or EOL in it?
Simple:
data=sys.stdin.read().splitlines()
for i in data:
print i
No newlines
sys.stdin.readline() waits forever until it receives a newline. Then it considers the current line to be complete and returns it in full. If you want to read data that doesn't contain newlines or you don't want to wait until a newline is received before you process (some of) the data, then you're going to have to use something other than readline. Most likely you should call read, which reads arbitrary data up to a given size.
However, your GPS appears to be sending data in the well-known NEMA format, and that format certainly terminates each line with a newline. Actually, it probably terminates each line with CRLF (\r\n) but it is possible that the \r could be getting munged somewhere before it gets to your TCP socket. Either way there's a \n at the very end of each line.
If your readline call is hanging without returning any lines, most likely it's because the sender is buffering lines until it has a full buffer. If you waited long enough for the sender's buffer to fill up, you'd get a whole bunch of lines at once. If that's what's happening, you'll have to change the sender to that it flushes its send buffer after each NEMA sentence.
It seems you are receiving # instead of <CR><LF>, just read until the # sign.
data = ""
while len(data) == 0 or data[-1] <> '#':
data += sys.stdin.read(1)
#do something with data
print 'ok'
My solution :
var = sys.stdin.readline().replace('\n', '')
It :
find the newline in the entry,
replace it from the entry by '' (none) ~remove,
assigne it to variable.

sys.stdin.readlines() hangs Python script

Everytime I'm executing my Python script, it appears to hang on this line:
lines = sys.stdin.readlines()
What should I do to fix/avoid this?
EDIT
Here's what I'm doing with lines:
lines = sys.stdin.readlines()
updates = [line.split() for line in lines]
EDIT 2
I'm running this script from a git hook so is there anyway around the EOF?
This depends a lot on what you are trying to accomplish. You might be able do:
for line in sys.stdin:
#do something with line
Of course, with this idiom as well as the readlines() method you are using, you need to somehow send the EOF character to your script so that it knows that the file is ready to read. (On unix Ctrl-D usually does the trick).
Unless you are redirecting something to stdin that would be expected behavior. That says to read input from stdin (which would be the console you are running the script from). It is waiting for your input.
See: "How to finish sys.stdin.readlines() input?
If you're running the program in an interactive session, then this line causes Python to read from standard input (i. e. your keyboard) until you send the EOF character (Ctrl-D (Unix/Mac) or Ctrl-Z (Windows)).
>>> import sys
>>> a = sys.stdin.readlines()
Test
Test2
^Z
>>> a
['Test\n', 'Test2\n']
I know this isn't directly answering your question, as others have already addressed the EOF issue, but typically what I've found that works best when reading live output from a long lived subprocess or stdin is the while/if line approach:
while True:
line = sys.stdin.readline()
if not line:
break
process(line)
In this case, sys.stdin.readline() will return lines of text before an EOF is returned. Once the EOF if given, the empty line will be returned which triggers the break from the loop. A hang can still occur here, as long as an EOF isn't provided.
It's worth noting that the ability to process the "live output", while the subprocess/stdin is still running, requires the writing application to flush it's output.

Categories