Need to read from long running os command with Python 2.4 - python

Firstly, I'm stuck with Python 2.4. This is a large enterprise environment and I'm unable to update to python 2.7 which would be my preference.
I need to read the output of some dtrace scripts that spit out data in intervals similar to iostat. (ie: iostat 5 100 # every 5 seconds, 100 count)
I'm playing around with Popen and Popen.communicate but it seems to slurp all the data at once and then print out in one large string.
I need to enter into a while loop and read the output 1 line at a time.
Can someone point me into the right direction for doing this?
Much thx.

import subprocess
p = subprocess.Popen("some_long_command",stdout=subprocess.PIPE)
for line in iter(p.stdout.readline, ""):
print line
I think at least ...

Related

PwnTools recv() on output that expects input directly after

Hi I have a problem that I cannot seem to find any solution for.
(Maybe i'm just horrible at phrasing searches correctly in english)
I'm trying to execute a binary from python using pwntools and reading its output completely before sending some input myself.
The output from my binary is as follows:
Testmessage1
Testmessage2
Enter input: <binary expects me to input stuff here>
Where I would like to read the first line, the second line and the output part of the third line (with ':' being the last character).
The third line of the output does not contain a newline at the end and expects the user to make an input directly. However, I'm not able to read the output contents that the third line starts with, no matter what I try.
My current way of trying to achieve this:
from pwn import *
io = process("./testbin")
print io.recvline()
print io.recvline()
print io.recvuntil(":", timeout=1) # this get's stuck if I dont use a timeout
...
# maybe sending data here
# io.send(....)
io.close()
Do I missunderstand something about stdin and stdout? Is "Enter input:" of the third line not part of the output that I should be able to receive before making an input?
Thanks in advance
I finally figured it out.
I got the hint I needed from
https://github.com/zachriggle/pwntools-glibc-buffering/blob/master/demo.py
It seems that Ubuntu is doing lots of buffering on its own.
When manually making sure that pwnTools uses a pseudoterminal for stdin and stdout it works!
import * from pwn
pty = process.PTY
p = process(stdin=pty, stdout=pty)
You can use the clean function which is more reliable and which can be used for remote connections: https://docs.pwntools.com/en/dev/tubes.html#pwnlib.tubes.tube.tube.clean
For example:
def start():
p = remote("0.0.0.0", 4000)
return p
io = start()
io.send(b"YYYY")
io.clean()
io.send(b"ZZZ")

Python 3 using many cores for a single threaded script

I am using a simple python 3 script which I wrote to parse through some very large files. It is a single threaded script (I don't even know how to set up a multithreading python script).
HOWEVER, this script is using 30+ cores on our computer cluster when I run the script.
The script only uses the argparse module and another module called pyBigWig. How can this be using 30 cores???
---- EDIT ----
I can't say that I'm surprised to instantly receive downvotes, despite this being a real issue that I'm facing and trying my best to describe the problem.
The entire script depends on this simple loop where I go through each line of the file:
with open(file, 'r') as fh:
for line in fh:
# assign some variables from this line
# calculate some values from these variables
# write new values to a new line of a new output file
Turns out the numpy module was using all the cores. I have managed to limit this by adding this to the top of my script:
import os
os.environ["MKL_NUM_THREADS"] = "1"
os.environ["NUMEXPR_NUM_THREADS"] = "1"
os.environ["OMP_NUM_THREADS"] = "1"

tail and less commands not monitoring file in real time

I'm looking for a way to monitor a file that is written to by a program on Linux. I found the tail -F command in here, and also recommended was less +FG. I tested it by running tail -F file in one terminal, and a simple python script:
import time
for i in range(20):
print i
time.sleep(0.5)
in another. I redirected the output to the file:
python script.py >> file
I expected that tail would track the file contents and update the display in fixed intervals, instead it only shows what was written to the file after the command terminates.
The same thing happens with less +FG and also if I watch the output from cat. I've also tried using the usual redirect which truncates the file > instead of >>. Here it says the file was truncated, but still does not track it in real time.
Any idea why this doesn't work? (It's suggested here that it might be due to buffered writes, but since my script runs over 10 seconds, I suspect this might not be the cause)
Edit: In case it matters, I'm running Linux Mint 18.1
Python's standard out is buffered. If when you close the script / script is done, you see all the output - that's definitely buffer issue.
You can use this instead:
import time
import sys
for i in range(20):
sys.stdout.write('%d\n' % i)
sys.stdout.flush()
time.sleep(0.5)
I've tested it and it prints values in real time. To overcome buffer issue, after each .write() method I use .flush() force "flushing" the buffer.
Additional options from the comments:
Use the original print statement with sys.stdout.flush() afterwords
Run the python script with python -u for unbuffered binary stdout and stderr
Regarding jon1467 answer (sorry can't comment your answer), your understanding of redirection is wrong.
Try this :
dd if=/dev/urandom > test.txt
while looking at the file size with :
ls -l test.txt
You'll see the file grow while dd is running.
Vinny's answer is correct, python standard output is buffered.
The more common way to the "buffering effect" you notice is by flushing the stdout as Vinny showed you.
You could also use -u option to disable buffering for the whole python process, or you could just reopen standard output with a buffer size of 0 as below (in python2 at least):
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)

How to get the last N lines from an unlimited Popen.stdout object

I'm trying to get the last N lines from an unlimited Popen.stdout object at the current time. And by unlimited I mean unlimited/many log entries which are getting written to stdout.
I tried Popen.stdout.readline() limited by time, but this just produce a whole lot of random issues, especially with little output.
Some sort of snapshot of the current output would help me, but I am unable to find anything like that. All the solutions I mostly find are for external processes which terminate, but mine is an server application which should be able to write to stdout after I read the last lines.
Greetings,
Faerbit
On Unix, when you launch your process you can pipe it into tail first:
p=subprocess.Popen("your_process.sh | tail --lines=3", stdout=subprocess.PIPE, shell=True)
r=p.communicate()
print r[0]
Usage of shell=True is the key here.

Python: Read huge number of lines from stdin

I'm trying to read a huge amount of lines from standard input with python.
more hugefile.txt | python readstdin.py
The problem is that the program freezes as soon as i've read just a single line.
print sys.stdin.read(8)
exit(1)
This prints the first 8 bytes but then i expect it to terminate but it never does. I think it's not really just reading the first bytes but trying to read the whole file into memory.
Same problem with sys.stdin.readline()
What i really want to do is of course to read all the lines but with a buffer so i don't run out of memory.
I'm using python 2.6
This should work efficiently in a modern Python:
import sys
for line in sys.stdin:
# do something...
print line,
You can then run the script like this:
python readstdin.py < hugefile.txt
Back in the day, you had to use xreadlines to get efficient huge line-at-a-time IO -- and the docs now ask that you use for line in file.
Of course, this is of assistance only if you're actually working on the lines one at a time. If you're just reading big binary blobs to pass onto something else, then your other mechanism might be as efficient.

Categories