python sys.stdin.read() from tail -f - python

How come sys.stdin.read() doesn't read the piped input from tail -f?
#!/usr/bin/env python
import sys
from geoip import geolite2
def iplookup(srcip):
for ip in srcip.split("\n"):
try:
print(geolite2.lookup(ip))
except:
pass
source = sys.stdin.read()
iplookup(source)
tail -f /var/log/bleh.log | grep -oE '((1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])\.){3}(1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])' | python mygeoip.py

You can use fileinput :
import sys
from geoip import geolite2
import fileinput
def iplookup(srcip):
for ip in srcip.split("\n"):
try:
print(geolite2.lookup(ip))
except:
pass
for line in fileinput.input():
iplookup(line)
On the plus side, your script automagically accepts filename as parameters as well.

None of the other answers (even fileinput) fully addresses the issue of buffering, and so will not work for small outputs of tail -f.
From the python man page:
Note that there is internal buffering in xreadlines(), readlines() and
file-object iterators ("for line in sys.stdin") which is not
influenced by this option. To work around this, you will want to use
"sys.stdin.readline()" inside a "while 1:" loop.
In other words what you want is:
while True:
line = sys.stdin.readline()
iplookup(line)

You can use sys.stdin as an iterator, rather than trying to read from it first.
def iplookup(srcip):
for ip in srcip:
ip = ip.strip()
try:
print(geolite2.lookup(ip))
except:
pass
iplookup(sys.stdin)

read() reads until EOF is reached.
EOF char is added when close() is performed or you can add it explicitly.
Your file does not have any EOF. Modify your program to read blocks of fixed size or iterate over leadline() instead.

Related

How to append the Output of a command to temporary file

I have been trying to append my output of a command to a temporary file in python and later doing some operations. Not able to append the data to a temporary file. Any help is appreciated! My sample code as follows.
Getting the error like this.
with open(temp1 , 'r') as f:
TypeError: expected str, bytes or os.PathLike object, not _TemporaryFileWrapper
import tempfile
import os
temp1 = tempfile.NamedTemporaryFile()
os.system("echo Hello world | tee temp1")
with open(temp1 , 'r') as f:
a = f.readlines()[-1]
print(a)
import tempfile
import os
# Opening in update-text mode to avoid encoding the data written to it
temp1 = tempfile.NamedTemporaryFile("w+")
# popen opens a pipe from the command, allowing one to capture its output
output = os.popen("echo Hello world")
# Write the command output to the temporary file
temp1.write(output.read())
# Reset the stream position at the beginning of the file, if you want to read its contents
temp1.seek(0)
print(temp1.read())
Check out subprocess.Popen for more powerful subprocess communication.
Whatever you're trying to do isn't right. It appears that you are trying to have a system call write to a file, and then you want to read that file in your Python code. You're creating a temporary file, but then your system call is writing to a statically named file, named 'temp1' rather than to the temporary file you've opened. So it's unclear if you want/need to use a computed temporary file name or if using temp1 is OK. The easiest way to fix your code to do what I think you want is like this:
import os
os.system("echo Hello world | tee temp1")
with open('temp1' , 'r') as f:
a = f.readlines()[-1]
print(a)
If you need to create a temporary file name in your situation, then you have to be careful if you are at all concerned about security or thread safety. What you really want to do is have the system create a temporary directory for you, and then create a statically named file in that directory. Here's your code reworked to do that:
import tempfile
import os
with tempfile.TemporaryDirectory() as dir:
tempfile = os.path.join(dir, "temp1")
os.system("echo Hello world /tmp > " + tempfile)
with open(tempfile) as f:
buf = f.read()
print(buf)
This method has the added benefit of automatically cleaning up for you.
UPDATE: I have now seen #UlisseBordingnon's answer. That's a better solution overall. Using os.system() is discouraged. I would have gone a bit different of a way by using the subprocess module, but what they suggest is 100% valid, and is thread and security safe. I guess I'll leave my answer here as maybe you or other readers need to use os.system() or otherwise have the shell process you execute write directly to a file.
As others have suggested, you should use the subprocess module instead of os.system. However from subprocess you can use the most recent interface (and by most recent, I believe this was adding in Python 3.4) of subprocess.run.
The neat thing about using .run is that you can pass any file-like object to stdout and the stdout stream will automatically redirect to that file.
import tempfile
import subprocess
with tempfile.NamedTemporaryFile("w+") as f:
subprocess.run(["echo", "hello world"], stdout=f)
# command has finished running, let's check the file
f.seek(0)
print(f.read())
# hello world
If you are using python 3.5 or later (as with most of us), then use subprocess.run is better because you do not need a temporary file:
import subprocess
completed_process = subprocess.run(
["echo", "hello world"],
capture_output=True,
encoding="utf-8",
)
print(completed_process.stdout)
Notes
The capture_output parameter tells run() to save the output to the .stdout and .stderr attributes
The encoding parameter will convert the output from bytes to string
Depending on your needs, if your print your output, a quickier way, but maybe not exactly what you are looking for is to redirect the output to a file, at the command line level
Example(egfile.py):
import os
os.system("echo Hello world")
At command level you can simply do:
python egfile.py > file.txt
The output of the file will be redirected to the file instead to the screen

Reading from stdin with a system argv

I am attempting to cat a CSV file into stdout and then pipe the printed output as input into a python program that also takes a system argument vector with 1 argument. I ran into an issue I think directly relates to how Python's fileinput.input() function reacts with regards to occupying the stdin file descriptor.
generic_user% cat my_data.csv | python3 my_script.py myarg1
Here is a sample Python program:
import sys, fileinput
def main(argv):
print("The program doesn't even print this")
data_list = []
for line in fileinput.input():
data_list.append(line)
if __name__ == "__main__":
main(sys.argv)
If I attempt to run this sample program with the above terminal command and no argument myarg1, the program is able to evaluate and parse the stdin for the data output from the CSV file.
If I run the program with the argument myarg1, it will end up throwing a FileNotFoundError directly related to myarg1 not existing as a file.
FileNotFoundError: [Errno 2] No such file or directory: 'myarg1'
Would someone be able to explain in detail why this behavior takes place in Python and how to handle the logic such that a Python program can first handle stdin data before argv overwrites the stdin descriptor?
You can read from the stdin directly:
import sys
def main(argv):
print("The program doesn't even print this")
data_list = []
for line in iter(sys.stdin):
data_list.append(line)
if __name__ == "__main__":
main(sys.argv)
You are trying to access a file which has not been yet created, hence fileinput cannot open it, but since you are piping the data you have no need for it.
This is by design. The conceptors of fileinput thought that there were use cases where reading from stdin would be non sense and just provided a way to specifically add stdin to the list of files. According to the reference documentation:
import fileinput
for line in fileinput.input():
process(line)
This iterates over the lines of all files listed in sys.argv[1:], defaulting to sys.stdin if the list is empty. If a filename is '-', it is also replaced by sys.stdin.
Just keep your code and use: generic_user% cat my_data.csv | python3 my_script.py - myarg1
to read stdin before myarg1 file or if you want to read it after : ... python3 my_script.py myarg1 -
fileinput implements a pattern common for Unix utilities:
If the utility is called with commandline arguments, they are files to read from.
If it is called with no arguments, read from standard input.
So fileinput works exactly as intended. It is not clear what you are using commandline arguments for, but if you don't want to stop using fileinput, you should modify sys.argv before you invoke it.
some_keyword = sys.argv[1]
sys.argv = sys.argv[:1] # Retain only argument 0, the command name
for line in fileinput.input():
...

Using python socket.gethostbyname to accept multiple arguments in the command Prompt

I have a text file with a list of about 50 hostnames and I am looking to script a way to run through them to get each associated IP address in the Command Prompt.
I thought pasting the hostname list in to the following code might be the easiest way but socket.gethostbyname will take no more than 1 argument at a time.
import socket
socket.gethostbyname("***hostnames***")
Is there a way to work around this argument issue, or is there a way to have the hostnames read from the textfile?
The easiest work around is to pass a filename and iterate through it:
#!/usr/bin/python
import sys
import socket
file_nm = sys.argv[1]
with open(file_nm, 'r') as f:
for host in f:
print socket.gethostbyname(host.strip())

File following program

I am trying to build a python program that follows a log file checks for certain patterns. (Much like grep ..)
Part of the testing code 'test.py' is to read the stdin,
import fileinput
for line in fileinput.input():
print line
so if I do this in one terminal
tail -f log.txt | python test.py
In another terminal
echo "hello" >> log.txt
you expect hello is print out on the first terminal, but it doesn't. How to change the code? I also want to use it like this
cat log.txt | python test.py
with the same test.py.
Echoing sys.stdin directly seems to work on my Mac OS laptop:
import sys
for line in sys.stdin:
print line.rstrip()
But interestingly, this didn't work very well on my Linux box. It would print the output from tail -f eventually, but the buffering was definitely making it appear as though the program was not working (it would print out fairly large chunks after several seconds of waiting).
Instead I got more responsive behavior by reading from sys.stdin one byte at a time:
import sys
buf = ''
while True:
buf += sys.stdin.read(1)
if buf.endswith('\n'):
print buf[:-1]
buf = ''

Reading from a file with sys.stdin in Pycharm

I am trying to test a simple code that reads a file line-by-line with Pycharm.
for line in sys.stdin:
name, _ = line.strip().split("\t")
print name
I have the file I want to input in the same directory: lib.txt
How can I debug my code in Pycharm with the input file?
You can work around this issue if you use the fileinput module rather than trying to read stdin directly.
With fileinput, if the script receives a filename(s) in the arguments, it will read from the arguments in order. In your case, replace your code above with:
import fileinput
for line in fileinput.input():
name, _ = line.strip().split("\t")
print name
The great thing about fileinput is that it defaults to stdin if no arguments are supplied (or if the argument '-' is supplied).
Now you can create a run configuration and supply the filename of the file you want to use as stdin as the sole argument to your script.
Read more about fileinput here
I have been trying to find a way to use reading file as stdin in PyCharm.
However, most of guys including jet brains said that there is no way and no support, it is the feature of command line which is not related PyCharm itself.
* https://intellij-support.jetbrains.com/hc/en-us/community/posts/206588305-How-to-redirect-standard-input-output-inside-PyCharm-
Actually, this feature, reading file as stdin, is somehow essential for me to ease giving inputs to solve a programming problem from hackerank or acmicpc.
I found a simple way. I can use input() to give stdin from file as well!
import sys
sys.stdin = open('input.in', 'r')
sys.stdout = open('output.out', 'w')
print(input())
print(input())
input.in example:
hello world
This is not the world ever I have known
output.out example:
hello world
This is not the world ever I have known
You need to create a custom run configuration and then add your file as an argument in the "Script Parameters" box. See Pycharm's online help for a step-by-step guide.
However, even if you do that (as you have discovered), your problem won't work since you aren't parsing the correct command line arguments.
You need to instead use argparse:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("filename", help="The filename to be processed")
args = parser.parse_args()
if args.filename:
with open(filename) as f:
for line in f:
name, _ = line.strip().split('\t')
print(name)
For flexibility, you can write your python script to always read from stdin and then use command redirection to read from a file:
$ python myscript.py < file.txt
However, as far as I can tell, you cannot use redirection from PyCharm as Run Configuration does not allow it.
Alternatively, you can accept the file name as a command-line argument:
$ python myscript.py file.txt
There are several ways to deal with this. I think argparse is overkill for this situation. Alternatively, you can access command-line arguments directly with sys.argv:
import sys
filename = sys.argv[1]
with open(filename) as f:
for line in f:
name, _ = line.strip().split('\t')
print(name)
For robust code, you can check that the correct number of arguments are given.
Here's my hack for google code jam today, wish me luck. Idea is to comment out monkey() before submitting:
def monkey():
print('Warning, monkey patching')
global input
input = iter(open('in.txt')).next
monkey()
T = int(input())
for caseNum in range(1, T + 1):
N, L = list(map(int, input().split()))
nums = list(map(int, input().split()))
edit for python3:
def monkey():
print('Warning, monkey patching')
global input
it = iter(open('in.txt'))
input = lambda : next(it)
monkey()

Categories