Get lines from stdout after timestamp - python

There is a huge log of errors/warnings/infos printed out on stdout. I am only interested in the lines logged after I start a specific action.
Other information: I am using Python to telnet to a shell environment. I execute the commands on shell and store the time the action is started. I then call a command to view the log which spits it on stdout. I expect to read in the greped lines after that timestamp back to Python. I also store the current time but not sure how to use that (maybe grep on a date range?)
I can redirect to a file and use find but the log is huge and I'd rather not read all of it.
I can grep -n to get line number and then read everything after but I'm not sure how to.
Concept regex to egrep on is something like: {a-timestamp}*
Any suggestions would be appreciated!

awk '/the-timestamp-I-have/,0' the-log-file
This will print the lines from the-log-file, starting at the first line that matches the-timestamp-I-have and continuing through the last line.
Ref:
http://www.catonmat.net/blog/awk-one-liners-explained-part-three/
http://www.catonmat.net/blog/ten-awk-tips-tricks-and-pitfalls/#awk_ranges

Related

Getting all pods for a container, storing them in text files and then using those files as args in single command

The picture above shows the list of all kubernetes pods I need to save to a text file (or multiple text files).
I need a command which:
stores multiple pod logs into text files (or on single text file) - so far I have this command which stores one pod into one text file but this is not enough since I will have to spell out each pod name individually for every pod:
$ kubectl logs ipt-prodcat-db-kp-kkng2 -n ho-it-sst4-i-ie-enf > latest.txt
I then need the command to send these files into a python script where it will check for various strings - so far this works but if this could be included with the above command then that would be extremely useful:
python CheckLogs.py latest.txt latest2.txt
Is it possible to do either (1) or both (1) and (2) in a single command?
The simplest solution is to create a shell script that does exactly what you are looking for:
#!/bin/sh
FILE="text1.txt"
for p in $(kubectl get pods -o jsonpath="{.items[*].metadata.name}"); do
kubectl logs $p >> $FILE
done
With this script you will get the logs of all the pods in your namespace in a FILE.
You can even add python CheckLogs.py latest.txt
There are various tools that could help here. Some of these are commonly available, and some of these are shortcuts that I create my own scripts for.
xargs: This is used to run multiple command lines in various combinations, based on the input. For instance, if you piped text output containing three lines, you could potentially execute three commands using the content of those three lines. There are many possible variations
arg1: This is a shortcut that I wrote that simply takes stdin and produces the first argument. The simplest form of this would just be "awk '{print $1}'", but I designed mine to take optional parameters, for instance, to override the argument number, separator, and to take a filename instead. I often use "-i{}" to specify a substitution marker for the value.
skipfirstline: Another shortcut I wrote, that simply takes some multiline text input and omits the first line. It is just "sed -n '1!p'".
head/tail: These print some of the first or last lines of stdin. Interesting forms of this take negative numbers. Read the man page and experiment.
sed: Often a part of my pipelines, for making inline replacements of text.

How can I see the entire output of a python file when run inside terminal?

Terminal automatically cut the output as it scrolled up with each iteration.
By default mac terminal will have a limited no of lines of buffer. Say 1,000.
When you run a program and the output crossed 1000 lines your lines will be lost from memory. It's like FIFO buffer queue.
Basically the answer to your question:
`Is there a way to store the output of the previously run command in a text file?` is no. Sorry.
You can re-run the program and preserve the output by redirecting it to another file. Or increase the number of lines in the buffer (Maybe make it unlimited)
You can use less to go through the output:
your_command | less
Your Enter key will take you down.
Also, press q to exit.
Or you can reroute the output to a file
In one terminal tab run the program and redirect the output to a output.log file like this.
python program.py > output.log
In another tab you can tailf on the same log file and see the output.
tailf output.log
To see the complete output open the log file in any text editor.
You can consider increasing the scrollback buffer.
Or
If you want to see the data and also run it to a file, use tee, e.g,
spark-shell | tee tmp.out

Passing individual lines from files into a python script using a bash script

This might be a simple question, but I am new to bash scripting and have spent quite a bit of time on this with no luck; I hope I can get an answer here.
I am trying to write a bash script that reads individual lines from a text file and passes them along as argument for a python script. I have a list of files (which I have saved into a single text file, all on individual lines) that I need to be used as arguments in my python script, and I would like to use a bash script to send them all through. Of course I can take the tedious way and copy/paste the rest of the python command to individual lines in the script, but I would think there is a way to do this with the "read line" command. I have tried all sorts of combinations of commands, but here is the most recent one I have:
#!/bin/bash
# Command Output Test
cat infile.txt << EOF
while read line
do
VALUE = $line
python fits_edit_head.py $line $line NEW_PARA 5
echo VALUE+"huh"
done
EOF
When I do this, all I get returned is the individual lines from the input file. I have the extra VALUE there to see if it will print that, but it does not. Clearly there is something simple about the "read line" command that I do not understand but after messing with it for quite a long time, I do not know what it is. I admit I am still a rookie to this bash scripting game, and not a very good one at that. Any help would certainly be appreciated.
You probably meant:
while read line; do
VALUE=$line ## No spaces allowed
python fits_edit_head.py "$line" "$line" NEW_PARA 5 ## Quote properly to isolate arguments well
echo "$VALUE+huh" ## You don't expand without $
done < infile.txt
Python may also read STDIN so that it could accidentally read input from infile.txt so you can use another file descriptor:
while read -u 4 line; do
...
done 4< infile.txt
Better yet if you're using Bash 4.0, it's safer and cleaner to use readarray:
readarray -t lines < infile.txt
for line in "${lines[#]}; do
...
done

grep logfile for a specific timeframe

I need to filter messages out of a log file which has the following format:
2013-03-22T11:43:21.817078+01:00 INFO log msg 1...
...
2013-03-22T11:44:32.817114+01:00 WARNING log msg 2...
...
2013-03-22T11:45:45.817777+01:00 INFO log msg 3...
...
2013-03-22T11:46:59.547325+01:00 INFO log msg 4...
...
(where ... means "more messages")
The filtering must be done based on a timeframe.
This is part of a bash script, and at this point in the code the timeframe is stored as $start_time and $end_time. For example:
start_time = "2013-03-22T11:45:20"
end_time = "2013-03-22T11:45:50"
Note that the exact value of $start_time or $end_time may may never appear in the log file; yet there will be several messages within the timeframe [$start_time, $end_time] which are the ones I'm looking for.
Now, I'm almost convinced I'll need a Python script to do the filtering, but I'd rather use grep (or awk, or any other tool) since it should run much faster (the log files are big).
Any suggestions?
based on the log content in your question, I think an awk oneliner may help:
awk -F'.' -vs="$start_time" -ve="$end_time" '$1>s && $1<e' logfile
Note: this is filtering content excluding the start and end time.
$ start_time="2013-03-22T11:45:20"
$ end_time="2013-03-22T11:45:50"
$ awk -F'.' '$1>s&&$1<e' s=$start_time e=$end_time file
2013-03-22T11:45:45.817777+01:00 INFO log msg 3...

How to get the last N lines from an unlimited Popen.stdout object

I'm trying to get the last N lines from an unlimited Popen.stdout object at the current time. And by unlimited I mean unlimited/many log entries which are getting written to stdout.
I tried Popen.stdout.readline() limited by time, but this just produce a whole lot of random issues, especially with little output.
Some sort of snapshot of the current output would help me, but I am unable to find anything like that. All the solutions I mostly find are for external processes which terminate, but mine is an server application which should be able to write to stdout after I read the last lines.
Greetings,
Faerbit
On Unix, when you launch your process you can pipe it into tail first:
p=subprocess.Popen("your_process.sh | tail --lines=3", stdout=subprocess.PIPE, shell=True)
r=p.communicate()
print r[0]
Usage of shell=True is the key here.

Categories