I'm new to Linux and have a Fedora 20 build for a class. We installed Tripwire using the default, out of the box configs and I want to take the standard errors from the install to fix the config file.
To collect the errors:
tripwire -m i -c tw.cfg 2> errors
To clean the error file up for processing:
cat errors.txt | grep "/" | cut -d " " -f 3 > fixederrors
This gives me a nice clean file with one path per line i.e.:
/bin/ash
/bin/ash.static
/root/.Xresources
I would like to automate this process by comparing the data in fixederrors to the config file and prepend matching strings with a '#'.
I tried sed, but it commented out the whole config file!
sed 's/^/#/g' fixederrors > commentederrors
Alternatively, I thought about comparing the config file and the fixederrors and creating a third file. Is there a way to take two files, compare them, and remove duplicated data?
Any help is appreciated. I tried bash and python, but I don't know enough and went down the rabbit hole on this one. Again, this is for my growth and not in a production environment.
Let's we suppose that you have this input file named fixederrors "clean"
/bin/ash
/bin/ash.static
/root/.Xresources
this input as configuration file named config.original
use /bin/ash
use /bin/bash
do stuff with /bin/ash.static and friends
/root/.Xresources
do other stuff...
With this script in bash
#!/bin/bash
Input_Conf_File="config.original" # Original configuration file
Output_Conf_File="config.new" # New configuration file
Input_Error_File="fixederrors" # File of cleaned error
rm -f $Output_Conf_File # Let's we create a new file erasing old
while read -r line ; do # meanwhile there are line in config file
AddingChar="" # No Char to add #
while IFS= read -r line2 ; do # For each line of Error file,
# here you can add specific rules for your match e.g
# line2=${line2}" " # if it has always a space after...
[[ $line == *$line2* ]] && AddingChar="#" # If found --> Change prefix "#"
done < $Input_Error_File
echo "${AddingChar}${line}" >> $Output_Conf_File # Print in new file
done < $Input_Conf_File
cat $Output_Conf_File # You can avoid this it's only to check results
exit 0
You will have this output
#use /bin/ash
use /bin/bash
#do stuff with /bin/ash.static and friends
#/root/.Xresources
do other stuff...
Note:
IFS= removes trailing spaces in read
Use wisely because e.g. the match /bin/ash will catch lines with /bin/ash.EVERYTHING; so if it exists a line in your configuration input file as /bin/ash.dynamic will be commented too. Without prior knowledge of your configuration file it's not possible to set a specific rule, but you can do it starting from here.
Related
This is my first question here.
I tried to load a data file with Python.
The file demo.txt is similar as below.
12,23,34.5,56,
78,29,33,
44,55,66,78,59,100
(the number of the lines in the files are different and the number of column in each line may be different. I need to work on many data files)
numpy.loadtxt("demo.txt",delimiter=",")
gives the error message "could not convert string to float:".
To fix this problem, I try to use the command
sed -i -e 's/,\n/,/g' demo.txt
to remove the line breaks at the end of each line to combine all lines into a single line. But it failed.
However, in the VIM, it is OK to use ":s/,\n/,/g" to remove the line breaks.
Thus, my questions are
is it possible to load the data file in python without modifying the files?
if not, how can I use a command like "sed" (as I need to put this command into my script to handle a bunch of files, a shell command like "sed" is necessary) to remove the line breaks at the end of each line to combine all lines into one single line? Without the line breaks at all lines, I can read the data with numpy.loadtxt easily.
Best regards,
Yiping
Remove all newlines from a file with tr -d '\n':
$ echo -e "some\nfile\nwith\n\newlines" > file_with_newlines
$ cat file_with_newlines
some
file
with
ewlines
$ cat file_with_newlines | tr -d '\n' > file_without_newlines
$ cat file_without_newlines
somefilewithewlines$
I don't know if this will actually help you with your numpy problem, but it will remove all the (UNIX) newlines from a file.
I have a file named a1.txt that contain different filesystems(paths) listed in it:
//abc/dds
//abc/abc
Now I need to write a script in path //abc that will read the content of a1.txt line by line. For every line read from this file I need to execute the command ls -lat line_read_from_file|tail -10>filename.txt
At the same time I need the different file to be created for every line read from a1.txt.
Can someone write a script for this?
If you want to do something per each line in a text file, you can do something like below.
while read -r line; do ls -lat $line | tail -10>output_file.txt; done < a1.txt
Since I don't clearly understand your requirement, you may have to improve this to fit your needs.
EDIT:
Seems like you are trying to put the last 10 lines of each file listed in a1.txt into separate files.
index=1 && while read -r line; tail -10 $line >> file_${index} && index=$((index+1)); done < a1.txt
Is it possible to allow Python to read from stdin from another source such as a file continually? Basically I'm trying to allow my script to use stdin to echo input and I'd like to use a file or external source to interact with it (while remaining open).
An example might be (input.py):
#!/usr/bin/python
import sys
line = sys.stdin.readline()
while line:
print line,
line = sys.stdin.readline()
Executing this directly I can continuously enter text and it echos back while the script remains alive. If you want to use an external source though such as a file or input from bash then the script exits immediately after receiving input:
$ echo "hello" | python input.py
hello
$
Ultimately what I'd like to do is:
$ tail -f file | python input.py
Then if the file updates have input.py echo back anything that is added to file while remaining open. Maybe I'm approaching this the wrong way or I'm simply clueless, but is there a way to do it?
Use the -F option to tail to make it reopen the file if it gets renamed or deleted and a new file is created with the original name. Some editors write the file this way, and logfile rotation scripts also usually work this way (they rename the original file to filename.1, and create a new log file).
$ tail -F file | python input.py
I use gsutil in a Linux environment for managing files in GCS. I enjoy being able to use the command
gsutil -m cp -I gs://...
preceded by some other command to pass the STDIN to gsutil for uploading files; in doing so, I can maintain a local list of files that have been uploaded or generate specific patterns to upload and hand them off.
I would like to be able to do a similar command like
gsutil -m rm -I gs://...
to scrub files similarly. Presently, I build a big list of files to remove and run it with the following code:
while read line
do
gsutil rm gs://...
done < "$myfile.txt"
This is extraordinarily slow compared to the multithreaded "gsutil -m rm..." command, and enabling the -m flag has no effect when you have to process files one at a time from a list. I also experimented with just running
gsutil -m rm gs://.../* # remove everything
<my command> | gsutil -m cp -I gs://.../ # put back the pieces that I want
but this involves recopying a lot of a data and wastes a lot of time; the data is already there and just needs to have some removed. Any thoughts would be appreciated. Also, I don't have a lot of flexibility on either end with renaming files; otherwise, a quick rename before uploading would handle all of this.
As an interim solution, since we don't have a -I option for rm right now, how about just creating a string of all the objects you want to delete in your loop and then using gsutil -m rm to delete it? You could also do this with a simple python script that invokes the gsutil command from within python as a separate process.
Expanding on your earlier example, maybe something like the following (disclaimer: my bash-fu isn't the greatest, and I haven't tested this):
objects=''
while read line
do
objects="$objects gs://$line"
done
gsutil -m rm $objects
For anyone wondering, I wound up doing like Zach Wilt indicated above. For reference, I was removing on the order of a couple thousand files from a span of 5 directories, so roughly 10,000 files. Doing this without the "-m" switch was taking upwards of 30 minutes; with the "-m" switch, it takes less than 30 seconds. Zoom!
For a robust example: I am using this to update Google Cloud Storage files to match local files. On the current day, I have a program that dumps lots of files that are incremental, and also a handful that are "rolled up". After a week, the incremental files get scrubbed locally automatically, but the same should happen in GCS to save the space. Here's how to do this:
#!/bin/bash
# get the full date strings for touch
start=`date --date='-9 days' +%x`
end=`date --date='-8 days' +%x`
# other vars
mon=`date --date='-9 days' +%b | tr [A-Z] [a-z]`
day=`date --date='-9 days' +%d`
# display start and finish times
echo "Cleaning files from $start"
# update start and finish times
touch --date="$start" /tmp/start1
touch --date="$end" /tmp/end1
# repeat for all servers
for dr in "dir1" "dir2" "dir3" ...
do
# list files in range and build retention file
find /local/path/$dr/ -newer /tmp/start1 ! -newer /tmp/end1 > "$dr-local.txt"
# get list of all files from appropriate folder on GCS
gsutil ls gs://gcs_path/$mon/$dr/$day/ > "$dr-gcs.txt"
# formatting the host list file
sed -i "s|gs://gcs_path/$mon/$dr/$day/|/local/path/$dr/|" "$dr-gcs.txt"
# build sed command file to delete matches
while read line
do
echo "\|$line|d" >> "$dr-del.txt"
done < "$dr-local.txt"
# run command file to strip lines for files that need to remain
sed -f "$dr-del.txt" <"$dr-gcs.txt" >"$dr-out.txt"
# convert local names to GCS names
sed -i "s|/local/path/$dr/|gs://gcs_path/$mon/$dr/$day/|" "$dr-out.txt"
# new variable to hold string
del=""
# convert newline separated file to one long string
while read line
do
del="$del$line "
done < "$dr-out.txt"
# remove all files matching the final output
gsutil -m rm $del
# cleanup files
rm $dr-local.txt
rm $dr-gcs.txt
rm $dr-del.txt
rm $dr-out.txt
done
You'll need to modify to fit your needs, but this is a concrete and working method for deleting files locally, and then synchronizing the change to Google Cloud Storage. Obviously, modify to fit your needs. Thanks again to #Zach Wilt.
Using Perforce, I'd like to be able to reject submits which contain files with Windows line endings (\r\n IIRC, maybe just \r anywhere as really we only want files with Unix line endings).
Rather than dos2unix incoming files or similar, to help track down instances where users attempt to submit files with Windows line endings, I'd like to add a trigger to reject text submissions which contain files with non-Unix line endings.
Could someone demonstrate how the trigger itself could be written, perhaps with bash or python?
Thanks
Here's the minimal edit I can thing of for the bash example found in the p4 docs:
#!/bin/sh
# Set target string, files to search, location of p4 executable...
TARGET='\r\n'
DEPOT_PATH="//depot/src/..."
CHANGE=$1
P4CMD="/usr/local/bin/p4 -p 1666 -c copychecker"
XIT=0
echo ""
# For each file, strip off #version and other non-filename info
# Use sed to swap spaces w/"%" to obtain single arguments for "for"
for FILE in `$P4CMD files $DEPOT_PATH#=$CHANGE | \
sed -e 's/\(.*\)\#[0-9]* - .*$/\1/' -e 's/ /%/g'`
do
# Undo the replacement to obtain filename...
FILE="`echo $FILE | sed -e 's/%/ /g'`"
# ...and use #= specifier to access file contents:
# p4 print -q //depot/src/file.c#=12345
if $P4CMD print -q "$FILE#=$CHANGE" | fgrep "$TARGET" > /dev/null
then
echo "Submit fails: '$TARGET' not found in $FILE"
XIT=1
else
echo ""
fi
done
exit $XIT
The original example fails if the target is missing, this one fails if it's present -- just switching the then and else branches of the if. You could edit it further of course (e.g. giving grep, or fgrep, the -q flag to suppress output, if your grep supports it as e.g. GNU's does).