Ignoring a specific file in a shell script for loop

Ignoring a specific file in a shell script for loop - python

My shell script loops through a given directory to execute a python script. I want to be able to ignore a specific file which is export_config.yaml. This is currently used in the for loop and prints out error on the terminal I want it to totally ignore this file.
yamldir=$1
for yaml in ${yamldir}/*.yaml; do
if [ "$yaml" != "export_config.yaml" ]; then
echo "Running export for $yaml file...";
python jsonvalid.py -p ${yamldir}/export_config.yaml -e $yaml -d ${endDate}
wait
fi
done

Your string comparison in the if statement is not matching export_config.yaml because for each iteration of your for loop you are assigning the entire relative file path (ie "$yamldir/export_config.yaml", not just "export_config.yaml") to $yaml.
First Option: Changing your if statement to reflect that should correct your issue:
if [ "$yaml" != "${yamldir}/export_config.yaml" ]; then
#etc...
Another option: Use basename to grab only the terminal path string (ie the filename).
for yaml in ${yamldir}/*.yaml; do
yaml=$(basename $yaml)
if [ "$yaml" != "export_config.yaml" ]; then
#etc...
Third option: You can do away with the if statement entirely by doing your for loop like this instead:
for yaml in $(ls ${yamldir}/*.yaml | grep -v "export_config.yaml"); do
By piping the output of ls to grep -v, you can exclude any line including export_config.yaml from the directory listing in the first place.

Related

create a short cut in mac terminal

I'm trying to create a short cut for a mac terminal such that when I write 'jj' the following line of code used in terminal will run:
python 5_7_16.py
My partner can write the program for Linux but he's not able to do it for Mac. He managed to write the path of the code as follows
FPTH="/Users/kylefoley/PycharmProjects/inference_engine2/inference2/Proofs/5_7_16.py"
When I'm using the pycharm software these are the first two lines of code I use before I can use python 5_7_16.py
cd inference2
cd Proofs
We already have the python file 'jj' saved in the right location and we can almost get it to work but not quite.
Also, software has three modes: output to excel, output to django, output to mysql. For reasons that I don't understand my partner thought we needed to write down in our file what type of mode is active. I don't understand why this is the case because all that information is already stored in the 5_7_16 file. Just in case it helps, here are the first lines of the python code.
excel = True
mysql = False
if not excel and not mysql:
from inference2.models import Define3, Archives, Input
from inference2 import views
if mysql:
import os
BASE_DIR = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
print BASE_DIR
sys.path.append(BASE_DIR)
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "inference_engine2.settings")
import django
django.setup()
from inference2 import views
from inference2.models import Define3, Archives, Input
So here is what he wrote so far, thought again, I don't understand why all this is necessary. I would think that all you would need would be to just tell the mac terminal what code you want to run:
FPTH="/Users/kylefoley/PycharmProjects/inference_engine2/inference2/Proofs/5_7_16.py"
vmysql=$(sed -i ‘’ -E ’s/^mysql = \(.*\)/\1/g’ $FPTH)
vexcel=$(sed —i ‘’ E ’s/^excel = \(.*\)/\1/g’ $FPTH)
echo $vexcel
echo $vmysql
if [ "$vexcel" == "True" ] ; then
echo "Excel"
elif [ "$vmysql" = "True" ]
then
echo "Mysql"
else
echo "Django"
fi
if [ "$vexcel" = "True" ] ; then
echo "Excel is set”
python $FPTH
elif [ "$vmysql" = "True" ]
then
echo "Mysql is set”
python $FPTH
else
echo “Django is set”
cd /dUsers/kylefoley/PycharmProjects/inference_engine2
python manage.py runserver
fi

You need to add the alias in your .bash_profile file.
For details, check: About .bash_profile, .bashrc, and where should alias be written in?
Below are the steps to follow:
# Step 1: Go To home directory
cd
# Step 2: Edit ".bash_profile" file OR, create if not exists
vi .bash_profile
# In this file add entry at last as:
# alias jj="python ~/inference2/Proofs/5_7_16.py"
# ^ OR whatever is the path to file
# Now, close the file
# Step 3: Refresh bash shell environment
source ~/.bash_profile
You are good to do jj now.
From the bash manpage:
When bash is invoked as an
interactive login shell, or as a
non-interactive shell with the
--login option, it first reads and executes commands from the file
/etc/profile, if that file exists.
After reading that file, it looks for
~/.bash_profile, ~/.bash_login, and
~/.profile, in that order, and reads
and executes commands from the first
one that exists and is readable. The
--noprofile option may be used when the shell is started to inhibit this
behavior.

feed a command a comma separated list of file names in a directory, extract a variable motif from file names for labels

I have a directory containing files that look like this:
1_reads.fastq
2_reads.fastq
89_reads.fastq
42_reads.fastq
I would like to feed a comma separated list of these file names to a command from a python program, so the input to the python command would like this:
program.py -i 1_reads.fastq,2_reads.fastq,89_reads.fastq,42_reads.fastq
Furthermore, I'd like to use the numbers in the file names for a labeling function within the python command such that the input would look like this:
program.py -i 1_reads.fastq,2_reads.fastq,89_reads.fastq,42_reads.fastq -t s1,s2,s89,s42
Its important that the file names and the label IDs are in the same order.

First: This is a very poorly-thought-out calling convention. Don't use it.
However, if you're using software someone else wrote that already has that convention baked in...
#!/bin/bash
IFS=, # use comma as separator
files=( [[:digit:]]*_* )
[[ -e $files || -L $files ]] || { echo "ERROR: No files matching glob exist" >&2; exit 1; }
prefixes=( )
for file in "${files[#]}"; do
prefixes+=( "s${file%%_*}" )
done
# "exec" only if this is the last command in the script; remove otherwise
exec program.py -i "${files[*]}" -t "${prefixes[*]}"
How this works:
IFS=, causes ${array[*]} to put a comma between each expanded element. Thus, expanding ${files[*]} and ${prefixes[*]} creates comma-separated strings with the contents of each array.
${file%%_*} removes everything after the first _ in a filename, allowing the numbers alone to be extracted.
[[ -e $files || -L $files ]] actually only tests whether the first element in that array exists (as a symlink or otherwise); however, this will always be true if the glob being expanded to form the array matched any files (unless files have been deleted between the two lines' invocation).

Try this:
program.py $(cd DIR && var=$(ls) && echo $var | tr ' ' ',')
That will pass to program.py the string returned by te command line inside the $(..).
That command line will: Enter in your directory, run ls storing the output in a variable, that will remove the newline characters replacing with spaces, and it doesn't add a trailing space. Then echo that variable to 'tr' which will translate spaces to commas.

It can be done easily in pure Bash. Make sure you run from within the directory that contains the files.
#!/bin/bash
shopt -s extglob nullglob
# Create an array of files
f=( +([[:digit:]])_reads.fastq )
# Check that there are some files...
if ((${#f[#]}==0)); then
echo "No files found. Exiting."
exit
fi
# Create an array of labels, directly from the array f:
# Remove trailing _reads.fastq
l=( "${f[#]%_reads.fastq}" )
# And prepend the letter s
l=( "${l[#]/#/s}" )
# Now the arrays f and l are good: check them:
declare -p f l
# To join the arrays, we'll use eval. Safe because the code is single-quoted!
IFS=, eval 'program.py -i "${f[*]}" -t "${l[*]}"'
Note. The use of eval here is perfectly safe as we're passing a constant string (and it's actually an idiomatic way to join an array without using a subshell or a loop). Don't modify the command, in particular the single quotes.
Thanks to Charles Duffy who convinced me to add healthy comments about the use of eval

bash program with argument from terminal

I am trying to write a bash script to a run a python program which take a files name and print values in the terminal.My bash program should take three argument from the terminal.First the python program name,second the folder name and third the file name where I want to store the output of my python program.
#!/bin/bash
directoryname = "$1"
programname = "$2"
newfilename ="$3"
for file in directoryname
do
python3 programname "$file" >> newfilename
done
and I am executing the program as follows:
./myscript.sh mypython.py /home/data myfile.txt
but it is giving error as :
./myscript.sh: line 2: directoryname: command not found
./myscript.sh: line 3: programname: command not found
./myscript.sh: line 4: newfilename: command not found
Please help me with this.I am pretty new to bash script.

Change to:
#!/bin/bash
directoryname="$1"
programname="$2"
newfilename="$3"
for file in $directoryname
do
python3 "$programname" "$file" >> "$newfilename"
done
No spaces around the =. A var is tagged with the $ before its name. In general, a var expansion is better if quoted ($var vs "$var").
And, I assume that you do want the list of files inside a directory, but the directoryname is only the directory itself (as /home/user/). If so, you will need:
for file in "$directoryname"/*

The unnecessary spaces in-between = and the variable names and the values are causing the issues, remove the spaces and try again. Will work. :)
#!/bin/bash
directoryname="$1"
programname="$2"
newfilename="$3"
for file in directoryname
do
python3 programname "$file" >> newfilename
done

How to test in shell if a path is already inside environment $*PATH? [duplicate]

With /bin/bash, how would I detect if a user has a specific directory in their $PATH variable?
For example
if [ -p "$HOME/bin" ]; then
echo "Your path is missing ~/bin, you might want to add it."
else
echo "Your path is correctly set"
fi

Using grep is overkill, and can cause trouble if you're searching for anything that happens to include RE metacharacters. This problem can be solved perfectly well with bash's builtin [[ command:
if [[ ":$PATH:" == *":$HOME/bin:"* ]]; then
echo "Your path is correctly set"
else
echo "Your path is missing ~/bin, you might want to add it."
fi
Note that adding colons before both the expansion of $PATH and the path to search for solves the substring match issue; double-quoting the path avoids trouble with metacharacters.

There is absolutely no need to use external utilities like grep for this. Here is what I have been using, which should be portable back to even legacy versions of the Bourne shell.
case :$PATH: # notice colons around the value
in *:$HOME/bin:*) ;; # do nothing, it's there
*) echo "$HOME/bin not in $PATH" >&2;;
esac

Here's how to do it without grep:
if [[ $PATH == ?(*:)$HOME/bin?(:*) ]]
The key here is to make the colons and wildcards optional using the ?() construct. There shouldn't be any problem with metacharacters in this form, but if you want to include quotes this is where they go:
if [[ "$PATH" == ?(*:)"$HOME/bin"?(:*) ]]
This is another way to do it using the match operator (=~) so the syntax is more like grep's:
if [[ "$PATH" =~ (^|:)"${HOME}/bin"(:|$) ]]

Something really simple and naive:
echo "$PATH"|grep -q whatever && echo "found it"
Where whatever is what you are searching for. Instead of && you can put $? into a variable or use a proper if statement.
Limitations include:
The above will match substrings of larger paths (try matching on "bin" and it will probably find it, despite the fact that "bin" isn't in your path, /bin and /usr/bin are)
The above won't automatically expand shortcuts like ~
Or using a perl one-liner:
perl -e 'exit(!(grep(m{^/usr/bin$},split(":", $ENV{PATH}))) > 0)' && echo "found it"
That still has the limitation that it won't do any shell expansions, but it doesn't fail if a substring matches. (The above matches "/usr/bin", in case that wasn't clear).

Here's a pure-bash implementation that will not pick up false-positives due to partial matching.
if [[ $PATH =~ ^/usr/sbin:|:/usr/sbin:|:/usr/sbin$ ]] ; then
do stuff
fi
What's going on here? The =~ operator uses regex pattern support present in bash starting with version 3.0. Three patterns are being checked, separated by regex's OR operator |.
All three sub-patterns are relatively similar, but their differences are important for avoiding partial-matches.
In regex, ^ matches to the beginning of a line and $ matches to the end. As written, the first pattern will only evaluate to true if the path it's looking for is the first value within $PATH. The third pattern will only evaluate to true if if the path it's looking for is the last value within $PATH. The second pattern will evaluate to true when it finds the path it's looking for in-between others values, since it's looking for the delimiter that the $PATH variable uses, :, to either side of the path being searched for.

I wrote the following shell function to report if a directory is listed in the current PATH. This function is POSIX-compatible and will run in compatible shells such as Dash and Bash (without relying on Bash-specific features).
It includes functionality to convert a relative path to an absolute path. It uses the readlink or realpath utilities for this but these tools are not needed if the supplied directory does not have .. or other links as components of its path. Other than this, the function doesn’t require any programs external to the shell.
# Check that the specified directory exists – and is in the PATH.
is_dir_in_path()
{
if [ -z "${1:-}" ]; then
printf "The path to a directory must be provided as an argument.\n" >&2
return 1
fi
# Check that the specified path is a directory that exists.
if ! [ -d "$1" ]; then
printf "Error: ‘%s’ is not a directory.\n" "$1" >&2
return 1
fi
# Use absolute path for the directory if a relative path was specified.
if command -v readlink >/dev/null ; then
dir="$(readlink -f "$1")"
elif command -v realpath >/dev/null ; then
dir="$(realpath "$1")"
else
case "$1" in
/*)
# The path of the provided directory is already absolute.
dir="$1"
;;
*)
# Prepend the path of the current directory.
dir="$PWD/$1"
;;
esac
printf "Warning: neither ‘readlink’ nor ‘realpath’ are available.\n"
printf "Ensure that the specified directory does not contain ‘..’ in its path.\n"
fi
# Check that dir is in the user’s PATH.
case ":$PATH:" in
*:"$dir":*)
printf "‘%s’ is in the PATH.\n" "$dir"
return 0
;;
*)
printf "‘%s’ is not in the PATH.\n" "$dir"
return 1
;;
esac
}
The part using :$PATH: ensures that the pattern also matches if the desired path is the first or last entry in the PATH. This clever trick is based upon this answer by Glenn Jackman on Unix & Linux.

This is a brute force approach but it works in all cases except when a path entry contains a colon. And no programs other than the shell are used.
previous_IFS=$IFS
dir_in_path='no'
export IFS=":"
for p in $PATH
do
[ "$p" = "/path/to/check" ] && dir_in_path='yes'
done
[ "$dir_in_path" = "no" ] && export PATH="$PATH:/path/to/check"
export IFS=$previous_IFS

$PATH is a list of strings separated by : that describe a list of directories. A directory is a list of strings separated by /. Two different strings may point to the same directory (like $HOME and ~, or /usr/local/bin and /usr/local/bin/). So we must fix the rules of what we want to compare/check. I suggest to compare/check the whole strings, and not physical directories, but remove duplicate and trailing /.
First remove duplicate and trailing / from $PATH:
echo $PATH | tr -s / | sed 's/\/:/:/g;s/:/\n/g'
Now suppose $d contains the directory you want to check. Then pipe the previous command to check $d in $PATH.
echo $PATH | tr -s / | sed 's/\/:/:/g;s/:/\n/g' | grep -q "^$d$" || echo "missing $d"

A better and fast solution is this:
DIR=/usr/bin
[[ " ${PATH//:/ } " =~ " $DIR " ]] && echo Found it || echo Not found
I personally use this in my bash prompt to add icons when i go to directories that are in $PATH.

Rewrite config file based on standard error output

I'm new to Linux and have a Fedora 20 build for a class. We installed Tripwire using the default, out of the box configs and I want to take the standard errors from the install to fix the config file.
To collect the errors:
tripwire -m i -c tw.cfg 2> errors
To clean the error file up for processing:
cat errors.txt | grep "/" | cut -d " " -f 3 > fixederrors
This gives me a nice clean file with one path per line i.e.:
/bin/ash
/bin/ash.static
/root/.Xresources
I would like to automate this process by comparing the data in fixederrors to the config file and prepend matching strings with a '#'.
I tried sed, but it commented out the whole config file!
sed 's/^/#/g' fixederrors > commentederrors
Alternatively, I thought about comparing the config file and the fixederrors and creating a third file. Is there a way to take two files, compare them, and remove duplicated data?
Any help is appreciated. I tried bash and python, but I don't know enough and went down the rabbit hole on this one. Again, this is for my growth and not in a production environment.

Let's we suppose that you have this input file named fixederrors "clean"
/bin/ash
/bin/ash.static
/root/.Xresources
this input as configuration file named config.original
use /bin/ash
use /bin/bash
do stuff with /bin/ash.static and friends
/root/.Xresources
do other stuff...
With this script in bash
#!/bin/bash
Input_Conf_File="config.original" # Original configuration file
Output_Conf_File="config.new" # New configuration file
Input_Error_File="fixederrors" # File of cleaned error
rm -f $Output_Conf_File # Let's we create a new file erasing old
while read -r line ; do # meanwhile there are line in config file
AddingChar="" # No Char to add #
while IFS= read -r line2 ; do # For each line of Error file,
# here you can add specific rules for your match e.g
# line2=${line2}" " # if it has always a space after...
[[ $line == *$line2* ]] && AddingChar="#" # If found --> Change prefix "#"
done < $Input_Error_File
echo "${AddingChar}${line}" >> $Output_Conf_File # Print in new file
done < $Input_Conf_File
cat $Output_Conf_File # You can avoid this it's only to check results
exit 0
You will have this output
#use /bin/ash
use /bin/bash
#do stuff with /bin/ash.static and friends
#/root/.Xresources
do other stuff...
Note:
IFS= removes trailing spaces in read
Use wisely because e.g. the match /bin/ash will catch lines with /bin/ash.EVERYTHING; so if it exists a line in your configuration input file as /bin/ash.dynamic will be commented too. Without prior knowledge of your configuration file it's not possible to set a specific rule, but you can do it starting from here.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Ignoring a specific file in a shell script for loop - python

Related

create a short cut in mac terminal

feed a command a comma separated list of file names in a directory, extract a variable motif from file names for labels

bash program with argument from terminal

How to test in shell if a path is already inside environment $*PATH? [duplicate]

Rewrite config file based on standard error output

Categories

Resources