Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I am new to bash and wanted to learn what this code is trying to do, if it is done poorly or with errors and how it can be improved.
COMMAND=$1
case $COMMAND in
"upgrade")
UPSCRIPT=`ls -t ./assets/upgrade | head -n1`
python ./assets/upgrade/$UPSCRIPT | tee >> biglog.txt
VERSION=$(echo $UPSCRIPT | awk -F. '{print $1}')
echo `date` $VERSION > ./version.txt
test -e ./artifcts && rm -rf ./artifacts
;;
"downgrade")
DOWNSCRIPT=`ls -t ./assets/downgrade | head -n1`
python ./assets/downgrade/$DOWNSCRIPT | tee >> biglog.txt
VERSION=$(echo $UPSCRIPT | awk -F. '{print $1}')
echo `date` $VERSION > ./version.txt
test -e ./artifcts && rm -rf ./artifacts
;;
*)
while read -r UPSCRIPT; do
python $UPSCRIPT | tee >> biglog.txt
VERSION=$(echo $UPSCRIPT | awk -F. '{print $1}')
echo `date` $VERSION > ./version.txt
test -e ./artifcts && rm -rf ./artifacts
done <<< $(find "./assets/update" -type f -name "*.py")
esac
Use lower case variable names. Upper case is recommended for environment and shell internal variables.
Use $() instead of `...`. It nests better.
use parameter expansion instead of running a command in a subshell, if possible. It's much faster.
Where the logic of the script was unclear, I left a comment in the code.
#! /bin/bash
command=$1
artifacts=./artifacts
case "$command" in
upgrade)
upscript=$(ls -t ./assets/upgrade | head -n1)
python ./assets/upgrade/"$upscript" | tee >> biglog.txt
version=${upscript%.*}
echo $(date) "$version" > ./version.txt
test -e "$artifacts" && rm -rf "$artifacts" # artifacts or artifcts?
;;
downgrade)
downscript=$(ls -t ./assets/downgrade | head -n1)
python ./assets/downgrade/"$downscript" | tee >> biglog.txt
version=${downscript%.*} # upscript or downscript?
echo $(date) "$version" > ./version.txt
test -e "$artifacts" && rm -rf "$artifacts"
;;
*)
while read -r upscript; do
python "$upscript" | tee >> biglog.txt
version=${upscript%.*}
echo $(date) "$version" > ./version.txt
test -e "$artifacts" && rm -rf "$artifacts"
done <<< $(find "./assets/update" -type f -name '*.py')
esac
I would probably also extract the common logic from upgrade and downgrade to a function to avoid repetition.
Parsing the output of ls or find is suspicious, as file names can contain weird characters. I'd need to understand more what you're trying to do to fix that.
Related
I want to run a (complex) Bash while loop from a Python3 script.
I know os.subprocess and os.subprocess.check_output works in this case, but I can't wrap my head around how to include the while inside a Python subprocess.
while read -r line
do
if [ "$(echo "$line" | cut -d : -f 7)" = "/bin/bash" ] && [ $(printf "$(echo "$line" | cut -d : -f 1)" | wc -c) -gt $mida ]
then
echo $line | cut -d : -f 1
fi
done < /etc/passwd
I've tried the following:
out=subprocess.check_output(""" while read -r line; do; if [ "$(echo "$line" | cut -d : -f 7)" = "/bin/bash" ] && [ $(printf "$(echo "$line" | cut -d : -f 1)" | wc -c) -gt $mida ]; then; echo $line | cut -d : -f 1; fi; done < /etc/passwd """, shell=True)
Just include it normally. Like it is. You are using """ quotes anyway.
out = subprocess.check_output("""
while read -r line
do
if [ "$(echo "$line" | cut -d : -f 7)" = "/bin/bash" ] && [ $(printf "$(echo "$line" | cut -d : -f 1)" | wc -c) -gt $mida ]
then
echo $line | cut -d : -f 1
fi
done < /etc/passwd
""", shell=True)
Notes:
you should export mida environment variable before using it. When $mida variable is not set will result in some [: something expected but not there messages.
printf "$(stuff)" | wc -c? Just stuff | wc -c.
Check your scripts with http://shellcheck.net
Read https://mywiki.wooledge.org/BashFAQ/001
Just split the line on : using IFS when reading instead of using cut
And that said, do not use shell - use python and write the logic in python.
Running this on osx...
cd ${BUILD_DIR}/mydir && for DIR in $(find ./ '.*[^_].py' | sed 's/\/\//\//g' | awk -F "/" '{print $2}' | sort |uniq | grep -v .py); do
if [ -f $i/requirements.txt ]; then
pip install -r $i/requirements.txt -t $i/
fi
cd ${DIR} && zip -r ${DIR}.zip * > /dev/null && mv ${DIR}.zip ../../ && cd ../
done
cd ../
error:
(env) ➜ sh package_lambdas.sh find: .*[^_].py: No such file or directory
why?
find takes as an argument a list of directories to search. You provided what appears to be regular expression. Because there is no directory named (literally) .*[^_].py, find returns an error.
Below I have revised your script to correct that mistake (if I understand your intention). Because I see so many ill-written shell scripts these days, I've taken the liberty of "traditionalizing" it. Please see if you don't also find it more readable.
Changes:
use #!/bin/sh, guaranteed to be on an Unix-like system. Faster than bash, unless (like OS X) it is bash.
use lower case for variable names to distinguish from system variables (and not hide them).
eschew braces for variables (${var}); they're not needed in the simple case
do not pipe output to /usr/bin/true; route it to dev/null if that's what you mean
rm -f by definition cannot fail; if you meant || true, it's superfluous
put then and do on separate lines, easier to read, and that's how the Bourne shell language was meant to be used
Let && and || serve as line-continuation, so you can see what's happening step by step
Other changes I would suggest:
Use a subshell when changing the working directory temporarily. When it terminates, the working directory is restored automatically (retained by the parent), saving you the cd .. step, and errors.
Use set -e to cause the script to terminate on error. For expected errors, use || true explicitly.
Change grep .py to grep '\.py$', just for good measure.
To avoid Tilting Matchstick Syndrome, use something other than / as a sed substitute delimiter, e.g., sed 's://:/:g'. But sed could be avoided altogether with awk -F '/+' '{print $2}'.
Revised version:
#! /bin/sh
src_dir=lambdas
build_dir=bin
mkdir -p $build_dir/lambdas
rm -rf $build_dir/*.zip
cp -r $src_dir/* $build_dir/lambdas
#
# The sed is a bit complicated to be osx / linux cross compatible :
# ( .//run.sh vs ./run.sh
#
cd $build_dir/lambdas &&
for L in $(find . -exec grep -l '.*[^_].py' {} + |
sed 's/\/\//\//g' |
awk -F "/" '{print $2}' |
sort |
uniq |
grep -v .py)
do
if [ -f $i/requirements.txt ]
then
echo "Installing requirements"
pip install -r $i/requirements.txt -t $i/
fi
cd $L &&
zip -r $L.zip * > /dev/null &&
mv $L.zip ../../ &&
cd ../
done
cd ../
The find(1) manpage says its args are [path ...] [expression], where "expression" consists of "primaries" and "operands" (-flags). '.*[^-].py' doesn't look like any expression, so it's being interpreted as a path, and it's reporting that there is no file named '.*[^-].py' in the working directory.
Perhaps you meant:
find ./ -regex '.*[^-].py'
I have written a bash script that consists of multiple Unix commands and Python scripts. The goal is to make a pipeline for detecting long non coding RNA from a certain input. Ultimately I would like to turn this into an 'app' and host it on some bioinformatics website. One problem I am facing is using getopt tools in bash. I couldn't find a good tutorial that I understand clearly. In addition any other comments related to the code is appreciated.
#!/bin/bash
if [ "$1" == "-h" ]
then
echo "Usage: sh $0 cuffcompare_output reference_genome blast_file"
exit
else
wget https://github.com/TransDecoder/TransDecoder/archive/2.0.1.tar.gz && tar xvf 2.0.1 && rm -r 2.0.1
makeblastdb -in $3 -dbtype nucl -out $3.blast.out
grep '"u"' $1 | \
gffread -w transcripts_u.fa -g $2 - && \
python2.7 get_gene_length_filter.py transcripts_u.fa transcripts_u_filter.fa && \
TransDecoder-2.0.1/TransDecoder.LongOrfs -t transcripts_u_filter.fa
sed 's/ .*//' transcripts_u_filter.fa | grep ">" | sed 's/>//' > transcripts_u_filter.fa.genes
cd transcripts_u_filter.fa.transdecoder_dir
sed 's/|.*//' longest_orfs.cds | grep ">" | sed 's/>//' | uniq > longest_orfs.cds.genes
grep -v -f longest_orfs.cds.genes ../transcripts_u_filter.fa.genes > longest_orfs.cds.genes.not.genes
sed 's/^/>/' longest_orfs.cds.genes.not.genes > temp && mv temp longest_orfs.cds.genes.not.genes
python ../extract_sequences.py longest_orfs.cds.genes.not.genes ../transcripts_u_filter.fa longest_orfs.cds.genes.not.genes.fa
blastn -query longest_orfs.cds.genes.not.genes.fa -db ../$3.blast.out -out longest_orfs.cds.genes.not.genes.fa.blast.out -outfmt 6
python ../filter_sequences.py longest_orfs.cds.genes.not.genes.fa.blast.out longest_orfs.cds.genes.not.genes.fa.blast.out.filtered
grep -v -f longest_orfs.cds.genes.not.genes.fa.blast.out.filtered longest_orfs.cds.genes.not.genes.fa > lincRNA_final.fa
fi
Here is how I run it:
sh test.sh cuffcompare_out_annot_no_annot.combined.gtf /mydata/db/Brapa_sequence_v1.2.fa TE_RNA_transcripts.fa
If you wanted the call to be :
test -c cuffcompare_output -r reference_genome -b blast_file
You would have something like :
#!/bin/bash
while getopts ":b:c:hr:" opt; do
case $opt in
b)
blastfile=$OPTARG
;;
c)
comparefilefile=$OPTARG
;;
h)
echo "USAGE : test -c cuffcompare_output -r reference_genome -b blast_file"
;;
r)
referencegenome=$OPTARG
;;
\?)
echo "Invalid option: -$OPTARG" >&2
exit 1
;;
:)
echo "Option -$OPTARG requires an argument." >&2
exit 1
;;
esac
done
In the string ":b:c:hr:",
- the first ":" tells getopts that we'll handle any errors,
- subsequent letters are the allowable flags. If the letter is followed by a ':', then getopts will expect that flag to take an argument, and supply that argument as $OPTARG
28I tried to make a script that's converting images source from normal links to base64 encoding in html files.
But there is a problem: sometimes, sed tells me
script.sh: line 25: /bin/sed: Argument list too long
This is the code:
#!/bin/bash
# usage: ./script.sh file.html
mkdir images_temp
for i in `sed -n '/<img/s/.*src="\([^"]*\)".*/\1/p' $1`;
do echo "######### download the image";
wget -P images_temp/ $i;
#echo "######### convert the image for size saving";
#convert -quality 70 `echo ${i##*/}` `echo ${i##*/}`.temp;
#echo "######### rename temp image";
#rm `echo ${i##*/}` && mv `echo ${i##*/}`.temp `echo ${i##*/}`;
echo "######### encode in base64";
k="`echo "data:image/png;base64,"`$(base64 -w 0 images_temp/`echo ${i##*/}`)";
echo "######### deletion of images_temp pictures";
rm images_temp/*;
echo "######### remplace string in html";
sed -e "s|$i|$k|" $1 > temp.html;
echo "######### remplace final file";
rm -rf $1 && mv temp.html $1;
sleep 5;
done;
I think the $k argument is too long for sed when the image is bigger than ~128ko; sed can't process it.
How do I make it work ?
Thank you in advance !
PS1: and sorry for the very very ugly code
PS2: or how do I do that in python ? PHP ? I'm open !
Your base64 encoded image can be multiple megabytes, while the system may place a limit on the maximum length of parameters (traditionally around 128k). Sed is also not guaranteed to handle lines over 8kb, though versions like GNU sed can deal with much more.
If you want to try with your sed, provide the instructions in a file rather than on the command line. Instead of
sed -e "s|$i|$k|" $1 > temp.html;
use
echo "s|$i|$k|" > foo.sed
sed -f foo.sed "$1" > temp.html
I have a simple python script i need to start and stop and i need to use a start.sh and stop.sh script to do it.
I have start.sh:
#!/bin/sh
script='/path/to/my/script.py'
echo 'starting $script with nohup'
nohup /usr/bin/python $script &
and stop.sh
#!/bin/sh
PID=$(ps aux | grep "/path/to/my/script.py" | awk '{print $2}')
echo "killing $PID"
kill -15 $PID
I'm mainly concerned with the stop.sh script. I think that's an appropriate way to find the pid but i wouldn't bet much on it. start.sh successfully starts it. when i run stop.sh, i can no longer find the process by "ps aux | grep 'myscript.py'" but the console outputs:
killing 25052
25058
./stop.sh: 5: kill: No such process
so it seems like it works AND gives an error of sorts with "No such process".
Is this actually an error? Am I approaching this in a sane way? Are there other things I should be paying attention to?
EDIT - I actually ended up with something like this:
start.sh
#!/bin/bash
ENVT=$1
COMPONENTS=$2
TARGETS=("/home/user/project/modules/script1.py" "/home/user/project/modules/script2.py")
for target in "${TARGETS[#]}"
do
PID=$(ps aux | grep -v grep | grep $target | awk '{print $2}')
echo $PID
if [[ -z "$PID" ]]
then
echo "starting $target with nohup for env't: $ENVT"
nohup python $target $ENVT $COMPONENTS &
fi
done
stop.sh
#!/bin/bash
ENVT=$1
TARGETS=("/home/user/project/modules/script1.py" "/home/user/project/modules/script2.py")
for target in "${TARGETS[#]}"
do
pkill -f $target
echo "killing process $target"
done
It is because ps aux |grep SOMETHING also finds the grep SOMETHING process, because SOMETHING matches. After the execution the grep is finished, so it cannot find it.
Add a line: ps aux | grep -v grep | grep YOURSCRIPT
Where -v means exclude. More in man grep.
The "correct" approach would probably be to have your script write its pid to a file in /var/run, and clear it out when you kill the script. If changing the script is not an option, have a look at start-stop-daemon.
If you want to continue with the grep-like approach, have a look at proctools. They're built in on most GNU/Linux machines and readily available on BSD including OS X:
pkill -f /path/to/my/script.py
init-type scripts are useful for this. This is very similar to one I use. You store the pid in a file, and when you want to check if it's running, look into the /proc filesystem.
#!/bin/bash
script_home=/path/to/my
script_name="$script_home/script.py"
pid_file="$script_home/script.pid"
# returns a boolean and optionally the pid
running() {
local status=false
if [[ -f $pid_file ]]; then
# check to see it corresponds to the running script
local pid=$(< "$pid_file")
local cmdline=/proc/$pid/cmdline
# you may need to adjust the regexp in the grep command
if [[ -f $cmdline ]] && grep -q "$script_name" $cmdline; then
status="true $pid"
fi
fi
echo $status
}
start() {
echo "starting $script_name"
nohup "$script_name" &
echo $! > "$pid_file"
}
stop() {
# `kill -0 pid` returns successfully if the pid is running, but does not
# actually kill it.
kill -0 $1 && kill $1
rm "$pid_file"
echo "stopped"
}
read running pid < <(running)
case $1 in
start)
if $running; then
echo "$script_name is already running with PID $pid"
else
start
fi
;;
stop)
stop $pid
;;
restart)
stop $pid
start
;;
status)
if $running; then
echo "$script_name is running with PID $pid"
else
echo "$script_name is not running"
fi
;;
*) echo "usage: $0 <start|stop|restart|status>"
exit
;;
esac
ps aux | grep "/path/to/my/script.py"
will return both the pid for the instance of script.py and also for this instance of grep. That'll probably be why you're getting a no such process: by the time you get around to killing the grep, it's already dead.
I don't have a unix box on at the moment, so i can't test this, but it should be fairly simple to get the idea.
start.sh:
if [ -e ./temp ]
then
pid=`cat temp`
echo "Process already exists; $pid"
else
script='/path/to/my/script.py'
echo 'starting $script with nohup'
nohup /usr/bin/python $script &
echo $! > temp
fi
stop.sh:
if [ -e ./temp ]
then
pid=`cat temp`
echo "killing $pid"
kill -15 $PID
rm temp
else
echo "Process not started"
fi
Try this out.