Executing python script using Jenkins parameterized pipeline - python

I have set up a Jenkins parameterized job to execute python script using execute shell feature in Jenkin. The job has the following parameters:
user-name: string, order_area_name: comma-separated strings, country_name: string, country_code: string, and so on...
My use case is to split the order_area_name and execute the python script for every order_area_name sequentially. So, I wrote a script that looks something like this:
#!/bin/bash
export PYTHONHASHSEED=0
empty_string=""
parameters_list=""
IFS=","
#Checking every parameter if it is present or not
if [ "$user_name" != "$empty_string" ]
then
parameters_list=$parameters_list" --user "$user_name
fi
if [ "$country_code" != "$empty_string" ]
then
parameters_list=$parameters_list" --country_code "$country_code
fi
if [ "$country_category" != "$empty_string" ]
then
parameters_list=$parameters_list" --country_category "$country_category
fi
parameters_list=$parameters_list" --aws_access_key_id "$aws_access_key_id
parameters_list=$parameters_list" --aws_secret_access_key "$aws_secret_access_key
##Checking if the parameter is present then splitting the string and storing it into array
##Then for each order_area_name executing the python script in sequential manner
if [ "$order_area_names" != "$empty_string" ]
then
read -r -a order_area_name_array <<< "$order_area_names"
for order_area in "${order_area_name_array[#]}";
do
final_list=$parameters_list" --order_area_name "$order_area
echo $final_list
python3 ./main.py ${final_list}
done
fi
exit
I am not able pass the final_list of values to the python script because of which the Jenkin job is failing. If I echo the final_list I see that the values are properly getting initialized:
--user jay#abc.com --mqs_level Q2 --num_parallel_pipelines 13 --sns_topic topicname --ramp_up_time 45 --max_duration_for_task 30 --batch_size 35 --lead_store_db_schema schema --airflow_k8s_web_server_pod_name airflow-web-xyz --aws_access_key_id 12345678 --aws_secret_access_key 12345678 --order_area_name London
The error looks something like this:
main.py: error: the following arguments are required: --user, --sns_topic, --aws_access_key_id, --aws_secret_access_key
Build step 'Execute shell' marked build as failure
Finished: FAILURE
I searched for this at a lot of places but didn't get any concrete answer for this. Could anyone please help me with this?

Instead of writing my command like:
python3 ./main.py ${final_list}
I used this and it worked very well:
echo $final_list | bash
"final_list" variable has the command which needs to be executed.

Related

Calling python script from bash script and getting it's return value

I've bash script doing some tasks but I need to manipulate on string obtained from configuration (for simplification in this test it's hardcoded). This manipulation can be done easily in python but is not simple in bash, so I've written a script in python doing this tasks and returning a string (or ideally an array of strings).
I'm calling this python script in my bash script. Both scripts are in the same directory and this directory is added to environment variables. I'm testing it on Ubuntu 22.04.
My python script below:
#!/usr/bin/python
def Get(input: str) -> list:
#Doing tasks - arr is an output array
return ' '.join(arr) #or ideally return arr
My bash script used to call the above python script
#!/bin/bash
ARR=("$(python -c "from test import Get; Get('val1, val2,val3')")")
echo $ARR
for ELEMENT in "${ARR[#]}"; do
echo "$ELEMENT"
done
When I added print in python script for test purposes I got proper results, so the python script works correctly. But in the bash script I got simply empty line. I've tried also something like that: ARR=("$(python -c "from test import Get; RES=Get('val1, val2,val3')")") and the iterate over res and got the same response.
It seems like the bash script cannot handle the data returned by python.
How can I rewrite this scripts to properly get python script response in bash?
Is it possible to get the whole array or only the string?
How can I rewrite this scripts to properly get python script response in bash?
Serialize the data from python side and deserialize on bash. Decide on proper protocol between the processes that would preserve any characters.
The best looks like it is to use newline or zero separated strings (protocol). Output delimiter separated elements from python (serialize) and read them properly on with readarray on bash side (deserialize).
$ tmp=$(python -c 'arr=[1,2,3]; print(*arr)')
$ readarray -t array <<<"$tmp"
$ declare -p array
declare -a array=([0]="1" [1]="2" [2]="3")
Or with zero separated stream. Note that Bash can't store zero bytes in variables, so we use redirection with process subtitution:
$ readarray -d '' -t array < <(python -c 'arr=[1,2,3]; print(*arr, sep="\0", end="")')
$ declare -p array
declare -a array=([0]="1" [1]="2" [2]="3")
I've solved my problem by exporting a string with elements separated by space.
I've also rewritten python code not to be a function but a script.
import sys
if len(sys.argv) > 1:
input = sys.argv[1]
#Doing tasks - arr is an output array
for element in arr:
print(element)
ARRAY=$(python script.py 'val1, val2,val3')
for ELEMENT in $ARRAY; do
echo "$ELEMENT"
done

can a script provide input when prompted by shell? [duplicate]

This question already has answers here:
Have bash script answer interactive prompts [duplicate]
(6 answers)
Closed 1 year ago.
suppose i wanted to make a bunch of files full of gibberish.
if i wanted to one file of gibberish, then encrypt it using ccrypt, i can do this:
$ echo "12 ddsd23" > randomfile.txt,
now using ccrypt:
$ ccrypt -e randomfile.txt
Enter encryption key:
Enter encryption key: (repeat)
as you can see i am prompted for input for the key.
i want to automate this and create a bunch of gibberish files.
script in python to produce random gibberish:
import random as rd
import string as st
alphs = st.ascii_letters
digits = st.digits
word = ""
while len(word) < 1000:
word += str(rd.choices(alphs))
word += str(rd.choices(digits))
print(word)
now running this from bash script, saving gibberish to file:
#!/bin/bash
count=1
while [ $count -le 100 ]
do
python3 /path/r.py > "file$count.txt"
ccrypt -e "file$count.txt"
((count=count+1))
done
problem, as you can see:
$ bash random.sh
Enter encryption key:
ccrypt does not have an option to provide passphrase as an argument.
Question: is there a way for the bash script to provide the passphrase when shell prompts for it?
i am aware this can be solved just by doing the encryption in python but just curious if something like this can be done with bash.
if it matters: there is an option for ccrypt to ask for just one prompt.
[Edited]
My original answer suggested to do:
printf "$PASSPHRASE\n$PASSPHRASE\n" | ccrypt -e "file$count.txt"
which is the generic solution that should work with many tools that expect some input passed to their STDIN; but it doesn't seem to work with ccrypt for whatever reason.
However, ccrypt also has options for providing the passphrase in different (non-interactive) ways:
$ ccrypt --help
...
-K, --key key give keyword on command line (unsafe)
-k, --keyfile file read keyword(s) as first line(s) from file
...
Here's an example using -K. Note that it is "unsafe" because if you execute this command in your interactive shell, or run your script with -x (to print each executed command), the passphrase may end up in ~/.bash_history or in some logs, respectively, so dump the passphrase to a file and use -k in case that's important.
#!/bin/bash
# read the passphrase, do not display it to screen
read -p "Please provide a passphrase:" -s PASSPHRASE
count=1
while [ $count -le 100 ]
do
python script.py > "file$count.txt"
ccrypt -e "file$count.txt" -K "$PASSPHRASE"
((count=count+1))
done
You need to use the yes command in your bash code. Basically this command will provide the inputs for a script (ie. ccrypt) whenever it needs it. Check here for more info.

Snakemake "Missing files after X seconds" error

I am getting the following error every time I try to run my snakemake script:
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 16
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 pear
1
[Wed Dec 4 17:32:54 2019]
rule pear:
input: Unmap_41_1.fastq, Unmap_41_2.fastq
output: merged_reads/Unmap_41.fastq
jobid: 0
wildcards: sample=Unmap_41, extension=fastq
Waiting at most 120 seconds for missing files.
MissingOutputException in line 14 of /faststorage/project/ABR/scripts/antismash.smk:
Missing files after 120 seconds:
merged_reads/Unmap_41.fastq
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
The snakefile is the following:
workdir: config["path_to_files"]
wildcard_constraints:
separator = config["separator"],
extension = config["file_extension"],
sample = '|' .join(config["samples"])
rule all:
input:
expand("antismash-output/{sample}/{sample}.txt", sample = config["samples"])
# merging the paired end reads (either fasta or fastq) as prodigal only takes single end reads
rule pear:
input:
forward = f"{{sample}}{config['separator']}1.{{extension}}",
reverse = f"{{sample}}{config['separator']}2.{{extension}}"
output:
"merged_reads/{sample}.{extension}"
#conda:
#"/home/lamma/env-export/antismash.yaml"
run:
"""
set+u; source activate antismash; set -u ;
pear -f {input.forward} -r {input.reverse} -o {output} -t 21
"""
# If single end then move them to merged_reads directory
rule move:
input:
"{sample}.{extension}"
output:
"merged_reads/{sample}.{extension}"
shell:
"cp {path}/{sample}.{extension} {path}/merged_reads/"
# Setting the rule order on the 3 above rules which should be treated equally and only one run.
ruleorder: pear > move
# annotating the metagenome with prodigal#. Can be done inside antiSMASH but prefer to do it out
rule prodigal:
input:
f"merged_reads/{{sample}}.{config['file_extension']}"
output:
gbk_files = "annotated_reads/{sample}.gbk",
protein_files = "protein_reads/{sample}.faa"
#conda:
#"/home/lamma/env-export/antismash.yaml"
shell:
"""
set+u; source activate antismash; set -u ;
prodigal -i {input} -o {output.gbk_files} -a {output.protein_files} -p meta
"""
# running antiSMASH on the annotated metagenome
rule antiSMASH:
input:
"annotated_reads/{sample}.gbk"
output:
touch("antismash-output/{sample}/{sample}.txt")
#conda:
#"/home/lamma/env-export/antismash.yaml"
shell:
"""
set+u; source activate antismash; set -u ;
antismash --knownclusterblast --subclusterblast --full-hmmer --smcog --outputfolder antismash-output/{wildcards.sample}/ {input}
"""
I am running the pipeline on only one file at the moment but the yaml file looks like this if it is of intest:
file_extension: fastq
path_to_files: /home/lamma/ABR/Each_reads
samples:
- Unmap_41
separator: _
I know the error can occure when you use certain flags in snakemake but I dont believe I am using those flags. The command being submited to run the snakefile is:
snakemake --latency-wait 120 --rerun-incomplete --keep-going --jobs 99 --cluster-status 'python /home/lamma/ABR/scripts/slurm-status.py' --cluster 'sbatch -t {cluster.time} --mem={cluster.mem} --cpus-per-task={cluster.c} --error={cluster.error} --job-name={cluster.name} --output={cluster.output}' --cluster-config antismash-config.json --configfile yaml-config-files/antismash-on-rawMetagenome.yaml -F --snakefile antismash.smk
I have tried to -F flag to force a rerun but this seems to do nothing, as does increasing the --latency-wait number. Any help would be appriciated :)
In rule pear I think you want to use the shell directive instead of run. With run you execute python code which in this case does nothing as you simply "execute" a string so you get no error and no file produced.

Ignoring a specific file in a shell script for loop

My shell script loops through a given directory to execute a python script. I want to be able to ignore a specific file which is export_config.yaml. This is currently used in the for loop and prints out error on the terminal I want it to totally ignore this file.
yamldir=$1
for yaml in ${yamldir}/*.yaml; do
if [ "$yaml" != "export_config.yaml" ]; then
echo "Running export for $yaml file...";
python jsonvalid.py -p ${yamldir}/export_config.yaml -e $yaml -d ${endDate}
wait
fi
done
Your string comparison in the if statement is not matching export_config.yaml because for each iteration of your for loop you are assigning the entire relative file path (ie "$yamldir/export_config.yaml", not just "export_config.yaml") to $yaml.
First Option: Changing your if statement to reflect that should correct your issue:
if [ "$yaml" != "${yamldir}/export_config.yaml" ]; then
#etc...
Another option: Use basename to grab only the terminal path string (ie the filename).
for yaml in ${yamldir}/*.yaml; do
yaml=$(basename $yaml)
if [ "$yaml" != "export_config.yaml" ]; then
#etc...
Third option: You can do away with the if statement entirely by doing your for loop like this instead:
for yaml in $(ls ${yamldir}/*.yaml | grep -v "export_config.yaml"); do
By piping the output of ls to grep -v, you can exclude any line including export_config.yaml from the directory listing in the first place.

Inserting python code in a bash script

I've got the following bash script:
#!/bin/bash
while read line
do
ORD=`echo $line | cut -c 7-21`
if [[ -r ../FASTA_SEC/${ORD}.fa ]]
then
WCR=`fgrep -o N ../FASTA_SEC/$ORD.fa | wc -l`
WCT=`wc -m < ../FASTA_SEC/$ORD.fa`
PER1=`echo print $WCR/$WCT.*100 | python`
WCTRIN=`fgrep -o N ../FASTA_SEC_EDITED/$ORD"_Trimmed.fa" | wc -l`
WCTRI=`wc -m < ../FASTA_SEC_EDITED/$ORD"_Trimmed.fa"`
PER2=`echo print $WCTRIN/$WCTRI.*100 | python`
PER3=`echo print $PER1-$PER2 | python`
echo $ORD $PER1 $PER2 $PER3 >> Log.txt
if [ $PER2 -ge 30 -a $PER3 -lt 10 ]
then
mv ../FASTA_SEC/$ORD.fa ./TRASH/$ORD.fa
mv ../FASTA_SEC_EDITED/$ORD"_Trimmed.fa" ./TRASH/$ORD"_Trimmed.fa"
fi
fi
done < ../READ/Data.txt
$PER variables are floating numbers as u might have noticed so I cannot use them normaly in the nested if conditional. I'd like to do this conditional iteration in python but I have no clue how do it whithin a bash script also I dont know how to import the value of the variables $PER2 and $PER3 into python. Could I write directly python code in the same bash script invvoking python somehow?
Thank you for your help, first time facing this.
You can use python -c CMD to execute a piece of python code from the command line. If you want bash to interpolate your environment variables, you should use double quotes around CMD.
You can return a value by calling sys.exit, but keep in mind that true and false in Python have the reverse meaning in bash.
So your code would be:
if python -c "import sys; sys.exit(not($PER2 > 30 and $PER3 < 10 ))"
It is possible to feed Python code to the standard input of python executable with the help of here document syntax:
variable=$(date)
python2.7 <<SCRIPT
print "The current date: %s" % "${variable}"
SCRIPT
In order to avoid parameter substitution (interpretation within the block), quote the first limit string: <<'SCRIPT'.
If you want to assign the output to a variable, use command substitution:
output=$(python2.7 <<SCRIPT
print "The current date: %s" % "${variable}"
SCRIPT
)
Note, it is not recommended to use back quotes for command substitution, as it is impossible to nest them, and the form $(...) is more readable.
maybe this helps?
$ X=4; Y=7; Z=$(python -c "print($X * $Y)")
$ echo $Z
28
python -c "str" takes "str" as input and runs it.
but then why not rewrite all in python? bash commands can nicely be executed with subprocess which is included in python or (need to install that) sh.

Categories