Supercomputer: Dead simple example of a program to run in supercomputer - python

I am learning how to use supercomputers to make the good use of resources.
Let's say I have a python script, that will create a text file with given random number.
myfile.py
# Imports
import random,os
outdir = 'outputs'
if not os.path.exists(outdir):
os.makedirs(outdir)
with open (outdir+'/temp.txt','w') as f :
a = random.randint(0,9)
f.write(str(a))
This will create only one text file in the local machine.
Is there any way I can use the multiple instances of this program, use multiple nodes and get multiple outputs?
I got a template for mpiexec in C program which looks like this, but I could not find any template for python program.
#PBS -N my_job
#PBS -l walltime=0:10:00
#PBS -l nodes=4:ppn=12
#PBS -j oe
cd $PBS_O_WORKDIR
mpicc -O2 mpi-hello.c -o mpi-hello
cp $PBS_O_WORKDIR/* $PFSDIR
cd $PFSDIR
mpiexec ./mpi-hello
cp $PFSDIR/* $PBS_O_WORKDIR
Note: On a single node using multiple cores I can write a bash script like this:
for i in `seq 1 10`;
do
python myfile.py && cp temp.txt outputs/out$i.txt &
done
But I want to utilize different nodes.
Required output:
outputs/out1.txt,out2.txt,out3.txt etc
Some related links are following:
https://www.osc.edu/sites/osc.edu/files/documentation/Batch%20Training%20-%2020150312%20-%20OSC.pdf
https://www.osc.edu/~kmanalo/multithreadedsubmission

Take a look to this link it may solve your problem
http://materials.jeremybejarano.com/MPIwithPython/introMPI.html
so your code may be something like:
from mpi4py import MPI
import random,os
outdir = 'outputs'
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
if not os.path.exists(outdir):
os.makedirs(outdir)
with open (outdir+'/temp%s.txt' % rank,'w') as f :
a = random.randint(0,9)
f.write(str(a))
and the pbs file:
#!/bin/bash
################################################################################
#PBS -N myfile.py
#PBS -l nodes=7:ppn=4
#PBS -l walltime=30:30:00:00
#PBS -m bea
##PBS -M mail#mail.mail
###############################################################################
cores=$(awk 'END {print NR}' $PBS_NODEFILE)
mpirun -np $cores python myfile.py

Related

Bash: For Loop of Job Arrays - Hoffman2

I am pretty new to bash code, and I have some basic questions.
I have one job array job_array_1.sh, which I am running in Hoffman2.
job_array_1.sh is the following:
#!/bin/bash
#$ -cwd
#$ -o test.joblog.$JOB_ID.$TASK_ID
#$ -j y
#$ -l h_data=5G,h_rt=00:20:00
#$ -m n
#$ -t 1-5:1
. /u/local/Modules/default/init/modules.sh
module load anaconda3
#module load python/3.9.6
python3 file1.py $SGE_TASK_ID
If, from the terminal I type qsub job_array_1.sh, this produces 5 different files with names test.joblog.$JOB_ID.$TASK_ID (with the value of t as $TASK_ID). Notice that in this way the 5 jobs start in a parallel way.
I need to create another file call it loop.sh such that it submits the file job_array_1.sh sequentially (in this case twice). So far I have:
#$ -cwd
#$ -j y
#$ -l h_data=3G,h_rt=01:00:00
#$ -m n
for ((i=1; i<=2; i++)); do
# job submission scripts or shell scripts
fname_in1="job_array_1.sh"
./$fname_in1 &
wait
done
When, from the terminal, I type qsub loop.sh this does not produce the 5 files that I have if I do qsub job_array_1.sh. How can I modify the loop.sh file so that the job_array_1.sh produces the 5 files?
I'm guessing wildly here because I don't know anything about your job submission system, but I do know a little about bash and am trying to help. I suspect you need something more like this:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -l h_data=3G,h_rt=01:00:00
#$ -m n
for ((i=1; i<=2; i++)); do
# job submission scripts or shell scripts
echo "Loop: $i"
qsub job_array_1.sh &
done
wait

Force flushing from inside bash script code to a stdout file

I'm trying to flush to stdout the output of a bioinformatic software written on Python code (Ete-Toolkit software). I tried the command (stdbuf) detailed on Force flushing of output to a file while bash script is still running but does not work because I have seen that stdbuf command it's only possible to execute from shell and not from bash(How to use stdbuf on a bash function).
Moreover from Python I discovered the following function that maybe could be interesting:
import sys
sys.stdout.flush()
But I don't know how can I implement inside the next bash script attached below.
The purpose is that if I only use the options -o and -e in the bash script (as you can see) the output is printed to logs_40markers in a not continuos manner which does not permits me to see the error. I can do it working directly from shell but my internet connection is not stable and practically each night there are a power outage and I have to restart again the command that will takes minimun one week.
#!/bin/bash
#$ -N tree
#$ -o logs_40markers
#$ -e logs_40markers
#$ -q all.q#compute-0-3
#$ -l mf=100G
stdbuf -oL
module load apps/etetoolkit-3.1.2
export QT_QPA_PLATFORM='offscreen'
ete3 build -w mafft_default-none-none-none -m sptree_fasttree_all -o provaflush --cogs coglist_species_filtered.txt -a multifasta_speciesunique.fa --clearall --cpu 40
&> logs_40markers
Thanks on advance if someone can give me some guide/advice,
Have a nice day,
Thank you,
Maggi
one informatician colleague of me solved the problem using the PYTHONUNBUFFERED command.
#!/bin/bash
#$ -N tree
#$ -o logs_40markers
#$ -e logs_40markers
#$ -q all.q#compute-0-3
#$ -l mf=100G
module load apps/etetoolkit-3.1.2
export QT_QPA_PLATFORM='offscreen'
export PYTHONUNBUFFERED="TRUE"
ete3 build -w mafft_default-none-none-none -m sptree_fasttree_all -o provaflush --cogs coglist_species_filtered.txt -a multifasta_speciesunique.fa --clearall --cpu 40 --v 4
To check the current situation of the process,
type in the shell:
tail output.log file -f (means follow)
I hope that someone could find this solution helpful

Implement Git hook - prePush and preCommit

Could you please show me how to implement git hook?
Before committing, the hook should run a python script. Something like this:
cd c:\my_framework & run_tests.py --project Proxy-Tests\Aeries \
--client Aeries --suite <Commit_file_Name> --dryrun
If the dry run fails then commit should be stopped.
You need to tell us in what way the dry run will fail. Will there be an output .txt with errors? Will there be an error displayed on terminal?
In any case you must name the pre-commit script as pre-commit and save it in .git/hooks/ directory.
Since your dry run script seems to be in a different path than the pre-commit script, here's an example that finds and runs your script.
I assume from the backslash in your path that you are on a windows machine and I also assume that your dry-run script is contained in the same project where you have git installed and in a folder called tools (of course you can change this to your actual folder).
#!/bin/sh
#Path of your python script
FILE_PATH=tools/run_tests.py/
#Get relative path of the root directory of the project
rdir=`git rev-parse --git-dir`
rel_path="$(dirname "$rdir")"
#Cd to that path and run the file.
cd $rel_path/$FILE_PATH
echo "Running dryrun script..."
python run_tests.py
#From that point on you need to handle the dry run error/s.
#For demonstrating purproses I'll asume that an output.txt file that holds
#the result is produced.
#Extract the result from the output file
final_res="tac output | grep -m 1 . | grep 'error'"
echo -e "--------Dry run result---------\n"${final_res}
#If a warning and/or error exists abort the commit
eval "$final_res" | while read -r line; do
if [ $line != "0" ]; then
echo -e "Dry run failed.\nAborting commit..."
exit 1
fi
done
Now every time you fire git commit -m the pre-commit script will run the dry run file and abort the commit if any errors have occured, keeping your files in the stagin area.
I have implemented this in my hook. Here is the code snippet.
#!/bin/sh
#Path of your python script
RUN_TESTS="run_tests.py"
FRAMEWORK_DIR="/my-framework/"
CUR_DIR=`echo ${PWD##*/}`
`$`#Get full path of the root directory of the project under RUN_TESTS_PY_FILE
rDIR=`git rev-parse --git-dir --show-toplevel | head -2 | tail -1`
OneStepBack=/../
CD_FRAMEWORK_DIR="$rDIR$OneStepBack$FRAMEWORK_DIR"
#Find list of modified files - to be committed
LIST_OF_FILES=`git status --porcelain | awk -F" " '{print $2}' | grep ".txt" `
for FILE in $LIST_OF_FILES; do
cd $CD_FRAMEWORK_DIR
python $RUN_TESTS --dryrun --project $CUR_DIR/$FILE
OUT=$?
if [ $OUT -eq 0 ];then
continue
else
return 1
fi
done

Qsub job using subprocess from worker node on cluster

I am trying to submit a python job with qsub which in turn submits several other jobs using subprocess and qsub.
I submit these jobs using 2 bash scripts shown below. run_test is the first one submitted and run_script is submit through subprocess.
$ cat run_test
#$ -cwd
#$ -V
#$ -pe openmpi 1
mpirun -n 1 python test_multiple_submit.py
$ cat run_script
#$ -cwd
#$ -V
#$ -pe openmpi 1
mpirun -n 1 python $1
I am having a problem with the second script where it seems to hang at the mpirun call. I was getting an error from bash before about 'module' not found but that has vanished recently.
A simplified version of the python script is shown below
import subprocess
subprocess.Popen(cmd)
subprocess.Popen('qsub run_script '+input)
<Some checks to see if jobs are still running>
The first subprocess runs a case on the current node and the second one should outsource the job to another node, then there are some checks to see if the jobs are still running. There are also some other bits to get other jobs submitted as well but I'm pretty sure this isn't a problem with the script.
Can anyone shed any light on why the second script is failing?
I found that the compute nodes on the cluster were not submit hosts therefore I was getting an error. The only submit host was the head node.
qconf -ss
The above lists the submit hosts. To add a node to the summit list as admin is shown below:
qconf -as < host name>

Exit code 191 when running a python script to run a shell file

I'm trying to use a python script to run a series of oommf simulations on a unix cluster but I'm getting stuck at the point where I send a command from python to bash. I'm using the line:-
subprocess.check_call('qsub shellfile.sh')
Which returns exit code 191. What is exit code 191, I can't seem to be able to find it online. It may be a PBS error rather than a unix error but I'm not sure. The error doesn't seem to be in the shell file itself since the only commands in there:-
#!/bin/bash
# This is an example submit script for the hello world program.
# OPTIONS FOR PBS PRO ==============================================================
#PBS -l walltime=1:00:00
# This specifies the job should run for no longer than 24 hours
#PBS -l select=1:ncpus=8:mem=2048mb
# This specifies the job needs 1 'chunk', with 1 CPU core, and 2048 MB of RAM (memory).
#PBS -j oe
# This joins up the error and output into one file rather that making two files
##PBS -o $working_folder/$PBS_JOBID-oommf_log
# This send your output to the file "hello_output" rather than the standard filename
# OPTIONS FOR PBS PRO ==============================================================
#PBS -P HPCA-000987-EFR
#PBS -M ppxsb3#nottingham.ac.uk
#PBS -m abe
# Here we just use Unix command to run our program
echo "Running on hostname"
sleep 20
echo "Finished job now""
Which should just print the hostname and 'Finished job now'
Thanks
Exit code 191 indicates that the project code associated with the job is invalid. This is the code in line 13:-
#PBS -P HPCA-000974-EFG
Which tells the cluster which project the code is associated with.

Categories