I am trying to run shell code from a python file to submit another python file to a computing cluster. The shell code is as follows:
#BSUB -J Proc[1]
#BSUB -e ~/logs/proc.%I.%J.err
#BSUB -o ~/logs/proc.%I.%J.out
#BSUB -R "span[hosts=1]"
#BSUB -n 1
python main.py
But when I run it from python like the following I can't get it to work:
from os import system
system('bsub -n 1 < #BSUB -J Proc[1];#BSUB -e ~/logs/proc.%I.%J.err;#BSUB -o ~/logs/proc.%I.%J.out;#BSUB -R "span[hosts=1]";#BSUB -n 1;python main.py')
Is there something I'm doing wrong here?
If I understand correctly, all the #BSUB stuff is text that should be fed to the bsub command as input; bsub is run locally, then runs those commands for you on the compute node.
In that case, you can't just do:
bsub -n 1 < #BSUB -J Proc[1];#BSUB -e ~/logs/proc.%I.%J.err;#BSUB -o ~/logs/proc.%I.%J.out;#BSUB -R "span[hosts=1]";#BSUB -n 1;python main.py
That's interpreted by the shell as "run bsub -n 1 and read from a file named OH CRAP A COMMENT STARTED AND NOW WE DON'T HAVE A FILE TO READ!"
You could fix this with MOAR HACKERY (using echo or here strings taking further unnecessary dependencies on shell execution). But if you want to feed stdin input, the best solution is to use a more powerful tool for the task, the subprocess module:
# Open a process (no shell wrapper) that we can feed stdin to
proc = subprocess.Popen(['bsub', '-n', '1'], stdin=subprocess.PIPE)
# Feed the command series you needed to stdin, then wait for process to complete
# Per Michael Closson, can't use semi-colons, bsub requires newlines
proc.communicate(b'''#BSUB -J Proc[1]
#BSUB -e ~/logs/proc.%I.%J.err
#BSUB -o ~/logs/proc.%I.%J.out
#BSUB -R "span[hosts=1]"
#BSUB -n 1
python main.py
''')
# Assuming the exit code is meaningful, check it here
if proc.returncode != 0:
# Handle a failed process launch here
This avoids a shell launch entirely (removing the issue with needing to deal with comment characters at all, along with all the other issues with handling shell metacharacters), and is significantly more explicit about what is being run locally (bsub -n 1) and what is commands being run in the bsub session (the stdin).
The #BSUB directives are parsed by the bsub binary, which doesn't support ; as a delimiter. You need to use newlines. This worked for me.
#!/usr/bin/python
import subprocess;
# Open a process (no shell wrapper) that we can feed stdin to
proc = subprocess.Popen(['bsub', '-n', '1'], stdin=subprocess.PIPE)
# Feed the command series you needed to stdin, then wait for process to complete
input="""#!/bin/sh
#BSUB -J mysleep
sleep 101
"""
proc.communicate(input);
*** So obviously I got the python code from #ShadowRanger. +1 his answer. I would have posted this as a comment to his answer if SO supported python code in a comment.
Related
I'm trying to flush to stdout the output of a bioinformatic software written on Python code (Ete-Toolkit software). I tried the command (stdbuf) detailed on Force flushing of output to a file while bash script is still running but does not work because I have seen that stdbuf command it's only possible to execute from shell and not from bash(How to use stdbuf on a bash function).
Moreover from Python I discovered the following function that maybe could be interesting:
import sys
sys.stdout.flush()
But I don't know how can I implement inside the next bash script attached below.
The purpose is that if I only use the options -o and -e in the bash script (as you can see) the output is printed to logs_40markers in a not continuos manner which does not permits me to see the error. I can do it working directly from shell but my internet connection is not stable and practically each night there are a power outage and I have to restart again the command that will takes minimun one week.
#!/bin/bash
#$ -N tree
#$ -o logs_40markers
#$ -e logs_40markers
#$ -q all.q#compute-0-3
#$ -l mf=100G
stdbuf -oL
module load apps/etetoolkit-3.1.2
export QT_QPA_PLATFORM='offscreen'
ete3 build -w mafft_default-none-none-none -m sptree_fasttree_all -o provaflush --cogs coglist_species_filtered.txt -a multifasta_speciesunique.fa --clearall --cpu 40
&> logs_40markers
Thanks on advance if someone can give me some guide/advice,
Have a nice day,
Thank you,
Maggi
one informatician colleague of me solved the problem using the PYTHONUNBUFFERED command.
#!/bin/bash
#$ -N tree
#$ -o logs_40markers
#$ -e logs_40markers
#$ -q all.q#compute-0-3
#$ -l mf=100G
module load apps/etetoolkit-3.1.2
export QT_QPA_PLATFORM='offscreen'
export PYTHONUNBUFFERED="TRUE"
ete3 build -w mafft_default-none-none-none -m sptree_fasttree_all -o provaflush --cogs coglist_species_filtered.txt -a multifasta_speciesunique.fa --clearall --cpu 40 --v 4
To check the current situation of the process,
type in the shell:
tail output.log file -f (means follow)
I hope that someone could find this solution helpful
I am trying to execute a Python program as a background process inside a container with kubectl as below (kubectl issued on local machine):
kubectl exec -it <container_id> -- bash -c "cd some-dir && (python xxx.py --arg1 abc &)"
When I log in to the container and check ps -ef I do not see this process running. Also, there is no output from kubectl command itself.
Is the kubectl command issued correctly?
Is there a better way to achieve the same?
How can I see the output/logs printed off the background process being run?
If I need to stop this background process after some duration, what is the best way to do this?
The nohup Wikipedia page can help; you need to redirect all three IO streams (stdout, stdin and stderr) - an example with yes:
kubectl exec pod -- bash -c "yes > /dev/null 2> /dev/null &"
nohup is not required in the above case because I did not allocate a pseudo terminal (no -t flag) and the shell was not interactive (no -i flag) so no HUP signal is sent to the yes process on session termination. See this answer for more details.
Redirecting /dev/null to stdin is not required in the above case since stdin already refers to /dev/null (you can see this by running ls -l /proc/YES_PID/fd in another shell).
To see the output you can instead redirect stdout to a file.
To stop the process you'd need to identity the PID of the process you want to stop (pgrep could be useful for this purpose) and send a fatal signal to it (kill PID for example).
If you want to stop the process after a fixed duration, timeout might be a better option.
Actually, the best way to make this kind of things is adding an entry point to your container and run execute the commands there.
Like:
entrypoint.sh:
#!/bin/bash
set -e
cd some-dir && (python xxx.py --arg1 abc &)
./somethingelse.sh
exec "$#"
You wouldn't need to go manually inside every single container and run the command.
I am trying to submit a python job with qsub which in turn submits several other jobs using subprocess and qsub.
I submit these jobs using 2 bash scripts shown below. run_test is the first one submitted and run_script is submit through subprocess.
$ cat run_test
#$ -cwd
#$ -V
#$ -pe openmpi 1
mpirun -n 1 python test_multiple_submit.py
$ cat run_script
#$ -cwd
#$ -V
#$ -pe openmpi 1
mpirun -n 1 python $1
I am having a problem with the second script where it seems to hang at the mpirun call. I was getting an error from bash before about 'module' not found but that has vanished recently.
A simplified version of the python script is shown below
import subprocess
subprocess.Popen(cmd)
subprocess.Popen('qsub run_script '+input)
<Some checks to see if jobs are still running>
The first subprocess runs a case on the current node and the second one should outsource the job to another node, then there are some checks to see if the jobs are still running. There are also some other bits to get other jobs submitted as well but I'm pretty sure this isn't a problem with the script.
Can anyone shed any light on why the second script is failing?
I found that the compute nodes on the cluster were not submit hosts therefore I was getting an error. The only submit host was the head node.
qconf -ss
The above lists the submit hosts. To add a node to the summit list as admin is shown below:
qconf -as < host name>
I am working on a script in python where first I set ettercap to ARP poisoning and then start urlsnarf to log the URLs. I want to have ettercap to start first and then, while poisoning, start urlsnarf. The problem is that these jobs must run at the same time and then urlsnarf show the output. So I thought it would be nice If I could run ettercap in background without waiting to exit and then run urlsnarf. I tried command nohup but at the time that urlsnarf had to show the url the script just ended. I run:
subprocess.call(["ettercap",
"-M ARP /192.168.1.254/ /192.168.1.66/ -p -T -q -i wlan0"])
But I get:
ettercap NG-0.7.4.2 copyright 2001-2005 ALoR & NaGA
MITM method ' ARP /192.168.1.254/ /192.168.1.66/ -p -T -q -i wlan0' not supported...
Which means that somehow the arguments were not passed correctly
You could use the subprocess module in the Python standard library to spawn ettercap as a separate process that will run simultaneously with the parent. Using the Popen class from subprocess you'll be able to spawn your ettercap process run your other processing and then kill the ettercap process when you are done. More info here: Python Subprocess Package
import shlex, subprocess
args = shlex.split("ettercap -M ARP /192.168.1.254/ /192.168.1.66/ -p -T -q -i wlan0")
ettercap = subprocess.Popen(args)
# program continues without waiting for ettercap process to finish.
I have a python script to start a process which I want to monitor using Nagios. When I run that script and perform ps -ef on my ubuntu EC2 instance, it shows process as python <filename>.py --arguments. For Nagios to monitor that process using check_procs, we need to supply process name. Here process name becomes 'python'.
/usr/lib/nagios/plugins/check_procs -C python
It returns the output that one python process is running. This is fine when I'm running one python process. But If I'm running multiple python scripts and monitor only few, then I have to give that particular process name. If in the above command, I give python script name, it throws an error. So I want to mask whole python <filename>.py --arguments to some other name so that while performing check_procs, I can give that new name.
If anyone have any idea, please let me know. I have checked other stackoverflow questions which suggest changing python process name using setproctitle but I want to perform it using shell.
Regards,
Sanket
You can use the check_procs command to look at arguments, which includes the module name. The following command will let you know if the python module 'module.py' is running.
/usr/lib/nagios/plugins/check_procs -c 1:1 -a module.py -C python
The -c argument lets you set the critical range. 1:1 will trigger a critical status if there is more or less than 1 process that matches running.
The -a argument will filter based on processes that contain the args 'module.py' (change it to the name of the module you want to monitor)
The -C argument will make sure that the process is a python process
If you need help figuring out how to create the service definition, I had to figure that out too. Just let me know.
REFERENCE:
check_procs plugin manpage
http://nagiosplugins.org/man/check_procs
You can't change the process name from pure Python, although you can use a wrapper (for example, written in C) to do so.
However, what you should do instead is making your program a daemon, and using a pidfile. Have a look at the python Daemon API and its implementation python-daemon.
check_procs already handles this situation.
check_procs can tell the difference between scripts launched as an argument to the interpreter vs jobs run directly a hashbang interpreter. Even though both of these look the same in the ps output!! The latter case will not be listed in check_procs -C python!
If you run your scripts explicitly via python: python <filename.py>, then you can monitor them with the check_procs -C python -a filename.py.
If you put #!/usr/bin/python in your scripts and run them as ./filename.py, then you can monitor with check_procs -C filename.py.
Example command line session showing this behavior:
#make test.py directly executable. See code below
$ chmod a+x test.py
#launch via python explicitly:
$ /usr/bin/python ./test.py &
[1] 27094
$ check_procs -C python && check_procs -C test.py && check_procs -a test.py
PROCS OK: 1 process with command name 'python'
PROCS OK: 0 processes with command name 'test.py'
PROCS OK: 1 process with args 'test.py'
#launch via python implicitly
$ ./test.py &
[2] 27134
$ check_procs -C python && check_procs -C test.py && check_procs -a test.py
PROCS OK: 1 process with command name 'python'
PROCS OK: 1 process with command name 'test.py'
PROCS OK: 2 processes with args 'test.py'
#PS 'COMMAND' output looks the same
$ ps 27094 27134
PID TTY STAT TIME COMMAND
27094 pts/6 S 0:00 /usr/bin/python ./test.py
27134 pts/6 S 0:00 /usr/bin/python ./test.py
#kill the explicit test
$ kill 27094
[1] - terminated /usr/bin/python ./test.py
$ check_procs -C python && check_procs -C test.py && check_procs -a test.py
PROCS OK: 0 processes with command name 'python'
PROCS OK: 1 process with command name 'test.py'
PROCS OK: 1 process with args 'test.py'
#kill the implicit test
$ kill 27134
[2] + terminated ./test.py
$ check_procs -C python && check_procs -C test.py && check_procs -a test.py
PROCS OK: 0 processes with command name 'python'
PROCS OK: 0 processes with command name 'test.py'
PROCS OK: 0 processes with args 'test.py'
test.py is a python script that sleeps for 2 minutes. It is chmod +x and has a hashbang #! line invoking /usr/bin/python.
#!/usr/bin/python
import time
time.sleep(120)
Create a pid file and use that file for the process lookup with nagios.
I'm not saying this is the best solution (it wouldn't scale well at all), but you can create a symbolic link to the python command and execute your script using this link. e.g.
ln -s `which python` ~/mypython
~/mypython myscript.py
Scripts launched using the link should show up as mypython in ps.
You can use subprocess.Popen to change the executable name, but you'd have to use a wrapper script (or some weird fork magic). The following code causes ps to list the executable as kwyjibo /tmp/test.py instead of /usr/bin/python /tmp/test.py:
import subprocess
p = subprocess.Popen(['kwyjibo', '/tmp/test.py'], executable='/usr/bin/python')