Run a series of external commands in a Python script

Run a series of external commands in a Python script - python

I'm trying to run external commands (note the plural) from a Python script. I've been reading about the subprocess module and use it. It works for me when I have a single or independent commands to run, whether I'm interested in the stdout or not.
What I want to do is a bit different: I want something persistent. Basically the first command I run is to log in an application, then I can run some other commands which only work if I'm logged in. For some of these commands, I need the stdout.
So when I use subprocess to login, it does it, but then the process is killed and next time I run a command with subprocess, I'm not logged in anymore... I just need to run a series of commands, like I would do it in a terminal.
Any idea how to do that?

You can pass in an arbitrarily complex series of commands with shell=True though I would generally advise against doing that, if not least because you are making your Python script platform dependent.
result = subprocess.check_output('''
servers=0
for server in one two three four; do
output=$(printf 'echo moo\necho bar\necho baz\n' | ssh "$server")
case $output in *"hello"*) echo "$output";; esac
echo "$output" | grep -q 'ALERT' && echo "$server: Intrusion detected"
servers=$((servers++))
done
echo "$servers hosts checked"
''', shell=True)
One of the problems with shell script (or I guess Powershell or cmd batch script if you are in that highly unfortunate predicament) is that doing what you are vaguely describing is often hard to do with a bunch of unconnected processes. E.g. curl has a crude way to maintain a session between separate invocations by keeping a "cookie jar" which allows one curl to pass on login credentials etc to an otherwise independent curl call later on, but there is no good, elegant, general mechanism for this. If at all possible, doing these things from within Python would probably make your script more robust as well as simpler and more coherent.

Related

bash script that is run from Python reaches sudo timeout

This is a long bash script (400+ lines ) that is originally invoked from a django app like so -
os.system('./bash_script.sh &> bash_log.log')
It stops on a random command in the script. If the order of commands is changed, it hangs on another command in approx. the same location.
sshing to the machine that runs the django app, and running sudo ./bash_script.sh, asks for a password and then runs all the way.
I can't see the message it presents when it hangs in the log file, couldn't make it redirect there. I assume it's a sudo password request.
Tried -
sudo -v in the script - didn't help.
ssh to the machine and manually extend the sudo timeout in /etc/sudoers - didnt help, I think since the django app is already in the air and uses the previos timeout.
splitting the script in two, and running one in separate thread, like so -
def basher(command, log_path):
with open(log_path) as log:
Popen(command, stdout=log, stderr=log).wait()
script_thread = Thread(target=basher, args=('bash_script_pt1.sh', 'bash_log_pt1.log'))
script_thread.start()
os.system('./bash_script_pt2.sh &> bash_log_pt2.log') # I know it's deprecated, not sure if maybe it's better in this case
script_thread.join()
The logs showed that part 1 ended ok, but part 2 still hangs, albeit later in the code than when they were together.
I thought to edit /etc/sudoers from inside the Python code, and then re-login via su - user. There are snippets of how to pass the password using pty, however I don't understand the mechanics of it and could not get it to work.
I also noted that ps aux | grep bash_script.sh shows that the script is being run twice. As -
/bin/bash bash_script.sh
and as
sh -c bash_script.sh.
I assume os.system has an internal shell=True going on.
I don't understand the Linux entities/mechanics in play to figure out what's happening.

My guess is that the django app has different and more limited permissions, than the script itself does, and the script is inheriting said restrictions because it is being executed by it.
You need to find out what permissions the script has when you run it just from bash, and what it has when you run it via django, and then figure out what the difference is.

How to run a python or bash script interactively on a webpage?

I am building a website and I would like to show a terminal on my webpage which runs a script (python or bash) interactively.
Something like trinket.io but I would like to use the python interpreter or the bash I have on my server, so I could install pip packages and in general control every aspect of the script.
I was thinking at something like an interactive frame which shows the terminal and what's executed in it, obv with user interaction supported.
A good example is https://create.withcode.uk/, it's exactly what I want but I would like to host it on my own server with my own modules and ecosystem. This seems to be pretty good also on the security side.
Is there anything like that?

If I understand well you look for a mechanism, that allows you to display a terminal on a web server.
Then you want to run an interactive python script on that terminal, right.
So in the end the solution to share a terminal does not necessarily have to be written in python, right? (Though I must admit that I prefer python solutions if I find them, but sometimes being pragmatic isn't a bad idea)
You might google for http and terminal emulators.
Perhaps ttyd fits the bill. https://github.com/tsl0922/ttyd
Building on linux could be done with
sudo apt-get install build-essential cmake git libjson-c-dev libwebsockets-dev
git clone https://github.com/tsl0922/ttyd.git
cd ttyd && mkdir build && cd build
cmake ..
make && make install
Usage would be something like:
ttyd -p 8888 yourpythonscript.py
and then you could connect with a web browser with http://hostip:8888
you might of course 'hide' this url behind a reverse proxy and add authentification to it
or add options like --credential username:password to password protect the url.
Addendum:
If you want to share multiple scripts with different people and the shareing is more a on the fly thing, then you might look at tty-share ( https://github.com/elisescu/tty-share ) and tty-server ( https://github.com/elisescu/tty-server )
tty-server can be run in a docker container.
tty-share can be used to run a script on your machine on one of your terminals. It will output a url, that you can give to the person you want to share the specific session with)
If you think that's interesting I might elaborate on this one

>> Insert security disclaimer here <<
Easiest most hacktastic way to do it is to create a div element where you'll store your output and an input element to enter commands. Then you can ajax POST the command to a back-end controller.
The controller would take the command and run it while capturing the output of the command and sending it back to the web page for it to render it in the div
In python I use this to capture command output:
from subprocess import Popen, STDOUT, PIPE
proc = Popen(['ls', '-l'], stdout=PIPE, stderr=STDOUT, cwd='/working/directory')
proc.wait()
return proc.stdout.read()

How to interact with new shell I logged into from script?

Not able to send commands to shell I logged into
Originally, I wrote a Python script. It was able to send commands like
subprocess.run(['kubectl', 'config', 'get-context'], shell=True)
but when it came time to get to the child shell, in this case bash, the command wouldn't run until I exited that shell and it would say things like it couldn't find the command.
I then tried to do it with the module "sh," but was also unsuccessful
I thought maybe using Python was problem and also realized my ultimate goal was to use a different shell (cypher-shell) and so skipped immediately to that with bash as the parent shell. In there I have a line that is sometimes successful, sometimes not
kubectl run -it --rm cypher-shell --image=gcr.io/cloud-marketplace/neo4j-public/causal-cluster-k8s:3.4 --restart=Never --namespace=default --command -- ./bin/cypher-shell -u neo4j -p "password" -a "domain.name"
But even when it successfully logs in it, it just hangs until I manually exit and then it runs the next commands
Note: I saw this and so, perhaps, it's not a child shell? Run shell command from child shell

I can't say I know exactly what you are doing, but if I understand your objective correctly you want the Python program to continue to log while the script continues to run? The problem is that the logger continues to run and holds up your program. The way I would deal with that would be to run the logger as a background process.
With bash, that would be ./script.sh & which would make it run without holding the rest of the program back from running.
Hopefully that may give you an idea! Good luck.

Checking for dead links locally in a static website (using wget?)

A very nice tool to check for dead links (e.g. links pointing to 404 errors) is wget --spider. However, I have a slightly different use-case where I generate a static website, and want to check for broken links before uploading. More precisely, I want to check both:
Relative links like file.pdf
Absolute links, most likely to external sites like example.
I tried wget --spyder --force-html -i file-to-check.html, which reads the local file, considers it as HTML and follows each links. Unfortunately, it can't deal with relative links within the local HTML file (errors out with Cannot resolve incomplete link some/file.pdf). I tried using file:// but wget does not support it.
Currently, I have a hack based on running a local webserver throught python3 http.serve and checking the local files through HTTP:
python3 -m http.server &
pid=$!
sleep .5
error=0
wget --spider -nd -nv -H -r -l 1 http://localhost:8000/index.html || error=$?
kill $pid
wait $pid
exit $error
I'm not really happy with this for several reasons:
I need this sleep .5 to wait for the webserver to be ready. Without it, the script fails, but I can't guarantee that 0.5 seconds will be enough. I'd prefer having a way to start the wget command when the server is ready.
Conversely, this kill $pid feels ugly.
Ideally, python3 -m http.server would have an option to run a command when the server is ready and would shutdown itself after the command is completed. That sounds doable by writing a bit of Python, but I was wondering whether a cleaner solution exists.
Did I miss anything? Is there a better solution? I'm mentioning wget in my question because it does almost what I want, but using wget is not a requirement for me (nor is python -m http.server). I just need to have something easy to run and automate on Linux.

So I think you are running in the right direction. I would use wget and python as they are two readily available options on many systems. And the good part is that it gets the job done for you. Now what you want is to listen for Serving HTTP on 0.0.0.0 from the stdout of that process.
So I would start the process using something like below
python3 -u -m http.server > ./myserver.log &
Note the -u I have used here for unbuffered output, this is really important
Now next is waiting for this text to appear in myserver.log
timeout 10 awk '/Serving HTTP on 0.0.0.0/{print; exit}' <(tail -f ./myserver.log)
So 10 seconds is your maximum wait time here. And rest is self-explanatory. Next about your kill $pid. I don't think it is a problem, but if you want it to be more like the way a user does it then I would change it to
kill -s SIGINT $pid
This will be equivalent to you processing CTRL+C after launching the program. Also I would handle the SIGINT my bash script as well using something like below
https://unix.stackexchange.com/questions/313644/execute-command-or-function-when-sigint-or-sigterm-is-send-to-the-parent-script/313648
The above basically adds below to top of the bash script to handle you killing the script using CTRL+C or external kill signal
#!/bin/bash
exit_script() {
echo "Printing something special!"
echo "Maybe executing other commands!"
trap - SIGINT SIGTERM # clear the trap
kill -- -$$ # Sends SIGTERM to child/sub processes
}
trap exit_script SIGINT SIGTERM

Tarun Lalwani's answer is correct, and following the advices given there one can write a clean and short shell script (relying on Python and awk). Another solution is to write the script completely in Python, giving a slightly more verbose but arguably cleaner script. The server can be launched in a thread, then the command to check the website is executed, and finally the server is shut down. We don't need to parse the textual output nor to send a signal to an external process anymore. The key parts of the script are therefore:
def start_server(port,
server_class=HTTPServer,
handler_class=SimpleHTTPRequestHandler):
server_address = ('', port)
httpd = server_class(server_address, handler_class)
thread = threading.Thread(target=httpd.serve_forever)
thread.start()
return httpd
def main(cmd, port):
httpd = start_server(port)
status = subprocess.call(cmd)
httpd.shutdown()
sys.exit(status)
I wrote a slightly more advanced script (with a bit of command-line option parsing on top of this) and published it as: https://gitlab.com/moy/check-links

Is it possible to run SLURM jobs in the background using SRUN instead of SBATCH?

I was trying to run slurm jobs with srun on the background. Unfortunately, right now due to the fact I have to run things through docker its a bit annoying to use sbatch so I am trying to find out if I can avoid it all together.
From my observations, whenever I run srun, say:
srun docker image my_job_script.py
and close the window where I was running the command (to avoid receiving all the print statements) and open another terminal window to see if the command is still running, it seems that my running script is for some reason cancelled or something. Since it isn't through sbatch it doesn't send me a file with the error log (as far as I know) so I have no idea why it closed.
I also tried:
srun docker image my_job_script.py &
to give control back to me in the terminal. Unfortunately, if I do that it still keeps printing things to my terminal screen, which I am trying to avoid.
Essentially, I log into a remote computer through ssh and then do a srun command, but it seems that if I terminate the communication of my ssh connection, the srun command is automatically killed. Is there a way to stop this?
Ideally I would like to essentially send the script to run and not have it be cancelled for any reason unless I cancel it through scancel and it should not print to my screen. So my ideal solution is:
keep running srun script even if I log out of the ssh session
keep running my srun script even if close the window from where I sent the command
keep running my srun script and let me leave the srun session and not print to my scree (i.e. essentially run to the background)
this would be my idea solution.
For the curious crowd that want to know the issue with sbatch, I want to be able to do (which is the ideal solution):
sbatch docker image my_job_script.py
however, as people will know it does not work because sbatch receives the command docker which isn't a "batch" script. Essentially a simple solution (that doesn't really work for my case) would be to wrap the docker command in a batch script:
#!/usr/bin/sh
docker image my_job_script.py
unfortunately I am actually using my batch script to encode a lot of information (sort of like a config file) of the task I am running. So doing that might affect jobs I do because their underlying file is changing. That is avoided by sending the job directly to sbatch since it essentially creates a copy of the batch script (as noted in this question: Changing the bash script sent to sbatch in slurm during run a bad idea?). So the real solution to my problem would be to actually have my batch script contain all the information that my script requires and then somehow in python call docker and at the same time pass it all the information. Unfortunately, some of the information are function pointers and objects, so its not even clear to me how I would pass such a thing to a docker command ran in python.
or maybe being able to run docker directly to sbatch instead of using a batch script with also solve the problem.

The outputs can be redirected with the options -o stdout and -e for stderr.
So, the job can be launched in background and with the outputs redirected:
$ srun -o file.out -e file.errr docker image my_job_script.py &

Another approach is to use a terminal multiplexer like tmux or screen.
For example, create a new tmux window type tmux. In that window, use srun with your script. From there, you can then detach the tmux window, which returns you to your main shell so you can go about your other business, or you can logoff entirely. When you want to check in on your script, just reattach to the tmux window. See the documentation tmux -h for how to detach and reattach on your OS.
Any output redirects using the -o or -e will still work with this technique and you can run multiple srun commands concurrently in different tmux windows. I’ve found this approach useful to run concurrent pipelines (genomics in this case).

I was wondering this too because the differences between sbatch and srun are not very clearly explainer or motivated. I looked at the code and found:
sbatch
sbatch pretty much just sends a shell script to the controller, tells it to run it and then exits. It does not need to keep running while the job is happening. It does have a --wait option to stay running until the job is finished but all it does is poll the controller every 2 seconds to ask it.
sbatch can't run a job across multiple nodes - the code simply isn't in sbatch.c. sbatch is not implemented in terms of srun, it's a totally different thing.
Also its argument must be a shell script. Bit of a weird limitation but it does have a --wrap option so that it can automatically wrap a real program in a shell script for you. Good luck getting all the escaping right with that!
srun
srun is more like an MPI runner. It directly starts tasks on lots of nodes (one task per node by default though you can override that with --ntasks). It's intended for MPI so all of the jobs will run simultaneously. It won't start any until all the nodes have a slot free.
It must keep running while the job is in progress. You can send it to the background with & but this is still different to sbatch. If you need to start a million sruns you're going to have a problem. A million sbatchs should (in theory) work fine.
There is no way to have srun exit and leave the job still running like there is with sbatch. srun itself acts as a coordinator for all of the nodes in the job, and it updates the job status etc. so it needs to be running for the whole thing.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.