Run Python Code on SSH Target using Airflow - python

There are 2 systems: A and B. Airflow Scheduler, webserver, redis and flower runs on A while an Airflow worker runs on B. Both systems are running Ubuntu 18.04 and uses Airflow 1.10.10 in docker containers.
Is it possible to create a DAG that remotely runs Python code (defined in that DAG) on B?
SSHOperator allows the remote execution of a bash command on B over SSH, but we require a remote execution of Python code over SSH instead.
Thank you!

I don't know if you've gotten your answer already, but I had a very similar (if not the very same) problem until just a few moments ago and thought I could provide the answer here.
The easiest way is to mount a shared folder onto both nodes so they can both access the actual physical DAG files.
More details about my case can be found here.

Related

How do you schedule some python scripts to run regularly on a Windows PC?

I have some python scripts that I look to run daily form a Windows PC.
My current workflow is:
The desktop PC stays all every day except for a weekly restart over the weekend
After the restart I open VS Code and run a little bash script ./start.sh that kicks off the tasks.
The above works reasonably fine, but it is also fairly painful. I need to re-run start.sh if I ever close VS Code (eg. for an update). Also the processes use some local python libraries so I need to stop them if I'm going to update them.
With regards to how to do this properly, 4 tools came to mind:
Windows Scheduler
Airflow
Prefect (https://www.prefect.io/)
Rocketry (https://rocketry.readthedocs.io/en/stable/)
However, I can't quite get my head around the fundamental issue that Prefect/Airflow/Rocketry run on my PC then there is nothing that will restart them after the PC reboots. I'm also not sure they will give me the isolation I'd prefer on these tools.
Docker came to mind, I could put each task into a docker image and run them via some form of docker swarm or something like that. But not sure if I'm re-inventing the wheel.
I'm 100% sure I'm not the first person in this situation. Could anyone point me to a guide on how this could be done well?
Note:
I am not considering running the python scripts in the cloud. They interact with local tools that are only licenced for my PC.
You can definitely use Prefect for that - it's very lightweight and seems to be matching what you're looking for. You install it with pip install prefect, start Orion API server: prefect orion start and once you create a Deployment, and start an agent prefect agent start -q default you can even configure schedule from the UI
For more information about Deployments, check our FAQ section.
It sounds Rocketry could also be suitable. Rocketry can shut down itself using a task. You could do a task that:
Runs on the main thread and process (blocking starting new tasks)
Waits or terminates all the currently running tasks (use the session)
Calls session.shut_down() which sets a flag to the scheduler.
There is also a app configuration shut_cond which is simply a condition. If this condition is True, the scheduler exits so alternatively you can use this.
Then after the line app.run() you simply have a line that runs shutdown -r (restart) command on shell using a subprocess library, for example. Then you need something that starts Rocketry again when the restart is completed. For this, perhaps this could be an answer: https://superuser.com/a/954957, or use Windows scheduler to have a simple startup task that starts Rocketry.
Especially if you had Linux machines (Raspberry Pis for example), you could integrate Rocketry with FastAPI and make a small cluster in which Rocketry apps communicate with each other, just put script with Rocketry as a startup service. One machine could be a backup that calls another machine's API which runs Linux restart command. Then the backup executes tasks until the primary machine answers to requests again (is up and running).
But as the author of the library, I'm possibly biased toward my own projects. But Rocketry very capable on complex scheduling problems, that's the purpose of the project.
You can use schtasks for windows to schedule the tasks like running bash script or python script and it's pretty reliable too.

One-time running Python code when AWS EC2 instance start

I have a two EC2 instances named "aws-example" and "aws-sandbox". At same time, they are docker machines. "aws-example" is a manager and "aws-sandbox" is a worker in a docker-swarm.
I wrote two Python script, when run the script in "aws-example", it stopped to "aws-sandbox" instance and start it again.
When I run the script in "aws-sandbox", worker has to left the swarm and join again.
I do by my hand all of this process. However, I have to automate that. How I do the one-time running Python script in "aws-sandbox" when the "aws-sandbox" instance started? I've had investigate to services AWS Lambda, CloudWatch etc. and I'm very confused. Are here any person who have clear pathway?
Make use of #reboot /path/to/script.py in cron, it should work.
For more info check this out.

How to create a "watchdog" for a python script running on a shared host (no ssh access or shell scripts)

My friends and I have written a simple telegram bot in python. The script is run on a remote shared host. The problem is that for some reason the script stops from time to time, and we want to have some sort of a mechanism to check whether it is running or not and restart it if necessary.
However, we don't have access to ssh, we can't run bash scripts and I couldn't find a way to install supervisord. Is there a way to achieve the same result by using a different method?
P.S. I would appreciate it if you gave detailed a explanation as I'm a newbie hobbyist. However, I have no problem with researching and learning new things.
You can have a small supervisor Python script whose only purpose is to start (and restart) your main application Python script. When your application crashes the supervisor takes care and restarts it.

How to run a Python script remotely

We run many Python scripts for data processing tasks. We have a modeling computer that has been upgraded to provide the best performance for these tasks, but it is shared by many people that all need to run different scripts on it at the same time.
Is it possible for me to run a Python script remotely on that machine from my laptop while others are either directly logged into it or also remotely running a script?
Is SSH a possibility? I haven't ever run any scripts remotely aside from logging in via remote desktop. Ideally, I could start the Python script on that remote machine, but all the messages would be visible to me on my laptop. Does this sound doable?
EDIT:
I forgot to mention all machines are running Windows 7.
SSH is definitely the way to go and also have a look at Fabric.
Regarding your edit. You can use Fabric on Windows. And I think that using SSH on Windows will be a bit easier than dancing with their Powershell's remoting capabilities.
SSH does seem like it should meet your needs.
You could also consider setting up an iPython notebook server that everyone could use.
Its got nice parallel processing capabilities if you are doing some serious number crunching.

Execute remote python script via SSH

I want to execute a Python script on several (15+) remote machine using SSH. After invoking the script/command I need to disconnect ssh session and keep the processes running in background for as long as they are required to.
I have used Paramiko and PySSH in past so have no problems using them again. Only thing I need to know is how to disconnect a ssh session in python (since normally local script would wait for each remote machine to complete processing before moving on).
This might work, or something similar:
ssh user#remote.host nohup python scriptname.py &
Basically, have a look at the nohup command.
On Linux machines, you can run the script with 'at'.
echo "python scriptname.py" ¦ at now
If you are going to perform repetitive tasks on many hosts, like for example deploying software and running setup scripts, you should consider using something like Fabric
Fabric is a Python (2.5 or higher) library and command-line tool for
streamlining the use of SSH for application deployment or systems
administration tasks.
It provides a basic suite of operations for executing local or remote
shell commands (normally or via sudo) and uploading/downloading files,
as well as auxiliary functionality such as prompting the running user
for input, or aborting execution.
Typical use involves creating a Python module containing one or more
functions, then executing them via the fab command-line tool.
You can even use tmux in this scenario.
As per the tmux documentation:
tmux is a terminal multiplexer. It lets you switch easily between several programs in one terminal, detach them (they keep running in the background) and reattach them to a different terminal. And do a lot more
From a tmux session, you can run a script, quit the terminal, log in again and check back as it keeps the session until the server restart.
How to configure tmux on a cloud server

Categories