I'm new to airflow and I'm trying to run a job on an ec2 instance using airflow's ssh_operator like shown below:
t2 = SSHOperator(
ssh_conn_id='ec2_ssh_connection',
task_id='execute_script',
command="nohup python test.py &",
retries=3,
dag=dag)
The job takes few hours and I want airflow to execute the python script and end. However when the command is executed and the dag completes the script is terminated on the ec2 instance. I also noticed that the above code doesn't create a nohup.out file.
I'm looking at how to run nohup using SSHOperator. It seems like this might be a python related issue because I'm getting the following error on EC2 script when the nohup has been executed:
[Errno 32] Broken pipe
Thanks!
Airflow's SSHHook uses the Paramiko module for SSH connectivity. There is an SO question regarding Prarmiko and nohup. One of the answers suggests to add sleep after the nohup command. I cannot explain exactly why, but it actually works. It is also necessary to set get_pty=True in SSHOperator.
Here is a complete example that demonstrates the solution:
from datetime import datetime
from airflow import DAG
from airflow.contrib.operators.ssh_operator import SSHOperator
default_args = {
'start_date': datetime(2001, 2, 3, 4, 0),
}
with DAG(
'a_dag', schedule_interval=None, default_args=default_args, catchup=False,
) as dag:
op = SSHOperator(
task_id='ssh',
ssh_conn_id='ssh_default',
command=(
'nohup python -c "import time;time.sleep(30);print(1)" & sleep 10'
),
get_pty=True, # This is needed!
)
The nohup.out file is written to the user's $HOME.
Related
I want to build Airflow tasks that use multiple gcloud commands.
A simple example :
def worker(**kwargs) :
exe = subprocess.run(["gcloud", "compute", "instances", "list"], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
print(exe.returncode)
for line in exe.stdout.splitlines() :
print(line.decode())
exe = subprocess.run(["gcloud", "compute", "ssh", "user#host", "--command=pwd"], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
print(exe.returncode)
for line in exe.stdout.splitlines() :
print(line.decode())
dag = DAG("TEST", default_args=default_args, schedule_interval=None)
worker_task = PythonOperator(task_id='sample-task', python_callable=worker, provide_context = True, dag=dag)
worker_task
I have this error :
ERROR: gcloud crashed (AttributeError): 'NoneType' object has no attribute 'isatty'
Apart from airflow, these commands work fine.
I've already tried disabling gcloud interactive mode with "--quiet", but that doesn't help.
I don't want to use the GcloudOperator operator from airflow, because these commands must be integrated in a custom operator.
thank you in advance for your help
As I see, your two commands are independent, so you can run them in two separate task from the operator BashOperator, and if you want to access the output of the commands, the output of each one will be available as a xcom, you can read it using ti.xcom_pull(task_ids='<the task id>').
Maybe use BashOperator?
worker_task = BashOperator(task_id="sample-task",bash_command='gcloud compute instances list', dag=dag)
I am a newbie to Airflow and struggling with BashOperator. I want to access a shell script using bash operatory in my dag.py.
I checked:
How to run bash script file in Airflow
and
BashOperator doen't run bash file apache airflow
on how to access shell script through bash operator.
This is what I did:
cmd = "./myfirstdag/dag/lib/script.sh "
t_1 = BashOperator(
task_id='start',
bash_command=cmd
)
On running my recipe and checking in airflow I got the below error:
[2018-11-01 10:44:05,078] {bash_operator.py:77} INFO - /tmp/airflowtmp7VmPci/startUDmFWW: line 1: ./myfirstdag/dag/lib/script.sh: No such file or directory
[2018-11-01 10:44:05,082] {bash_operator.py:80} INFO - Command exited with return code 127
[2018-11-01 10:44:05,083] {models.py:1361} ERROR - Bash command failed
Not sure why this is happening. Any help would be appreciated.
Thanks !
EDIT NOTE: I assume that it's searching in some airflow tmp location rather than the path I provided. But how do I make it search for the right path.
Try this:
bash_operator = BashOperator(
task_id = 'task',
bash_command = '${AIRFLOW_HOME}/myfirstdag/dag/lib/script.sh '
dag = your_dag)
For those running a docker version.
I had this same issue, took me a while to realise the problem, the behaviour can be different with docker. When the DAG is run it moves it tmp file, if you do not have airflow on docker this is on the same machine. with my the docker version it moves it to another container to run, which of course when it is run would not have the script file on.
check the task logs carefully, you show see this happen before the task is run.
This may also depend on your airflow-docker setup.
Try the following. It needs to have a full file path to your bash file.
cmd = "/home/notebook/work/myfirstdag/dag/lib/script.sh "
t_1 = BashOperator(
task_id='start',
bash_command=cmd
)
Are you sure of the path you defined?
cmd = "./myfirstdag/dag/lib/script.sh "
With the heading . it means it is relative to the path where you execute your command.
Could you try this?
cmd = "find . -type f"
try running this:
path = "/home/notebook/work/myfirstdag/dag/lib/script.sh"
copy_script_cmd = 'cp ' + path + ' .;'
execute_cmd = './script.sh'
t_1 = BashOperator(
task_id='start',
bash_command=copy_script_cmd + execute_cmd
)
I try to execute 4 commands into a container (it has a mysql database) but if i do it in anothe terminal work, but if a create a container and then execute the commands, it not working. I have this code:
this code create the container but dont execute the command 1 ,2 , 3 and 4.
import docker
from docker.types import Mount
from threading import Thread
client = docker.DockerClient(base_url='unix://var/run/docker.sock')
container= client.containers.run(
"python_base_image:v02",
detach=True,
name='201802750001M04',
ports={'3306/tcp': None, '80/tcp': None},
mounts=
[Mount("/var/lib/mysql","201802750001M04_backup_db",type='volume')]
)
command1 = "sed -i '/bind/s/^/#/g' /etc/mysql/my.cnf"
command2 = "mysql --user="root" --password="temprootpass" --
execute="GRANT ALL PRIVILEGES ON . TO 'macripco'#'172.17.0.1'
IDENTIFIED BY '12345';""
command3 = "mysql --user="root" --password="temprootpass" --
execute="GRANT ALL PRIVILEGES ON . TO 'macripco'#'localhost'
IDENTIFIED BY '12345';""
command4 = "sudo /etc/init.d/mysql restart"
a = container.exec_run(command1,detach=False,stream=True,stderr=True,
stdout=True)
b = container.exec_run(command2,detach=False,stream=True,stderr=True,
stdout=True)
c = container.exec_run(command3,detach=False,stream=True,stderr=True,
stdout=True)
d = container.exec_run(command4,detach=False,stream=True,stderr=True,
stdout=True)`
But if i execute the commands later(in another terminal), once the container has been created, that work. I need create and execute the commands together.
Thanks.
That was a problem about time execution, that was resolved with time.sleep(10) between two executions, after create container and before exec_run
I want/am trying to run shell command(Usually file manipulation) on a cluster(a spark cluster 1 master and 3 worker node).
There is passwordless ssh between all the machines in the cluster.
File directories are all same on all cluster Nodes.
Currently I am handling file manipulation shell command by
#let's say copy or move a file from one dir to other dir
import os, sys
os.system('ssh user#Ip_of_worker-1 "cp directory_1/file1.csv directory_2"')
os.system('ssh user#Ip_of_worker-2 "cp directory_1/file1.csv directory_2"')
os.system('ssh user#Ip_of_worker-3 "cp directory_1/file1.csv directory_2"')
And I am looking for a python package to do that, generally I am trying to avoid system call every time I want to run a shell command(I should get stdout & stderr for each command run on different cluster_Nodes in running python script log.).
And shell command should run in parallel/simultaneously on all target Nodes.
Please guide, if any such package you guys are aware of or have used before.
You could use a library implementing ssh protocol, for example paramiko, if you are not happy with system or subprocess. http://docs.paramiko.org/en/2.1/
Hannu
Try to look for pdsh and call this using python.
https://linux.die.net/man/1/pdsh
Example
http://www.linux-magazine.com/Issues/2014/166/Parallel-Shells
It sounds like you want Fabric - http://www.fabfile.org/
From their basic example:
from fabric.api import run
def host_type():
run('uname -s')
Gets you:
$ fab -H localhost,linuxbox host_type
[localhost] run: uname -s
[localhost] out: Darwin
[linuxbox] run: uname -s
[linuxbox] out: Linux
You can do something like this:
#!/usr/bin/python
import thread
import subprocess
# Define a function for the thread
def run_remote(host, delay):
remote_cmd='cp directory_1/file1.csv directory_2'
ssh = subprocess.Popen(['ssh', '-oStrictHostKeyChecking=no', host, remote_cmd],
shell=False,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
result = ssh.stdout.readlines()
if result == []:
error = ssh.stderr.readlines()
print "ERROR: %s" % error
else:
print result
# Create two threads as follows
try:
thread.start_new_thread( run_remote, ("Ip_of_worker-1", 1, ) )
thread.start_new_thread( run_remote, ("Ip_of_worker-2", 1, ) )
thread.start_new_thread( run_remote, ("Ip_of_worker-3", 1, ) )
except:
print "Error: unable to start thread"
parallel-ssh is a non-blocking parallel ssh client that can do this:
from pssh.pssh2_client import ParallelSSHClient
client = ParallelSSHClient(['host1', 'host2']
output = client.run_command('cp directory_1/file1.csv directory_2')
client.join(output)
When I execute simple command like "net start", I am getting output successfully as shown below.
Python script:
import os
def test():
cmd = ' net start '
output = os.popen(cmd).read()
print output
test()
Output:
C:\Users\test\Desktop\service>python test.py
These Windows services are started:
Application Experience
Application Management
Background Intelligent Transfer Service
Base Filtering Engine
Task Scheduler
TCP/IP NetBIOS Helper
The command completed successfully.
C:\Users\test\Desktop\service>
But When I execute long commands (for example : "net start "windows search") I am NOT getting any output.
Python script:
import os
def test():
cmd = ' net start "windows search" '
output = os.popen(cmd).read()
print output
test()
Output:
C:\Users\test\Desktop\service>python test.py
C:\Users\test\Desktop\service>
I have tried "net start \"windows search\" ". also. But same issue.
Can anyone guide me on this please?
From the documentation:
Deprecated since version 2.6: This function is obsolete. Use the subprocess module. Check especially the Replacing Older Functions with the subprocess Module section.
subprocess.Popen(['net', 'start', 'windows search'], ...)