I want/am trying to run shell command(Usually file manipulation) on a cluster(a spark cluster 1 master and 3 worker node).
There is passwordless ssh between all the machines in the cluster.
File directories are all same on all cluster Nodes.
Currently I am handling file manipulation shell command by
#let's say copy or move a file from one dir to other dir
import os, sys
os.system('ssh user#Ip_of_worker-1 "cp directory_1/file1.csv directory_2"')
os.system('ssh user#Ip_of_worker-2 "cp directory_1/file1.csv directory_2"')
os.system('ssh user#Ip_of_worker-3 "cp directory_1/file1.csv directory_2"')
And I am looking for a python package to do that, generally I am trying to avoid system call every time I want to run a shell command(I should get stdout & stderr for each command run on different cluster_Nodes in running python script log.).
And shell command should run in parallel/simultaneously on all target Nodes.
Please guide, if any such package you guys are aware of or have used before.
You could use a library implementing ssh protocol, for example paramiko, if you are not happy with system or subprocess. http://docs.paramiko.org/en/2.1/
Hannu
Try to look for pdsh and call this using python.
https://linux.die.net/man/1/pdsh
Example
http://www.linux-magazine.com/Issues/2014/166/Parallel-Shells
It sounds like you want Fabric - http://www.fabfile.org/
From their basic example:
from fabric.api import run
def host_type():
run('uname -s')
Gets you:
$ fab -H localhost,linuxbox host_type
[localhost] run: uname -s
[localhost] out: Darwin
[linuxbox] run: uname -s
[linuxbox] out: Linux
You can do something like this:
#!/usr/bin/python
import thread
import subprocess
# Define a function for the thread
def run_remote(host, delay):
remote_cmd='cp directory_1/file1.csv directory_2'
ssh = subprocess.Popen(['ssh', '-oStrictHostKeyChecking=no', host, remote_cmd],
shell=False,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
result = ssh.stdout.readlines()
if result == []:
error = ssh.stderr.readlines()
print "ERROR: %s" % error
else:
print result
# Create two threads as follows
try:
thread.start_new_thread( run_remote, ("Ip_of_worker-1", 1, ) )
thread.start_new_thread( run_remote, ("Ip_of_worker-2", 1, ) )
thread.start_new_thread( run_remote, ("Ip_of_worker-3", 1, ) )
except:
print "Error: unable to start thread"
parallel-ssh is a non-blocking parallel ssh client that can do this:
from pssh.pssh2_client import ParallelSSHClient
client = ParallelSSHClient(['host1', 'host2']
output = client.run_command('cp directory_1/file1.csv directory_2')
client.join(output)
Related
I saw some useful information in this post about how you can't expect to run a process in the background if you are retrieving output from it using subprocess. The problem is ... this is exactly what I want to do!
I have a script which drops commands to various hosts via ssh and I don't want to have to wait on each one to finish before starting the next. Ideally, I could have something like this:
for host in hostnames:
p[host] = Popen(["ssh", mycommand], stdout=PIPE, stderr=PIPE)
pout[host], perr[host] = p[host].communicate()
which would have (in the case where mycommand takes a very long time) all of the hosts running mycommand at the same time. As it is now, it appears that the entirety of the ssh command finishes before starting the next. This is (according to the previous post I linked) due to the fact that I am capturing output, right? Other than just cating the output to a file and reading the output later, is there a decent way to make these things happen on various hosts in parallel?
You may want to use fabric for this.
Fabric is a Python (2.5-2.7) library and command-line tool for streamlining the use of SSH for application deployment or systems administration tasks.
Example file:
from fabric.api import run, env
def do_mycommand():
my_command = "ls" # change to your command
output = run(mycommand)
print "Output of %s on %s:%s" % (mycommand, env.host_string, output)
Now to execute on all hosts (host1,host2 ... is where all hosts go):
fab -H host1,host2 ... do_mycommand
You could use threads for achieving parallelism and a Queue for retrieving results in a thread-safe way:
import subprocess
import threading
import Queue
def run_remote_async(host, command, result_queue, identifier=None):
if isinstance(command, str):
command = [command]
if identifier is None:
identifier = "{}: '{}'".format(host, ' '.join(command))
def worker(worker_command_list, worker_identifier):
p = subprocess.Popen(worker_command_list,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
result_queue.put((worker_identifier, ) + p.communicate())
t = threading.Thread(target=worker,
args=(['ssh', host] + command, identifier),
name=identifier)
t.daemon = True
t.start()
return t
Then, a possible test case could look like this:
def test():
data = [('host1', ['ls', '-la']),
('host2', 'whoami'),
('host3', ['echo', '"Foobar"'])]
q = Queue.Queue()
for host, command in data:
run_remote_async(host, command, q)
for i in range(len(data)):
identifier, stdout, stderr = q.get()
print identifier
print stdout
Queue.get() is blocking, so at this point you can collect one result after another, once the task is completed.
I'm making a custom shell in Python for a very limited user on a server, who is logged in via ssh with a public key authentication. They need to be able to run ls, find -type d, and cat in specific directories with certain limitations. This works fine if you run something like ssh user#server -i keyfile, because you see the interactive prompt, and can run those commands. However, something like ssh user#server -i keyfile "ls /var/log" doesn't. ssh simply hangs, with no response. By using the -v switch I've found that the connection is succeeding, so the problem is in my shell. I'm also fairly certain that the script isn't even being started, since print sys.argv at the beginning of the program does nothing. Here's the code:
#!/usr/bin/env python
import subprocess
import re
import os
with open(os.devnull, 'w') as devnull:
proc = lambda x: subprocess.Popen(x, stdout=subprocess.PIPE, stderr=devnull)
while True:
try:
s = raw_input('> ')
except:
break
try:
cmd = re.split(r'\s+', s)
if len(cmd) != 2:
print 'Not permitted.'
continue
if cmd[0].lower() == 'l':
# Snip: verify directory
cmd = proc(['ls', cmd[1]])
print cmd.stdout.read()
elif cmd[0].lower() == 'r':
# Snip: verify directory
cmd = proc(['cat', cmd[1]])
print cmd.stdout.read()
elif cmd[0].lower() == 'll':
# Snip: verify directory
cmd = proc(['find', cmd[1], '-type', 'd'])
print cmd.stdout.read()
else:
print 'Not permitted.'
except OSError:
print 'Unknown error.'
And here's the relevant line from ~/.ssh/authorized_keys:
command="/path/to/shell $SSH_ORIGINAL_COMMAND" ssh-rsa [base-64-encoded-key] user#host
How can I make the shell script when the command is passed on the command line so it can be used in scripts without starting an interactive shell?
The problem with ssh not responding is related to the fact that ssh user#host cmd does not open a terminal for the command being run. Try calling ssh user#host -t cmd.
However, even if you pass the -t option, you'd still have another problem with your script: it only works interactively and totally ignores the $SSH_ORIGINAL_PROGRAM being passed. A naive solution would be to check sys.argv and if its bigger than 1 you don't loop forever, and instead only execute whatever command you have in it.
I always use fabric to deploy my processes from my local pc to remote servers.
If I have a python script like this:
test.py:
import time
while True:
print "Hello world."
time.sleep(1)
Obviously, this script is a continuous running script.
And I deploy this script to remote server and execute my fabric script like this:
...
sudo("python test.py")
The fabric will always wait the return of test.py and won't exit.How can I stop the fabric script at once and ignore the return of test.py
Usually for this kind of asynchronous task processing Celery is preferred .
This explains in detail the use of Celery and Fabric together.
from fabric.api import hosts, env, execute,run
from celery import task
env.skip_bad_hosts = True
env.warn_only = True
#task()
def my_celery_task(testhost):
host_string = "%s#%s" % (testhost.SSH_user_name, testhost.IP)
#hosts(host_string)
def my_fab_task():
env.password = testhost.SSH_password
run("ls")
try:
result = execute(my_fab_task)
if isinstance(result.get(host_string, None), BaseException):
raise result.get(host_string)
except Exception as e:
print "my_celery_task -- %s" % e.message
sudo("python test.py 2>/dev/null >/dev/null &")
or redirect the output to some other file instead of /dev/null
This code worked for me:
fabricObj.execute("(nohup python your_file.py > /dev/null < /dev/null &)&")
Where fabricObj is an object to fabric class(defined internally) which speaks to fabric code.
Passing a Fabric env.hosts sting as a variable is not work in function.
demo.py
#!/usr/bin/env python
from fabric.api import env, run
def deploy(hosts, command):
print hosts
env.hosts = hosts
run(command)
main.py
#!/usr/bin/env python
from demo import deploy
hosts = ['localhost']
command = 'hostname'
deploy(hosts, command)
python main.py
['localhost']
No hosts found. Please specify (single) host string for connection:
But env.host_string works!
demo.py
#!/usr/bin/env python
from fabric.api import env, run
def deploy(host, command):
print host
env.host_string = host
run(command)
main.py
#!/usr/bin/env python
from demo import deploy
host = 'localhost'
command = 'hostname'
deploy(host, command)
python main.py
localhost
[localhost] run: hostname
[localhost] out: heydevops-workspace
But the env.host_string is not enough for us, it's a single host.
Maybe we can use env.host_string within a loop, but that's not good.
Because we also want to set the concurrent tasks number and run them parallelly.
Now in ddep(my deployment engine), I only use MySQLdb to get the parameters then execute the fab command like:
os.system("fab -f service/%s.py -H %s -P -z %s %s" % (project,host,number,task))
This is a simple way but not good.
Because if I use the fab command, I can't catch the exceptions and failures of the results in Python, to make my ddep can "retry" the failed hosts.
If I use the "from demo import deploy", I can control and get them by some codes in Python.
So now "env.host " is the trouble. Can somebody give me a solution?
Thanks a lot.
Here's my insight.
According to docs, if you're calling fabric tasks from python scripts - you should use fabric.tasks.execute.
Should be smth like this:
demo.py
from fabric.api import run
from fabric.tasks import execute
def deploy(hosts, command):
execute(execute_deploy, command=command, hosts=hosts)
def execute_deploy(command):
run(command)
main.py
from demo import deploy
hosts = ['localhost']
command = 'hostname'
deploy(hosts, command)
Then, just run python main.py. Hope that helps.
Finally, I fixed this problem by using execute() and exec.
main.py
#!/usr/bin/env python
from demo import FabricSupport
hosts = ['localhost']
myfab = FabricSupport()
myfab.execute("df",hosts)
demo.py
#!/usr/bin/env python
from fabric.api import env, run, execute
class FabricSupport:
def __init__(self):
pass
def hostname(self):
run("hostname")
def df(self):
run("df -h")
def execute(self,task,hosts):
get_task = "task = self.%s" % task
exec get_task
execute(task,hosts=hosts)
python main.py
[localhost] Executing task 'hostname'
[localhost] run: hostname
[localhost] out: heydevops-workspace
I'm running a script from crontab that will just ssh and run a command and store the results in a file.
The function that seems to be failing is subprocess.popen.
Here is the python function:
def _executeSSHCommand(sshcommand,user,node):
'''
Simple function to execute an ssh command on a remote node.
'''
sshunixcmd = '/usr/bin/ssh %s#%s \'%s\'' % (user,node,sshcommand)
process = subprocess.Popen([sshunixcmd],
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
process.wait()
result = process.stdout.readlines()
return result
When it's run from the command line, it executes correctly, from cron it seems to fail with the error message below.
Here are the crontab entries:
02 * * * * /home/matt/scripts/check-diskspace.py >> /home/matt/logs/disklog.log
Here are the errors:
Sep 23 17:02:01 timmy CRON[13387]: (matt) CMD (/home/matt/scripts/check-diskspace.py >> /home/matt/logs/disklog.log)
Sep 23 17:02:01 timmy CRON[13386]: (CRON) error (grandchild #13387 failed with exit status 2)
I'm going blind trying to find exactly where I have gone so wrong. Any ideas?
The cron PATH is very limited. You should either set absolute path to your ssh /usr/bin/ssh or set the PATH as a first line in your crontab.
You probably need to pass ssh the -i argument to tell ssh to use a specific key file. The problem is that your environment is not set up to tell ssh which key to use.
The fact that you're using python here is a bit of a red herring.
For everything ssh-related in python, you might consider using paramiko. Using it, the following code should do what you want.
import paramiko
client = paramiko.SSHClient()
client.load_system_host_keys()
client.connect(node, username=user)
stdout = client.exec_command(ssh_command)[0]
return stdout.readlines()
When running python scripts from cron, the environment PATH can be a hangup, as user1652558 points out.
To expand on this answer with example code to add custom PATH values to the environment for a subprocess call:
import os
import subprocess
#whatever user PATH values you need
my_path = "/some/custom/path1:/some/custom/path2"
#append the custom values to the current PATH settings
my_env = os.environ.copy()
my_env["PATH"] = my_path + ":" + my_env["PATH"]
#subprocess call
resp = subprocess.check_output([cmd], env=my_env, shell=True)