I have the following code where I am running a shell script using subprocess inside a celery task. It's not working as in I don't get an error or any forward progress, or any output from the celery task:
The following is the code to execute the task:
def run_shell_command(command_line):
command_line_args = shlex.split(command_line)
logging.info('Subprocess: "' + command_line + '"')
try:
command_line_process = subprocess.Popen(
command_line_args,
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
)
for l in iter(command_line_process.stdout.readline,b''):
print l.strip()
command_line_process.communicate()
command_line_process.wait()
except (OSError, subprocess.CalledProcessError) as exception:
logging.info('Exception occured: ' + str(exception))
logging.info('Subprocess failed')
return False
else:
# no exception was raised
logging.info('Subprocess finished')
return True
It's called from within a task:
#app.task
def execute(jsonConfig, projectName, tagName, stage, description):
command = 'python ' + runScript + ' -c ' + fileName
run_shell_command(command)
Here the python "runScript" is in itself calling subprocesses, and executes a long running task. What could be the problem
The logging level has been set to INFO :
logging.basicConfig(filename='celery-execution.log',level=logging.INFO)
The celery worker is started as follow:
celery -A celery_worker worker --loglevel=info
I can see the subprocess being started:
[2016-05-03 01:08:55,126: INFO/Worker-2] Subprocess: "python runScript.py -c data/confs/Demo-demo14-1.conf"
I can also see the subprocess running in the background using a ps -ef, however this is a compute/memory intensive workload and it does not seem to be actually using any cpu or memory which makes me believe that nothing is really happening and it's stuck.
Related
What is the most pythonic syntax for getting subprocess to successfully manage the running of the following CLI command, which can take a long time to complete?
CLI Command:
The CLI command that subprocess must run is:
az resource invoke-action --resource-group someRG --resource-type Microsoft.VirtualMachineImages/imageTemplates -n somename78686786976 --action Run
The CLI command runs for a long time, for example 11 minutes in this case, but possibly longer at other times.
While run from the terminal manually, the terminal prints the following while the command is waiting to hear back that it has succeeded:
\ Running
The \ spins around while the command runs when the command is manually typed in the terminal.
The response that is eventually given back when the command finally succeeds is the following JSON:
{
"endTime": "2022-06-23T02:54:02.6811671Z",
"name": "long-alpha-numerica-string-id",
"startTime": "2022-06-23T02:43:39.2933333Z",
"status": "Succeeded"
}
CURRENT PYTHON CODE:
The current python code we are using to run the above command from within a python program is as follows:
def getJsonResponse(self, cmd,counter=0):
process = subprocess.run(cmd, shell=True, stdout=subprocess.PIPE, text=True)
data = process.stdout
err = process.stderr
logString = "data string is: " + data
print(logString)
logString = "err is: " + str(err)
print(logString)
logString = "process.returncode is: " + str(process.returncode)
print(logString)
if process.returncode == 0:
print(str(data))
return data
else:
if counter < 11:
counter +=1
logString = "Attempt "+str(counter)+ " out of 10. "
print(logString)
import time
time.sleep(30)
data = self.getShellJsonResponse(cmd,counter)
return data
else:
logString = "Error: " + str(err)
print(logString)
logString = "Error: Return Code is: " + str(process.returncode)
print(logString)
logString = "ERROR: Failed to return Json response. Halting the program so that you can debug the cause of the problem."
quit(logString)
sys.exit(1)
CURRENT PROBLEM:
The problem we are getting with the above is that our current python code above reports a process.returncode of 1 and then recursively continues to call the python function again and again while the CLI command is running instead of simply reporting that the CLI command is still running.
And our current recursive approach does not take into account what is actually happening since the CLI command was first called, and instead just blindly repeats up to 10 times for up to 5 minutes, when the actual process might take 10 to 20 minutes to complete.
What is the most pythonic way to rewrite the above code in order to gracefully report that the CLI command is running for however long it takes to complete, and then return the JSON given above when the
command finally completes?
I'm not sure if my code is pythoic, but I think it's better to run it in Popen.
I can't test the CLI command you should execute, so I replaced it with the netstat command, which takes a long time to respond.
import subprocess
import time
def getJsonResponse(cmd):
process = subprocess.Popen(
cmd,
encoding='utf-8',
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
)
while(True):
returncode = process.poll()
if returncode is None:
# You describe what is going on.
# You can describe the process every time the time elapses as needed.
# print("running process")
time.sleep(0.01)
data = process.stdout
if data:
# If there is any response, describe it here.
# You need to use readline () or readlines () properly, depending on how the process responds.
msg_line = data.readline()
print(msg_line)
err = process.stderr
if err:
# If there is any error response, describe it here.
msg_line = err.readline()
print(msg_line)
else:
print(returncode)
break
# Describes the processing after the process ends.
print("terminate process")
getJsonResponse(cmd=['netstat', '-a'])
I have a strange issue here - I have an application that I'm attempting to launch from python, but all attempts to launch it from within a .py script fail without any discernable output. Testing from within VSCode debugger. Here's some additional oddities:
When I swap in notepad.exe into the .py instead of my target applications path, notepad launches ok.
When I run the script line by line from the CLI (start by launching python, then type out the next 4-5 lines of Python), the script works as expected.
Examples:
#This works in the .py, and from the CLI
import subprocess
cmd = ['C:\\Windows\\system32\\notepad.exe', 'C:\\temp\\myfiles\\test_24.xml']
pipe = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
pipe.wait()
print(pipe)
#This fails in the .py, but works ok when pasted in line by line from the CLI
import subprocess
cmd = ['C:\\temp\\temp_app\\target_application.exe', 'C:\\temp\\myfiles\\test_24.xml']
pipe = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
pipe.wait()
print(pipe)
The result is no output when running the .py
I've tried several other variants, including the following:
import subprocess
tup = 'C:\\temp\\temp_app\\target_application.exe C:\temp\test\test_24.xml'
proc = subprocess.Popen(tup)
proc.wait()
(stdout, stderr) = proc.communicate()
print(stdout)
if proc.returncode != 0:
print("The error is: " + str(stderr))
else:
print("Executed: " + str(tup))
Result:
None
The error is: None
1.082381010055542
Now this method indicates there is an error because we are returning something other than 0 and printing "The error is: None", and this is because stderror is "None". So - is it throwing an error without giving an error?
stdout is also reporting "None".
So, lets try check_call and see what happens:
print("Trying check_call")
try:
subprocess.check_call('C:\\temp\\temp_app\\target_application.exe C:\\temp\\test\\test_24.xml', shell=True)
except subprocess.CalledProcessError as error:
print(error)
Results:
Trying check_call
Command 'C:\temp\temp_app\target_application.exe C:\temp\test\test_24.xml' returned non-zero exit status 1.
I've additionally tried subprocess.run, although it is missing the wait procedure I was hoping to use.
import subprocess
tup = 'C:\\temp\\temp_app\\target_application.exe C:\temp\test\test_24.xml'
proc = subprocess.run(tup, check=True)
proc.wait()
(stdout, stderr) = proc.communicate()
print(stdout)
if proc.returncode != 0:
print("The error is: " + str(stderr))
else:
print("Executed: " + str(tup))
What reasons might be worth chasing, or what other ways of trying to catch an error might work here? I don't know how to interpret "`" as an error result.
I saw some useful information in this post about how you can't expect to run a process in the background if you are retrieving output from it using subprocess. The problem is ... this is exactly what I want to do!
I have a script which drops commands to various hosts via ssh and I don't want to have to wait on each one to finish before starting the next. Ideally, I could have something like this:
for host in hostnames:
p[host] = Popen(["ssh", mycommand], stdout=PIPE, stderr=PIPE)
pout[host], perr[host] = p[host].communicate()
which would have (in the case where mycommand takes a very long time) all of the hosts running mycommand at the same time. As it is now, it appears that the entirety of the ssh command finishes before starting the next. This is (according to the previous post I linked) due to the fact that I am capturing output, right? Other than just cating the output to a file and reading the output later, is there a decent way to make these things happen on various hosts in parallel?
You may want to use fabric for this.
Fabric is a Python (2.5-2.7) library and command-line tool for streamlining the use of SSH for application deployment or systems administration tasks.
Example file:
from fabric.api import run, env
def do_mycommand():
my_command = "ls" # change to your command
output = run(mycommand)
print "Output of %s on %s:%s" % (mycommand, env.host_string, output)
Now to execute on all hosts (host1,host2 ... is where all hosts go):
fab -H host1,host2 ... do_mycommand
You could use threads for achieving parallelism and a Queue for retrieving results in a thread-safe way:
import subprocess
import threading
import Queue
def run_remote_async(host, command, result_queue, identifier=None):
if isinstance(command, str):
command = [command]
if identifier is None:
identifier = "{}: '{}'".format(host, ' '.join(command))
def worker(worker_command_list, worker_identifier):
p = subprocess.Popen(worker_command_list,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
result_queue.put((worker_identifier, ) + p.communicate())
t = threading.Thread(target=worker,
args=(['ssh', host] + command, identifier),
name=identifier)
t.daemon = True
t.start()
return t
Then, a possible test case could look like this:
def test():
data = [('host1', ['ls', '-la']),
('host2', 'whoami'),
('host3', ['echo', '"Foobar"'])]
q = Queue.Queue()
for host, command in data:
run_remote_async(host, command, q)
for i in range(len(data)):
identifier, stdout, stderr = q.get()
print identifier
print stdout
Queue.get() is blocking, so at this point you can collect one result after another, once the task is completed.
I always use fabric to deploy my processes from my local pc to remote servers.
If I have a python script like this:
test.py:
import time
while True:
print "Hello world."
time.sleep(1)
Obviously, this script is a continuous running script.
And I deploy this script to remote server and execute my fabric script like this:
...
sudo("python test.py")
The fabric will always wait the return of test.py and won't exit.How can I stop the fabric script at once and ignore the return of test.py
Usually for this kind of asynchronous task processing Celery is preferred .
This explains in detail the use of Celery and Fabric together.
from fabric.api import hosts, env, execute,run
from celery import task
env.skip_bad_hosts = True
env.warn_only = True
#task()
def my_celery_task(testhost):
host_string = "%s#%s" % (testhost.SSH_user_name, testhost.IP)
#hosts(host_string)
def my_fab_task():
env.password = testhost.SSH_password
run("ls")
try:
result = execute(my_fab_task)
if isinstance(result.get(host_string, None), BaseException):
raise result.get(host_string)
except Exception as e:
print "my_celery_task -- %s" % e.message
sudo("python test.py 2>/dev/null >/dev/null &")
or redirect the output to some other file instead of /dev/null
This code worked for me:
fabricObj.execute("(nohup python your_file.py > /dev/null < /dev/null &)&")
Where fabricObj is an object to fabric class(defined internally) which speaks to fabric code.
I have a server that launches command line apps. They receive a local file path, load a file, export something, then close.
It's working, but I would like to be able to keep track of which tasks are active and which completed.
So with this line:
p = mp.Process(target=subprocess.Popen(mayapy + ' -u ' + job.pyFile), group=None)
I have tried 'is_alive', and it always returns False.
The subprocess closes, I see it closed in task manager, but the process and pid still seem queryable.
Your use of mp.Process is wrong. The target should be a function, not the return value of subprocess.Popen(...).
In any case, if you define:
proc = subprocess.Popen(mayapy + ' -u ' + job.pyFile)
Then proc.poll() will be None while the process is working, and will equal a return value (not None) when the process has terminated.
For example, (the output is in the comments)
import subprocess
import shlex
import time
PIPE = subprocess.PIPE
proc = subprocess.Popen(shlex.split('ls -lR /'), stdout=PIPE)
time.sleep(1)
print(proc.poll())
# None
proc.terminate()
time.sleep(1)
print(proc.poll())
# -15