Pythonic way to handle long-running CLI commands with subprocess? - python

What is the most pythonic syntax for getting subprocess to successfully manage the running of the following CLI command, which can take a long time to complete?
CLI Command:
The CLI command that subprocess must run is:
az resource invoke-action --resource-group someRG --resource-type Microsoft.VirtualMachineImages/imageTemplates -n somename78686786976 --action Run
The CLI command runs for a long time, for example 11 minutes in this case, but possibly longer at other times.
While run from the terminal manually, the terminal prints the following while the command is waiting to hear back that it has succeeded:
\ Running
The \ spins around while the command runs when the command is manually typed in the terminal.
The response that is eventually given back when the command finally succeeds is the following JSON:
{
"endTime": "2022-06-23T02:54:02.6811671Z",
"name": "long-alpha-numerica-string-id",
"startTime": "2022-06-23T02:43:39.2933333Z",
"status": "Succeeded"
}
CURRENT PYTHON CODE:
The current python code we are using to run the above command from within a python program is as follows:
def getJsonResponse(self, cmd,counter=0):
process = subprocess.run(cmd, shell=True, stdout=subprocess.PIPE, text=True)
data = process.stdout
err = process.stderr
logString = "data string is: " + data
print(logString)
logString = "err is: " + str(err)
print(logString)
logString = "process.returncode is: " + str(process.returncode)
print(logString)
if process.returncode == 0:
print(str(data))
return data
else:
if counter < 11:
counter +=1
logString = "Attempt "+str(counter)+ " out of 10. "
print(logString)
import time
time.sleep(30)
data = self.getShellJsonResponse(cmd,counter)
return data
else:
logString = "Error: " + str(err)
print(logString)
logString = "Error: Return Code is: " + str(process.returncode)
print(logString)
logString = "ERROR: Failed to return Json response. Halting the program so that you can debug the cause of the problem."
quit(logString)
sys.exit(1)
CURRENT PROBLEM:
The problem we are getting with the above is that our current python code above reports a process.returncode of 1 and then recursively continues to call the python function again and again while the CLI command is running instead of simply reporting that the CLI command is still running.
And our current recursive approach does not take into account what is actually happening since the CLI command was first called, and instead just blindly repeats up to 10 times for up to 5 minutes, when the actual process might take 10 to 20 minutes to complete.
What is the most pythonic way to rewrite the above code in order to gracefully report that the CLI command is running for however long it takes to complete, and then return the JSON given above when the
command finally completes?

I'm not sure if my code is pythoic, but I think it's better to run it in Popen.
I can't test the CLI command you should execute, so I replaced it with the netstat command, which takes a long time to respond.
import subprocess
import time
def getJsonResponse(cmd):
process = subprocess.Popen(
cmd,
encoding='utf-8',
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
)
while(True):
returncode = process.poll()
if returncode is None:
# You describe what is going on.
# You can describe the process every time the time elapses as needed.
# print("running process")
time.sleep(0.01)
data = process.stdout
if data:
# If there is any response, describe it here.
# You need to use readline () or readlines () properly, depending on how the process responds.
msg_line = data.readline()
print(msg_line)
err = process.stderr
if err:
# If there is any error response, describe it here.
msg_line = err.readline()
print(msg_line)
else:
print(returncode)
break
# Describes the processing after the process ends.
print("terminate process")
getJsonResponse(cmd=['netstat', '-a'])

Related

subprocess.PIPE prevents executable from closing

Why subprocess.PIPE prevents a called executable from closing.
I use the following script to call an executable file with a number of inputs:
import subprocess, time
CREATE_NO_WINDOW = 0x08000000
my_proc = subprocess.Popen("myApp.exe " + ' '.join([str(input1), str(input2), str(input3)]),
startupinfo=subprocess.STARTUPINFO(), stdout=subprocess.PIPE,
creationflags = CREATE_NO_WINDOW)
Then I monitor if the application has finished within a given time (300 seconds) and if not I just kill it. I also read the output of the application to know whether it failed in doing the required tasks.
proc_wait_time = 300
start_time = time.time()
sol_status = 'Fail'
while time.time() - start_time < proc_wait_time:
if (my_proc.poll() is None):
time.sleep(1)
else:
try:
sol_status = my_proc.stdout.read().replace('\r\n \r\n','')
break
except:
sol_status = 'Fail'
break
else:
try: my_proc.kill()
except: None
sol_status = 'Frozen'
if sol_status in ['Fail', 'Frozen']:
print ('Failed running my_proc')
As you can note from the code I need to wait for myApp.exe to finish, however, sometimes myApp.exe freezes. Since the script above is part of a loop, I need to identify such a situation (by a timer), keep track of it and kill myApp.exe so that the whole script doesn't get stuck!
Now, the issue is that if I use subprocess.PIPE (which I suppose I have to if I want read the output of the application) then myApp.exe doesn't close after finishing and consequently my_proc.poll() is None is always True.
I am using Python 2.7.
There was a pipe buffer limit/bug in case of huge amounts of data written to subprocess.PIPE. The easiest way to fix it is to pipe the data directly into a file:
_stdoutHandler = open('C:/somePath/stdout.log', 'w')
_stderrHandler = open('C:/somePath/stderr.log', 'w')
my_proc = subprocess.Popen(
"myApp.exe " + ' '.join([str(input1), str(input2), str(input3)]),
stdout=_stdoutHandler,
stderr=_stderrHandler,
startupinfo=subprocess.STARTUPINFO(),
creationflags=CREATE_NO_WINDOW
)
...
_stdoutHandler.close()
_stderrHandler.close()

Python script checking if a particular Linux command is still running

I want to write a Python script which will check every minute if some pre-defined process is still running on Linux machine and if it doesn't print a timestamp at what time it has crashed. I have written a script which is doing exactly that but unfortunately, it works correctly with only one process.
This is my code:
import subprocess
import shlex
import time
from datetime import datetime
proc_def = "top"
grep_cmd = "pgrep -a " + proc_def
try:
proc_run = subprocess.check_output(shlex.split(grep_cmd)).decode('utf-8')
proc_run = proc_run.strip().split('\n')
'''
Creating a dictionary with key the PID of the process and value
the command line
'''
proc_dict = dict(zip([i.split(' ', 1)[0] for i in proc_run],
[i.split(' ', 1)[1] for i in proc_run]))
check_run = "ps -o pid= -p "
for key, value in proc_dict.items():
check_run_cmd = check_run + key
try:
# While the output of check_run_cmd isn't empty line do
while subprocess.check_output(
shlex.split(check_run_cmd)
).decode('utf-8').strip():
# This print statement is for debugging purposes only
print("Running")
time.sleep(3)
'''
If the check_run_cmd is returning an error, it shows us the time
and date of the crash as well as the PID and the command line
'''
except subprocess.CalledProcessError as e:
print(f"PID: {key} of command: \"{value}\" stopped
at {datetime.now().strftime('%d-%m-%Y %T')}")
exit(1)
# Check if the proc_def is actually running on the machine
except subprocess.CalledProcessError as e:
print(f"The \"{proc_def}\" command isn't running on this machine")
For example, if there are two top processes it will show me information about the crash time of only one of these processes and it will exit. I want to stay active as long as there is another process running and exit only if both processes are killed. It should present information when each of the processes has crashed.
It shall also not be limited to two proc only and support multiple processes started with the same proc_def command.
Have to change the logic a bit, but basically you want an infinite loop alternating a check on all processes - not checking the same one over and over:
import subprocess
import shlex
import time
from datetime import datetime
proc_def = "top"
grep_cmd = "pgrep -a " + proc_def
try:
proc_run = subprocess.check_output(shlex.split(grep_cmd)).decode('utf-8')
proc_run = proc_run.strip().split('\n')
'''
Creating a dictionary with key the PID of the process and value
the command line
'''
proc_dict = dict(zip([i.split(' ', 1)[0] for i in proc_run],
[i.split(' ', 1)[1] for i in proc_run]))
check_run = "ps -o pid= -p "
while proc_dict:
for key, value in proc_dict.items():
check_run_cmd = check_run + key
try:
# While the output of check_run_cmd isn't empty line do
subprocess.check_output(shlex.split(check_run_cmd)).decode('utf-8').strip()
# This print statement is for debugging purposes only
print("Running")
time.sleep(3)
except subprocess.CalledProcessError as e:
print(f"PID: {key} of command: \"{value}\" stopped at {datetime.now().strftime('%d-%m-%Y %T')}")
del proc_dict[key]
break
# Check if the proc_def is actually running on the machine
except subprocess.CalledProcessError as e:
print(f"The \"{proc_def}\" command isn't running on this machine")
This suffers from the same problems in the original code, namely the time resolution is 3 seconds, and if a new process is run during this script, you won't ping it (though this may be desired).
The first problem would be fixed by sleeping for less time, depending on what you need, the second by running the initial lines creating proc_dict in the while True.

How can I start a process and put it to background in python?

I am currently writing my first python program (in Python 2.6.6). The program facilitates starting and stopping different applications running on a server providing the user common commands (like starting and stopping system services on a Linux server).
I am starting the applications' startup scripts by
p = subprocess.Popen(startCommand, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output, err = p.communicate()
print(output)
The problem is, that the startup script of one application stays in foreground and so p.communicate() waits forever. I have already tried to use "nohup startCommand &" in front of the startCommand but that did not work as expected.
As a workaround I now use the following bash script to call the application's start script:
#!/bin/bash
LOGFILE="/opt/scripts/bin/logs/SomeServerApplicationStart.log"
nohup /opt/someDir/startSomeServerApplication.sh >${LOGFILE} 2>&1 &
STARTUPOK=$(tail -1 ${LOGFILE} | grep "Server started in RUNNING mode" | wc -l)
COUNTER=0
while [ $STARTUPOK -ne 1 ] && [ $COUNTER -lt 100 ]; do
STARTUPOK=$(tail -1 logs/SomeServerApplicationStart.log | grep "Server started in RUNNING mode" | wc -l)
if (( STARTUPOK )); then
echo "STARTUP OK"
exit 0
fi
sleep 1
COUNTER=$(( $COUNTER + 1 ))
done
echo "STARTUP FAILED"
The bash script is called from my python code. This workaround works perfect but I would prefer to do all in python...
Is subprocess.Popen the wrong way? How could I accommplish my task in Python only?
First it is easy not to block the Python script in communicate... by not calling communicate! Just read from output or error output from the command until you find the correct message and just forget about the command.
# to avoid waiting for an EOF on a pipe ...
def getlines(fd):
line = bytearray()
c = None
while True:
c = fd.read(1)
if c is None:
return
line += c
if c == '\n':
yield str(line)
del line[:]
p = subprocess.Popen(startCommand, shell=True, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT) # send stderr to stdout, same as 2>&1 for bash
for line in getlines(p.stdout):
if "Server started in RUNNING mode" in line:
print("STARTUP OK")
break
else: # end of input without getting startup message
print("STARTUP FAILED")
p.poll() # get status from child to avoid a zombie
# other error processing
The problem with the above, is that the server is still a child a the Python process and could get unwanted signals such as SIGHUP. If you want to make it a daemon, you must first start a subprocess that next start your server. That way when first child will end, it can be waited by caller and the server will get a PPID of 1 (adopted by init process). You can use multiprocessing module to ease that part
Code could be like:
import multiprocessing
import subprocess
# to avoid waiting for an EOF on a pipe ...
def getlines(fd):
line = bytearray()
c = None
while True:
c = fd.read(1)
if c is None:
return
line += c
if c == '\n':
yield str(line)
del line[:]
def start_child(cmd):
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
shell=True)
for line in getlines(p.stdout):
print line
if "Server started in RUNNING mode" in line:
print "STARTUP OK"
break
else:
print "STARTUP FAILED"
def main():
# other stuff in program
p = multiprocessing.Process(target = start_child, args = (server_program,))
p.start()
p.join()
print "DONE"
# other stuff in program
# protect program startup for multiprocessing module
if __name__ == '__main__':
main()
One could wonder what is the need for the getlines generator when a file object is itself an iterator that returns one line at a time. The problem is that it internally calls read that read until EOF when file is not connected to a terminal. As it is now connected to a PIPE, you will not get anything until the server ends... which is not what is expected

How to implement retry mechanism if the shell script execution got failed?

I am trying to execute shell script in Python code. And so far everything is looking good.
Below is my Python script which will execute a shell script. Now for an example sake, here it is a simple Hello World shell script.
jsonStr = '{"script":"#!/bin/bash\\necho Hello world 1\\n"}'
j = json.loads(jsonStr)
shell_script = j['script']
print "start"
proc = subprocess.Popen(shell_script, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
(stdout, stderr) = proc.communicate()
if stderr:
print "Shell script gave some error"
print stderr
else:
print stdout
print "end" # Shell script ran fine.
Now what I am looking for is, suppose for whatever reason whenever I am executing my shell script from Python code and it got failed for whatever reason. Then that means stderr won't be empty. So now I want to retry executing the shell script again, let's say after sleeping for couple of milliseconds?
Meaning is there any possibility of implementing of retry mechanism if the shell script execution got failed? Can I retry for 5 or 6 times? Meaning is it possible to configure this number as well?
from time import sleep
MAX_TRIES = 6
# ... your other code ...
for i in xrange(MAX_TRIES):
proc = subprocess.Popen(shell_script, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
(stdout, stderr) = proc.communicate()
if stderr:
print "Shell script gave some error..."
print stderr
sleep(0.05) # delay for 50 ms
else:
print stdout
print "end" # Shell script ran fine.
break
Something like this maybe:
maxRetries = 6
retries = 0
while (retries < maxRetries):
doSomething ()
if errorCondition:
retries += 1
continue
break
How about using a decorator? Seems like a very clear way.
You can read about them here https://wiki.python.org/moin/PythonDecoratorLibrary. (Retry decorator)

multiprocessing.Process subprocess.Popen completed?

I have a server that launches command line apps. They receive a local file path, load a file, export something, then close.
It's working, but I would like to be able to keep track of which tasks are active and which completed.
So with this line:
p = mp.Process(target=subprocess.Popen(mayapy + ' -u ' + job.pyFile), group=None)
I have tried 'is_alive', and it always returns False.
The subprocess closes, I see it closed in task manager, but the process and pid still seem queryable.
Your use of mp.Process is wrong. The target should be a function, not the return value of subprocess.Popen(...).
In any case, if you define:
proc = subprocess.Popen(mayapy + ' -u ' + job.pyFile)
Then proc.poll() will be None while the process is working, and will equal a return value (not None) when the process has terminated.
For example, (the output is in the comments)
import subprocess
import shlex
import time
PIPE = subprocess.PIPE
proc = subprocess.Popen(shlex.split('ls -lR /'), stdout=PIPE)
time.sleep(1)
print(proc.poll())
# None
proc.terminate()
time.sleep(1)
print(proc.poll())
# -15

Categories