Kill process and all its child processes after a Popen - python

So I have a situation where I have an Flask endpoint A, endpoint B and two other scripts foo.py and bar.py.
When I call endpoint A, I will do a call for foo.py with Popen and store its PID.
On foo.py, it makes a call to bar.py using Popen, which makes another call, again, using Popen. The process opened on bar.py is a server (to be more specific, it's a tf-serving server), which will be hanging forever when I do an p.wait(). Later on, I would like to use endpoint B to end the whole process triggered by A.
The situation can be something like:
Flask's endpoints:
import os
import json
import signal
from subprocess import Popen
from flask import current_app
from flask import request, jsonify
#app.route('/A', methods=['GET'])
def a():
p = Popen(['python', '-u','./foo.py'])
current_app.config['FOO_PID'] = p.pid
return jsonify({'message': 'Started successfully'}), 200
#inspection.route('/B', methods=['GET'])
def b():
os.kill(current_app.config['FOO_PID'], signal.SIGTERM)
return jsonify({'message': 'Stopped successfully'}), 200
foo.py:
p = Popen(['python' ,'-u', './bar.py', '--serve'])
while True:
continue
bar.py:
command = f'tensorflow_model_server --rest_api_port=8501 --model_name=obj_det --model_base_path=./model'
p = subprocess.Popen(command, shell=True, stderr=sys.stderr, stdout=sys.stdout)
p.wait()
Unfortunately when I kill foo.py using endpoint B, the process created by bar.py (ie. the server) will not end. How can I kill the server?
Please consider a solution that is OS agnostic.

Using a package like psutil allows you to recursively iterate and access of all child processes related to a certain PID. This will effectively allow you to kill all nested processes. Documentation for psutil https://github.com/giampaolo/psutil.
import json
import signal
from subprocess import Popen
from flask import current_app
from flask import request, jsonify
from psutil import Process
#app.route('/A', methods=['GET'])
def a():
p = Popen(['python', '-u','./foo.py'])
current_app.config['FOO_PID'] = p.pid
return jsonify({'message': 'Started successfully'}), 200
#inspection.route('/B', methods=['GET'])
def b():
pid = current_app.config['FOO_PID']
parent = Process(pid)
for child in parent.children(recursive=True):
child.kill()
parent.kill()
return jsonify({'message': 'Stopped successfully'}), 200

Related

Python - how can I run separate module (not function) as a separate process?

tl,dr: How can I programmably execute a python module (not function) as a separate process from a different python module?
On my development laptop, I have a 'server' module containing a bottle server. In this module, the name==main clause starts the bottle server.
#bt_app.post("/")
def server_post():
<< Generate response to 'http://server.com/' >>
if __name__ == '__main__':
serve(bt_app, port=localhost:8080)
I also have a 'test_server' module containing pytests. In this module, the name==main clause runs pytest and displays the results.
def test_something():
_rtn = some_server_function()
assert _rtn == desired
if __name__ == '__main__':
_rtn = pytest.main([__file__])
print("Pytest returned: ", _rtn)
Currently, I manually run the server module (starting the web server on localhost), then I manually start the pytest module which issues html requests to the running server module and checks the responses.
Sometimes I forget to start the server module. No big deal but annoying. So I'd like to know if I can programmatically start the server module as a separate process from the pytest module (just as I'm doing manually now) so I don't forget to start it manually.
Thanks
There is my test cases dir tree:
test
├── server.py
└── test_server.py
server.py start a web server with flask.
from flask import Flask
app = Flask(__name__)
#app.route('/')
def hello_world():
return 'Hello, World!'
if __name__ == '__main__':
app.run()
test_server.py make request to test.
import sys
import requests
import subprocess
import time
p = None # server process
def start_server():
global p
sys.path.append('/tmp/test')
# here you may want to do some check.
# whether the server is already started, then pass this fucntion
kwargs = {} # here u can pass other args needed
p = subprocess.Popen(['python','server.py'], **kwargs)
def test_function():
response = requests.get('http://localhost:5000/')
print('This is response body: ', response.text)
if __name__ == '__main__':
start_server()
time.sleep(3) # waiting server started
test_function()
p.kill()
Then you can do python test_server to start the server and do test cases.
PS: Popen() needs python3.5+. if older version, use run instead
import logging
import threading
import time
def thread_function(name):
logging.info("Thread %s: starting", name)
time.sleep(2)
logging.info("Thread %s: finishing", name)
if __name__ == "__main__":
format = "%(asctime)s: %(message)s"
logging.basicConfig(format=format, level=logging.INFO,
datefmt="%H:%M:%S")
threads = list()
for index in range(3):
logging.info("Main : create and start thread %d.", index)
x = threading.Thread(target=thread_function, args=(index,))
threads.append(x)
x.start()
for index, thread in enumerate(threads):
logging.info("Main : before joining thread %d.", index)
thread.join()
logging.info("Main : thread %d done", index)
With threading you can run multiple processes at once!
Wim baasically answered this question. I looked into the subprocess module. While reading up on it, I stumbled on the os.system function.
In short, subprocess is a highly flexible and functional program for running a program. os.system, on the other hand, is much simpler, with far fewer functions.
Just running a python module is simple, so I settled on os.system.
import os
server_path = "python -m ../src/server.py"
os.system(server_path)
Wim, thanks for the pointer. Had it been a full fledged answer I would have upvoted it. Redo it as a full fledged answer and I'll do so.
Async to the rescue.
import gevent
from gevent import monkey, spawn
monkey.patch_all()
from gevent.pywsgi import WSGIServer
#bt_app.post("/")
def server_post():
<< Generate response to 'http://server.com/' >>
def test_something():
_rtn = some_server_function()
assert _rtn == desired
print("Pytest returned: ",_rtn)
sleep(0)
if __name__ == '__main__':
spawn(test_something) #runs async
server = WSGIServer(("0.0.0.0", 8080, bt_app)
server.serve_forever()

gevent - hub.loop.reinit() does not work after fork

The do_magic function would be called twice in the following example, both in parent and child process.
My confusion is the os.fork has been replaced with gevent.fork, and hub.loop.reinit() would be called in child process. If so, why do_magic still be called in child process?
import gevent
from gevent import monkey
monkey.patch_all()
import os, time
def do_magic():
print 'magic...'
def main():
g = gevent.spawn_later(1, do_magic)
pid = os.fork()
if pid != 0: # parent
g.join()
else:
gevent.get_hub().loop.reinit()
time.sleep(3)
main()

Running asyncio.subprocess.Process from Tornado RequestHandler

I'm trying to write a Tornado web app which runs a local command asynchronously, as a coroutine. This is the stripped down example code:
#! /usr/bin/env python3
import shlex
import asyncio
import logging
from tornado.web import Application, url, RequestHandler
from tornado.httpserver import HTTPServer
from tornado.ioloop import IOLoop
logging.getLogger('asyncio').setLevel(logging.DEBUG)
async def run():
command = "python3 /path/to/my/script.py"
logging.debug('Calling command: {}'.format(command))
process = asyncio.create_subprocess_exec(
*shlex.split(command),
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.STDOUT
)
logging.debug(' - process created')
result = await process
stdout, stderr = result.communicate()
output = stdout.decode()
return output
def run_sync(self, path):
command = "python3 /path/to/my/script.py"
logging.debug('Calling command: {}'.format(command))
try:
result = subprocess.run(
*shlex.split(command),
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
check=True
)
except subprocess.CalledProcessError as ex:
raise RunnerError(ex.output)
else:
return result.stdout
class TestRunner(RequestHandler):
async def get(self):
result = await run()
self.write(result)
url_list = [
url(r"/test", TestRunner),
]
HTTPServer(Application(url_list, debug=True)).listen(8080)
logging.debug("Tornado server started at port {}.".format(8080))
IOLoop.configure('tornado.platform.asyncio.AsyncIOLoop')
IOLoop.instance().start()
When /path/to/my/script.py is called directly it executes as expected. Also, when I have TestHandler.get implemented as a regular, synchronous method (see run_sync), it executes correctly. However, when running the above app and calling /test, the log shows:
DEBUG:asyncio:Using selector: EpollSelector
DEBUG:asyncio:execute program 'python3' stdout=stderr=<pipe>
DEBUG:asyncio:process 'python3' created: pid 21835
However, ps shows that the process hanged:
$ ps -ef | grep 21835
berislav 21835 21834 0 19:19 pts/2 00:00:00 [python3] <defunct>
I have a feeling that I'm not implementing the right loop, or I'm doing it wrong, but all the examples I've seen show how to use asyncio.get_event_loop().run_until_complete(your_coro()), and I couldn't find much about combining asyncio and Tornado. All suggestions welcome!
Subprocesses are tricky because of the singleton SIGCHLD handler. In asyncio, this means that they only work with the "main" event loop. If you change tornado.ioloop.IOLoop.configure('tornado.platform.asyncio.AsyncIOLoop') to tornado.platform.asyncio.AsyncIOMainLoop().install(), then the example works. A few other cleanups were also necessary; here's the full code:
#! /usr/bin/env python3
import shlex
import asyncio
import logging
import tornado.platform.asyncio
from tornado.web import Application, url, RequestHandler
from tornado.httpserver import HTTPServer
from tornado.ioloop import IOLoop
logging.getLogger('asyncio').setLevel(logging.DEBUG)
async def run():
command = "python3 /path/to/my/script.py"
logging.debug('Calling command: {}'.format(command))
process = await asyncio.create_subprocess_exec(
*shlex.split(command),
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.STDOUT
)
logging.debug(' - process created')
result = await process.wait()
stdout, stderr = await process.communicate()
output = stdout.decode()
return output
tornado.platform.asyncio.AsyncIOMainLoop().install()
IOLoop.instance().run_sync(run)
Also note that tornado has its own subprocess interface in tornado.process.Subprocess, so if that's the only thing you need asyncio for, consider using the Tornado version instead. Be aware that combining Tornado and asyncio's subprocesses interfaces in the same process may produce conflicts with the SIGCHLD handler, so you should pick one or the other, or use the libraries in such a way that the SIGCHLD handler is unnecessary (for example by relying solely on stdout/stderr instead of the process's exit status).

Streaming a response doesn't work with Flask-Restful

I have a scenario where I want to show output of a long running script through a Flask API. I followed an example given for Flask and it works. I get dmesg steam in my browser.
import subprocess
import time
from flask import Flask, Response
app = Flask(__name__)
#app.route('/yield')
def index():
def inner():
proc = subprocess.Popen(
['dmesg'], # call something with a lot of output so we can see it
shell=True,
stdout=subprocess.PIPE
)
for line in iter(proc.stdout.readline,''):
time.sleep(1) # Don't need this just shows the text streaming
yield line.rstrip() + '<br/>\n'
return Response(inner(), mimetype='text/html') # text/html is required for most browsers to show this
The thing is, I have been using Flask-Restful from a long time. So I want to do the streaming using it. I tried it and it's not working.
import subprocess
import time
from flask import Response
from flask_restful import Resource
class CatalogStrings(Resource):
def get(self):
return Response(inner(), mimetype='text/html')
def inner():
proc = subprocess.Popen(
['dmesg'], # call something with a lot of output so we can see it
shell=True,
stdout=subprocess.PIPE
)
for line in iter(proc.stdout.readline, ''):
time.sleep(1) # Don't need this just shows the text streaming
yield line.rstrip() + '<br/>\n'
Please help

Using Popen in a thread blocks every incoming Flask-SocketIO request

I have the following situation:
I receive a request on a socketio server. I answer it (socket.emit(..)) and then start something with heavy computation load in another thread.
If the heavy computation is caused by subprocess.Popen (using subprocess.PIPE) it totally blocks every incoming request as long as it is being executed although it happens in a separate thread.
No problem - in this thread it was suggested to asynchronously read the result of the subprocess with a buffer size of 1 so that between these reads other threads have the chance to do something. Unfortunately this did not help for me.
I also already monkeypatched eventlet and that works fine - as long as I don't use subprocess.Popen with subprocess.PIPE in the thread.
In this code sample you can see that it only happens using subprocess.Popen with subprocess.PIPE. When uncommenting #functionWithSimulatedHeavyLoad() and instead comment functionWithHeavyLoad() everything works like charm.
from flask import Flask
from flask.ext.socketio import SocketIO, emit
import eventlet
eventlet.monkey_patch()
app = Flask(__name__)
socketio = SocketIO(app)
import time
from threading import Thread
#socketio.on('client command')
def response(data, type = None, nonce = None):
socketio.emit('client response', ['foo'])
thread = Thread(target = testThreadFunction)
thread.daemon = True
thread.start()
def testThreadFunction():
#functionWithSimulatedHeavyLoad()
functionWithHeavyLoad()
def functionWithSimulatedHeavyLoad():
time.sleep(5)
def functionWithHeavyLoad():
from datetime import datetime
import subprocess
import sys
from queue import Queue, Empty
ON_POSIX = 'posix' in sys.builtin_module_names
def enqueueOutput(out, queue):
for line in iter(out.readline, b''):
if line == '':
break
queue.put(line)
out.close()
# just anything that takes long to be computed
shellCommand = 'find / test'
p = subprocess.Popen(shellCommand, universal_newlines=True, shell=True, stdout=subprocess.PIPE, bufsize=1, close_fds=ON_POSIX)
q = Queue()
t = Thread(target = enqueueOutput, args = (p.stdout, q))
t.daemon = True
t.start()
t.join()
text = ''
while True:
try:
line = q.get_nowait()
text += line
print(line)
except Empty:
break
socketio.emit('client response', {'text': text})
socketio.run(app)
The client receives the message 'foo' after the blocking work in the functionWithHeavyLoad() function is completed. It should receive the message earlier, though.
This sample can be copied and pasted in a .py file and the behavior can be instantly reproduced.
I am using Python 3.4.3, Flask 0.10.1, flask-socketio1.2, eventlet 0.17.4
Update
If I put this into the functionWithHeavyLoad function it actually works and everything's fine:
import shlex
shellCommand = shlex.split('find / test')
popen = subprocess.Popen(shellCommand, stdout=subprocess.PIPE)
lines_iterator = iter(popen.stdout.readline, b"")
for line in lines_iterator:
print(line)
eventlet.sleep()
The problem is: I used find for heavy load in order to make the sample for you more easily reproducable. However, in my code I actually use tesseract "{0}" stdout -l deu as the sell command. This (unlike find) still blocks everything. Is this rather a tesseract issue than eventlet? But still: how can this block if it happens in a separate thread where it reads line by line with context switch when find does not block?
Thanks to this question I learned something new today. Eventlet does offer a greenlet friendly version of subprocess and its functions, but for some odd reason it does not monkey patch this module in the standard library.
Link to the eventlet implementation of subprocess: https://github.com/eventlet/eventlet/blob/master/eventlet/green/subprocess.py
Looking at the eventlet patcher, the modules that are patched are os, select, socket, thread, time, MySQLdb, builtins and psycopg2. There is absolutely no reference to subprocess in the patcher.
The good news is that I was able to work with Popen() in an application very similar to yours, after I replaced:
import subprocess
with:
from eventlet.green import subprocess
But note that the currently released version of eventlet (0.17.4) does not support the universal_newlines option in Popen, you will get an error if you use it. Support for this option is in master (here is the commit that added the option). You will either have to remove that option from your call, or else install the master branch of eventlet direct from github.

Categories