Using custom signal handlers with gunicorn - python

I have a Flask application with custom signal handlers to take care of clean up tasks before exiting. When running the application with gunicorn, gunicorn kills the application before it can complete all clean up tasks.

You didn't explain what you mean by custom signal handlers, but I'm not sure that you should be using Flask's signals to capture process-level events, like shutdown. Instead, you can use the signal module from the standard library to hook onto the SIGTERM signal, like so:
# app.py - CREATE THIS FILE
from flask import Flask
from time import sleep, time
import signal
import sys
def create_app():
signal.signal(signal.SIGTERM, my_teardown_handler)
app = Flask(__name__)
#app.route('/')
def home():
return 'hi'
return app
def my_teardown_handler(signal, frame):
"""Sleeps for 3 seconds, then creates/updates a file named app-log.txt with the timestamp."""
sleep(3)
with open('app-log.txt', 'w') as f:
msg = ''.join(['The time is: ', str(time())])
f.write(msg)
sys.exit(0)
if __name__ == '__main__':
app = create_app()
app.run(port=8888)
# wsgi.py - CREATE THIS FILE, in same folder as app.py
import os
import sys
from werkzeug.wsgi import DispatcherMiddleware
from werkzeug.exceptions import NotFound
from app import create_app
app = DispatcherMiddleware(create_app())
Assuming you have a virtual environment with Flask and Gunicorn installed, you should then be able to launch the app with Gunicorn:
$ gunicorn --bind 127.0.0.1:8888 --log-level debug wsgi:app
Next, in a separate terminal, you can send the TERM signal to your app, like so:
$ kill -s TERM [PROCESS ID OF GUNICORN PROCESS / $(ps ax | grep gunicorn | head -n 1 | awk '{print $1}')]
And to observe the results, you should notice that the contents of the app-log.txt file get updated when you run that kill command, after the three-second delay. You could even spawn a third terminal window in this directory and run watch -n 1 "cat app-log.txt" to observe this file being updated in real time, while you cycle between starting the app and sending the TERM signal.
As for tying that into production, I know that Supervisor has a configuration option to specify the stopsignal, like so:
[program:my-app]
command = /path/to/gunicorn [RUNTIME FLAGS]
stopsignal = TERM
...
But that's a separate topic from the original issue of ensuring that your app's clean up tasks are completely executed.

Related

How to run Uvicorn FastAPI server as a module from another Python file?

I want to run FastAPI server using Uvicorn from A different Python file.
uvicornmodule/main.py
import uvicorn
import webbrowser
from fastapi import FastAPI
from fastapi.responses import FileResponse
from fastapi.staticfiles import StaticFiles
app = FastAPI()
import os
script_dir = os.path.dirname(__file__)
st_abs_file_path = os.path.join(script_dir, "static/")
app.mount("/static", StaticFiles(directory=st_abs_file_path), name="static")
#app.get("/")
async def index():
return FileResponse('static/index.html', media_type='text/html')
def start_server():
# print('Starting Server...')
uvicorn.run(
"app",
host="0.0.0.0",
port=8765,
log_level="debug",
reload=True,
)
# webbrowser.open("http://127.0.0.1:8765")
if __name__ == "__main__":
start_server()
So, I want to run the FastAPI server from the below test.py file:
from uvicornmodule import main
main.start_server()
Then, I run python test.py.
But I am getting the below error:
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
What I am doing wrong? I need to run this module as package.
When spawning new processes from the main process (as this is what happens when uvicorn.run() is called), it is important to protect the entry point to avoid recursive spawning of subprocesses, etc. As described in this article:
If the entry point was not protected with an if-statement idiom
checking for the top-level environment, then the script would execute
again directly, rather than run a new child process as expected.
Protecting the entry point ensures that the program is only started
once, that the tasks of the main process are only executed by the main
process and not the child processes.
Basically, your code that creates the new process must be under if __name__ == '__main__':. Hence:
from uvicornmodule import main
if __name__ == "__main__":
main.start_server()
Additionally, running uvicorn programmatically and having reload and/or workers flag(s) enabled, you must pass the application as an import string in the format of "<module>:<attribute>". For example:
# main.py
import uvicorn
from fastapi import FastAPI
app = FastAPI()
if __name__ == "__main__":
uvicorn.run("main:app", host="0.0.0.0", port=8000, reload=True)
As a sidenote, the below would also work, if reload and/or workers flags were not used:
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000)
Also, as per FastAPI documentation, when running the server from a terminal in the following way (the default port is 8000. Have a look at all the available command line options):
> uvicorn main:app --reload
the command uvicorn main:app refers to:
main: the file main.py (the Python "module").
app: the object created inside of main.py with the line app = FastAPI().
--reload: make the server restart after code changes. Only use for development.

web2py scheduling multiple tasks

in my web2py app i am using scheduler. So far I have one task scheduled which runs a subprocess when called from controler (an external exe file/application)
Now i want to add another task which will do some background work
My code in scheduler.py till now was
def runWoshiEngine(scriptId, path):
# import os, sys
# import time
import subprocess
print "runWoshiEngine in progress......"
p = subprocess.Popen(['woshi_engine.exe', scriptId], shell=True, stdout = subprocess.PIPE, cwd=path)
return dict(status = 1)
from gluon.scheduler import Scheduler
scheduler = Scheduler(db, heartbeat = 1)
so this way the scheduler started on client's request
after starting my app I run my scheduler with command
python web2py.py -K myapp
Now I want to add another function which will start every hour to do some background work.
What would you recommend and how to add it to scheduler because if i add anything to my line of code my initial task --> exe app is not started
thank you
best regards

is there any way to run bottle application in daemon mode

I have a web application built on bottle(python) frame work and I want to run it in daemon mode.Is there any way to run it in daemon mode
Thanks
Sure you can. Install BottleDaemon 0.1.0 on your OS and than change your router file like so:
from bottledaemon import daemon_run
from bottle import route
#route("/hello")
def hello():
return "Hello World"
# The following lines will call the BottleDaemon script and launch a daemon in the background.
if __name__ == "__main__":
daemon_run()

Celery auto reload on ANY changes

I could make celery reload itself automatically when there is changes on modules in CELERY_IMPORTS in settings.py.
I tried to give mother modules to detect changes even on child modules but it did not detect changes in child modules. That make me understand that detecting is not done recursively by celery. I searched it in the documentation but I did not meet any response for my problem.
It is really bothering me to add everything related celery part of my project to CELERY_IMPORTS to detect changes.
Is there a way to tell celery that "auto reload yourself when there is any changes in anywhere of project".
Thank You!
Celery --autoreload doesn't work and it is deprecated.
Since you are using django, you can write a management command for that.
Django has autoreload utility which is used by runserver to restart WSGI server when code changes.
The same functionality can be used to reload celery workers. Create a seperate management command called celery. Write a function to kill existing worker and start a new worker. Now hook this function to autoreload as follows.
import shlex
import subprocess
from django.core.management.base import BaseCommand
from django.utils import autoreload
def restart_celery():
cmd = 'pkill celery'
subprocess.call(shlex.split(cmd))
cmd = 'celery worker -l info -A foo'
subprocess.call(shlex.split(cmd))
class Command(BaseCommand):
def handle(self, *args, **options):
print('Starting celery worker with autoreload...')
# For Django>=2.2
autoreload.run_with_reloader(restart_celery)
# For django<2.1
# autoreload.main(restart_celery)
Now you can run celery worker with python manage.py celery which will autoreload when codebase changes.
This is only for development purposes and do not use it in production. Code taken from my other answer here.
You can manually include additional modules with -I|--include. Combine this with GNU tools like find and awk and you'll be able to find all .py files and include them.
$ celery -A app worker --autoreload --include=$(find . -name "*.py" -type f | awk '{sub("\./",""); gsub("/", "."); sub(".py",""); print}' ORS=',' | sed 's/.$//')
Lets explain it:
find . -name "*.py" -type f
find searches recursively for all files containing .py. The output looks something like this:
./app.py
./some_package/foopy
./some_package/bar.py
Then:
awk '{sub("\./",""); gsub("/", "."); sub(".py",""); print}' ORS=','
This line takes output of find as input and removes all occurences of ./. Then it replaces all / with a .. The last sub() removes replaces .py with an empty string. ORS replaces all newlines with ,. This outputs:
app,some_package.foo,some_package.bar,
The last command, sed removes the last ,.
So the command that is being executed looks like:
$ celery -A app worker --autoreload --include=app,some_package.foo,some_package.bar
If you have a virtualenv inside your source you can exclude it by adding -path .path_to_your_env -prune -o:
$ celery -A app worker --autoreload --include=$(find . -path .path_to_your_env -prune -o -name "*.py" -type f | awk '{sub("\./",""); gsub("/", "."); sub(".py",""); print}' ORS=',' | sed 's/.$//')
You can use watchmedo
pip install watchdog
Start celery worker indirectly via watchmedo
watchmedo auto-restart --directory=./ --pattern=*.py --recursive -- celery worker --app=worker.app --concurrency=1 --loglevel=INFO
More detailed
I used watchdog watchdemo utility, it works great but for some reason the PyCharm debugger was not able to debug the subprocess spawned by watchdemo.
So if your project has werkzeug as dependency, you can use the werkzeug._reloader.run_with_reloader function to autoreload celery worker on code change. Plus it works with PyCharm debugger.
"""
Filename: celery_dev.py
"""
import sys
from werkzeug._reloader import run_with_reloader
# this is the celery app path in my application, change it according to your project
from web.app import celery_app
def run():
# create copy of "argv" and remove script name
argv = sys.argv.copy()
argv.pop(0)
# start the celery worker
celery_app.worker_main(argv)
if __name__ == '__main__':
run_with_reloader(run)
Sample PyCharm debug configuration.
NOTE:
This is a private werkzeug API and is working as of Werkzeug==2.0.3. It may stop working in future versions. Use at you own risk.
OrangeTux's solution didn't work out for me, so I wrote a little Python script to achieve more or less the same. It monitors file changes using inotify, and triggers a celery restart if it detects a IN_MODIFY, IN_ATTRIB, or IN_DELETE.
#!/usr/bin/env python
"""Runs a celery worker, and reloads on a file change. Run as ./run_celery [directory]. If
directory is not given, default to cwd."""
import os
import sys
import signal
import time
import multiprocessing
import subprocess
import threading
import inotify.adapters
CELERY_CMD = tuple("celery -A amcat.amcatcelery worker -l info -Q amcat".split())
CHANGE_EVENTS = ("IN_MODIFY", "IN_ATTRIB", "IN_DELETE")
WATCH_EXTENSIONS = (".py",)
def watch_tree(stop, path, event):
"""
#type stop: multiprocessing.Event
#type event: multiprocessing.Event
"""
path = os.path.abspath(path)
for e in inotify.adapters.InotifyTree(path).event_gen():
if stop.is_set():
break
if e is not None:
_, attrs, path, filename = e
if filename is None:
continue
if any(filename.endswith(ename) for ename in WATCH_EXTENSIONS):
continue
if any(ename in attrs for ename in CHANGE_EVENTS):
event.set()
class Watcher(threading.Thread):
def __init__(self, path):
super(Watcher, self).__init__()
self.celery = subprocess.Popen(CELERY_CMD)
self.stop_event_wtree = multiprocessing.Event()
self.event_triggered_wtree = multiprocessing.Event()
self.wtree = multiprocessing.Process(target=watch_tree, args=(self.stop_event_wtree, path, self.event_triggered_wtree))
self.wtree.start()
self.running = True
def run(self):
while self.running:
if self.event_triggered_wtree.is_set():
self.event_triggered_wtree.clear()
self.restart_celery()
time.sleep(1)
def join(self, timeout=None):
self.running = False
self.stop_event_wtree.set()
self.celery.terminate()
self.wtree.join()
self.celery.wait()
super(Watcher, self).join(timeout=timeout)
def restart_celery(self):
self.celery.terminate()
self.celery.wait()
self.celery = subprocess.Popen(CELERY_CMD)
if __name__ == '__main__':
watcher = Watcher(sys.argv[1] if len(sys.argv) > 1 else ".")
watcher.start()
signal.signal(signal.SIGINT, lambda signal, frame: watcher.join())
signal.pause()
You should probably change CELERY_CMD, or any other global variables.
There was an issue in #AlexTT answer, I don't know if I should comment on his answer of put this as an answer.
You can use watchmedo
pip install watchdog
Start celery worker indirectly via watchmedo
watchmedo auto-restart --directory=./ --pattern=*.py --recursive -- celery -A <app> worker --concurrency=1 --loglevel=INFO
This is the way I made it work in Django:
# worker_dev.py (put it next to manage.py)
from django.utils import autoreload
def run_celery():
from projectname import celery_app
celery_app.worker_main(["-Aprojectname", "-linfo", "-Psolo"])
print("Starting celery worker with autoreload...")
autoreload.run_with_reloader(run_celery)
Then run python worker_dev.py. This has an advantage of working inside docker container.
This is a huge adaptation from Suor's code.
I made a custom Django command which can be called like this:
python manage.py runcelery
So, every time the code changes, celery's main process is gracefully killed and then executed again.
Change the CELERY_COMMAND variable as you wish.
# File: runcelery.py
import os
import signal
import subprocess
import time
import psutil
from django.core.management.base import BaseCommand
from django.utils import autoreload
DELAY_UNTIL_START = 5.0
CELERY_COMMAND = 'celery --config my_project.celeryconfig worker --loglevel=INFO'
class Command(BaseCommand):
help = ''
def kill_celery(self, parent_pid):
os.kill(parent_pid, signal.SIGTERM)
def run_celery(self):
time.sleep(DELAY_UNTIL_START)
subprocess.run(CELERY_COMMAND.split(' '))
def get_main_process(self):
for process in psutil.process_iter():
if process.ppid() == 0: # PID 0 has no parent
continue
parent = psutil.Process(process.ppid())
if process.name() == 'celery' and parent.name() == 'celery':
return parent
return
def reload_celery(self):
parent = self.get_main_process()
if parent is not None:
self.stdout.write('[*] Killing Celery process gracefully..')
self.kill_celery(parent.pid)
self.stdout.write('[*] Starting Celery...')
self.run_celery()
def handle(self, *args, **options):
autoreload.run_with_reloader(self.reload_celery)

Running Flask + Gunicorn + SocketIO

I am running SocketIOServer for my flask app and I have a run.py script which looks like this:
import sys
from idateproto import create_app, run_app
if __name__ == "__main__":
if len(sys.argv) > 1:
create_app(sys.argv[1])
else:
create_app('dev')
run_app()
and run_app looks like this:
def run_app():
# Run the application
# See flask/app.py run() for the implementation of run().
# See http://werkzeug.pocoo.org/docs/serving/ for the parameters of Werkzeug's run_simple()
# If the debug parameter is not set, Flask does not change app.debug, which is set from
# the DEBUG app config variable, which we've set in create_app().
#app.run(**app_run_args)
global app
global middleware
SocketIOServer((app.config['HOST'], app.conf`enter code here`ig['PORT']), middleware,namespace="socket.io", policy_server=False).serve_forever()
I run this like: python run.py production/dev/test etc.
Now I want to run it in production using gunicorn and all the tutorials online suggest doing something like:
gunicorn run:app -c gunicorn-config.py
My problem is I don't want to run just my app any more. I want to tell gunicorn to run the serve_forever method on the SocketIOServer instead, I have done a lot of research online and can not find a way to achieve this. Please help.

Categories