I have web application written in Flask. As suggested by everyone, I can't use Flask in production. So I thought of Gunicorn with Flask.
In Flask application I am loading some Machine Learning models. These are of size 8GB collectively. Concurrency of my web application can go upto 1000 requests. And the RAM of machine is 15GB.
So what is the best way to run this application?
You can start your app with multiple workers or async workers with Gunicorn.
Flask server.py
from flask import Flask
app = Flask(__name__)
#app.route("/")
def hello():
return "Hello World!"
if __name__ == "__main__":
app.run()
Gunicorn with gevent async worker
gunicorn server:app -k gevent --worker-connections 1000
Gunicorn 1 worker 12 threads:
gunicorn server:app -w 1 --threads 12
Gunicorn with 4 workers (multiprocessing):
gunicorn server:app -w 4
More information on Flask concurrency in this post: How many concurrent requests does a single Flask process receive?.
The best thing to do is to use pre-fork mode (preload_app=True). This will initialize your code in a "master" process and then simply fork off worker processes to handle requests. If you are running on linux and assuming your model is read-only, the OS is smart enough to reuse the physical memory amongst all the processes.
Related
I'm deploying a sample fast api app to the cloud with google standard app engine model. The app is served with gunicorn this way:
gunicorn main:app --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:80
This command spawns 4 worker proccesses of my app.
I've read that in fast api you can either create sync or async endpoints. If an endpoint is async all requests run on a single thread with the event loop. If the endpoint is sync, it runs the function on another thread to prevent it from blocking the server.
I have sync blocking endpoints, so fastapi should run them on threads, but also i have gunicorn spawning worker proccesess.
Given that python only executes one thread at a time, but also the standard app engine is also limited CPU wise on multiple proccessing, i'm confused on the best configuration for a fastapi application on the cloud.
Should i let gunicorn or fastapi handle the concurrency?
The number of workers you specify should match the instance class of your App Engine app; and since you're using 4 workers in your app, it has an equivalence of 4 instance classes. Here's an example that shows an App Engine deployment that uses 4 gunicorn workers for serving apps: entrypoint: gunicorn -b :8080 -w 4 main:app. The examples I've provided was stated in the entrypoint best practices.
Just a note, the gunicorn uses sync workers by default so that worker class is compatible with all web applications, but each worker can only handle one request at a time.
Lastly if using Google App Engine Flex, kindly check the recommended gunicorn configurations for further guide in your app.
In development, flask-socketio (4.1.0) with uwsgi is working nicely with just 1 worker and standard initialization.
Now I'm preparing for production and want to make it work with multiple workers.
I've done the following:
Added redis message_queue in init_app:
socketio = SocketIO()
socketio.init_app(app,async_mode='gevent_uwsgi', message_queue=app.config['SOCKETIO_MESSAGE_QUEUE'])
(Sidenote: we are using redis in the app itself as well)
gevent monkey patching at top of the file that we run with uwsgi
from gevent import monkey
monkey.patch_all()
run uwsgi with:
uwsgi --http 0.0.0.0:63000 --gevent 1000 --http-websockets --master --wsgi-file rest.py --callable application --py-autoreload 1 --gevent-monkey-patch --workers 4 --threads 1
This doesn't seem to work. The connection starts rapidly alternating between a connection and 400 Bad request responses. I suspect these correspond to the ' Invalid session ....' errors I see when I enable SocketIO logging.
Initially it was not using redis at all,
redis-cli > PUBSUB CHANNELS *
resulted in an empty result even with workers=1.
it seemed the following (taken from another SO answer) fixed that:
# https://stackoverflow.com/a/19117266/492148
import gevent
import redis.connection
redis.connection.socket = gevent.socket
after doing so I got a "flask-socketio" pubsub channel with updating data.
but after returning to multiple workers, the issue returned. Given that changing the redis socket did seem to bring things in the right direction I feel like the monkeypatching isn't working properly yet, but the code I used seems to match all examples I can find and is at the very top of the file that is loaded by uwsgi.
You can run as many workers as you like, but only if you run each worker as a standalone single-worker uwsgi process. Once you have all those workers running each on its own port, you can put nginx in front to load balance using sticky sessions. And of course you also need the message queue for the workers to use when coordinating broadcasts.
Eventually found https://github.com/miguelgrinberg/Flask-SocketIO/issues/535
so it seems you can't have multiple workers with uwsgi either as it needs sticky sessions. Documentation mentions that for gunicorn, but I did not interpret that to extend to uwsgi.
I have gunicorn serving a django application. Nginx is used as a reverse proxy. And supervisord is used to manage gunicorn.
This is the supervisord config:
command = /opt/backend/envs/backend/bin/gunicorn msd.wsgi:application --name backend --bind 13.134.82.143:8030 --workers 5 --timeout 300 --user backend --group backend --log-level info --log-file /opt/backend/logs/gunicorn.log
directory = /opt/backend/backend
user = backend
group = backend
stdout_logfile = /opt/backend/logs/supervisor.log
redirect_stderr = true
Sometimes gunicorn workers time out. After that, I expect gunicorn to automatically reload the dead ones.
However, the strange thing is under heavy load, some workers cannot get back up saying:
Can't connect to ('13.134.82.143', 8030)
I think that the workers that timed out are left as a zombie and occupying the ports.
What can I do in such cases?
This question already has answers here:
Why does running the Flask dev server run itself twice?
(7 answers)
Closed 8 years ago.
As far as I understood Flask should create a thread and a second thread to run on it, but what I see is there are always two processes, not threads, running.
Even for the simplest app.
from flask import Flask
from flask import render_template, request, flash, session, redirect
app = Flask(__name__)
#app.route('/')
def hello_world():
return 'Hello World!'
app.run(host="192.168.21.73", port=5000, debug=True)
You can see two process running:
ps -x
5026 ttyO0 S+ 0:01 /usr/bin/python ./test_flask.py
5031 ttyO0 Sl+ 0:45 /usr/bin/python ./test_flask.py
What is happening here?
It's because you're running the dev server with the reloader. The reloader monitors the filesystem for changes and starts the real app in a different process, so there are two total processes.
You can disable the reloader by settting debug=False or use_reloader=False when calling run.
I found this 0 dependency python websocket server from SO: https://gist.github.com/jkp/3136208
I am using gunicorn for my flask app and I wanted to run this websocket server using gunicorn also. In the last few lines of the code it runs the server with:
if __name__ == "__main__":
server = SocketServer.TCPServer(
("localhost", 9999), WebSocketsHandler)
server.serve_forever()
I cannot figure out how to get this websocketserver.py running in gunicorn. This is because one would think you would want gunicorn to run server_forever() as well as the SocketServer.TCPServer(....
Is this possible?
GUnicorn expects a WSGI application (PEP 333) not just a function. Your app has to accept an environ variable and a start_response callback and return an iterator of data (roughly speaking). All the machinery encapsuled by SocketServer.StreamRequestHandler is on gunicorn side. I imagine this is a lot of work to modify this gist to become a WSGI application (But that'll be fun!).
OR, maybe this library will get the job done for you: https://github.com/CMGS/gunicorn-websocket
If you use Flask-Sockets extension, you have a websocket implementation for gunicorn directly in the extension which make it possible to start with the following command line :
gunicorn -k flask_sockets.worker app:app
Though I don't know if that's what you want to do.