Serving a WSGI app endpoint in a separate thread?

Serving a WSGI app endpoint in a separate thread? - python

I have a WSGI application (it's a Flask app, but that should be irrelevant, I think) running under a Gunicorn server at port 9077. The app has a /status endpoint, which is supposed to report 'OK' if the app is running. If it fails to report OK within a reasonable time, the whole container gets killed (by Kubernetes).
The problem is this: when the app is under very heavy load (which does happen occasionally), the /status endpoint can take a while to respond and the container sometimes gets killed prematurely. Is there a way to configure Gunicorn to always serve the /status endpoint in a separate thread? Perhaps even on a different port? I would appreciate any hints or ideas for dealing with this situation.

never worked with Gunicorn, and im not sure if it supports this feature.
But with uWSGI, when i know that the app is going to be under a heavy load,
i run uwsgi with --processes (can also run in multithread mode or both)
uWSGI just spins up multiple instances of the flask app and act as a load balancer, no need for different ports, uwsgi takes care of everything.
You are not bound by GIL anymore and your app uses all the resources available on the machine.
documentation about uWSGI concurrency
a quick tutorial on how to setup a flask app, uWSGI and nginx (you can skip the nginx part)
here is an example of the config file i provide.
[uwsgi]
module = WSGI:app
master = true
processes = 16
die-on-term = true
socket = 0.0.0.0:8808
protocol = http
uwsgi --daemonize --ini my_uwsgi_conf.ini
I can easily achieve 1000 calls/sec when its running that way.
hope that helps.
ps: Another solution for you, just spin up more containers that are running your app.
And put them behind nginx to load-balance

Related

unable to use threadpoolexecutor in a flask app run with gunicorn preload flags

I have a flask app which i'm trying to front with gunicorn. I want to use the preload flag since my application has some scheduled jobs using apscheduler which i want only to run in the master and not the workers.
I also want to use the ThreadPoolExecutor in python to delegate jobs to the background triggered by a route on my app.
when I use the --preload flag with gunicorn any calls to my threadpoolexecutor (using executor.submit) seem to fail. The same seems to happen when i programatically trigger a job through the apscheduler.
When i don't use the --preload flag everything runs smoothly.
Is there some config i can change to get this working or would this not work with the --preload flag?

Handle concurrent requests or threading Flask SocketIO with eventlet

I’ve started working a lot with Flask SocketIO in Python with Eventlet and are looking for a solution to handle concurrent requests/threading. I’ve seen that it is possible with gevent, but how can I do it if I use eventlet?

The eventlet web server supports concurrency through greenlets, same as gevent. No need for you to do anything, concurrency is always enabled.

You could use gunicorn or its analogs to launch the app in production mode with several workers.
As said here:
gunicorn --worker-class eventlet -w 5 module:app
Where the number after -w is the number of workers, module is your flask-socketio server module, and app is the flask app (app = flask.Flask(__name__)). Each worker is a process busy with handling incoming requests, so you will have concurrency. If the tasks your app does take significant time, the worker doing that task will be irresponsive while doing it.
Note: if you launch your app this way, the if __name__ == '__main__': part will be ignored, it seems that your module will be imported. And you don't need to call app.run yourself in the module in this case

Unable to push flask-socketio app to heroku

I have been attempting to push my flask app running socketio to Heroku, but to no avail. I have narrowed it down to the Procfile. I am constantly getting 503 server errors because my program doesn't want to be connected to. I tested it locally and it works just fine.
I have had a couple versions of the Procfile, which are
web: gunicorn -b 0.0.0.0:$PORT app:userchat_manager
and
web: python userchat_manager.py
where the userchar_manager file holds the SocketIO.run() function to run the app. What would be the best way to fix this?
EDIT: I changed the Procfile to
web: gunicorn -b 0.0.0.0:$PORT app:app
and it loads. However, whenever I try to send a message, it doesn't send the message and I get a 400 code.

See the Deployment section of the documentation. The gunicorn web server is only supported when used alongside eventlet or gevent, and in both cases you have to use a single worker process.
If you want to drop gunicorn and run the native web server instead, you should code your userchat_manager.py script in a way that loads the port on which the server should listen from the PORT environment variable exposed by Heroku. If you go this route, I still think you should look into using eventlet or gevent, without using an asynchronous framework the performance is pretty bad (no WebSocket support), and the number of clients that can be connected at the same time is very limited (just one client per worker).

Try this:
web: gunicorn --worker-class eventlet -w 1 your_module:app
You don't need port to connect the socket, just use your heroku app url as socket connection withou :PORT.

Flask app run with Gunicorn async workers appears less efficient than dev server

I've got a web app developed in Flask. The setup is simple. The app is running on Gunicorn. All requests are proxied through the nginx. The Flask app itself makes HTTP requests to external API. The HTTP requests from the flask app to the external API are initiated by AJAX calls from the javascript code in the frontend. The external API returns data in JSON format to the Flask app and the back to the frontend.
The problem is that when I run this app in development mode with the option multithreaded = True I can see that the JSON data get returned asynchronously to the server and I can see the result on the frontend page very quickly.
However, when I try to run the app in production mode with nginx and gunicorn I see that the JSON data get returned sequentially - quit slowly, one by one. It seems that due to some reason the HTTP requests to the external API get blocked.
I use supervisor on linux Ubuntu Server 16.04. This is how I start gunicorn through supervisor:
command = /path/to/project/env/bin/gunicorn -k gevent --worker-connections 1000 wsgi:app -b localhost:8500
It seems that gunicorn does not handle the requests asynchronously, although it should.
As experiment I ran the Flask app using it's built in wsgi server (NOT gunicorn) in development mode, with debug=True and multithreaded=True. All requests were still proxied through the nginx. The JSON data returned much quicker, i.e. asynchronously (seems the calls did not block).
I read gunicorn's documentation. It says if I need to make calls to external API, then I should use async workers. I use them but it doesn't work.
All the caching stuff was taken into account. I may assume that I don't use any cache. I cleared it all when I checked the server setups.
What am I missing? How can I make gunicorn run as expected?
Thanks.

I actually solved this problem quite quickly and forgot to post the answer right away. The reason why the gunicorn server did not process the requests acynchronously as I would expect was very simple and stupid. Since I was managing gunicorn through the supervisor after I had changed the config to:
command = /path/to/project/env/bin/gunicorn -k gevent --worker-connections 1000 wsgi:app -b localhost:8500
I forgot to run:
sudo supervisorctl reread
sudo supervisorctl update
It's simple but not obvious though. My mistake was that I expected the config to update automatically after I restart my app on gunicorn using this command:
sudo supervisorctl restart my_app
Yes it restart the app, but not the config of gunicorn.

How can I serve a wsgi app on demand?

I have a small server on which I host the wsgi applications I write. This server does not have a lot of ram, and the applications are not frequently used and rarely more than one at once.
Is there a way to configure the server so that the applications are only started when they are needed (when I try to connect on the socket they're served on), somewhat like inetd does ?

depends on the server software you use.
if you use nginx + uwsgi for example, you can configure the uwsgi workers to only be created on requests and get destroyed after a certain amount of inactivity.
http://projects.unbit.it/uwsgi/wiki/Doc
look for "idle" "cheap" "cheaper"

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.