When you use Django with mod_wsgi, what exactly happens when a user makes a request to the server from a browser? Does apache load up your Django app when it starts and have it running in a separate process? Does it create a new Python process for every HTTP request?
In embedded mode, the Django app is part of the httpd worker. In daemon mode, the Django app is a separate process and the httpd worker communicates with it over a socket. In either case, the WSGI interface is the same.
Related
I am building a system that was some components that will be run in its own process or thread. They need to communicate with each other. One of those components is a Django application, the internal communication with the Django app will not be done through HTTP. Looking for networking libraries I found Twisted (awesome library!), reading its documentation I found that Twisted implements the WSGI specification too, so I thought its Web server could serve WSGI applications like Django. Following the docs I come with the following script to serve the Django app:
from twisted.web import server
from twisted.internet import reactor, endpoints
from twisted.web.wsgi import WSGIResource
from twisted.python.threadpool import ThreadPool
from mysite.wsgi import application as django_application
# Create and start a thread pool to handle incoming HTTP requests
djangoweb_threadpool = ThreadPool()
djangoweb_threadpool.start()
# Cleanup the threads when Twisted stops
reactor.addSystemEventTrigger('after', 'shutdown', djangoweb_threadpool.stop)
# Setup a twisted Service that will run the Django web app
djangoweb_request_handler = server.Site(WSGIResource(reactor, djangoweb_threadpool, django_application))
djangoweb_server = endpoints.TCP4ServerEndpoint(reactor, 8000)
djangoweb_server.listen(djangoweb_request_handler)
reactor.run()
Save it in a file like runserver.py in the same directory of manage.py, you can start the WSGI server by running python runserver.py.
I made a django view that does a blocking call to time.sleep() to test it, it worked fine. Since it's multithread, it did not block other requests. So I think it works well with the synchronous Django code. I could setup another service with a custom protocol as a gateway for internal communication.
1) Does that script properly loads the Django app? It will work the same way as other WSGI servers like gunicorn and uwsgi?
2) Will that threads be run in parallel?
hendrix is a project that lets you run django via twisted. It looks like it can run other twisted services if desired (https://hendrix.readthedocs.io/en/latest/deploying-other-services/).
If you're in the early stages of developement, consider klein. It's more akin to flask than django though.
Setting up Flask with uWSGI and Nginx can be difficult. I tried following this DigitalOcean tutorial and still had trouble. Even with buildout scripts it takes time, and I need to write instructions to follow next time.
If I don't expect a lot of traffic, or the app is private, does it make sense to run it without uWSGI? Flask can listen to a port. Can Nginx just forward requests?
Does it make sense to not use Nginx either, just running bare Flask app on a port?
When you "run Flask" you are actually running Werkzeug's development WSGI server, and passing your Flask app as the WSGI callable.
The development server is not intended for use in production. It is not designed to be particularly efficient, stable, or secure. It does not support all the possible features of a HTTP server.
Replace the Werkzeug dev server with a production-ready WSGI server such as Gunicorn or uWSGI when moving to production, no matter where the app will be available.
The answer is similar for "should I use a web server". WSGI servers happen to have HTTP servers but they will not be as good as a dedicated production HTTP server (Nginx, Apache, etc.).
Flask documents how to deploy in various ways. Many hosting providers also have documentation about deploying Python or Flask.
First create the app:
import flask
app = flask.Flask(__name__)
Then set up the routes, and then when you want to start the app:
import gevent.pywsgi
app_server = gevent.pywsgi.WSGIServer((host, port), app)
app_server.serve_forever()
Call this script to run the application rather than having to tell gunicorn or uWSGI to run it.
I wanted the utility of Flask to build a web application, but had trouble composing it with other elements. I eventually found that gevent.pywsgi.WSGIServer was what I needed. After the call to app_server.serve_forever(), call app_server.stop() when to exit the application.
In my deployment, my application is listening on localhost:port using Flask and gevent, and then I have Nginx reverse-proxying HTTPS requests to it.
You definitely need something like a production WSGI server such as Gunicorn, because the development server of Flask is meant for ease of development without much configuration for fine-tuning and optimization.
Eg. Gunicorn has a variety of configurations depending on the use case you are trying to solve. But the development flask server does not have these capabilities. In addition, these development servers show their limitations as soon as you try to scale and handle more requests.
With respect to needing a reverse proxy server such as Nginx is concerned it depends on your use case.
If you are deploying your application behind the latest load balancer in AWS such as an application load balancer(NOT classic load balancer), that itself will suffice for most use cases. No need to take effort into setting up NGINX if you have that option.
The purpose of a reverse proxy is to handle slow clients, meaning clients which take time to send the request. These reverse load balancers buffer the requests till the entire request is got from the clients and send them async to Gunicorn. This improves the performance of your application considerably.
Setting up Flask with uWSGI and Nginx can be difficult. I tried following this DigitalOcean tutorial and still had trouble. Even with buildout scripts it takes time, and I need to write instructions to follow next time.
If I don't expect a lot of traffic, or the app is private, does it make sense to run it without uWSGI? Flask can listen to a port. Can Nginx just forward requests?
Does it make sense to not use Nginx either, just running bare Flask app on a port?
When you "run Flask" you are actually running Werkzeug's development WSGI server, and passing your Flask app as the WSGI callable.
The development server is not intended for use in production. It is not designed to be particularly efficient, stable, or secure. It does not support all the possible features of a HTTP server.
Replace the Werkzeug dev server with a production-ready WSGI server such as Gunicorn or uWSGI when moving to production, no matter where the app will be available.
The answer is similar for "should I use a web server". WSGI servers happen to have HTTP servers but they will not be as good as a dedicated production HTTP server (Nginx, Apache, etc.).
Flask documents how to deploy in various ways. Many hosting providers also have documentation about deploying Python or Flask.
First create the app:
import flask
app = flask.Flask(__name__)
Then set up the routes, and then when you want to start the app:
import gevent.pywsgi
app_server = gevent.pywsgi.WSGIServer((host, port), app)
app_server.serve_forever()
Call this script to run the application rather than having to tell gunicorn or uWSGI to run it.
I wanted the utility of Flask to build a web application, but had trouble composing it with other elements. I eventually found that gevent.pywsgi.WSGIServer was what I needed. After the call to app_server.serve_forever(), call app_server.stop() when to exit the application.
In my deployment, my application is listening on localhost:port using Flask and gevent, and then I have Nginx reverse-proxying HTTPS requests to it.
You definitely need something like a production WSGI server such as Gunicorn, because the development server of Flask is meant for ease of development without much configuration for fine-tuning and optimization.
Eg. Gunicorn has a variety of configurations depending on the use case you are trying to solve. But the development flask server does not have these capabilities. In addition, these development servers show their limitations as soon as you try to scale and handle more requests.
With respect to needing a reverse proxy server such as Nginx is concerned it depends on your use case.
If you are deploying your application behind the latest load balancer in AWS such as an application load balancer(NOT classic load balancer), that itself will suffice for most use cases. No need to take effort into setting up NGINX if you have that option.
The purpose of a reverse proxy is to handle slow clients, meaning clients which take time to send the request. These reverse load balancers buffer the requests till the entire request is got from the clients and send them async to Gunicorn. This improves the performance of your application considerably.
I have a web app setup like this:
nginx <--> gunicorn <--> flask
I believe nginx can serve a lot of concurrent connection. But I heard that from the WSGI gateway to the flask app, it's blocking. I.e. only a single request can be served at a time. I read it here. My question is: why couldn't nginx invoke another instance (not sure if this is the right term) of gunicorn and handle multiple request in parallel?
This is simply not true, Gunicorn (and all other WSGI app servers) can (and should) be configured to use multiple threads, processes, or eventlets depending on the specific WSGI server's concurrency model. Each thread (or eventlet) in each process dispatches one request at a time to the app it is running.
Nginx does nothing to launch the first, or any subsequent, WSGI processes. You start the WSGI server, configured correctly, and it handles the concurrency. Nginx dispatches requests as concurrently as it can to whatever app it is configured to proxy to.
My django application is running on apache+wsgi. One of the module in my django app needs to load a Java library via jpype and this Java library takes long time to initialize due to its application nature.
The problem is that, for each http request handled by django in apache+wsgi setup, this Java library is re-loaded. However, this does not happen when I run my same app in development web server (python manager.py runserver 8000). In development web server, it only loads the Java library only once.
Is there any way to change apache or mod_wsgi configuration or my django app so that it won't reload my Java library for every http request?
Many thanks.
Andy
You are possibly just getting confused and are actually using as poor Apache/mod_wsgi configuration. Specifically, you are likely using embedded mode with Apache prefork MPM. That is bad because Apache will use lots of single thread processes and so the code has to be loaded in all of them. That is why you probably think it is happening on each request against the same process, where in reality, each request is hitting a different process.
Ensure you are using daemon mode of mod_wsgi and that your code is thread safe and so use single multithreaded process and it shouldn't have the issue.
Edit your question and add your Apache/mod_wsgi configuration snippets from Apache configuration file and state what Apache MPM you are using.