Apache reverse proxy performance issues - python

I have a python webapp that is called through Apache with mod_wsgi on a server behind our firewall, and we've configured a public-facing webserver to act as a reverse proxy to access the webapp. I've timed access to a reference page returned by the app via the following routes:
from the machine it is running on (localhost): ~300ms
from my workstation over VPN directly to the server: ~500ms
through the reverse proxy: ~2000ms!
One caveat is that the public-facing proxy is using HTTPS, the others are not, but I find it hard to believe that's causing a 1.5s hit. I'm not really a configuration wizard, so what should I look at to try to figure out what's causing this poor performance?

Related

Do I need Nginx with Gunicorn if I am not serving any static content?

In a typical python server setup it is recommended to have Nginx web server serve the static content and proxy the dynamic requests to Gunicorn app server.
Now if I am not serving any static content through my python application do I still need Nginx in front of Gunicorn ? What would be the advantages ?
Detailed explanation would be really appreciated.
All the static content is served through CDN and the backend server will only need to serve the APIs(REST). So when I will only server dynamic content, will I need to have Nginx ? Does it have any advantage in case of high load etc.
It is recommended in Gunicorn docs to run it behind a proxy server.
Technically, you don't really need Nginx.
BUT it's the Internet: your server will receive plenty of malformed HTTP requests which are made by bots and vulnerability scanner scripts. Now, your Gunicorn process will be busy parsing and dealing with these requests instead of serving genuine clients.
With Nginx in front, it will terminate these requests without forwarding to your Gunicorn backend.
Most of these bots make requests to your IP address, instead of your domain name. So it's really easy to configure Nginx to ignore requests made to IP address and only serve requests made to your domain. This is far more secure and faster than relying on Django's ALLOWED_HOSTS settings.
Also, it's much easier to find resources for Nginx about protecting your server, like blacklisting rogue IP addresses or user agents, etc. Compare these two google searches: nginx ban ip vs gunicorn ban ip. You can see Nginx search has more resources.
If you're worried about performance, then rest assured Nginx will not be the bottleneck. If you really want to optimise performance, database querying will be the first place to start.
No, I no longer deploy nginx specifically for the python app. I may have an application load balancer / nginx in the path to split requests to other apps, but not for load management. If using asyncio based systems, I typically don't even use an app server (uwsgi/gunicorn). This is including apps with very high throughput. Every layer of reverse-proxy / layer-7 load balancing will add a touch of latency- don't add it if you don't need it.
Even if you don't use NGINX for serving static assets putting gunicorn behind a proxy server is the recommended setup.
For example, putting gunicorn behind a proxy will allow to add some back pressure to your system in order to protect you from attacks such as Slowloris.

Set keep-alive timeout for Flask server with parameter [duplicate]

Setting up Flask with uWSGI and Nginx can be difficult. I tried following this DigitalOcean tutorial and still had trouble. Even with buildout scripts it takes time, and I need to write instructions to follow next time.
If I don't expect a lot of traffic, or the app is private, does it make sense to run it without uWSGI? Flask can listen to a port. Can Nginx just forward requests?
Does it make sense to not use Nginx either, just running bare Flask app on a port?
When you "run Flask" you are actually running Werkzeug's development WSGI server, and passing your Flask app as the WSGI callable.
The development server is not intended for use in production. It is not designed to be particularly efficient, stable, or secure. It does not support all the possible features of a HTTP server.
Replace the Werkzeug dev server with a production-ready WSGI server such as Gunicorn or uWSGI when moving to production, no matter where the app will be available.
The answer is similar for "should I use a web server". WSGI servers happen to have HTTP servers but they will not be as good as a dedicated production HTTP server (Nginx, Apache, etc.).
Flask documents how to deploy in various ways. Many hosting providers also have documentation about deploying Python or Flask.
First create the app:
import flask
app = flask.Flask(__name__)
Then set up the routes, and then when you want to start the app:
import gevent.pywsgi
app_server = gevent.pywsgi.WSGIServer((host, port), app)
app_server.serve_forever()
Call this script to run the application rather than having to tell gunicorn or uWSGI to run it.
I wanted the utility of Flask to build a web application, but had trouble composing it with other elements. I eventually found that gevent.pywsgi.WSGIServer was what I needed. After the call to app_server.serve_forever(), call app_server.stop() when to exit the application.
In my deployment, my application is listening on localhost:port using Flask and gevent, and then I have Nginx reverse-proxying HTTPS requests to it.
You definitely need something like a production WSGI server such as Gunicorn, because the development server of Flask is meant for ease of development without much configuration for fine-tuning and optimization.
Eg. Gunicorn has a variety of configurations depending on the use case you are trying to solve. But the development flask server does not have these capabilities. In addition, these development servers show their limitations as soon as you try to scale and handle more requests.
With respect to needing a reverse proxy server such as Nginx is concerned it depends on your use case.
If you are deploying your application behind the latest load balancer in AWS such as an application load balancer(NOT classic load balancer), that itself will suffice for most use cases. No need to take effort into setting up NGINX if you have that option.
The purpose of a reverse proxy is to handle slow clients, meaning clients which take time to send the request. These reverse load balancers buffer the requests till the entire request is got from the clients and send them async to Gunicorn. This improves the performance of your application considerably.

Understanding each component of a web application architecture

Here is a scenario for a system where I am trying to understand what is what:
I'm Joe, a novice programmer and I'm broke. I've got a Flask app and one physical machine. Since I'm broke, I cannot afford another machine for each piece of my system, thus the web server, application and database all live on my one machine.
I've never deployed an app before, but I know that a server can refer to a machine or software. From here on, lets call the physical machine the Rack. I've loaded an instance of MongoDB on my machine and I know that is the Database Server. In order to handle API requests, I need something on the rack that will handle HTTP/S requests, so I install and run an instance of NGINX on it and I know that this is the Web Server. However, my web server doesnt know how to run the app, so I do some research and learn about WSGI and come to find out I need another component. So I install and run an instance of Gunicorn and I know that this is the WSGI Server.
At this point I have a rack that is home to a web server to handle API calls (really just acts as a reverse proxy and pushes requests to the WSGI server), a WSGI server that serves up dynamic content from my app and a database server that stores client information used by the app.
I think I've got my head on straight, then my friend asks "Where is your Application Server?"
Is there an application server is this configuration? Do I need one?
Any basic server architecture has three layers. On one end is the web server, which fulfills requests from clients. The other end is the database server, where the data resides.
In between these two is the application server. It consists of the business logic required to interact with the web server to receive the request, and then with the database server to perform operations.
In your configuration, the WSGI serve/Flask app is the application server.
Most application servers can double up as web servers.

Are a WSGI server and HTTP server required to serve a Flask app?

Setting up Flask with uWSGI and Nginx can be difficult. I tried following this DigitalOcean tutorial and still had trouble. Even with buildout scripts it takes time, and I need to write instructions to follow next time.
If I don't expect a lot of traffic, or the app is private, does it make sense to run it without uWSGI? Flask can listen to a port. Can Nginx just forward requests?
Does it make sense to not use Nginx either, just running bare Flask app on a port?
When you "run Flask" you are actually running Werkzeug's development WSGI server, and passing your Flask app as the WSGI callable.
The development server is not intended for use in production. It is not designed to be particularly efficient, stable, or secure. It does not support all the possible features of a HTTP server.
Replace the Werkzeug dev server with a production-ready WSGI server such as Gunicorn or uWSGI when moving to production, no matter where the app will be available.
The answer is similar for "should I use a web server". WSGI servers happen to have HTTP servers but they will not be as good as a dedicated production HTTP server (Nginx, Apache, etc.).
Flask documents how to deploy in various ways. Many hosting providers also have documentation about deploying Python or Flask.
First create the app:
import flask
app = flask.Flask(__name__)
Then set up the routes, and then when you want to start the app:
import gevent.pywsgi
app_server = gevent.pywsgi.WSGIServer((host, port), app)
app_server.serve_forever()
Call this script to run the application rather than having to tell gunicorn or uWSGI to run it.
I wanted the utility of Flask to build a web application, but had trouble composing it with other elements. I eventually found that gevent.pywsgi.WSGIServer was what I needed. After the call to app_server.serve_forever(), call app_server.stop() when to exit the application.
In my deployment, my application is listening on localhost:port using Flask and gevent, and then I have Nginx reverse-proxying HTTPS requests to it.
You definitely need something like a production WSGI server such as Gunicorn, because the development server of Flask is meant for ease of development without much configuration for fine-tuning and optimization.
Eg. Gunicorn has a variety of configurations depending on the use case you are trying to solve. But the development flask server does not have these capabilities. In addition, these development servers show their limitations as soon as you try to scale and handle more requests.
With respect to needing a reverse proxy server such as Nginx is concerned it depends on your use case.
If you are deploying your application behind the latest load balancer in AWS such as an application load balancer(NOT classic load balancer), that itself will suffice for most use cases. No need to take effort into setting up NGINX if you have that option.
The purpose of a reverse proxy is to handle slow clients, meaning clients which take time to send the request. These reverse load balancers buffer the requests till the entire request is got from the clients and send them async to Gunicorn. This improves the performance of your application considerably.

Is there any thing needed for https python web page

I been using python to create an web app and it has been doing well so far. Now I would like to encrypt the transmission of the data between client and server using https. The communication is generally just post form and web pages, no money transactions are involve. Is there anything I need to change to the python code except setting the server up with certificate and configurate it to use https? I see a lot of information regarding ssl for python and I not sure if I need those modules and python setup to make https work.
Thanks
Typically, the ssl part for Python web app is managed by some frontend web server like nginx, apache or so.
This does not require any modification of your code (assuming, you are not expecting user to authenticate by ssl certificate on client side, what is quite exotic, but possible scenario).
If you want to run pure Python solution, I would recommend using cherrypy, which is able providing rather reliable and performant web server part (it will be very likely slower then served behind nginx or apache).

Categories