Deploying Django (fastcgi, apache mod_wsgi, uwsgi, gunicorn) - python

Can someone explain the difference between apache mod_wsgi in daemon mode and django fastcgi in threaded mode. They both use threads for concurrency I think.
Supposing that I'm using nginx as front end to apache mod_wsgi.
UPDATE:
I'm comparing django built in fastcgi(./manage.py method=threaded maxchildren=15) and mod_wsgi in 'daemon' mode(WSGIDaemonProcess example threads=15). They both use threads and acquire GIL, am I right?
UPDATAE 2:
So if they both are similar, is there any benefits of apache mod_wsgi against fastcgi. I see such pros of fastcgi:
we don't need apache
we consume less RAM
I noticed that fastcgi has lesser overhead
UPDATAE 3:
I'm now happy with nginx + uwsgi.
UPDATAE 4:
I'm now happy with nginx + gunicorn :)

Neither have to use threads to be able to handle concurrent requests. It depends on how you configure them. You can use multiple processes where each is single threaded if you want.
For more background on mod_wsgi process/threading models see:
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
The models are similar albeit that mod_wsgi handles process management itself. What happens in FASTCGI as far as process management depends on what FASTCGI hosting mechanism you are using and you don't say what that is.
Another difference is that FASTCGI still needs a separate FASTCGI to WSGI bridge such as flup where as mod_wsgi doesn't need any sort of bridge as implements WSGI interface natively.
Finally, FASTCGI process are an exec/fork of some supervisor process or the web server, dependent on hosting mechanism. In mod_wsgi the processes are a fork only of Apache parent process. In general this doesn't matter too much but does have some implications.
There are other differences but they arise more because mod_wsgi offers a lot more functionality and configurability than a FASTCGI hosting mechanism does.
Anyway, the question is a bit vague, can you be more specific about what it is you are wanting to know or contrast between the two and why? Answer can then perhaps be targeted better.

Related

Openshift Python multiple httpd instances

I have a Python web application (using WSGI) deployed on Openshift. The application is quite memory greedy. What I have noticed is that there are several instance of Apache httpd service deployed at all times. That means the memory usage of my gear is multiplied by the number of these processes and the application crashes pretty often.
I don't have lots of traffic yet, so there is no need to have multiple httpd running.
Is there any way to configure Python cartridge to limit it to a single httpd process?
If you are using the OpenShift Python cartridge and its default setup, only two of those processes should actually have copies of your application running in it. The other httpd processes are the parent monitor process and the Apache child worker processes which will proxy requests to the processes which are actually running your web application.
If you need control to reduce it down to one process, then you would need to follow:
http://blog.dscpl.com.au/2015/01/using-alternative-wsgi-servers-with.html
to override the standard setup and use mod_wsgi-express instead. This will default to using one process for your application and allow you to control both number of processes and threads for the application processes.
If you are seeing lots of memory use, then it could just be your application code, or there is an outside chance you are seeing memory issues due to use of older mod_wsgi as there are some odd corner cases which can cause extra memory usage because of how Apache works. If you use mod_wsgi-express it will use the latest and avoid those problems.
So try mod_wsgi-express and if still have memory issues, suggest you get on the mod_wsgi mailing list to get help debugging it.

Maintaining a (singleton) process with mod_wsgi?

I have a python web.py app with long (minutes) start-up time that I'd like to host with in Apache with mod_wsgi.
The long-term answer may be "rewrite the app." But in the short term I'd like to configure mod_wsgi to:
Use a single process to serve the app (I can do this with WSGIDaemonProcess processes=1),
and
Keep using that process without killing it off periodically
Is #2 doable? Or, are there other stopgap solutions I can use to host this app?
Thanks!
Easy. Don't restart Apache, don't set maximum-requests and don't change the code in the WSGI script file.
Are you saying that you are seeing restarts even when you leave Apache completely untouched?
And yes it sounds like you should be re-architecting your system. A web process that takes that long to startup is crazy.

Could somebody give me a high-level technical overview of WSGI details behind the scenes vs other web interface approaces with Python?

Firstly:
I understand what WSGI is and how to use it
I understand what "other" methods (Apache mod-python, fcgi, et al) are, and how to use them
I understand their practical differences
What I don't understand is how each of the various "other" methods work compared to something like UWSGI, behind the scenes. Does your server (Nginx, etc) route the request to your WSGI application and UWSGI creates a new Python interpreter for each request routed to it? How much different is is from the other more traditional / monkey patched methods is WSGI (aside from the different, easier Python interface that WSGI offers)? What light bulb moment am I missing?
Except for CGI, a new Python interpreter is nearly never created per request. Read:
http://blog.dscpl.com.au/2009/03/python-interpreter-is-not-created-for.html
This was written in respect of mod_python but also applies to mod_wsgi and any WSGI hosting mechanism that uses persistent processes.
Also read:
http://www.python.org/dev/peps/pep-0333/#environ-variables
There you will find described the 'wsgi.run_once' variable described. This is used to indicate to a WSGI application when a hosting mechanism is used which would see a process only handling one request and then being exited, ie., CGI. Thus, write a test hello world application which dumps out the WSGI environment and see what it is set to for what you are using.
Also pay attention to the 'wsgi.multiprocess' and 'wsgi.multithread' variables. They tell you if a multi process server is being used such that there are multiple instances of your application handling requests at the same time. The 'wsgi.multithread' variable tells you if the process itself is handling multiple requests in concurrent threads in same process.
For more on multiprocess and multithread models in relation to Apache embedded systems, such as mod_python and mod_wsgi, and mod_wsgi daemon mode, see:
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading

Better webserver performance for Python Django: Apache mod_wsgi or Lighttpd fastcgi

I am currently running a high-traffic python/django website using Apache and mod_wsgi. I'm hoping that there's a faster webserver configuration out there, and I've heard a fair number of recommendations for lighttpd and fastcgi. Is this setup faster than apache+mod_wsgi for serving dynamic django pages (I'm already convinced that lighttpd can server static files better)? The benchmarks online are either poorly conducted or inconclusive so I'm looking for some personal anecdotes. What architectural benefits does lighttpd + fastcgi provide? I understand that lighttpd uses epoll, and that a fastcgi process will be multithreaded. Also, having two separate processes, one for lighttpd and one for the python interpreter, will be largely beneficial.
I am aware of tornado and its ability to handle thousands of file descriptors with much fewer threads using epoll and callbacks. However, I'd prefer to stick with django for now.
Thanks,
Ken
I suggest nginx with superfcgi for web sites with high load. nginx is very fast for static files. superfcgi uses multiple processes with multiple threads that shows high stability for python applications in spite of GIL, just set number of processes to number of CPU cores at your server.
I don't have thorough benchmarks, but I'm personally convinced that, just like lighttpd can outperform apache on simpler tasks, mod_wsgi gives apache the laurel when it comes to serving Python web apps. (nginx with its own mod_wsgi seems to perform even better than apache, but, hey, you didn't ask about that!-).
Doesn't answer you question, but do you already use caching for your site? Like memcached? This might give you a better performance gain than going through the mess of switching webservers.
you can try fcgid. https://github.com/chenyf/fcgid, it's a C++ fastcgi server

In production, Apache + mod_wsgi or Nginx + mod_wsgi?

What to use for a medium to large python WSGI application, Apache + mod_wsgi or Nginx + mod_wsgi?
Which combination will need more memory and CPU time?
Which one is faster?
Which is known for being more stable than the other?
I am also thinking to use CherryPy's WSGI server but I hear it's not very suitable for a very high-load application, what do you know about this?
Note: I didn't use any Python Web Framework, I just wrote the whole thing from scratch.
Note': Other suggestions are also welcome.
For nginx/mod_wsgi, ensure you read:
http://blog.dscpl.com.au/2009/05/blocking-requests-and-nginx-version-of.html
Because of how nginx is an event driven system underneath, it has behavioural characteristics which are detrimental to blocking applications such as is the case with WSGI based applications. Worse case scenario is that with multiprocess nginx configuration, you can see user requests be blocked even though some nginx worker processes may be idle. Apache/mod_wsgi doesn't have this issue as Apache processes will only accept requests when it has the resources to actually handle the request. Apache/mod_wsgi will thus give more predictable and reliable behaviour.
The author of nginx mod_wsgi explains some differences to Apache mod_wsgi in this mailing list message.
The main difference is that nginx is built to handle large numbers of connections in a much smaller memory space. This makes it very well suited for apps that are doing comet like connections that can have many idle open connections. This also gives it quite a smaller memory foot print.
From a raw performance perspective, nginx is faster, but not so much faster that I would include that as a determining factor.
Apache has the advantage in the area of modules available, and the fact that it is pretty much standard. Any web host you go with will have it installed, and most techs are going to be very familiar with it.
Also, if you use mod_wsgi, it is your wsgi server so you don't even need cherrypy.
Other than that, the best advice I can give is try setting up your app under both and do some benchmarking, since no matter what any one tells you, your mileage may vary.
One thing that CherryPy's webserver has going for it is that it's a pure python webserver (AFAIK), which may or may not make deployment easier for you. Plus, I could see the benefits of using it if you're just using a server for WSGI and static content.
(shameless plug warning: I wrote the WSGI code that I'm about to mention)
Kamaelia will have WSGI support coming in the next release. The cool thing is that you'll likely be able to either use the pre-made one or build your own using the existing HTTP and WSGI code.
(end shameless plug)
With that said, given the current options, I would personally probably go with CherryPy because it seems to be the simplest to configure and I can understand python code moreso than I can understand C code.
You may do best to try each of them out and see what the pros and cons of each one are for your specific application though.

Categories