Reducing Django Memory Usage. Low hanging fruit?

Reducing Django Memory Usage. Low hanging fruit? - python

My memory usage increases over time and restarting Django is not kind to users.
I am unsure how to go about profiling the memory usage but some tips on how to start measuring would be useful.
I have a feeling that there are some simple steps that could produce big gains. Ensuring 'debug' is set to 'False' is an obvious biggie.
Can anyone suggest others? How much improvement would caching on low-traffic sites?
In this case I'm running under Apache 2.x with mod_python. I've heard mod_wsgi is a bit leaner but it would be tricky to switch at this stage unless I know the gains would be significant.
Edit: Thanks for the tips so far. Any suggestions how to discover what's using up the memory? Are there any guides to Python memory profiling?
Also as mentioned there's a few things that will make it tricky to switch to mod_wsgi so I'd like to have some idea of the gains I could expect before ploughing forwards in that direction.
Edit: Carl posted a slightly more detailed reply here that is worth reading: Django Deployment: Cutting Apache's Overhead
Edit: Graham Dumpleton's article is the best I've found on the MPM and mod_wsgi related stuff. I am rather disappointed that no-one could provide any info on debugging the memory usage in the app itself though.
Final Edit: Well I have been discussing this with Webfaction to see if they could assist with recompiling Apache and this is their word on the matter:
"I really don't think that you will get much of a benefit by switching to an MPM Worker + mod_wsgi setup. I estimate that you might be able to save around 20MB, but probably not much more than that."
So! This brings me back to my original question (which I am still none the wiser about). How does one go about identifying where the problems lies? It's a well known maxim that you don't optimize without testing to see where you need to optimize but there is very little in the way of tutorials on measuring Python memory usage and none at all specific to Django.
Thanks for everyone's assistance but I think this question is still open!
Another final edit ;-)
I asked this on the django-users list and got some very helpful replies
Honestly the last update ever!
This was just released. Could be the best solution yet: Profiling Django object size and memory usage with Pympler

Make sure you are not keeping global references to data. That prevents the python garbage collector from releasing the memory.
Don't use mod_python. It loads an interpreter inside apache. If you need to use apache, use mod_wsgi instead. It is not tricky to switch. It is very easy. mod_wsgi is way easier to configure for django than brain-dead mod_python.
If you can remove apache from your requirements, that would be even better to your memory. spawning seems to be the new fast scalable way to run python web applications.
EDIT: I don't see how switching to mod_wsgi could be "tricky". It should be a very easy task. Please elaborate on the problem you are having with the switch.

If you are running under mod_wsgi, and presumably spawning since it is WSGI compliant, you can use Dozer to look at your memory usage.
Under mod_wsgi just add this at the bottom of your WSGI script:
from dozer import Dozer
application = Dozer(application)
Then point your browser at http://domain/_dozer/index to see a list of all your memory allocations.
I'll also just add my voice of support for mod_wsgi. It makes a world of difference in terms of performance and memory usage over mod_python. Graham Dumpleton's support for mod_wsgi is outstanding, both in terms of active development and in helping people on the mailing list to optimize their installations. David Cramer at curse.com has posted some charts (which I can't seem to find now unfortunately) showing the drastic reduction in cpu and memory usage after they switched to mod_wsgi on that high traffic site. Several of the django devs have switched. Seriously, it's a no-brainer :)

These are the Python memory profiler solutions I'm aware of (not Django related):
Heapy
pysizer (discontinued)
Python Memory Validator (commercial)
Pympler
Disclaimer: I have a stake in the latter.
The individual project's documentation should give you an idea of how to use these tools to analyze memory behavior of Python applications.
The following is a nice "war story" that also gives some helpful pointers:
Reducing the footprint of python applications

Additionally, check if you do not use any of known leakers. MySQLdb is known to leak enormous amounts of memory with Django due to bug in unicode handling. Other than that, Django Debug Toolbar might help you to track the hogs.

In addition to not keeping around global references to large data objects, try to avoid loading large datasets into memory at all wherever possible.
Switch to mod_wsgi in daemon mode, and use Apache's worker mpm instead of prefork. This latter step can allow you to serve many more concurrent users with much less memory overhead.

Webfaction actually has some tips for keeping django memory usage down.
The major points:
Make sure debug is set to false (you already know that).
Use "ServerLimit" in your apache config
Check that no big objects are being loaded in memory
Consider serving static content in a separate process or server.
Use "MaxRequestsPerChild" in your apache config
Find out and understand how much memory you're using

Another plus for mod_wsgi: set a maximum-requests parameter in your WSGIDaemonProcess directive and mod_wsgi will restart the daemon process every so often. There should be no visible effect for the user, other than a slow page load the first time a fresh process is hit, as it'll be loading Django and your application code into memory.
But even if you do have memory leaks, that should keep the process size from getting too large, without having to interrupt service to your users.

Here is the script I use for mod_wsgi (called wsgi.py, and put in the root off my django project):
import os
import sys
import django.core.handlers.wsgi
from os import path
sys.stdout = open('/dev/null', 'a+')
sys.stderr = open('/dev/null', 'a+')
sys.path.append(path.join(path.dirname(__file__), '..'))
os.environ['DJANGO_SETTINGS_MODULE'] = 'myproject.settings'
application = django.core.handlers.wsgi.WSGIHandler()
Adjust myproject.settings and the path as needed. I redirect all output to /dev/null since mod_wsgi by default prevents printing. Use logging instead.
For apache:
<VirtualHost *>
ServerName myhost.com
ErrorLog /var/log/apache2/error-myhost.log
CustomLog /var/log/apache2/access-myhost.log common
DocumentRoot "/var/www"
WSGIScriptAlias / /path/to/my/wsgi.py
</VirtualHost>
Hopefully this should at least help you set up mod_wsgi so you can see if it makes a difference.

Caches: make sure they're being flushed. Its easy for something to land in a cache, but never be GC'd because of the cache reference.
Swig'd code: Make sure any memory management is being done correctly, its really easy to miss these in python, especially with third party libraries
Monitoring: If you can, get data about memory usage and hits. Usually you'll see a correlation between a certain type of request and memory usage.

We stumbled over a bug in Django with big sitemaps (10.000 items). Seems Django is trying to load them all in memory when generating the sitemap: http://code.djangoproject.com/ticket/11572 - effectively kills the apache process when Google pays a visit to the site.

Related

Django memory leak: possible causes?

I've a Django application that every so often is getting into memory leak.
I am not using large data that could overload the memory, in fact the application 'eats' memory incrementally (in a week the memory goes from ~ 70 MB to 4GB), that is why I suspect the garbage collector is missing something, I am not sure though. Also, it seems as this increment is not dependant of the number of requests.
Obvious things like DEBUG=True, leaving open files, etc... no apply here.
I'm using uWSGI 2.0.3 (+ nginx) and Django 1.4.5
I could set up wsgi so that restart the server when the memory exceeds certain limit, but I wouldn't like to do that since that is not a solution really.
Are there any well know situations where the garbage collector "doesn't do its work properly"? Could it provide some code examples?
Is there any configuration of uWSGI + Django that could cause this?

I haven't found the exact stuff I'm looking for (each project is a world!), but following some clues and advices I managed to solve the issue. I share with you a few links that could help if you are facing a similar problem.
django memory leaks, part I, django memory leaks, part II and Finding and fixing memory leaks in Python
Some useful SO answers/questions:
Which Python memory profiler is recommended?, Is there any working memory profiler for Python3, Python memory leaks and Python: Memory leak debugging
Update
pyuwsgimemhog is a new tool that helps to find out where the leak is.

Django doesn't have known memory leak issues.
I had a similar memory issue. I found that there is a slow SQL causing a high DB CPU percentage. The memory issue is fixed after I fixed the slow SQL.

Tricks to improve performance of python backend

I am using python programs to nearly everything:
deploy scripts
nagios routines
website backend (web2py)
The reason why I am doing this is because I can reuse the code to provide different kind of services.
Since a while ago I have noticed that those scripts are putting a high CPU load on my servers. I have taken several steps to mitigate this:
late initialization, using cached_property (see here and here), so that only those objects needed are indeed initialized (including import of the related modules)
turning some of my scripts into http services (with a simple web.py implementation, wrapping-up my classes). The services are then triggered (by nagios for example), with simple curl calls.
This has reduced the load dramatically, going from over 20 CPU load to well under 1. It seems python startup is very resource intensive, for complex programs with lots of inter-dependencies.
I would like to know what other strategies are people here implementing to improve the performance of python software.

An easy one-off improvement is to use PyPy instead of the standard CPython for long-lived scripts and daemons (for short-lived scripts it's unlikely to help and may actually have longer startup times). Other than that, it sounds like you've already hit upon one of the biggest improvements for short-lived system scripts, which is to avoid the overhead of starting the Python interpreter for frequently-invoked scripts.
For example, if you invoke one script from another and they're both in Python you should definitely consider importing the other script as a module and calling its functions directly, as opposed to using subprocess or similar.
I appreciate that it's not always possible to do this, since some use-cases rely on external scripts being invoked - Nagios checks, for example, are going to be tricky to keep resident at all times. Your approach of making the actual check script a simple HTTP request seems reasonable enough, but the approach I took was to use passive checks and run an external service to periodically update the status. This allows the service generating check results to be resident as a daemon rather than requiring Nagios to invoke a script for each check.
Also, watch your system to see whether the slowness really is CPU overload or IO issues. You can use utilities like vmstat to watch your IO usage. If you're IO bound then optimising your code won't necessarily help a lot. In this case, if you're doing something like processing lots of text files (e.g. log files) then you can store them gzipped and access them directly using Python's gzip module. This increases CPU load but reduces IO load because you only need transfer the compressed data from disk. You can also write output files directly in gzipped format using the same approach.
I'm afraid I'm not particularly familiar with web2py specifically, but you can investigate whether it's easy to put a caching layer in front if the freshness of the data isn't totally critical. Try and make sure both your server and clients use conditional requests correctly, which will reduce request processing time. If they're using a back-end database, you could investigate whether something like memcached will help. These measures are only likely to give you real benefit if you're experiencing a reasonably high volume of requests or if each request is expensive to handle.
I should also add that generally reducing system load in other ways can occasionally give surprising benefits. I used to have a relatively small server running Apache and I found moving to nginx helped a surprising amount - I believe it was partly more efficient request handling, but primarily it freed up some memory that the filesystem cache could then use to further boost IO-bound operations.
Finally, if overhead is still a problem then carefully profile your most expensive scripts and optimise the hotspots. This could be improving your Python code, or it could mean pushing code out to C extensions if that's an option for you. I've had some great performance by pushing data-path code out into C extensions for large-scale log processing and similar tasks (talking about hundreds of GB of logs at a time). However, this is a heavy-duty and time-consuming approach and should be reserved for the few places where you really need the speed boost. It also depends whether you have someone available who's familiar enough with C to do it.

Using Django Caching with Mod_WSGI

Is anyone aware of any issues with Django's caching framework when deployed to Apache/Mod_WSGI?
When testing with the caching framework locally with the dev server, using the profiling middleware and either the FileBasedCache or LocMemCache, Django's very fast. My request time goes from ~0.125 sec to ~0.001 sec. Fantastic.
I deploy the identical code to a remote machine running Apache/Mod_WSGI and my request time goes from ~0.155 sec (before I deployed the change) to ~.400 sec (post deployment). That's right, caching slowed everything down.
I've spent hours digging through everything, looking for something I'm missing. I've tried using FileBasedCache with a location on tmpfs, but that also failed to improve performance.
I've monitored the remote machine with top, and it shows no other processes and it has 6GB available memory, so basically Django should have full rein. I love Django, but it's incredibly slow, and so far I've never been able to get the caching framework to make any noticeable impact in a production environment. Is there anything I'm missing?
EDIT: I've also tried memcached, with the same result. I confirmed memcached was running by telneting into it.

Indeed django is slow. But I must say most of the slowness goes from app itself.. django just forces you (bu providing bad examples in docs) to do lazy thing that are slow in production.
First of: try nginx + uwsgi. it is just the best.
To optimize you app: you need to find you what is causing slowness, it can be:
slow database queries (a lot of queries or just slow queries)
slow database itself
slow filesystem (nfs for example)
Try logging request queries and watch iostat or iotop or something like that.
I had this scenario with apache+mod_wsgi: first request from browser was very slow... then a few request from same browser were fast.. then if sat doing nothing for 2 minutes - wgain very slow. I don`t know if that was improperly configured apache if it was shutting down wsgi app and starting for each keepalive request. It just posted me off - I installed nging and with nginx+fgxi all was a lot faster than apache+mod_wsgi.

I had a similar problem with an app using memcached. The solution was running mod_wsgi in daemon mode instead of embeded mode, and Apache in mpm_worker mode. After that, application is working much faster.

Same thing happened to me and was wondering what is that is taking so much time.
each cache get was taking around 100 millisecond.
So I debugged the code django locmem code and found out that pickle was taking a lot of time (I was caching a whole table in locmemcache).
I wrapped the locmem as I didn't wanted anything advanced, so even if you remove the pickle and unpickle and put it. You will see a major improvement.
Hope it helps someone.

Tuning mod_wsgi in daemon mode

I'm running wsgi application on apache mod_wsgi in daemon mode.
I have these lines in the configuration
WSGIDaemonProcess app processes=2 threads=3 display-name=%{GROUP}
WSGIProcessGroup app
How do I find the optimal combination/tuning of processes and threads?
EDIT:
This link [given in answer bellow] was quite usefull:
https://serverfault.com/questions/145617/apache-2-2-mpm-worker-more-threads-or-more-processes/146382#146382
Now, my question is this: If my server gives quite good performance for my needs, should I reduce the number of threads to increase stability / reliability? Can I even set it to 1?

You might get more information on ServerFault as well. For example: https://serverfault.com/questions/145617/apache-2-2-mpm-worker-more-threads-or-more-processes
This is another good resource for the topic: http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading#The_mod_wsgi_Daemon_Processes
which briefly describes the options -- including setting threads = 1.
I haven't done this yet but it sounds like it doesn't much matter. Supporting multiple threads as well as multiple processors are both well supported. But for my experience level (and probably yours) its worthwhile to eliminate threading as an extra source of concern -- even if it is theoretically rock solid.

Your best bet is to probably try different bench marks. You can use the apache benchmark command to get a rough estimate at how your configuration is doing. A lot of the tweaking is going to depend on how CPU / IO bound your web app is. The performance is also going to depend on the specs of the server you are hosting on etc.

Stack recommendations for small/medium-sized web application in Python

I'm looking for some recommendations for a python web application. We have some memory restrictions and we try to keep it small and lean.
We thought about using WSGI (and a python webserver) and build the rest ourself. We already have a template engine we'd like to use, but we are open for some suggestions regarding the whole request handling (the controller).
The application has to run in a single process and the requests have to be processed with multiple threads.
We've looked at django, but we are a not sure if it fits into our memory budget.
Your feedback is very welcome!
Cheers,
Reto

I've been using Werkzeug because it's more a small collection of really useful components than a whole framework. It runs behind a wsgi server of your choice (and comes with a built-in one). If you want something even easier, Flask might be worth a look. Also, you might want to bookmark the rather speedy Jinja in case your template engine doesn't pan out. Those folks over at pocoo.org have been releasing some nice stuff.

You can run an django application in 20 mb memory easily. probably a django application will use less memory than 20mb.
I want to advise you to check webpy and cherrypy
but I'm big fan of django. if you have 20 mb memory to run application, django will give you everythig it has.

I'd go for bottle. It has all the conciseness of web.py but with some nice routing features.

You could take a look at Twisted, which has a module twisted.web. That seems to be fairly light-weight. I'm currently using it, and with a simple app it starts almost instantaneously, so it can't be all that resource intensive :)
I don't know whether Twisted uses different threads.

webpy (http://webpy.org/) is a very minimal memory footprint but highly usable framework. But it all depends on how complex your application is going to be.

Also please take a look at WHIFF. It's tiny and very flexible whiff documentation

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.