Why running django's session cleanup command kill's my machine resources? - python

I have a one year production site configured with django.contrib.sessions.backends.cached_db backend with a MySQL database backend. The reason why I chose cached_db is a mix of security with read performance.
The problem is, the cleanup command, responsible to delete all expired sessions, was never executed, resulting in a 2.3GB session table data length, 6 million rows and 500Mb index length.
When I try to run the ./manage.py cleanup (in Django 1.3) command, or ./manage.py clearsessions (Django`s 1.5 correspondent), the process never ends (or my patience doesn't complete 3 hours).
The code that Django use's to do this is:
Session.objects.filter(expire_date__lt=timezone.now()).delete()
In a first impression, I think that's normal because the table has 6M rows, but, after I inspect System's monitor, I discover that all memory and cpu was used by the python process, not mysqld, fullfilling my machine's resources. I think that's something terrible wrong with this command code. It seems that python iterates over all founded expired session rows before deleting each of them, one by one. In this case, a code refactoring to just raw a DELETE FROM command can resolve my problem and helps Django community, right? But, if this is the case, a Queryset delete command is acting weird and none optimized in my opinion. Am I right?

Related

Why does MongoDB return a "errno:24 Too many open files" error

I got a problem after running python script with MongoDB aggregation pipeline. The error said
errno:24 Too many open files, full error: {'ok': 0.0, 'errmsg': 'error opening file "D:\MongoDB\Server\4.4\data/_tmp/extsort-doc-group.463": errno:24 Too many open files', 'code': 16814, 'codeName': 'Location16814'}
The server that host Mongo Database is Windows Server 2016
Problem is gone when I've limited number of data by reducing span day from 7 to 3 days, the script will successfully run and give me a result.
This script have been run for couple weeks before with 7 days setting and there was no problem.
As per the MongoDB docs here, it is recommended to set a high limit for open file limit. For Ubuntu, we generally do so by changing limits in /etc/security/limits.conf which are specific to user and limit types. There are different ways for different distros. For checking the limits a simple ulimit -a can be very helpful.
IMO machines running databases should have high limits for open file and process count. Also, there is bunch of recommendation from MongoDB related to paging and disk types to use. I would recommend going through them to use MongoDB upto it's potential.
I haven't worked on a windows machine for very long and I am sure if you try to find how to increase open file limit you will find it. Also, when you reduced the query from 7 days to 3 days, the files wired-tiger had to access to fetch the indexes and disk operations also reduced, and it might have allowed you to run the query. Please note unlike some databases, the file system organisation in mongodb-wiredtiger is a bit different.

Parsing .py files performances with mod_wsgi / django / rest_framework

I'm using apache with mod_wsgi on a debian jessie, python3.4, django and django REST framework to power a REST web service.
I'm currently running performance tests. My server is a KS-2 (http://www.kimsufi.com/fr/serveurs.xml) with 4Gb of RAM and an Atom N2800 processor (1.8GHz, 2c/4t). My server already runs plenty of little services, but my load average does not exceed 0.5 and I usually have 2Gb of free RAM. I'm giving those context informations because maybe the performances I describe below is normal in context of this hardware support.
I'm quite new to python powered web services and don't really know what to except in term of performances. I used firefox's network monitor to test the duration of a request.
I've set up a test environnement with django rest framework's first example (http://www.django-rest-framework.org/). When I go to url http://myapi/users/?format=json I have to wait ~1600 ms. If I check the response multiple times in a short period of time it goes down go 60ms. However, as soon as I wait more than ~5 secs, the average time is 1600ms.
My application has about 6k lines of python and includes some django librairies in INSTALLED_APPS (django-cors-headers, django-filter, django-guardian, django-rest-swagger). When I perform the same kind of tests (on a comparable view returning a list of my users) on it I get 6500/90ms.
My data do not require a lot of ressources to retrieve (django-debug-toolbar shows me that my SQL queries take <10ms to perform). So I'm not sure what is going on under the hood but I guess all .py files need to be periodically parsed or .pyc to be read. If it's the case, is it possible to get rid of this behaviour ? I mean, in a production environnement where I know I won't edit often my files. Or if it's not the case, to lower the weight of the first call.
Note : I've read django's documentation about cache (https://docs.djangoproject.com/en/1.9/topics/cache/), but in my application my data (which do not seem to require a lot of ressources) is susceptible to change often. I guess caching does not help for the source code of an application, am I wrong ?
Thanks
I guess all .py files need to be periodically parsed or .pyc to be read
.py files are only parsed (and compiled to bytecode .pyc files) when there's no matching .pyc file or the .py file is newer than the .pyc. Also, the .pyc files are only loaded once per process.
Given your symptom, chances are your problem is mostly with your server's settings. First make sure you're in daemon mode (https://code.google.com/p/modwsgi/wiki/QuickConfigurationGuide#Delegation_To_Daemon_Process), then tweak your settings accroding to your server hardware and application's needs ( https://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIDaemonProcess)
This looks like your Apache does remove the Python processes from the memory after a while. mod_wsgi loads the Python interpreter and files into Apache which is slow. However you should be able to tune it so it keeps them into memory.

Python re establishment after Server shuts down

I have a python script running on my server which accessed a database, executes a fetch query and runs a learning algorithm to classify and updates certain values and means depending on the query.
I want to know if for some reason my server shuts down in between then my python script would shut down and my query lost.
How do i get to know where to continue from once I re-run the script and i want to carry on the updated means from the previous queries that have happened.
First of all: the question is not really related to Python at all. It's a general problem.
And the answer is simple: keep track of what your script does (in a file or directly in db). If it crashes continue from the last step.

GAE Backend fails to respond to start request

This is probably a truly basic thing that I'm simply having an odd time figuring out in a Python 2.5 app.
I have a process that will take roughly an hour to complete, so I made a backend. To that end, I have a backend.yaml that has something like the following:
-name: mybackend
options: dynamic
start: /path/to/script.py
(The script is just raw computation. There's no notion of an active web session anywhere.)
On toy data, this works just fine.
This used to be public, so I would navigate to the page, the script would start, and time out after about a minute (HTTP + 30s shutdown grace period I assume, ). I figured this was a browser issue. So I repeat the same thing with a cron job. No dice. Switch to a using a push queue and adding a targeted task, since on paper it looks like it would wait for 10 minutes. Same thing.
All 3 time out after that minute, which means I'm not decoupling the request from the backend like I believe I am.
I'm assuming that I need to write a proper Handler for the backend to do work, but I don't exactly know how to write the Handler/webapp2Route. Do I handle _ah/start/ or make a new endpoint for the backend? How do I handle the subdomain? It still seems like the wrong thing to do (I'm sticking a long-process directly into a request of sorts), but I'm at a loss otherwise.
So the root cause ended up being doing the following in the script itself:
models = MyModel.all()
for model in models:
# Magic happens
I was basically taking for granted that the query would automatically batch my Query.all() over many entities, but it was dying at the 1000th entry or so. I originally wrote it was computational only because I completely ignored the fact that the reads can fail.
The actual solution for solving the problem we wanted ended up being "Use the map-reduce library", since we were trying to look at each model for analysis.

Testing for mysterious load errors in python/django

This is related to this Configure Apache to recover from mod_python errors, although I've since stopped assuming that this has anything to do with mod_python. Essentially, I have a problem that I wasn't able to reproduce consistently and I wanted some feedback on whether the proposed solution seems likely and some potential ways to try and reproduce this problem.
The setup: a django-powered site would begin throwing errors after a few days of use. They were always ImportErrors or ImproperlyConfigured errors, which amount to the same thing, since the message always specified trouble loading some module referenced in the settings.py file. It was not generally the same class. I am using preforked apache with 8 forked children, and whenever this problem would come up, one process would be broken and seven would be fine. Once broken, every request (with Debug On in the apache conf) would display the same trace every time it served a request, even if the failed load is not relevant to the particular request. An httpd restart always made the problem go away in the short run.
Noted problems: installation and updates are performed via svn with some post-update scripts. A few .pyc files accidentally were checked into the repository. Additionally, the project itself was owned by one user (not apache, although apache had permissions on the project) and there was a persistent plugin that ended up getting backgrounded as root. I call these noted problems because they would be wrong whether or not I noticed this error, and hence I have fixed them. The project is owned by apache and the plugin is backgrounded as apache. All .pyc files are out of the repository, and they are all force-recompiled after each checkout while the server and plugin have been stopped.
What I want to know is
Do these configuration disasters seem like a likely explanation for sporadic ImportErrors?
If there is still a problem somewhere else in my code, how would I best reproduce it?
As for 2, my approach thus far has been to write some stress tests that repeatedly request the same page so as to execute common code paths.
Incidentally, this has been running without incident for about 2 days since the fix, but the problem was observed with 1 to 10 day intervals between.
"Do these configuration disasters seem like a likely explanation for sporadic ImportErrors"
Yes. An old .pyc file is a disaster of the first magnitude.
We develop on Windows, but run production on Red Hat Linux. An accidentally moved .pyc file is an absolute mystery to debug because (1) it usually runs and (2) it has a Windows filename for the original source, making the traceback error absolutely senseless. I spent hours staring at logs -- on linux -- wondering why the file was "C:\This\N\That".
"If there is still a problem somewhere else in my code, how would I best reproduce it?"
Before reproducing errors, you should try to prevent them.
First, create unit tests to exercise everything.
Start with Django's tests.py testing. Then expand to unittest for all non-Django components. Then write yourself a "run_tests" script that runs every test you own. Run this periodically. Daily isn't often enough.
Second, be sure you're using logging. Heavily.
Third, wrap anything that uses external resources in generic exception-logging blocks like this.
try:
some_external_resource_processing()
except Exception, e:
logger.exception( e )
raise
This will help you pinpoint problems with external resources. Files and databases are often the source of bad behavior due to permission or access problems.
At this point, you have prevented a large number of errors. If you want to run cyclic load testing, that's not a bad idea either. Use unittest for this.
class SomeLoadtest( unittest.TestCase ):
def test_something( self ):
self.connection = urllib2.urlopen( "localhost:8000/some/path" )
results = self.connection.read()
This isn't the best way to do things, but it shows one approach. You might want to start using Selenium to test the web site "from the outside" as a complement to your unittests.

Categories