MySQL and python-mysql (mysqldb) crashing under heavy load

MySQL and python-mysql (mysqldb) crashing under heavy load - python

I was just putting the finishing touches to a site built using web.py, MySQL and python-mysql (mysqldb module) feeling good about having projected from sql injections and the like when I leant on the refresh button sending 50 or so simultaneous requests and it crashed my server! I reproduced the error and found that I get the following two errors interchangeably, sometimes its one and sometimes the other:
Error 1:
127.0.0.1:60712 - - [12/Sep/2013 09:54:34] "HTTP/1.1 GET /" - 500 Internal Server Error
Exception _mysql_exceptions.ProgrammingError: (2014, "Commands out of sync; you can't run this command now") in <bound method Cursor.__del__ of <MySQLdb.cursors.Cursor object at 0x10b287750>> ignored
Traceback (most recent call last):
Error 2:
python(74828,0x10b625000) malloc: *** error for object 0x7fd8991b6e00: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
Abort trap: 6
Clearly the requests are straining MySQL and causing it to fall over so my question is how do I protect against this happening.
My server setup is setup using Ubuntu 13.04, nginx, MySQL (which I connect to with the mysqldb python module), web.py and fast-cgi.
When the web.py app starts up it connects to the database as so:
def connect():
global con
con = mdb.connect(host=HOST, user=USER, passwd=PASSWORD, db=DATABASE)
if con is None:
print 'error connecting to database'
and the con object is assigned to a global variable so various parts of the application can access it
I access the databse data like this:
def get_page(name):
global con
with con:
cur = con.cursor()
cur.execute("SELECT `COLUMN_NAME` FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE `TABLE_SCHEMA`='jt_website' AND `TABLE_NAME`='pages'")
table_info = cur.fetchall()
One idea I had was to open and close the database before and after each request but that seems overkill to me, does anybody have any opinions on this?
What sort of methods do people use to protect their database connections in python and other environments and what sort of best practices should I be following?

I don't use web.py but docs and tutorials show a different way to deal with database.
They suggest to use a global object (you create it in .connect) which probably will be a global proxy in the Flask style.
Try organizing your code as in this example ←DEAD LINK and see if it happens again.
The error you reported seems a concurrency problem, that normally is handled automatically by the framework.
About the latter question:
What sort of methods do people use to protect their database connections in python and other environments and what sort of best practices should I be following?
It's different depending on the web framework you use. Django for example hides everything and it just works.
Flask lets you choose what you want to do. You can use flask-sqlalchemy which uses the very good SQLAlchemy ORM managing the connection proxy for the web application.

Related

Python loses connection to MySQL database after about a day

I am developing a web-based application using Python, Flask, MySQL, and uWSGI. However, I am not using SQL Alchemy or any other ORM. I am working with a preexisting database from an old PHP application that wouldn't play well with an ORM anyway, so I'm just using mysql-connector and writing queries by hand.
The application works correctly when I first start it up, but when I come back the next morning I find that it has become broken. I'll get errors like mysql.connector.errors.InterfaceError: 2013: Lost connection to MySQL server during query or the similar mysql.connector.errors.OperationalError: 2055: Lost connection to MySQL server at '10.0.0.25:3306', system error: 32 Broken pipe.
I've been researching it and I think I know what the problem is. I just haven't been able to find a good solution. As best as I can figure, the problem is the fact that I am keeping a global reference to the database connection, and since the Flask application is always running on the server, eventually that connection expires and becomes invalid.
I imagine it would be simple enough to just create a new connection for every query, but that seems like a far from ideal solution. I suppose I could also build some sort of connection caching mechanism that would close the old connection after an hour or so and then reopen it. That's the best option I've been able to come up with, but I still feel like there ought to be a better one.
I've looked around, and most people that have been receiving these errors have huge or corrupted tables, or something to that effect. That is not the case here. The old PHP application still runs fine, the tables all have less than about 50,000 rows, and less than 30 columns, and the Python application runs fine until it has sat for about a day.
So, here's to hoping someone has a good solution for keeping a continually open connection to a MySQL database. Or maybe I'm barking up the wrong tree entirely, if so hopefully someone knows.

I have it working now. Using pooled connections seemed to fix the issue for me.
mysql.connector.connect(
host='10.0.0.25',
user='xxxxxxx',
passwd='xxxxxxx',
database='xxxxxxx',
pool_name='batman',
pool_size = 3
)
def connection():
"""Get a connection and a cursor from the pool"""
db = mysql.connector.connect(pool_name = 'batman')
return (db, db.cursor())
I call connection() before each query function and then close the cursor and connection before returning. Seems to work. Still open to a better solution though.
Edit
I have since found a better solution. (I was still occasionally running into issues with the pooled connections). There is actually a dedicated library for Flask to handle mysql connections, which is almost a drop-in replacement.
From bash: pip install Flask-MySQL
Add MYSQL_DATABASE_HOST, MYSQL_DATABASE_USER, MYSQL_DATABASE_PASSWORD, MYSQL_DATABASE_DB to your Flask config. Then in the main Python file containing your Flask App object:
from flaskext.mysql import MySQL
mysql = MySQL()
mysql.init_app(app)
And to get a connection: mysql.get_db().cursor()
All other syntax is the same, and I have not had any issues since. Been using this solution for a long time now.

Is having a global database connection allowed in WSGI applications?

I need to create a simple project in Flask. I don't want to use SQLAlchemy. In the code snippet below, everyone that connects to the server uses the same connection object but for each request, a new cursor object is created. I am asking this because I have never used Python DB api before in this way. Is it correct? Should I create a new connection object for each request or use the same connection and cursor object for each request or the method below. Which one is correct?
import mysql.connector
from flask import Flask, request
app = Flask(__name__)
try:
con = mysql.connector.connect(user='root',password='',host='localhost',database='pywork')
except mysql.connector.Error as err:
print("Something went wrong")
#app.route('/')
def home():
cursor = con.cursor()
cursor.execute("INSERT INTO table_name VALUES(NULL,'test record')")
con.commit()
cursor.close()
return ""

WSGI applications may be served by several worker processes and threads. So you might end up having multiple threads using the same connection. So you need to find out whether your library's implementation of the connection is thread safe. Look up the documentation and see if they claim to provide Level 2 thread safety.
Then you should reflect about whether or not you need transactions during your requests. If you find you need transactions (e.g., requests issue multiple database commands with an inconsistent state in between or possible race conditions), you should use different connections, because transactions are always connection wide. Note that some database systems or configurations don't support transactions or don't isolate separate connections from each other.
So if you share a connection, you should assume that you work with autocommit turned on (or better: actually do that).

Taking mongoengine.connect out of the setting.py in django

Most of the blog posts and examples on the web, in purpose of connecting to the MongoDB using Mongoengine in Python/Django, have suggested that we should add these lines to the settings.py file of the app:
from mongoengine import connect
connect('project1', host='localhost')
It works fine for most of the cases except one I have faced recently:
When the database is down!
Let say if db goes down, the process that is taking care of the web server (in my case, Supervisord) will stop running the app because of Exception that connect throws. It may try few more times but after its timeout reached, it will stop trying.
So even if your app has some parts that are not tied to db, they will also break down.
A quick solution to this is adding a try/exception block to the connect code:
try:
connect('project1', host='localhost')
except Exception as e:
print(e)
but I am looking for a better and clean way to handle this.

Unfortunately this is not really possible with mongoengine unless you go with the try-except solution like you did.
You could try to connect with pure pymongo version 3.0+ using MongoClient and registering the connection manually in the mongoengine.connection._connection_settings dictionary (quite hacky but should work). From pymongo documentation:
Changed in version 3.0: MongoClient is now the one and only client class for a standalone server, mongos, or replica set. It includes the functionality that had been split into MongoReplicaSetClient: it can connect to a replica set, discover all its members, and monitor the set for stepdowns, elections, and reconfigs.
The MongoClient constructor no longer blocks while connecting to the server or servers, and it no longer raises ConnectionFailure if they are unavailable, nor ConfigurationError if the user’s credentials are wrong. Instead, the constructor returns immediately and launches the connection process on background threads.

Flask-SQLAlchemy "MySQL server has gone away" when using HAproxy

I've built a small python REST service using Flask, with Flask-SQLAlchemy used for talking to the MySQL DB.
If I connect directly to the MySQL server everything is good, no problems at all. If I use HAproxy (handles HA/failover, though in this dev environment there is only one DB server) then I constantly get MySQL server has gone away errors if the application doesn't talk to the DB frequently enough.
My HAproxy client timeout is set to 50 seconds, so what I think is happening is it cuts the stream, but the application isn't aware and tries to make use of an invalid connection.
Is there a setting I should be using when using services like HAproxy?
Also it doesn't seem to reconnect automatically, but if I issue a request manually I get Can't reconnect until invalid transaction is rolled back, which is odd since it is just a select() call I'm making, so I don't think it is a commit() I'm missing - or should I be calling commit() after every ORM based query?

Just to tidy up this question with an answer I'll post what I (think I) did to solve the issues.
Problem 1: HAproxy
Either increase the HAproxy client timeout value (globally, or in the frontend definition) to a value longer than what MySQL is set to reset on (see this interesting and related SF question)
Or set SQLALCHEMY_POOL_RECYCLE = 30 (30 in my case was less than HAproxy client timeout) in Flask's app.config so that when the DB is initialised it will pull in those settings and recycle connections before HAproxy cuts them itself. Similar to this issue on SO.
Problem 2: Can't reconnect until invalid transaction is rolled back
I believe I fixed this by tweaking the way the DB is initialised and imported across various modules. I basically now have a module that simply has:
from flask.ext.sqlalchemy import SQLAlchemy
db = SQLAlchemy()
Then in my main application factory I simply:
from common.database import db
db.init_app(app)
Also since I wanted to easily load table structures automatically I initialised the metadata binds within the app context, and I think it was this which cleanly handled the commit() issue/error I was getting, as I believe the database sessions are now being correctly terminated after each request.
with app.app_context():
# Setup DB binding
db.metadata.bind = db.engine

locked SQLite database errors with WSGI python app

I'm having issues making multiple ajax POST calls to functions that access a database in my web app using sqlite and mod-wsgi. I have no issues making requests to one function, but as soon as I call a different function, I started getting "database is locked" errors. I've tried setting the variables as global and just accessing them in the two functions, as well as opening and closing the database in each function, to no avail.
What's the proper way to interface with a database if you just have one application function in your code? Threads? Persistent connections?
I've used Django before, but wanted something bare-bone for this simple app running on my local machine.
The relevant section of code is:
con = sqlite3.connect("/var/www/Knowledge/eurisko.sqlite")
con.row_factory = sqlite3.Row
cursor = con.cursor()
cursor.execute("update notes_content set c1content=?, c2timestamp=?
where c0title=?", [content, timestamp, title])
con.commit()
cursor.close()
con.close()
The full file is here: http://pastebin.com/7yuiZFi2
I'm running apache 2.2 on ubuntu 10 with libapache2-modwsgi and python
2.7.

See warnings about concurrent access from multiple processes in SQLite documentation.
http://www.sqlite.org/faq.html#q5
This information was provided on mod_wsgi list where question was also asked, but following up here.
This can be an issue because Apache/mod_wsgi supports both single and multi process configurations. Likely OP is using multi process configuration. Also see:
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
for description of Apache/mod_wsgi process/threading model.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.