I'm migrating a legacy db into a bunch of models I have running locally. I connected to the legacy db and ran inspectdb to recreate the models. Now I'm writing functions to pair the numerous fields to their equivalents in new models. I've been using shell_plus, and the first minute or so of queries go great, but my connections keep timing out
with the following:
RemoteArticle.objects.using("remote_mysql").all()
django.db.utils.OperationalError: (2013, 'Lost connection to MySQL server during query')
Is there a command I can run to either a) reconnect to the db before running a query (so I don't have to reopen shell_plus), or ideally b) make it so that all of my queries automatically reconnect each time I run them?
I've seen timeout issues on other platforms, but I wasn't sure if Django had a built-in way of handling such things.
Thanks!
There is a page in the MySQL docs on this. Since you're apparently trying to migrate a big database, this part may apply to you:
Sometimes the “during query” form happens when millions of rows are
being sent as part of one or more queries. If you know that this is
happening, you should try increasing net_read_timeout from its default
of 30 seconds to 60 seconds or longer, sufficient for the data
transfer to complete.
The timeout makes sense, because the all() is just one query to retrieve all rows. So, reconnecting before each query is not the solution. If changing the net_read_timeout is not an option, you might want to think about paging.
I believe Lost connection to MySQL server during query happens because you exhaust the MySQL resource like timeout, session, and memory.
If the problem is because of the timeout, try increase the timeout --net_read_timeout=100. On the DB server.
Related
I have a data model defined with Peewee and it is all good. Now I need to bookkeep some form of database context in an xlwings front end application but I'm not 100% sure how to proceed. I see couple of bits and pieces of information here and there but not 100% sure how to:
https://docs.peewee-orm.com/en/2.10.2/peewee/database.html#connection-pooling
https://docs.peewee-orm.com/en/2.10.2/peewee/database.html#automatic-reconnect
I'm already using a DatabaseProxy to deffer tying the model to a specific connection until runtime as I can target different vendors e.g. PROD -> Postgres and UT -> SQLite:
database_proxy = DatabaseProxy()
Furthermore, I use playhouse.db_url's connect way with a fully qualified URL. I need a "Database Context" that:
Works with DatabaseProxy and playhouse.db_url.connect and can support at least SQLite and Postgres.
Keeps a pool of N connections which are managed internally and if the connection is lost then a retry is attempted until a timeout.
Indeed, I need a connect timeout instead of keep waiting forever to fail.
The connection pooling part: automatic recycling and closing of idle connections.
Robust automatic retrial on OperationalError due to short database outages.
I struggle to put all the pieces together here, for example, the playhouse.db_url.connect doesn't have retry or timeout arguments, it is also not clear how to decorate such connection with pooling and so on.
I am developing a web-based application using Python, Flask, MySQL, and uWSGI. However, I am not using SQL Alchemy or any other ORM. I am working with a preexisting database from an old PHP application that wouldn't play well with an ORM anyway, so I'm just using mysql-connector and writing queries by hand.
The application works correctly when I first start it up, but when I come back the next morning I find that it has become broken. I'll get errors like mysql.connector.errors.InterfaceError: 2013: Lost connection to MySQL server during query or the similar mysql.connector.errors.OperationalError: 2055: Lost connection to MySQL server at '10.0.0.25:3306', system error: 32 Broken pipe.
I've been researching it and I think I know what the problem is. I just haven't been able to find a good solution. As best as I can figure, the problem is the fact that I am keeping a global reference to the database connection, and since the Flask application is always running on the server, eventually that connection expires and becomes invalid.
I imagine it would be simple enough to just create a new connection for every query, but that seems like a far from ideal solution. I suppose I could also build some sort of connection caching mechanism that would close the old connection after an hour or so and then reopen it. That's the best option I've been able to come up with, but I still feel like there ought to be a better one.
I've looked around, and most people that have been receiving these errors have huge or corrupted tables, or something to that effect. That is not the case here. The old PHP application still runs fine, the tables all have less than about 50,000 rows, and less than 30 columns, and the Python application runs fine until it has sat for about a day.
So, here's to hoping someone has a good solution for keeping a continually open connection to a MySQL database. Or maybe I'm barking up the wrong tree entirely, if so hopefully someone knows.
I have it working now. Using pooled connections seemed to fix the issue for me.
mysql.connector.connect(
host='10.0.0.25',
user='xxxxxxx',
passwd='xxxxxxx',
database='xxxxxxx',
pool_name='batman',
pool_size = 3
)
def connection():
"""Get a connection and a cursor from the pool"""
db = mysql.connector.connect(pool_name = 'batman')
return (db, db.cursor())
I call connection() before each query function and then close the cursor and connection before returning. Seems to work. Still open to a better solution though.
Edit
I have since found a better solution. (I was still occasionally running into issues with the pooled connections). There is actually a dedicated library for Flask to handle mysql connections, which is almost a drop-in replacement.
From bash: pip install Flask-MySQL
Add MYSQL_DATABASE_HOST, MYSQL_DATABASE_USER, MYSQL_DATABASE_PASSWORD, MYSQL_DATABASE_DB to your Flask config. Then in the main Python file containing your Flask App object:
from flaskext.mysql import MySQL
mysql = MySQL()
mysql.init_app(app)
And to get a connection: mysql.get_db().cursor()
All other syntax is the same, and I have not had any issues since. Been using this solution for a long time now.
I'm running PostgreSQL 9.6 (in Docker, using the postgres:9.6.13 image) and psycopg2 2.8.2.
My PostgreSQL server (local) hosts two databases. My goal is to create materialized views in one of the databases that use data from the other database using Postgres's foreign data wrappers. I do all this from a Python script that uses psycopg2.
This works well as long as creating the materialized view does not take too long (i.e. if the amount of data being imported isn't too large). However, if the process takes longer than roughly ~250 seconds, psycopg2 throws the exception
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
No error message (or any message concerning this whatsoever) can be found in Postgres's logs.
Materialized view creation completes successfully if I do it from an SQL client (Postico).
This code illustrates roughly what I'm doing in the Python script:
db = pg.connect(
dbname=config.db_name,
user=config.db_user,
password=config.db_password,
host=config.db_host,
port=config.db_port
)
with db.cursor() as c:
c.execute("""
CREATE EXTENSION IF NOT EXISTS postgres_fdw;
CREATE SERVER fdw FOREIGN DATA WRAPPER postgres_fdw OPTIONS (...);
CREATE USER MAPPING FOR CURRENT_USER SERVER fdw OPTIONS (...);
CREATE SCHEMA foreign;
IMPORT FOREIGN SCHEMA foreign_schema FROM SERVER fdw INTO foreign;
""")
c.execute("""
CREATE MATERIALIZED VIEW IF NOT EXISTS my_view AS (
SELECT (...)
FROM foreign.foreign_table
);
""")
Adding the keepalive parameters to the psycopg2.connect call seems to have solved the problem:
self.db = pg.connect(
dbname=config.db_name,
user=config.db_user,
password=config.db_password,
host=config.db_host,
port=config.db_port,
keepalives=1,
keepalives_idle=30,
keepalives_interval=10,
keepalives_count=5
)
I still don't know why this is necessary. I can't find anyone else who has described having to use the keepalives parameter keywords when using Postgres in Docker just to be able to run queries that take longer than 4-5 minutes, but maybe it's obvious enough that nobody has noted it?
We encountered the same issue, and resolved it by adding net.ipv4.tcp_keepalive_time=200 to our docker-compose.yml file:
services:
myservice:
image: myimage
sysctls:
- net.ipv4.tcp_keepalive_time=200
From what I understand this will signal that the connection is alive after 200 seconds, which is less than the time it takes to drop the connection (300 seconds?), thus preventing it from being dropped.
It might be that PostgreSQL 9.6 kills your connections after the new timeout mentioned at https://stackoverflow.com/a/45627782/1587329. In that case, you could set
the statement_timeout in postgresql.conf
but it is not recommended.
It might work in Postico because the value has been set there.
To log an error you need to set log_min_error_statement to ERROR or lower for it to show.
In my Python-Django web application, sometimes the database it will disconnect (problems related to my test environment, not so much stable...) and my web-app give me this error:
File "/usr/lib/python3.6/site-packages/django/db/backends/postgresql/base.py", line 222, in create_cursor,
django.db.utils.InterfaceError: connection already closed,
cursor = self.connection.cursor()
Now, how i can tell django to retry to open the connection and continue? it seems that django remains stuck at this point...
Thanks.
There's no way to tell Django that it should retry on connection error. It's instead designed to simply fail on that one request. From the documentation:
If any database errors have occurred while processing the requests, Django checks whether the connection still works, and closes it if it doesn’t. Thus, database errors affect at most one request; if the connection becomes unusable, the next request gets a fresh connection.
However, this shouldn't be a problem if you follow this advice in the documentation:
If your database terminates idle connections after some time, you should set CONN_MAX_AGE to a lower value, so that Django doesn’t attempt to use a connection that has been terminated by the database server.
I've built a small python REST service using Flask, with Flask-SQLAlchemy used for talking to the MySQL DB.
If I connect directly to the MySQL server everything is good, no problems at all. If I use HAproxy (handles HA/failover, though in this dev environment there is only one DB server) then I constantly get MySQL server has gone away errors if the application doesn't talk to the DB frequently enough.
My HAproxy client timeout is set to 50 seconds, so what I think is happening is it cuts the stream, but the application isn't aware and tries to make use of an invalid connection.
Is there a setting I should be using when using services like HAproxy?
Also it doesn't seem to reconnect automatically, but if I issue a request manually I get Can't reconnect until invalid transaction is rolled back, which is odd since it is just a select() call I'm making, so I don't think it is a commit() I'm missing - or should I be calling commit() after every ORM based query?
Just to tidy up this question with an answer I'll post what I (think I) did to solve the issues.
Problem 1: HAproxy
Either increase the HAproxy client timeout value (globally, or in the frontend definition) to a value longer than what MySQL is set to reset on (see this interesting and related SF question)
Or set SQLALCHEMY_POOL_RECYCLE = 30 (30 in my case was less than HAproxy client timeout) in Flask's app.config so that when the DB is initialised it will pull in those settings and recycle connections before HAproxy cuts them itself. Similar to this issue on SO.
Problem 2: Can't reconnect until invalid transaction is rolled back
I believe I fixed this by tweaking the way the DB is initialised and imported across various modules. I basically now have a module that simply has:
from flask.ext.sqlalchemy import SQLAlchemy
db = SQLAlchemy()
Then in my main application factory I simply:
from common.database import db
db.init_app(app)
Also since I wanted to easily load table structures automatically I initialised the metadata binds within the app context, and I think it was this which cleanly handled the commit() issue/error I was getting, as I believe the database sessions are now being correctly terminated after each request.
with app.app_context():
# Setup DB binding
db.metadata.bind = db.engine