I have an old, large project based in Python 2.7 with Tornado framework. To work with MySQL, it initially used Tornado-MySQL with raw SQL queries, and it worked well, but now it must use MySQL 8, and that library is obsolete, unmaintained.
So, now I set TorMySQL library – it connects well to MySQL Server 8, but I don't fully understand how to use it, and this leads so bugs.
In one project's file I wrote this code to access databases:
from tornado import gen
from tornado.gen import Return
from tornado.ioloop import IOLoop
import tormysql
import settings
POOL = tormysql.ConnectionPool(
max_connections = 20,
idle_seconds = 7200, #timeout time, 0 is not timeout
wait_connection_timeout = 3,
host='127.0.0.1',
port=3306,
user=settings.MYSQL_USER,
passwd=settings.MYSQL_PASSWORD,
db='aivanf',
use_unicode=True,
charset='utf8mb4')
#gen.coroutine
def executePool(query, params):
with (yield POOL.Connection()) as conn:
with conn.cursor() as cursor:
try:
yield cursor.execute(query, params)
except Exception, ex:
print('Exception!\n{}'.format(ex))
yield conn.rollback()
raise Return(None)
else:
first = query[:10].lower()
if 'update' in first or 'insert' in first:
yield conn.commit()
if 'select' in first:
raise Return(cursor.fetchall())
else:
raise Return(None)
I use if's because this single function is called with different types of queries. I know, it's ugly, but works fine. Similar, but even simpler code for Tornado-MySQL worked completely perfect, but with MySQL 5.7 only.
However, some UPDATE / INSERT queries seem to be skipped, and I get these messages:
(1213, u'Deadlock found when trying to get lock; try restarting transaction')
WARNING:root:Connection maybe not release, used time 25.32s {'port': 3306, 'host': '127.0.0.1', 'user': '...', 'database': '...'} <3,2>.
Also, sometimes different clients of the server see different versions of data – like if they had different connections with own uncommitted data.
How to solve the problem?
I suppose that the problem about the pool – maybe I have to close / recreate it? The TorMySQL page has also this line: yield pool.close()
You probably have to conn.commit() even after a SELECT query - otherwise a run of SELECT queries are done within the same transaction as the first.
I think most users are accustomed to "autocommit" by default, but that does not seem to be the default mode for TorMySQL
(I was confused the same as you were, for the first couple days of using TorMySQL :)
Related
I'm developing a webapp using Flask-SQLAlchemy and a Postgre DB, then I have this dropdown list in my webpage which is populated from a select to the DB, after selecting different values for a couple of times I get the "sqlalchemy.exc.TimeoutError:".
My package's versions are:
Flask-SQLAlchemy==2.5.1
psycopg2-binary==2.8.6
SQLAlchemy==1.4.15
My parameters for the DB connection are set as:
app.config['SQLALCHEMY_POOL_SIZE'] = 20
app.config['SQLALCHEMY_MAX_OVERFLOW'] = 20
app.config['SQLALCHEMY_POOL_TIMEOUT'] = 5
app.config['SQLALCHEMY_POOL_RECYCLE'] = 10
The error I'm getting is:
sqlalchemy.exc.TimeoutError: QueuePool limit of size 20 overflow 20 reached, connection timed out, timeout 5.00 (Background on this error at: https://sqlalche.me/e/14/3o7r)
After changing the value of the 'SQLALCHEMY_MAX_OVERFLOW' from 20 to 100 I get the following error after some value changes on the dropdown list.
psycopg2.OperationalError: connection to server at "localhost" (::1), port 5432 failed: FATAL: sorry, too many clients already
Every time a new value is selected from the dropdown list, four queries are triggered to the database and they are used to populate four corresponding tables in my HTML with the results from that query.
I have a 'db.session.commit()' statement after every single query to the DB, but even though I have it, I get this error after a few value changes to my dropdown list.
I know that I should be looking to correctly manage my connection sessions, but I'm strugling with this. I thought about setting the pool timeout to 5s, instead of the default 30s in hopes that the session would be closed and returned to the pool in a faster way, but it seems it didn't help.
As a suggestion from #snakecharmerb, I checked the output of:
select * from pg_stat_activity;
I ran the webapp for 10 different values before it showed me an error, which means all the 20+20 sessions where used and are left in an 'idle in transaction' state.
Do anybody have any idea suggestion on what should I change or look for?
I found a solution to the issue I was facing, in another post from StackOverFlow.
When you assign your flask app to your db variable, on top of indicating which Flask app it should use, you can also pass on session options, as below:
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy(app, session_options={'autocommit': True})
The usage of 'autocommit' solved my issue.
Now, as suggested, I'm using:
app.config['SQLALCHEMY_POOL_SIZE'] = 1
app.config['SQLALCHEMY_MAX_OVERFLOW'] = 0
Now everything is working as it should.
The original post which helped me is: Autocommit in Flask-SQLAlchemy
#snakecharmerb, #jorzel, #J_H -> Thanks for the help!
You are leaking connections.
A little counterintuitively,
you may find you obtain better results with a lower pool limit.
A given python thread only needs a single pooled connection,
for the simple single-database queries you're doing.
Setting the limit to 1, with 0 overflow,
will cause you to notice a leaked connection earlier.
This makes it easier to pin the blame on the source code that leaked it.
As it stands, you have lots of code, and the error is deferred
until after many queries have been issued,
making it harder to reason about system behavior.
I will assume you're using sqlalchemy 1.4.29.
To avoid leaking, try using this:
from contextlib import closing
from sqlalchemy import create_engine, text
from sqlalchemy.orm import scoped_session, sessionmaker
engine = create_engine(some_url, future=True, pool_size=1, max_overflow=0)
get_session = scoped_session(sessionmaker(bind=engine))
...
with closing(get_session()) as session:
try:
sql = """yada yada"""
rows = session.execute(text(sql)).fetchall()
session.commit()
...
# Do stuff with result rows.
...
except Exception:
session.rollback()
I am using flask-restful.
So when I got this error -> QueuePool limit of size 20 overflow 20 reached, connection timed out, timeout 5.00 (Background on this error at: https://sqlalche.me/e/14/3o7r)
I found out in logs that my checked out connections are not closing. this I found out using logger.info(db_session.get_bind().pool.status())
def custom_decorator(error_message, db_session):
def api_decorator(func):
def api_request(self, *args, **kwargs):
try:
response = func(self)
db_session.commit()
return response
except Exception as err:
db_session.rollback()
logger.error(error_message.format(err))
return error_response(
message=f"Internal Server Error",
status_code=HTTPStatus.INTERNAL_SERVER_ERROR,
)
finally:
db_session.close()
return api_request
return api_decorator
So I had to create this decorator which handles the db_session closing automatically. Using this I am not getting any active checked out connections.
you can use the decorators in your function as follows:
#custom_decorator("blah", db_session)
def example():
"some code"
I have a function that connects to a mysql db and executes a query, that takes quite long (approx. 10 min)
def foo(connections_string): # connection_string something like "mysql://user:key#jost/db"
statement = "SELECT * FROM largtable"
conn = None
df = None
try:
engine = sqlalchemy.create_engine(
connections_string,
connect_args={
"connect_timeout": 1500,
},
poolclass = QueuePool,
pool_pre_ping = True,
pool_size = 10,
pool_recycle=3600,
pool_timeout = 900,
)
conn = engine.connect()
df = pd.read_sql_query(statement, conn)
except Exception:
raise Exception("could not load data")
finally:
if conn:
conn.close()
return df
When I run this in my local envionment, it works and takes about 600 seconds. When I run this via airflow, it fails after about 5 to 6 Mins with the error (_mysql_exceptions.OperationalError) (2013, 'Lost connection to MySQL server during query')
I have tried the suggestions on stakoverflow to adjust the timeout of sqlalchemy (e.g., this and this) and from the sqlalchemy docs, which lead to the additional args (pool_ and connection_args) for the create_engine() function. However, these didn't seem to have any effect at all.
I've also tried to replace sqlalchemy with pymysql, which lead to the same error on airflow. Thus, I didn't try flask-sqlalchemy yet, since I expect the same result.
Since it works in the basically same environment (py version 3.7.x, sqlalchemy 1.3.3 and pandas 1.3.x) if not run by airflow but doesn't when run by airflow, I think there is some global variable, that overrules my timeout settings. But I have no idea where to start the search.
And some additional info, b/c somebody could work with the info: I got it running with airflow twice now in off-hours (5 am and sundays). But not again since.
PS: unfortunately, pagination as suggested here is not an option, since the query runtime results from transformations and calculations.
I'm using python to try and connect to a DB. This code worked and something in my environment changed so that the host in not present/accessible. This is as expected. The thing that I'm trying to work out is, I can't seem to catch the error of this happening. This is my code:
def create_db_connection(self):
try:
message('try...')
DB_HOST = os.environ['DB_HOST']
DB_USERNAME = os.environ['DB_USERNAME']
DB_PASSWORD = os.environ['DB_PASSWORD']
message('connecting...')
db = mysql.connector.connect(
host=DB_HOST,
user=DB_USERNAME,
password=DB_PASSWORD,
auth_plugin='mysql_native_password'
)
message('connected...')
return db
except mysql.connector.Error as err:
log.info('bad stuff happened...')
log.info("Something went wrong: {}".format(err))
message('exception connecting...')
except Exception as ex:
log.info('something bad happened')
message("Exception: {}".format(ex))
message('returning false connection...')
return False
I see up to the message('connecting...') call, but nothing afterwards. Also, I don't see any of the except messages/logs at all.
Is there something else I need to catch/check in order to know that a DB connection attempt has failed?
This is running inside an AWS Lambda and was working until I changed some subnets/etc. The key thing is I want to catch it no longer being able to connect.
The issue is most likely that your lambda function is timing out before the database connection is timing out.
First, modify the lambda function to execute for 60 seconds and test. You should find after about 30 seconds you will see the connection to the database timeout.
To resolve this issue, modify the security group on the database instance to include the security group configured for lambda. Use this entry to open a the correct port 3306
I have been having major trouble connecting my python shell to my postgres. I am doing this on windows. I have downloaded psycopg2 and everything for this to process, however it still is not working.
import psycopg2
conn=psycopg2.connect("dbname = 'test' user ='postgres' host ='localhost' password = 'mypassword'")
It gives me an error telling me that the database "test" does not exist, however it does! If you guys have any advice at all on what I should test out, that would be amazing. Thank you!
You can layout connection parameters as a string and pass it to the connect() function as like:
conn = psycopg2.connect("dbname=test user=postgres password=postgres")
Or you can use a list of keyword arguments like
conn = psycopg2.connect(host="localhost",database="test", user="postgres", password="postgres")
If its still fails then you should check on PostgreSQL side. You should try to connect the db in question using command line and see if error re appears or not. if it appears then something is missing on DB server side.
I'm creating a RESTful API which needs to access the database. I'm using Restish, Oracle, and SQLAlchemy. However, I'll try to frame my question as generically as possible, without taking Restish or other web APIs into account.
I would like to be able to set a timeout for a connection executing a query. This is to ensure that long running queries are abandoned, and the connection discarded (or recycled). This query timeout can be a global value, meaning, I don't need to change it per query or connection creation.
Given the following code:
import cx_Oracle
import sqlalchemy.pool as pool
conn_pool = pool.manage(cx_Oracle)
conn = conn_pool.connect("username/p4ss#dbname")
conn.ping()
try:
cursor = conn.cursor()
cursor.execute("SELECT * FROM really_slow_query")
print cursor.fetchone()
finally:
cursor.close()
How can I modify the above code to set a query timeout on it?
Will this timeout also apply to connection creation?
This is similar to what java.sql.Statement's setQueryTimeout(int seconds) method does in Java.
Thanks
for the query, you can look on timer and conn.cancel() call.
something in those lines:
t = threading.Timer(timeout,conn.cancel)
t.start()
cursor = conn.cursor()
cursor.execute(query)
res = cursor.fetchall()
t.cancel()
In linux see /etc/oracle/sqlnet.ora,
sqlnet.outbound_connect_timeout= value
also have options:
tcp.connect_timeout and sqlnet.expire_time, good luck!
You could look at setting up PROFILEs in Oracle to terminate the queries after a certain number of logical_reads_per_call and/or cpu_per_call
Timing Out with the System Alarm
Here's how to use the operating system timout to do this. It's generic, and works for things other than Oracle.
import signal
class TimeoutExc(Exception):
"""this exception is raised when there's a timeout"""
def __init__(self): Exception.__init__(self)
def alarmhandler(signame,frame):
"sigalarm handler. raises a Timeout exception"""
raise TimeoutExc()
nsecs=5
signal.signal(signal.SIGALRM, alarmhandler) # set the signal handler function
signal.alarm(nsecs) # in 5s, the process receives a SIGALRM
try:
cx_Oracle.connect(blah blah) # do your thing, connect, query, etc
signal.alarm(0) # if successful, turn of alarm
except TimeoutExc:
print "timed out!" # timed out!!