In Django, how can I set db connection timeout? - python

OK, I know it's not that simple. I have two db connections defined in my settings.py: default and cache. I'm using DatabaseCache backend from django.core.cache. I have database router defined so I can use separate database/schema/table for my models and for cache. Perfect!
Now sometimes my cache DB is not available and there are two cases:
Connection to databse was established already when DB crashed - this is easy - I can use this recipe: http://code.activestate.com/recipes/576780-timeout-for-nearly-any-callable/ and wrap my query like this:
try:
timelimited(TIMEOUT, self._meta.cache.get, cache_key))
expect TimeLimitExprired:
# live without cache
Connection to database wasn't yet established - so I need to wrap in timelimited some portion of code that actually establishes database connection. But I don't know where such code exists and how to wrap it selectively (i.e. wrap only cache connection, leave default connection without timeout)
Do you know how to do point 2?
Please note, this answer https://stackoverflow.com/a/1084571/940208 is not correct:
grep -R "connect_timeout" /usr/local/lib/python2.7/dist-packages/django/db
gives no results and cx_Oracle driver doesn't support this parameter as far as I know.

Related

how to listen for incoming jdbc connections?

let's say that me and a partner are working on a python application that needs to connect to a database, exemple mysql:
import mysql.connector
mydb = mysql.connector.connect(host="databaseURL",user="user",passwd="userPassword")
l am trying to hide my credentials from the code, but referencing them in anyway (from a file, environment variable, substitution ...etc.) doesn't work since he can simply print the value, or get them from memory, and clearing memory isn't an option in my use case.
one idea that l thought about is using a proxy, that sits between the python app and the database, this way l could connect to the db with some proxy credentials instead. exemple:
import mysql.connector
mydb = mysql.connector.connect(host="proxyURL",user="proxyUser",passwd="proxyPassword")
basically if the credentials are valid, the proxy gonna request the actual credentials, and use them to connect to the database.
The difficulty that l found is how to to make a server listen for incoming JDBC connections? i.e. how to create a jdbc proxy. Otherwise is the approach even correct?
thinking about it, l don't think that's even possible, since each database has it own JDBC driver that communicates in its unique way with the database. this means that l need to write different proxies for each database.

Postgres closes connection during query after a few hundred seconds when using Psycopg2

I'm running PostgreSQL 9.6 (in Docker, using the postgres:9.6.13 image) and psycopg2 2.8.2.
My PostgreSQL server (local) hosts two databases. My goal is to create materialized views in one of the databases that use data from the other database using Postgres's foreign data wrappers. I do all this from a Python script that uses psycopg2.
This works well as long as creating the materialized view does not take too long (i.e. if the amount of data being imported isn't too large). However, if the process takes longer than roughly ~250 seconds, psycopg2 throws the exception
psycopg2.OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
No error message (or any message concerning this whatsoever) can be found in Postgres's logs.
Materialized view creation completes successfully if I do it from an SQL client (Postico).
This code illustrates roughly what I'm doing in the Python script:
db = pg.connect(
dbname=config.db_name,
user=config.db_user,
password=config.db_password,
host=config.db_host,
port=config.db_port
)
with db.cursor() as c:
c.execute("""
CREATE EXTENSION IF NOT EXISTS postgres_fdw;
CREATE SERVER fdw FOREIGN DATA WRAPPER postgres_fdw OPTIONS (...);
CREATE USER MAPPING FOR CURRENT_USER SERVER fdw OPTIONS (...);
CREATE SCHEMA foreign;
IMPORT FOREIGN SCHEMA foreign_schema FROM SERVER fdw INTO foreign;
""")
c.execute("""
CREATE MATERIALIZED VIEW IF NOT EXISTS my_view AS (
SELECT (...)
FROM foreign.foreign_table
);
""")
Adding the keepalive parameters to the psycopg2.connect call seems to have solved the problem:
self.db = pg.connect(
dbname=config.db_name,
user=config.db_user,
password=config.db_password,
host=config.db_host,
port=config.db_port,
keepalives=1,
keepalives_idle=30,
keepalives_interval=10,
keepalives_count=5
)
I still don't know why this is necessary. I can't find anyone else who has described having to use the keepalives parameter keywords when using Postgres in Docker just to be able to run queries that take longer than 4-5 minutes, but maybe it's obvious enough that nobody has noted it?
We encountered the same issue, and resolved it by adding net.ipv4.tcp_keepalive_time=200 to our docker-compose.yml file:
services:
myservice:
image: myimage
sysctls:
- net.ipv4.tcp_keepalive_time=200
From what I understand this will signal that the connection is alive after 200 seconds, which is less than the time it takes to drop the connection (300 seconds?), thus preventing it from being dropped.
It might be that PostgreSQL 9.6 kills your connections after the new timeout mentioned at https://stackoverflow.com/a/45627782/1587329. In that case, you could set
the statement_timeout in postgresql.conf
but it is not recommended.
It might work in Postico because the value has been set there.
To log an error you need to set log_min_error_statement to ERROR or lower for it to show.

Flask-SQLAlchemy "MySQL server has gone away" when using HAproxy

I've built a small python REST service using Flask, with Flask-SQLAlchemy used for talking to the MySQL DB.
If I connect directly to the MySQL server everything is good, no problems at all. If I use HAproxy (handles HA/failover, though in this dev environment there is only one DB server) then I constantly get MySQL server has gone away errors if the application doesn't talk to the DB frequently enough.
My HAproxy client timeout is set to 50 seconds, so what I think is happening is it cuts the stream, but the application isn't aware and tries to make use of an invalid connection.
Is there a setting I should be using when using services like HAproxy?
Also it doesn't seem to reconnect automatically, but if I issue a request manually I get Can't reconnect until invalid transaction is rolled back, which is odd since it is just a select() call I'm making, so I don't think it is a commit() I'm missing - or should I be calling commit() after every ORM based query?
Just to tidy up this question with an answer I'll post what I (think I) did to solve the issues.
Problem 1: HAproxy
Either increase the HAproxy client timeout value (globally, or in the frontend definition) to a value longer than what MySQL is set to reset on (see this interesting and related SF question)
Or set SQLALCHEMY_POOL_RECYCLE = 30 (30 in my case was less than HAproxy client timeout) in Flask's app.config so that when the DB is initialised it will pull in those settings and recycle connections before HAproxy cuts them itself. Similar to this issue on SO.
Problem 2: Can't reconnect until invalid transaction is rolled back
I believe I fixed this by tweaking the way the DB is initialised and imported across various modules. I basically now have a module that simply has:
from flask.ext.sqlalchemy import SQLAlchemy
db = SQLAlchemy()
Then in my main application factory I simply:
from common.database import db
db.init_app(app)
Also since I wanted to easily load table structures automatically I initialised the metadata binds within the app context, and I think it was this which cleanly handled the commit() issue/error I was getting, as I believe the database sessions are now being correctly terminated after each request.
with app.app_context():
# Setup DB binding
db.metadata.bind = db.engine

cx_Oracle-like package for Clojure

I'm coming from a very heavy Python->Oracle development environment and have been playing around with Clojure quite a bit. I love the ease of access that cx_Oracle gives me to the database on the Python end and was wondering if Clojure has something similar.
Specifically what I'm looking for is something to give me easy access to a database connection, ala cx_Oracle's "username/password#tns_name" format.
The best I've come up with so far is:
(defn get-datasource [user password server service]
{:datasource (clj-dbcp.core/make-datasource {:adapter :oracle
:style :service-name
:host server
:service-name service
:user user
:password password})})
This requires the server however and 95% of my users don't have the knowledge of what server they're hitting, just the tns name from tnsnames.ora.
In addition, I don't understand when I have a database connection and when it disconnects. With cx_Oracle I either had to do a with cx_Oracle.connect()... or a connection.close() to close the connection.
Can someone give me guidance as to how datasources work as far as connections go and the easiest way to connect to a database given a username, password, and tns alias?
Thanks!!
Best use Clojure's most idiomatic database library clojure.java.jdbc.
First, because the Oracle driver isn't available from a maven repository, we need to download the latest one and install it in our local repository, using the lein-localrepo plugin:
lein localrepo install -r D:\Path\To\Repo\
D:\Path\To\ojdbc6.jar
oracle.jdbc/oracledriver "12.1.0.1"
Now we can reference it in our project.clj, together with clojure.java.jdbc.
(defproject oracle-connect "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/java.jdbc "0.3.3"]
[oracle.jdbc/oracledriver "12.1.0.1"]])
After starting a REPL we can connect to the database through a default host/port/SID connection
(ns oracle-connect
(:require [clojure.java.jdbc :as jdbc]))
(def db
{:classname "oracle.jdbc.OracleDriver"
:subprotocol "oracle:thin"
:subname "//#hostname:port:sid"
:user "username"
:password "password"}))
(jdbc/query db ["select ? as one from dual" 1])
db is just a basic map, referred to as the db-spec. It is not a real connection, but has all the information needed to make one. Clojure.java.jdbc makes one when needed, for instance in (query db ..).
We need to enter the classname manually because clojure.java.jdbc doesn't have a default mapping between the subprotocol and the classname for Oracle. This is probably because the Oracle JDBC driver has both thin and OCI JDBC connection options.
To make a connection with a TNS named database, the driver needs the location of the tnsnames.ora file. This is done by setting a system property called oracle.net.tns_admin.
(System/setProperty "oracle.net.tns_admin"
"D:/oracle/product/12.1.0.1/db_1/NETWORK/ADMIN")
Once this is set all we need for subname is the tnsname of the database.
(def db
{:classname "oracle.jdbc.OracleDriver"
:subprotocol "oracle:thin"
:subname "#tnsname"
:user "username"
:password "password"}))
(jdbc/query db ["select ? as one from dual" 1])
Now on to the 'how do connections work' part. As stated earlier, clojure.java.jdbc creates connections when needed, for instance within the query function.
If all you want to do is transform the results of a query, you can give in two extra optional named parameters: :row-fn and :result-set-fn. Every row is transformed with the row-fn, after which the whole resultset is transformed with the result-set-fn.
Both of these are executed within the context of the connection, so the connection is guaranteed to be open until all these actions have been performed, unless these functions return lazy sequences.
By default the :result-set-fn is defined as a doall guaranteeing all results are realized, but if you redefine it be sure to realize all lazy results. Usually whenever you get a connection or resultset closed exception while using results outside of the scope the problem is you didn't.
The connection only exists within the scope of the query function. At the end it is closed. This means that every query results in a connection. If you want multiple queries done within one connection, you can wrap them in a with-db-connection:
(jdbc/with-db-connection [c db]
(doall (map #(jdbc/query c ["select * from EMP where DEPTNO = ?" %])
(jdbc/query c ["select * from DEPT"] :row-fn :DEPTNO))))
In the with-db-connection binding you bind the db-spec to a var, and use that var instead of the db-spec in statements inside the binding scope. It creates a connection and adds that to the var. The other statements will use that connection. This is especially handy when creating dynamic queries based on the result of other queries.
The same thing goes for with-db-transaction. It has the same semantics as with-db-connection, however here the scope not only guarantees the same connection is used, but also that either all statements or none succeed by wrapping them in a transaction block. Both with-db-connection and with-db-transaction are nestable.
There are also more advanced options like creating connection pools and instead of having query et al. create or reuse single connections, have them draw a connection from the pool. See the clojure-doc.org documentation for those.

Testing Memcached connection

I want to run a basic service which shows the status of various other services on the system (i.e. Mongo, Redis, Memcached, etc).
For Memcached, I thought that I might do something like this:
from django.core.cache import cache
host, name = cache._cache._get_server('test')
This seems to return a host and an arbitrary string. Does the host object confirm that I'm connecting to Memcached successfully?
I know that the returned host object has a connect() method. I'm slightly scared to open a new connection in production environment and I don't have an easy Dev setup to test that method. I assume that it's in one of the Python Memcached libraries, but I'm not sure which one is relevant here.
Can I just use the _get_server method to test Memecached connection success, or should I use the connect method?
There are various things you could monitor, like memcache process up, memcache logs moving etc. that don't require network connectivity. Next level of test would be to see that you can open a socket at the memcache port. But the real test is of course to set and get a value from memcache. For that kind of testing I would probably just use the python-memcached package directly to make the connection and set and get values.

Categories