IBM db2 connection gets closed after some time

IBM db2 connection gets closed after some time - python

I'm trying to connect to db2 (ibm_db). The connection is successful, i'm able to make changes in the db. But after a while the connection gets closed. I'm not closing the connection anywhere.
It throws this errror:
[IBM][CLI Driver] CLI0106E Connection is closed. SQLSTATE=08003 SQLCODE=-99999
2019-04-11 03:11:20,558 - INFO - werkzeug - 9.46.72.43 - - [11/Apr/2019 03:11:20] POST 200
Here is my code: (Not exact. But something similar)
import ibm_db
conn = ibm_db.connect("database","username","password")
def update():
stmt = ibm_db.exec_immediate(conn, "UPDATE employee SET bonus = '1000' WHERE job = 'MANAGER'")
How do i maintain the connection the whole time. I mean whenever the service is running.

Your design of only making a connection when the service starts is unsuitable for long running services.
There's nothing you can do to stop the other end (i.e. the Db2-server, or any intervening gateway) from closing the connection. The connection can get closed for a variety of reasons. For example, the Db2-server may be configured to discard idle sessions, or sessions that break some site-specific workload-management rules. Network issues can cause connections to become unavailable. Service-management matters can cause connections to be forced off etc.
Check out the pconnect method to see if it helps you. Otherwise consider a better design such as connection-pooling, reconnect-on-demand etc.

Related

How to overcome the 2hr connection timeout (OperationalError) using SQLAlchemy and Postgres?

I'm trying to execute some long-running SQL queries using SQLAlchemy against a Postgres database hosted on AWS RDS.
from sqlalchemy import create_engine
conn_str = 'postgresql://user:password#db-primary.cluster-cxf.us-west-2.rds.amazonaws.com:5432/dev'
engine = create_engine(conn_str)
sql = 'UPDATE "Clients" SET "Name" = NULL'
#this takes about 4 hrs to execute if run in pgAdmin
with engine.begin() as conn:
conn.execute(sql)
After running for exactly 2 hours, the script errors out with
OperationalError: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
(Background on this error at: https://sqlalche.me/e/14/e3q8)
I have tested setting connection timeouts in SQLAlchemy (based on How to set connection timeout in SQLAlchemy). This did not make a difference.
I have looked up the connection settings in the Postgres settings (based on https://dba.stackexchange.com/questions/164419/is-it-possible-to-limit-timeout-on-postgres-server), but both statement_timeout and idle_in_transaction_session_timeout are set to 0, meaning there are no set limits.

I agree with #jjanes. This smells like a TCP connection timeout issue. Might be that somewhere in the network layer something, be it a NAT or a firewall, dropped your TCP connection, leaving the code to wait for the full TCP keepalive timeout until it sees the connection as closed. This could happen usually when the network topology between the client and the database is complicated. For example there may be a company firewall, or some sort of interconnection. pgAdmin may come with a pre-configured setting for TCP keepalive, therefore it was not impacted, but I'm not sure.
Other timeouts didn't kick in because, in my understanding, TCP timeout is in the L4 layer, which overshadows other timeouts that are in L7 application layer.
You could try adding the keepalive parameters into your connection string and see if it can resolve the issue. For example:
postgresql://user:password#db-primary.cluster-cxf.us-west-2.rds.amazonaws.com:5432/dev?keepalives_idle=1&keepalives_count=1&tcp_user_timeout=1000
Note the keepalive parameters at the end. For your reference, here's the explanation to those parameters:
https://www.postgresql.org/docs/current/runtime-config-connection.html

Python long idle connection in cx_Oracle getting: DPI-1080: connection was closed by ORA-3113

I have long-running Python executable running.
Open Oracle connection using cx_Oracle on start.
After more than 45-60 mins of idle connects - it get's this error.
Any idea or special setup required in cx_Oracle ?

Instead of leaving a connection unused in your application, consider closing it when it isn't needed, and then reopening when it is needed. Using a connection pool would be recommended, since pools can handle some underlying failures such as yours and will give you a usable connection.
At application initialization start the pool once:
pool = cx_Oracle.SessionPool("username", pw,
"localhost/orclpdb1", min=0, max=4, increment=1)
Then later get the connection and hold it only when you need it:
with pool.acquire() as connection:
cursor = connection.cursor()
for result in cursor.execute(
"""select sys_context('userenv','sid') from dual"""):
print(result)
The end of the with block will release the connection back to the pool. It
won't be closed. The next time acquire() is called the pool can check the
connection is still usable. If it isn't, it will give you a new one. Because of these checks, the pool is useful even if you only have one connection.
See my blog post Always Use Connection Pools — and
How
most of which applies to cx_Oracle.
But if you don't want to change your code, then try setting an Oracle Network parameter EXPIRE_TIME as shown in the cx_Oracle documentation. This can be set in various places. In C-based Oracle clients like cx_Oracle:
With 18c client libraries it can be added as (EXPIRE_TIME=n) to the DESCRIPTION section of a connect descriptor
With 19c client libraries it can additionally be used via Easy Connect: host/service?expire_time=n.
With 21c client libraries it can additionally be used in a client-side sqlnet.ora file
This may not always help, depending what is closing the connection.
Fundamentally you should/could fix the root cause, which could be a firewall timeout, or a DBA-imposed user resource or DB idle time limit.

How to verify if a mysql database is reachable quickly in python?

I have a mysql server running on my local network that isn't reachable off the network, and it needs to stay like this.
When I am on a different network the following code hangs for about 5-10 seconds, my guess is that its retrying to connect for a number of attempts:
import mysql.connector
conn = mysql.connector.connect(
host="Address",
user="user",
password="password",
database="database"
)
Is there a way to "ping" the mysql server before this code to verify that the MySQL server is reachable or limit the number of retries?
At the moment I am having to use a try-except clause to catch if the server is not reaachable.

Instead of trying to implement specific behavior before connecting, adjust the connect timeout so that you don't have to wait - according to your need, the server is down if you can't connect within a short timeframe anyway.
You can use connection_timeout to adjust the socket timeout used when connecting to the server.
If you set it to a low value (seems like it's in seconds - so 1 should work fine) you'll get the behavior you're looking for (and it will also help you catch any issues with the user/password/database values).

SQLAlchemy SSL SYSCALL timeout coping mechanism

I'm using a combination of SQLAlchemy and Postgres. Once every while my database cluster replaces a failing node, circle of life I guess.
I was under the impression that by configuring my engine in the following manner:
engine = create_engine(
env_config.pg_connection_string,
echo=False,
pool_size=env_config.pg_pool_size,
pool_timeout=1, # Number of seconds to wait before giving up on getting a connection from the pool.
pool_recycle=3600, # Replace connections on CHECKOUT after 1 hour
connect_args={
'connect_timeout': 10, # Maximum wait for connection
"options": "-c statement_timeout=30s" # Maximum amount of time set for statements
},
)
my connections would be timing out on queries >30s, and my connections would timeout after trying for 10 seconds.
What I'm noticing in practice is that in a situation where my db node is being replaced from my db cluster, it sometimes takes 15 mins(900s) dealing with an exception like psycopg2.DatabaseError: SSL SYSCALL error: No route to host. If a db transaction is active while the node is being replaced it could take up to 16 mins for it to raise the SYSCALL exception. All new transactions are being handled well, and I guess routed to the right host? But existing session / transactions seem to block and halt for up to 16 minutes.
My explanation would be that a SSL SYSCALL issue is neither a connection nor a statement related setting, so both configured time-outs would not have an impact. My question remains 'How do I stop or timeout these SSL SYSCALL issues?', I would rather just fail quickly and retry the same query than spend 15 minutes in a blocking call. I'm not sure where to resolve this, I'm guessing either in my DB layer (Postgres, SQLAlchemy, or db driver) or a configuration in my network layer (Centos).
Some more digging in my postgres configurations reveal that both the TCP related settings in postgres for tcp_keepalives_count and tcp_keep_alives_interval are 6 and 10. Which makes we wonder why the connection hasn't been killed after 60 seconds. Also, is it even possible to receive TCP ACKS even though there is no 'Route to Host', the SSL SYSCALL issue.

Unless someone else has a more fitting explanation I'm convinced my issue is being caused by a combination of TCP tcp_retries2 and non gracefully halting of open db connections. Whenever my primary db node is being replaced its being nuked from the cluster, any established connections with that node are being left open / in established state. With the current default TCP settings it could take up to 15 minutes before the connection is dropped, not really sure why this manifests in a SSL SYSCALL exception though.
This issue that covers my problem is covered really well on one of the issues / PR's at the PGbounder repo: https://github.com/pgbouncer/pgbouncer/issues/138, TCP connections taking a long time before marked marked / considered 'dead'.
I suggest reading that page in order to get a better understanding, my assumption being that my issue is also caused by the default TCP settings.
Long story short, I consider to have two options:
Manually tune TCP settings on my host, this will affect all other TCP using components on that machine.
Setup something like PGBouncer so TCP tuning can be done service locally, without affecting anything else on that machine.

Why does PyMongo throw AutoReconnect?

While researching some strange issues with my Python web application (in particular, issues regarding MongoDB connectivity), I noticed something on the official PyMongo documentation page. My web application uses Flask, but this shouldn't influence the issue I'm facing.
The PyMongo driver does connection pooling, but it also throws an exception (AutoReconnect) when a connection is stale and a reconnect is due.
It states that (regarding the AutoReconnect exception):
In order to auto-reconnect you must handle this exception, recognizing
that the operation which caused it has not necessarily succeeded.
Future operations will attempt to open a new connection to the
database (and will continue to raise this exception until the first
successful connection is made).
I have noticed that this actually happens constantly (and it doesn't seem to be an error). Connections are closed by the MongoDB server after what seems like several minutes of inactivity, and need to be recreated by the web application.
What I don't understand it why the PyMongo driver throws an error when it reconnects (which the user of the driver needs to handle themselves), instead of doing it transparently. (There could even be an option a user could set so that AutoReconnect exceptions do get thrown, but wouldn't a sensible default be that these exceptions don't get thrown at all, and the connections are recreated seamlessly?)
I have never encountered this behavior using other database systems, which is why I'm a bit confused.
It's also worth mentioning that my web application's MongoDB connections never fail when connecting to my local development MongoDB server (I assume it would have something to do with the fact that it's a local connection, and that the connection is done through a UNIX socket instead of a network socket, but I could be wrong).

You're misunderstanding AutoReconnect. It is raised when the driver attempts to communicate with the server (to send a command or other operation) and a network failure or similar problem occurs. The name of the exception is meant to communicate that you do not have to create a new instance of MongoClient, the existing client will attempt to reconnect automatically when your application tries the next operation. If the same problem occurs, AutoReconnect is raised again.
I suspect the reason you are seeing sockets timeout (and AutoReconnect being raised) is that there is a load balancer between the server and your application that closes connections after some period of inactivity. For example, this apparently happens on Microsoft's Azure platform after 13 minutes of no activity on a socket. You might be able to fix this by using the socketKeepAlive option, added in PyMongo 2.8. Note that you will also have to set the keepalive interval on your application server to an appropriate value (the default on Linux is 2 hours). See here for more information.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.