MongoDB Connection Management in Python

MongoDB Connection Management in Python - python

Is it better to do
from pymongo import Connection
conn = Connection()
db = conn.db_name
def contrivedExample(firstName)
global db
return db.people.find_one({'first-name': firstName})
or
from pymongo import Connection
def contrivedExample(firstName):
with Connection() as conn:
return conn.db_name.people.find_one({'first-name': firstName})
Various basic MongoDB tutorials (Python-oriented or not) imply that an app should connect once at startup; is that actually the case? Does the answer change for non-trivial, long-running applications? Does the answer change for web applications specifically? What are the pros/cons of going single-connection vs. connection-per-request?
Assuming "once at startup" is the right answer, would it be appropriate to start that connection in __init__.py?

The pymongo Connection class supports connection pooling and, as of version 2.2, the auto_start_request option ensures that the same socket will be used for a connection activity during the lifetime of the thread (default behavior). Additionally, there is built-in support for reconnecting when necessary, although your application code should handle the immediate exception.
To your question, I believe it'd be preferable to rely on pymongo's own connection pooling and request a new connection per thread. This Stack Overflow thread also discusses some best practices and explains some of the options at play, which you may find helpful. If necessary, you have the option of sharing the same socket between threads.

Related

Python long idle connection in cx_Oracle getting: DPI-1080: connection was closed by ORA-3113

I have long-running Python executable running.
Open Oracle connection using cx_Oracle on start.
After more than 45-60 mins of idle connects - it get's this error.
Any idea or special setup required in cx_Oracle ?

Instead of leaving a connection unused in your application, consider closing it when it isn't needed, and then reopening when it is needed. Using a connection pool would be recommended, since pools can handle some underlying failures such as yours and will give you a usable connection.
At application initialization start the pool once:
pool = cx_Oracle.SessionPool("username", pw,
"localhost/orclpdb1", min=0, max=4, increment=1)
Then later get the connection and hold it only when you need it:
with pool.acquire() as connection:
cursor = connection.cursor()
for result in cursor.execute(
"""select sys_context('userenv','sid') from dual"""):
print(result)
The end of the with block will release the connection back to the pool. It
won't be closed. The next time acquire() is called the pool can check the
connection is still usable. If it isn't, it will give you a new one. Because of these checks, the pool is useful even if you only have one connection.
See my blog post Always Use Connection Pools — and
How
most of which applies to cx_Oracle.
But if you don't want to change your code, then try setting an Oracle Network parameter EXPIRE_TIME as shown in the cx_Oracle documentation. This can be set in various places. In C-based Oracle clients like cx_Oracle:
With 18c client libraries it can be added as (EXPIRE_TIME=n) to the DESCRIPTION section of a connect descriptor
With 19c client libraries it can additionally be used via Easy Connect: host/service?expire_time=n.
With 21c client libraries it can additionally be used in a client-side sqlnet.ora file
This may not always help, depending what is closing the connection.
Fundamentally you should/could fix the root cause, which could be a firewall timeout, or a DBA-imposed user resource or DB idle time limit.

should i open db connection for every thread?

I have developing kind of chat app.
There are python&postgresql in server side, and xcode, android(java) side are client side(Web will be next phase).
Server program is always runing on ubuntu linux. and I create thread for every client connection in server(server program developed by python). I didnt decide how should be db operations?.
Should i create general DB connection and i should use this
connection for every client's DB
operation(Insert,update,delete..etc). In that case If i create
general connection, I guess i got some lock issue in future. (When i try to get chat message list while other user inserting)
IF I create DB connection when each client connected to my server. In that case, Is there too many connection. and it gaves me performance issue in future.
If i create DB connection on before each db operation, then there is so much db connection open and close operation.
Whats your opinion? Whats the best way?

The best way would be to maintain a pool of database connections in the server side.
For each request, use the available connection from the pool to do database operations and release it back to the pool once you're done.
This way you will not be creating new db connections for each request, which would be a costly operation.

Silently closing websockets in Tornado

I have an nginx-server with one-hour timeout and a Tornado web-server behind it.
When nginx closes the connection, I have no idea about it in Tornado. I saw this question about closing connections automatically by timeout-event (Implementing and testing WebSocket server connection timeout) and I'm going to use it as a fallback workaround.
My question is: does the Tornado have an internal mechanism for websocket connections invalidation?

WebSocketHandler has an overridable on_close method, which should be getting called when the connection is closed (most of the time). This method is not 100% reliable (due to the limitations of the underlying network protocols), however, so a timeout-based fallback is recommended. Tornado doesn't have any built-in support for this, though, so you'll have to implement it yourself, perhaps in a manner similar to the answer you linked to.

Managing connection to redis from Python

I'm using redis-py in my python application to store simple variables or lists of variables in a Redis database, so I thought it would be better to create a connection to the redis server every time I need to save or retrieve a variable as this is not done very often and I don't want to have a permanent connection that timeout.
After reading through some basic tutorials, I created the connections using the Redis class, but have not found a way to close the connection, as this is the first time I'm using Redis. I'm not sure if I'm using the best approach for managing the connections so I would like some advice for this.
This is how I'm setting or getting a variable now:
import redis
def getVariable(variable_name):
my_server = redis.Redis("10.0.0.1")
response = my_server.get(variable_name)
return response
def setVariable(variable_name, variable_value):
my_server = redis.Redis("10.0.0.1")
my_server.set(variable_name, variable_value)
I basically use this code to store the last connection time or to get an average of requests per second done to my app and stuff like that.
Thanks for your advice.

Python uses a reference counter mechanism to deal with objects, so at the end of the blocks, the my_server object will be automatically destroyed and the connection closed. You do not need to close it explicitly.
Now this is not how you are supposed to manage Redis connections. Connecting/disconnecting for each operation is too expensive, so it is much better to maintain the connection opened. With redis-py it can be done by declaring a pool of connections:
import redis
POOL = redis.ConnectionPool(host='10.0.0.1', port=6379, db=0)
def getVariable(variable_name):
my_server = redis.Redis(connection_pool=POOL)
response = my_server.get(variable_name)
return response
def setVariable(variable_name, variable_value):
my_server = redis.Redis(connection_pool=POOL)
my_server.set(variable_name, variable_value)
Please note connection pool management is mostly automatic and done within redis-py.

#sg1990 what if you have 10.000 users requiring redis at the same time? They cannot share a single connection and you've just created yourself a bottleneck.
With a pool of connections you can create an arbitrary number of connections and simply use get_connection() and release(), from redis-py docs.
A connection per user is a huge overkill, since every connection needs to maintain an open socket. This way you'd automatically decrease a number of e.g. concurrent websocket users that your machine can handle by half.

you can use this to create two databases in redis:
r1 = redis.StrictRedis(host="localhost", port=6379, db=0, decode_responses=True)
r2 = redis.StrictRedis(host="localhost", port=6379, db=1, decode_responses=True)

pymongo connection pooling and client requests

I know pymongo is thread safe and has an inbuilt connection pool.
In a web app that I am working on, I am creating a new connection instance on every request.
My understanding is that since pymongo manages the connection pool, it isn't wrong approach to create a new connection on each request, as at the end of the request the connection instance will be reclaimed and will be available on subsequent requests.
Am I correct here, or should I just create a single instance to use across multiple requests?

The "wrong approach" depends upon the architecture of your application. With pymongo being thread-safe and automatic connection pooling, the actual use of a single shared connection, or multiple connections, is going to "work". But the results will depend on what you expect the behavior to be. The documentation comments on both cases.
If your application is threaded, from the docs, each thread accessing a connection will get its own socket. So whether you create a single shared connection, or request a new one, it comes down to whether your requests are threaded or not.
When using gevent, you can have a socket per greenlet. This means you don't have to have a true thread per request. The requests can be async, and still get their own socket.
In a nutshell:
If your webapp requests are threaded, then it doesn't matter which way you access a new connection. The result will be the same (socket per thread)
If your webapp is async via gevent, then it doesn't matter which way you access a new conection. The result will be the same. (socket per greenlet)
If your webapp is async, but NOT via gevent, then you have to take into consideration the notes on the best suggested workflow.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.