I'm using redis-py in my python application to store simple variables or lists of variables in a Redis database, so I thought it would be better to create a connection to the redis server every time I need to save or retrieve a variable as this is not done very often and I don't want to have a permanent connection that timeout.
After reading through some basic tutorials, I created the connections using the Redis class, but have not found a way to close the connection, as this is the first time I'm using Redis. I'm not sure if I'm using the best approach for managing the connections so I would like some advice for this.
This is how I'm setting or getting a variable now:
import redis
def getVariable(variable_name):
my_server = redis.Redis("10.0.0.1")
response = my_server.get(variable_name)
return response
def setVariable(variable_name, variable_value):
my_server = redis.Redis("10.0.0.1")
my_server.set(variable_name, variable_value)
I basically use this code to store the last connection time or to get an average of requests per second done to my app and stuff like that.
Thanks for your advice.
Python uses a reference counter mechanism to deal with objects, so at the end of the blocks, the my_server object will be automatically destroyed and the connection closed. You do not need to close it explicitly.
Now this is not how you are supposed to manage Redis connections. Connecting/disconnecting for each operation is too expensive, so it is much better to maintain the connection opened. With redis-py it can be done by declaring a pool of connections:
import redis
POOL = redis.ConnectionPool(host='10.0.0.1', port=6379, db=0)
def getVariable(variable_name):
my_server = redis.Redis(connection_pool=POOL)
response = my_server.get(variable_name)
return response
def setVariable(variable_name, variable_value):
my_server = redis.Redis(connection_pool=POOL)
my_server.set(variable_name, variable_value)
Please note connection pool management is mostly automatic and done within redis-py.
#sg1990 what if you have 10.000 users requiring redis at the same time? They cannot share a single connection and you've just created yourself a bottleneck.
With a pool of connections you can create an arbitrary number of connections and simply use get_connection() and release(), from redis-py docs.
A connection per user is a huge overkill, since every connection needs to maintain an open socket. This way you'd automatically decrease a number of e.g. concurrent websocket users that your machine can handle by half.
you can use this to create two databases in redis:
r1 = redis.StrictRedis(host="localhost", port=6379, db=0, decode_responses=True)
r2 = redis.StrictRedis(host="localhost", port=6379, db=1, decode_responses=True)
Related
Global variables are not thread-safe or "process-safe" in Flask.
However, I need to open connections to services that each worker will use, such as a PubSub client or a Cloud Storage client. It seems like these still need to be global so that any function in the application can access them. To lazily initialize them, I check if the variable is None, and this needs to be thread-safe. What is the recommended approach for opening connections that each request will use? Should I use a thread lock to synchronize?
The question you linked is talking about data, not connections. Having multiple workers mutating global data is not good because you can't reason about where those workers are in a web application to keep them in sync.
The solution to that question is to use an external data source, like a database, which must be connected to somehow. Your idea to have one global connection is not safe though, since multiple worker threads would interact with it concurrently and either mess with each other's state or wait one at a time to acquire the resource. The simplest way to handle this is to establish a connection in each view when you need it.
This example shows how to have a unique connection per request, without globals, reusing the connection once it's established for the request. The g object, while it looks like a global, is implemented as a thread-local behind the scenes, so each worker gets it's own g instance and connection stored on it during one request only.
from flask import g
def get_conn():
"""Use this function to establish or get the already established
connection during a request. The connection is closed at the end
of the request. This avoids having a global connection by storing
the connection on the g object per request.
"""
if "conn" not in g:
g.conn = make_connection(...)
return g.conn
#app.teardown_request
def close_conn(e):
"""Automatically close the connection after the request if
it was opened.
"""
conn = g.pop("conn", None)
if conn is not None:
conn.close()
#app.route("/get_data")
def get_data():
# If something else has already used get_conn during the
# request, this will return the same connection. Anything
# that uses it after this will also use the same connection.
conn = get_conn()
data = conn.query(...)
return jsonify(data)
You might eventually find that establishing a new connection each request is too expensive once you have many thousands of concurrent requests. One solution is to build a connection pool to store a list of connections globally, with a thread-safe way to acquire and replace a connection in the list as needed. SQLAlchemy (and Flask-SQLAlchemy) uses this technique. Many libraries already provide connection pool implementations, so either use them or use them as a reference for your own.
pool = redis.ConnectionPool(host='10.0.0.1', port=6379, db=0)
r = redis.Redis(connection_pool=pool)
vs.
r = redis.Redis(host='10.0.0.1', port=6379, db=0)
Those two works fine.
Whats the idea behind using connection pool? When would you use it?
From the redis-py docs:
Behind the scenes, redis-py uses a connection pool to manage connections to a Redis server. By default, each Redis instance you create will in turn create its own connection pool. You can override this behavior and use an existing connection pool by passing an already created connection pool instance to the connection_pool argument of the Redis class. You may choose to do this in order to implement client side sharding or have finer grain control of how connections are managed.
So, normally this is not something you need to handle yourself, and if you do, then you know!
How does MongoClient works and creates a connection pooling or thread creation?
What are major resources used if a create a multiple connections?
My main reson for asking is this ?
I have created multiple classes in python which represents functionality of single collection in mongodb. In each class i am creating a client
self.client = MongoClient(hostname, port)
What resources i need to worry about and what can be performance issues?
If there way i can share single client along all classes ?
Make one MongoClient. Make it a global variable in a module:
client = MongoClient(host, port)
A MongoClient has a built-in connection pool, and it starts a thread to monitor its connection to your server. For best efficiency, make one MongoClient and share it throughout your program.
class host_struct(object):
host_id = dict()
host_license_id = dict()
def m(a):
eval(a)
host = host_struct()
m('host.host_id={1:1}')
print host
The above code doesn't work and is a sample of what I am trying to accomplish. I am trying to solve a problem where I need to call a function with a class object as a string, yet in the function manipulate the object as a class.
Here is my problem: I have a connection pooler/broker module, that maintains a persistent connection to the server. The server sets a inactivity TTL on all connections of 30 minutes. So every 29 minutes the broker need to touch the server to maintain a persistent connection. At the same time the connection broker needs to process client requests which it will send to the server and when the server responds, send the server's reply to the client.
The communications to the server are via a connection class that has many complex objects. So allowing the client modules to directly manipulate the class would bypass the connection broker entirely which will result in the server terminating the connection due to the inactivity TTL.
Is this possible? Is there a better way to address this problem?
Here is some additional background. I am opening a connection to VMWare vCenter. To initiate the connection, I instantiate the connection class, then call a connection method. Currently in my client programs, I am doing all of this now. However I am running into a problem with vCenter and need to connect once when I start the program and use the same connection for the entire run. Currently I am opening a connection to vCenter do my work, close the connection and sleep for a period of time then repeat the process. This continual connect/disconnect is causing issues. So I wrote a test to see if I could address the issues my maintaining a persistent connection and I was successful.
vcenter = VIServer()
vcenter.connect(*config_values)
At this point, the vcenter object is connected to the server. There are several method calls I need to make to query certain objects. Here are 2 examples of the many I use:
vms = vcenter._retrieve_properties_traversal(property_names=vm_objects,obj_type='VirtualMachine')
or
api_version = vcenter.get_api_version()
The first line will retrieve specific VM objects from the server and the second gets the API version. I would like to call this method from the connection broker because he will be the one that is keeping the connection to vCenter open.
So in my connection broker I would like to pass 'vcenter.get_api_version()' as a string argument and have the connection broker execute api = vcenter.get_api_version().
Does this help to clarify?
Use exec instead of eval. Example:
class host_struct: # no need for parentheses if not inheriting from something besides object
host_id = {} # use of {} is more idiomatic than dict()
host_license_id = {}
def m(a):
exec a
host = host_struct()
m('host.host_id.update({1:1})') # updating will add to existing dict instead of replacing
print host.host_id[1]
Running this script produces the expecte output of 1.
I know pymongo is thread safe and has an inbuilt connection pool.
In a web app that I am working on, I am creating a new connection instance on every request.
My understanding is that since pymongo manages the connection pool, it isn't wrong approach to create a new connection on each request, as at the end of the request the connection instance will be reclaimed and will be available on subsequent requests.
Am I correct here, or should I just create a single instance to use across multiple requests?
The "wrong approach" depends upon the architecture of your application. With pymongo being thread-safe and automatic connection pooling, the actual use of a single shared connection, or multiple connections, is going to "work". But the results will depend on what you expect the behavior to be. The documentation comments on both cases.
If your application is threaded, from the docs, each thread accessing a connection will get its own socket. So whether you create a single shared connection, or request a new one, it comes down to whether your requests are threaded or not.
When using gevent, you can have a socket per greenlet. This means you don't have to have a true thread per request. The requests can be async, and still get their own socket.
In a nutshell:
If your webapp requests are threaded, then it doesn't matter which way you access a new connection. The result will be the same (socket per thread)
If your webapp is async via gevent, then it doesn't matter which way you access a new conection. The result will be the same. (socket per greenlet)
If your webapp is async, but NOT via gevent, then you have to take into consideration the notes on the best suggested workflow.