I am trying to retrieve data from a database for use in an api context. However I noticed that conn.close() was taking a relatively long time to execute (in this context conn is a connection from a mysql connection pool). Since closing the connection is not blocking the api's ability to return data I figured I would use asyncio to close the connection async so it wouldn't block the data being returned.
async def get_data(stuff):
conn = api.db.get_connection()
cursor = conn.cursor(dictionary=True)
data = execute_query(stuff, conn, cursor)
cursor.close()
asyncio.ensure_future(close_conn(conn))
return helper_rows
async def close_conn(conn):
conn.close()
results = asyncio.run(get_data(stuff))
However despite the fact the asyncio.ensure_future(close(conn)) is not blocking (I put timing statements in to see how long everything was taking and the ones before and after this command were about 1ms different) the actual result won't be gotten until close_conn is completed. (I verified this using time statements and the difference in time between when it reaches the return statement in get_data and when the line after results=asyncio.run(get_data(stuff)) is about 200ms).
So my question is how do I make this code close the connection in the background so I am free to go ahead and process the data without having to wait for it.
Since conn.close() is not a coroutine it blocks the event loop when close_conn is scheduled. If you want to do what you described, use an async sql client and do await conn.close().
You could try using an asynchronous context manager. (async with statement)
async def get_data(stuff):
async with api.db.get_connection() as conn:
cursor = conn.cursor(dictionary=True)
data = execute_query(stuff, conn, cursor)
cursor.close()
asyncio.ensure_future(close_conn(conn))
return helper_rows
results = asyncio.run(get_data(stuff))
If that doesn't work the sql client you are using try with aiosqlite.
https://github.com/omnilib/aiosqlite
import aiosqlite
Related
I need advaice in a special case.
I have a program like this:
data = [...]
multithread.Pool(n, data)
def slow_function(data)
db = psycopg2.connect(credentials)
cursor = db.cursor()
new_data = realy_slow_func()
some_query = "some update query"
cursor.execute(some_query )
Is opening new connection in each thread safe? It doesn't matter if it's slow, and faster approaches exists.
Threads are necessary because realy_slow_func() is slow.
Credentials for database are the same for each threads
I am using psycopg2
You should be using a connection pool, which will create a pool of connections and reuse the same connections across your thread. I would suggest using a ThreadPool too so that the number of threads running at a time is equal to the number of connections available in the DB Connection Pool. But for the scope of this question, I will talk about DB Connection Pool
I have not tested the code, but this is how it would look. You first create a connectionPool and then get a connection from it within your thread, and once complete release the connection. You could also manage the get connection and release, outside of the thread and just pass the connection as parameter, and release once thread completes
Highlighting ThreadedConnectionPool as the class used to create the pool as the name suggests works with threads.
From docs:
A connection pool that works with the threading module.
Note This pool class can be safely used in multi-threaded applications.
import psycopg2
from psycopg2 import pool
postgreSQL_pool = psycopg2.pool.ThreadedConnectionPool(1, 20, user="postgres",
password="pass##29",
host="127.0.0.1",
port="5432",
database="postgres_db")
data = [...]
multithread.Pool(n, data)
def slow_function(data):
db = postgreSQL_pool.getconn()
cursor = db.cursor()
new_data = realy_slow_func()
some_query = "some update query"
cursor.execute(some_query)
cursor.close()
postgreSQL_pool.putconn(db)
Source: https://pynative.com/psycopg2-python-postgresql-connection-pooling/
Docs: https://www.psycopg.org/docs/pool.html
I have a server that gathers data from a bunch of GPS trackers, and want to ship this data out in real time to X connected clients via WebSockets. The trackers connect over TCP (each in their own thread) and send data regularly to the server. The data is merged in a thread called data_merger and put in that threads queue(). This mechanic works nicely and as intended, however I'm running into issues when I want to send this data to websocket connections.
I tried basing my solution on the websocket synchronization example, as this seemed like it applied to my usecase. I have a thread called outbound_worker that handles the websocket code. From thread.run():
def run(self):
self.data_merger.name = 'data_merger'
self.data_merger.start()
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
start_server = websockets.serve(self.handle_clients, 'localhost', self.port)
print("WebSocker server started for port %s at %s" % (self.port, datetime.now()))
loop.run_until_complete(start_server)
asyncio.get_event_loop().run_forever()
Then the handler method for the websocket server:
async def handle_clients(self, websocket, path):
while True:
try:
# Register the websocket to connected set
await self.register(websocket)
data = self.data_merger.queue.get()
await send_to_clients(data)
await asyncio.sleep(0.1)
except websockets.ConnectionClosed:
print("Connection closed")
await self.unregister(websocket)
break
async def send_to_clients(self, data):
data = json.dumps(data)
if self.connected:
await asyncio.wait([ws.send(data) for ws in self.connected])
The register() and unregister() methods are identical to the example I linked above. The client I'm using is a basic loop that prints the data received:
async def hello():
uri = "ws://localhost:64000"
while True:
async with websockets.connect(uri) as websocket:
print("Awaiting data...")
data = await websocket.recv()
#print(f"{data}")
print(f"{json.loads(data)}")
asyncio.get_event_loop().run_until_complete(hello())
As I am new to asynchronous calls in Python and websockets in general, I'm not sure if my approach is correct here. Since I am trying to push data right after registering a new connection, the code seems to halt at the await send_to_clients(data) line. Should I rather handle this in the data_merger thread and pass the connected set?
Another issue is that if I simply use the client_handler to register() and unregister() the new connections, it seems to just loop over the register() part and I'm unable to connect a second client.
I guess my questions can be condensed into the following:
How do I accept and manage multiple open connections over websocket, similar to a multithreaded socket server?
Is there a way to trigger a function call (for instance register() only on new websocket connections, similar to socket.listen() and socket.accept()?
I have server, where I need to keep connection with client as long as possible. I need to allow for multiple clients connect to this server. Code:
class LoginServer(BaseServer):
def __init__(self, host, port):
super().__init__(host, port)
async def handle_connection(self, reader: StreamReader, writer: StreamWriter):
peername = writer.get_extra_info('peername')
Logger.info('[Login Server]: Accepted connection from {}'.format(peername))
auth = AuthManager(reader, writer)
while True:
try:
await auth.process()
except TimeoutError:
continue
finally:
await asyncio.sleep(1)
Logger.warning('[Login Server]: closing...')
writer.close()
#staticmethod
def create():
Logger.info('[Login Server]: init')
return LoginServer(Connection.LOGIN_SERVER_HOST.value, Connection.LOGIN_SERVER_PORT.value)
The problem: currently only one client can connect to this server. It seems socket do not closing properly. And because of this even previous client cannot reconnect. I think this is because infinite loop exists. How to fix this problem?
The while loop is correct.
If you wanted a server that waits on data from a client you would have the following loop in your handle_connection.
while 1:
data = await reader.read(100)
# Do something with the data
See the example echo server here for more details on reading / writing.
https://asyncio.readthedocs.io/en/latest/tcp_echo.html
Your problem is likely that this function doesn't return and is looping itself without await'g anything. That would mean the asyncio loop would never regain control so new connections could not be made.
await auth.process()
I use python module mysql.connector for connecting to an AWS RDS instance.
Now, as we know, if we do not send a request to SQL server for a while, the connection disconnects.
To handle this, I reconnect to SQL in case a read/write request fails.
Now my problem with the "request fails", it takes significant to fail. And only then can I reconnect, and retry my request. (I have pointed this out as a comment in code snippet).
For a real-time application such as mine, this is a problem. How could I solve this? Is it possible to find out if the disconnection has already happened so that I can try a new connection without having to wait on a read/write request?
Here is how I handle it in my code right now:
def fetchFromDB(self, vid_id):
fetch_query = "SELECT * FROM <db>"
success = False
attempts = 0
output = []
while not success and attempts < self.MAX_CONN_ATTEMPTS:
try:
if self.cnx == None:
self._connectDB_()
if self.cnx:
cursor = self.cnx.cursor() # MY PROBLEM: This step takes too long to fail in case the connection has expired.
cursor.execute(fetch_query)
output = []
for entry in cursor:
output.append(entry)
cursor.close()
success = True
attempts = attempts + 1
except Exception as ex:
logging.warning("Error")
if self.cnx != None:
try:
self.cnx.close()
except Exception as ex:
pass
finally:
self.cnx = None
return output
In my application I cannot tolerate a delay of more than 1 second while reading from mysql.
While configuring mysql, I'm doing just the following settings:
SQL.user = '<username>'
SQL.password = '<password>'
SQL.host = '<AWS RDS HOST>'
SQL.port = 3306
SQL.raise_on_warnings = True
SQL.use_pure = True
SQL.database = <database-name>
There are some contrivances like generating an ALARM signal or similar if a function call takes too long. Those can be tricky with database connections or not work at all. There are other SO questions that go there.
One approach would be to set the connection_timeout to a known value when you create the connection making sure it's shorter than the server side timeout. Then if you track the age of the connection yourself you can preemptively reconnect before it gets too old and clean up the previous connection.
Alternatively you could occasionally execute a no-op query like select now(); to keep the connection open. You would still want to recycle the connection every so often.
But if there are long enough periods between queries (where they might expire) why not open a new connection for each query?
This code is in python but basically it's using OCI so should be reproducible in any other language:
import cx_Oracle as db
dsn = '(DESCRIPTION =(CONNECT_TIMEOUT=3)(RETRY_COUNT=1)(TRANSPORT_CONNECT_TIMEOUT=3)(ADDRESS_LIST =(ADDRESS = (PROTOCOL = TCP)(HOST = SOME_HOST)(PORT = 1531)))(CONNECT_DATA =(SERVICE_NAME = SOME_NAME)))'
connect_string = "LOGIN/PASSWORD#%s" % dsn
conn = db.connect(connect_string)
conn.ping() # WILL HANG FOREVER!!!
If SOME_HOST is down, this will hang forever!
And it's not related to OCIPing - if I replace:
ping()
with:
cursor = conn.cursor()
cursor.execute('SELECT 1 FROM DUAL') # HANG FOREVER AS WELL
This will hang as well.
I'm using SQL*Plus: Release 11.2.0.3.0 Production on Wed Nov 6 12:17:09 2013.
I tried wrapping this code in thread and waiting for same time than killing the thread but this doesn't work. This code creates a thread itself and it's impossible from python to kill it. Do you have any ideas how to recover?
The short answer is to use try/except/finally blocks but if part of your code is truly awaiting for a condition that would never be satisfied, what you need to do, is implement an internal timeout. There are numerous methods to do this. You can adapt the solution to this problem to your needs to get this done.
Hope this helps.
I had the same problems with interrupting conn.ping().
Now I use next construction:
from threading import Timer
pingTimeout = 10 # sec
# ...
def breakConnection():
conn.cancel()
connection = False
try:
t = Timer(pingTimeout, breakConnection)
cursor = conn.cursor()
cursor.execute('SELECT 1 FROM DUAL')
t.close()
cursor.close()
except Exception:
connection = False
if not connection:
print 'Trying to reconnect...'
# ...
It's a dirty way, but it works.
And real way to check if a connection is usable is to execute the application statement you want run (I don't mean SELECT 1 FROM DUAL).
Then try retry, if you catch the exception.
if you want close connection try
conn.close()