I'm using MySQL through SQLAlchemy in a Flask App. In situations where I need to use a db connection, I can see that closing that connection at the end of a request means that I have to wait for a ROLLBACK to be run on the connection against my database, which means waiting for an extra request to travel to the database and back. In my geographical situation, waiting for the extra time for this to happen can take >100ms. This seems like a very avoidable cost as there is really no reason to make a user submitting a request to wait for this final back-and-forth with the database.
Is there any reason not to close the connection asynchronously, or somehow after the response has been sent back to the user? Is there an accepted way of doing this?
Related
I have a front end web app that allows users to push entries to a DynamoDB database on AWS. I then have a python backend that has a websocket connection to AWS that is sent a message any time a new entry appears in the database.
I'm using the websocket-client module in python, and am basically just running their "Long-lived Connection" example which you can see on their git page https://github.com/websocket-client/websocket-client.
I had expected that using run_forever would just keep the connection going and I would receive updates as and when they occurred.. however after a short period of inactivity I get a going away message and the connection closes and no attempt is made to reconnect.
I've followed this guide:
https://spin.atomicobject.com/2021/01/06/websockets-aws-dynamodb-updates/
for setting up the functionality at AWS. This all works great, when the websocket connection is up and running.. i've tried looking through the docs on the websocket-client page but can't find anything useful.
I'm quite new to AWS and websockets.. I had thought using websockets would be a cheaper way to handle this problem rather than polling a database every second, do I need to manually handle the case that AWS disconnects and then reconnect again, or is there some option to increase the length of the time out?
I'm not really sure that the costs are of keeping the connection up longer either though.
If anyone can provide any advice/tips I would be happy to hear them!
Thanks
This seems like a lot of overhead on your end. Why not use DynamoDB Streams and a Lambda trigger to consume the changes/insertions to your DynamoDB table? Its cost efficient and performant and you can even use Event Filters to only consume the items you want, no wasted compute.
So essentially, what I'm trying to do is remove a user's session id whenever they leave the website/close their tab because I want them to re-login every time they access the site. I've heard from other Stack Overflow questions that I should try out the Flask-SocketIO extension, and use the disconnect event to detect when they leave the website and then pop their ids from the session. So that's exactly what I did, however, whenever I pop the session, it doesn't actually register. Here's the full code I used to try implement that.
# Socket IO Events
#socket_io.on('connect')
def on_connect():
app.logger.info("Connected!")
app.logger.info("SESSION INFO: " + str(session))
#socket_io.on('disconnect')
def on_disconnect():
app.logger.info("Client disconnected!")
app.logger.info("SESSION INFO: " + str(session))
if 'id' in session:
session.pop('id')
So as you can tell, whenever I sign up and head to the home page, I receive a session id, and when this disconnect event fires, my session id gets popped. However, take a look at this output.
[2021-07-30 14:20:39,765] INFO in app: Connected!
[2021-07-30 14:20:39,766] INFO in app: SESSION INFO: {'id': 1}
yjI1iUG5o3YmBgRlAAAA: Sending packet PING data None
9jo8DD7RQ5mEcUWZAAAI: Upgrade to websocket successful
yjI1iUG5o3YmBgRlAAAA: Received packet PONG data
HYTS2jEcmhqp9Vq5AAAE: Sending packet PING data None
KQznMBiop36XZLcTAAAC: Client is gone, closing socket
[2021-07-30 14:20:53,854] INFO in app: Client disconnected!
[2021-07-30 14:20:53,854] INFO in app: SESSION INFO: {}
[2021-07-30 14:21:35,164] INFO in app: Connected!
hBzSZoZ-W7_nesEBAAAK: Received request to upgrade to websocket
[2021-07-30 14:21:35,168] INFO in app: SESSION INFO: {'id': 1}
After connecting to the page, it gives me a session id. Then, when I disconnect, it removes my session id and it clearly shows over there, that it removed my session id as there isn't anything in the session dict. However, when I reconnect, it like automatically gives me my session id.
Now based on what I've read from this other question on removing session ids, I cannot use socketio to alter the client's cookies, that's at least what I understood from it. He also said that it'd be better to store the sessions on the server side. But I find that a little troublesome and I don't want to just give up on this. Is there any way that I may store client sessions using Flask's built-in sessions system(that store cookies on the client's side), but still allow me to alter them from a socket-io perspective? I'm just very lost on this. Hope someone can explain how socket-io works or just provide a good and comprehensive article on it. Thanks in advance :) But the main problem is, Flask-SocketIO isn't popping the session id when I tell it to and I'm not sure why.
I'm sorry but you have been misled. Using Flask-SocketIO just so that you can be notified when a person leaves your site is extremely overkill, and as you've seen, it doesn't even work for your purposes.
If you want to know why you can't make changes to the user session from a Socket.IO event handler the reason is that Flask-SocketIO uses WebSocket. The user session is maintained in a cookie, which can only be modified in a server response to an HTTP request initiated by the client. WebSocket has no ability to modify cookies.
So I would forget about using Flask-SocketIO for this. it's really not one of its use cases.
Instead, I suggest you look into adding a beforeunload event in your page, where you can delete the session cookie directly in the client. For this to work you will also need to set the SESSION_COOKIE_HTTPONLY configuration option to Flask, so that JavaScript in your page can see, modify and delete the cookie. Note that this may have security implications, so think about it very carefully before doing it.
I just want to inquire regarding reusing the same connection while having a loop sending the same POST request. Assume I have this code:
import requests
import time
r = requests.Session()
url = "http://somenumbers.php"
while True:
x = r.post(url)
time.sleep(10)
Now according to the documentation of requests library
Excellent news — thanks to urllib3, keep-alive is 100% automatic within a session! Any requests that you make within a session will automatically reuse the appropriate connection!
Note that connections are only released back to the pool for reuse once all body data has been read; be sure to either set stream to False or read the content property of the Response object
Does this work for the code up above? I am trying to prevent sending the same request in case the server freezes or a read timeout occurs. In Issue with sending POST requests using the library requests I go over the whole problem, and one of the suggestions is to reuse the connection, but
Isn't sending the same request on the same connection will just mean multiple entries, or is it going to fix the issue since it will only pull back when one entry is sent as the documentation states?
Assuming the latter is true, won't that affect performance and cause long delays since the request is trapped inside the connection?!
r.post is a blocking call. The function will only return once the request has been sent and a response is received. As long as you access x.content before the loop terminates, the next loop will re-use the underlying TCP connection.
Isn't sending the same request on the same connection will just mean
multiple entries, or is it going to fix the issue since it will only
pull back when one entry is sent as the documentation states?
requests doesn't cache the response. It will not check if a previous request having the same parameters was made. If you need that, you will have to build something on your own.
won't that affect performance and cause long delays since the request
is trapped inside the connection
requests will only re-use an available connection. If no free connection exists, a new connection will be established. You can use requests.packages.urllib3.poolmanager.PoolManager to control the number of connections in the pool.
I have a web app using Django. The app has a maximum capacity of N RPS, while the client sends M RPS, where M>N. In other words, the app receives more requests than it can handle, and the number of unprocessed requests will grow linearly over time (after t sec, the number of requests that are waiting to be processed is (M-N) * t)
I would like to know what will happen to these requests. Will they accumulate in memory until the memory is full? Will they get "canceled" after some conditions are met?
It's hard to answer to your question directly without details about your configuration. Moreover for a extremely high usage of your app it's realy hard to determine what will happen there. But surely, you can't be sure that all those request will be handled correctly.
If you are able to count how many requests per second your application can handle and you want to make it reliable for more than N requests, then maybe it's a good start to think of some kind of load balancer, which will spread your request over multiple server machines.
To answer your question, I can of think of few posibilities when request can't be handled correclty:
Client cancelled he's request (maybe a browser, which can have maximum time execution limit).
Time execution of request was above the timeout limit set in web server configuration (because of lack of resources, too many I/O operations, ...).
Maybe other service (like some blocked PostgreSQL query or maybe Memcache server failed to work) was timeouted
Your server machine is overloaded and TCP connection can't be established.
Web server of your choice is able to handle only specified in configuration amount of requests/queue length and rejects those over limit (in Apache for example this configuration: http://httpd.apache.org/docs/2.2/mod/mpm_common.html#listenbacklog).
Try to read something about C10K problem, could be useful to think about it more deeply.
I have an HTTP API using Flask and in one particular operation clients use it to retrieve information obtained from a 3rd party API. The retrieval is done with a celery task. Usually, my approach would be to accept the client request for that information and return a 303 See Other response with an URI that can be polled for the response as the background job is finished.
However, some clients require the operation to be done in a single request. They don't want to poll or follow redirects, which means I have to run the background job synchronously, hold on to the connection until it's finished, and return the result in the same response. I'm aware of Flask streaming, but how to do such long-pooling with Flask?
Tornado would do the trick.
Flask is not designed for asynchronization. A Flask instance processes one request at a time in one thread. Therefore, when you hold the connection, it will not proceed to next request.