I have an nginx-server with one-hour timeout and a Tornado web-server behind it.
When nginx closes the connection, I have no idea about it in Tornado. I saw this question about closing connections automatically by timeout-event (Implementing and testing WebSocket server connection timeout) and I'm going to use it as a fallback workaround.
My question is: does the Tornado have an internal mechanism for websocket connections invalidation?
WebSocketHandler has an overridable on_close method, which should be getting called when the connection is closed (most of the time). This method is not 100% reliable (due to the limitations of the underlying network protocols), however, so a timeout-based fallback is recommended. Tornado doesn't have any built-in support for this, though, so you'll have to implement it yourself, perhaps in a manner similar to the answer you linked to.
Related
In my application I need to "simulate" a HTTP timeout. Simply put, in this scenario:
client -> myapp -> server
client makes a HTTP POST connection to myapp which forwards it to server. However, server does not respond due to network issues or similar problems. I am stuck with an open TCP session from client which I'll need to drop.
My application uses web.py, nginx and uwsgi.
I cannot return a custom HTTP error such as 418 I am a teapot - it has to be a connection timeout to mirror server's behaviour as closely as possible.
One hack-y solution could be (I guess) to just time.wait() until client disconnects but this would use a uwsgi thread and I have a feeling it could lead to resource starvation because a server timeout is likely to happen for other connections. Another approach is pointed out here however this solution implies returning something to client, which is not my case.
So my question is: is there an elegant way to kill a uwsgi worker programmatically from python code?
So far I've found
set_user_harakiri(N) which I could combine with a time.sleep(N+1). However in this scenario uwsgi detects the harakiri and tries re-spawning the worker.
worker_id() but I'm not sure how to handle it - I can't find much documentation on using it
A suggestion to use connection_fd() as explained here
disconnect() which does not seem to do anything, as the code continues and returns to client
suspend() does suspend the instance, but NGINX returns the boilerplate error page
Any other idea?
UPDATE
Turns out it's more complicated than that. If I just close the socket or disconnect from uwsgi the nginx web server detects a 'server error' and returns a 500 boilerplate error page. And, I do not know how to tell nginx to stop being so useful.
The answer is a combination of both.
From the python app, return 444
Configure nginx as explained on this answer i.e. using the uwsgi_intercept_errors directive.
I'm running a Twisted server with the LineReceiver protocol. Sometimes clients will disconnect silently, so Twisted keeps the connection open. And because the server doesn't send anything unless requested of it, there's never a TCP timeout. In other words, some connections are never closed server-side.
How can I have Twisted close a connection that's been inactive for a few hours?
You can schedule timed events using reactor.callLater. Based on this, there's a helper for adding timeouts to protocols, twisted.protocols.policies.TimeoutMixin.
Another approach is to use TCP keep-alives, which you can enable using the transport's setTcpKeepAlive method.
And another approach is to use application-level keep-alives. Essentially send a ''noop'' once in a while. It doesn't need a response. If the connection has been lost, the extra data in the send buffer will cause the TCP stack to eventually notice.
See also the FAQ entry.
I know pymongo is thread safe and has an inbuilt connection pool.
In a web app that I am working on, I am creating a new connection instance on every request.
My understanding is that since pymongo manages the connection pool, it isn't wrong approach to create a new connection on each request, as at the end of the request the connection instance will be reclaimed and will be available on subsequent requests.
Am I correct here, or should I just create a single instance to use across multiple requests?
The "wrong approach" depends upon the architecture of your application. With pymongo being thread-safe and automatic connection pooling, the actual use of a single shared connection, or multiple connections, is going to "work". But the results will depend on what you expect the behavior to be. The documentation comments on both cases.
If your application is threaded, from the docs, each thread accessing a connection will get its own socket. So whether you create a single shared connection, or request a new one, it comes down to whether your requests are threaded or not.
When using gevent, you can have a socket per greenlet. This means you don't have to have a true thread per request. The requests can be async, and still get their own socket.
In a nutshell:
If your webapp requests are threaded, then it doesn't matter which way you access a new connection. The result will be the same (socket per thread)
If your webapp is async via gevent, then it doesn't matter which way you access a new conection. The result will be the same. (socket per greenlet)
If your webapp is async, but NOT via gevent, then you have to take into consideration the notes on the best suggested workflow.
The fun part of websockets is sending essentially unsolicited content from the server to the browser right?
Well, I'm using django-websocket by Gregor Müllegger. It's a really wonderful early crack at making websockets work in Django.
I have accomplished "hello world." The way this works is: when a request is a websocket, an object, websocket, is appended to the request object. Thus, I can, in the view interpreting the websocket, do something like:
request.websocket.send('We are the knights who say ni!')
That works fine. I get the message back in the browser like a charm.
But what if I want to do that without issuing a request from the browser at all?
OK, so first I save the websocket in the session dictionary:
request.session['websocket'] = request.websocket
Then, in a shell, I go and grab the session by session key. Sure enough, there's a websocket object in the session dictionary. Happy!
However, when I try to do:
>>> session.get_decoded()['websocket'].send('With a herring!')
I get:
Traceback (most recent call last):
File "<console>", line 1, in <module>
error: [Errno 9] Bad file descriptor
Sad. :-(
OK, so I don't know much of anything about sockets, but I know enough to sniff around in a debugger, and lo and behold, I see that the socket in my debugger (which is tied to the genuine websocket from the request) has fd=6, while the one that I grabbed from the session-saved websocket has fd=-1.
Can a socket-oriented person help me sort this stuff out?
I'm the author of django-websocket. I'm not a real expert in the topic of websockets and networking, however I think I have a decent understanding of whats going on. Sorry for going into great detail. Even if most of the answer isn't specific to your question it might help you at some other point. :-)
How websockets work
Let me explain shortly what a websocket is. A websocket starts as something that really looks like a plain HTTP request, established from the browser. It indicates through a HTTP header that it wants to "upgrade" the protocol to be a websocket instead of a HTTP request. If the server supports websockets, it agrees on the handshake and both - server and client - now know that they will use the established tcp socket formerly used for the HTTP request as a connection to interchange websocket messages.
Beside sending and waiting for messages, they have also of course the ability to close the connection at any time.
How django-websocket abuses the python's wsgi request environment to hijack the socket
Now lets get into the details of how django-websocket implements the "upgrading" of the HTTP request in a django request-response cylce.
Django usually uses the WSGI specification to talk to the webserver like apache or gunicorn etc. This specification was designed just with the very limited communication model of HTTP in mind. It assumes that it gets a HTTP request (only incoming data) and returns the response (only outgoing data). This makes it tricky to force django into the concept of a websocket where bidirectional communication is allowed.
What I'm doing in django-websocket to achieve this is that I dig very deeply into the internals of WSGI and django's request object to retrieve the underlaying socket. This tcp socket is then used to handle the upgrade the HTTP request to a websocket instance directly.
Now to your original question ...
I hope the above makes it obvious that when a websocket is established, there is no point in returning a HttpResponse. This is why you usually don't return anything in a view that is handled by django-websocket.
However I wanted to stick close to the concept of a view that holds the logic and returns data based on the input. This is why you should only use the code in your view to handle the websocket.
After you return from the view, the websocket is automatically closed. This is done for a reason: We don't want to keep the socket open for an undefined amount of time and relying on the client (the browser) to close it.
This is why you cannot access a websocket with django-websocket outside of your view. The file descriptor is then of course set to -1 indicating that its already closed.
Disclaimer
I explained above that I'm digging in the surrounding environment of django to get somehow -- in a very hackish way -- access to the underlaying socket. This is very fragile and also not supposed to work since WSGI is not designed for this! I also explained above that the websocket is closed after the view ends - however after the websocket closed down (AND closed the tcp socket), django's WSGI implementation tries to send a HTTP response - it doesn't know about websockets and thinks it is in a normal HTTP request-response cycle. But the socket is already closed an the sending will fail. This usually causes an exception in django.
This didn't affected my testings with the development server. The browser will never notice (you know .. the socket is already closed ;-) - but raising an unhandled error in every request is not a very good concept and may leak memory, doesn't handle database connection shutdown correctly and many athor things that will break at some point if you use django-websocket for more than experimenting.
This is why I would really advise you not to use websockets with django yet. It doesn't work by design. Django and especially WSGI would need a total overhaul to solve these problems (see this discussion for websockets and WSGI). Since then I would suggest using something like eventlet. Eventlet has a working websocket implementation (I borrowed some code from eventlet for the initial version of django-websocket) and since its just plain python code you can import your models and everything else from django. The only drawback is that you need a second webserver running just to handle websockets.
As Gregor Müllegger pointed out, Websockets can't be properly handled by WSGI, because that protocol never was designed to handle such a feature.
uWSGI, since version 1.9.11, can handle Websockets out of the box. Here uWSGI communicates with the application server using raw HTTP rather than the WSGI protocol. A server written that way, can therefore handle the protocol internals and keep the connection open over a long period. Having long living connections handled by a Django view is not a good idea either, because they then would block a worker thread, which is a limited resource.
The main purpose of Websockets, is to have the server push messages to the client in an asynchronous way. This can be a Django view triggered by other browsers (ex.: chat clients, multiplayer games), or an event triggered by, say django-celery (ex.: sport results). It therefore is fundamental for these Django services, to use a message queue for pushing messages to the client.
To handle this in a scalable way, I wrote django-websocket-redis, a Django module which can keep open all those long living Websocket connections in one single thread/process using Redis as the backend message queue.
You could give stargate a bash: http://boothead.github.com/stargate/ and http://pypi.python.org/pypi/stargate/.
It's built on top of pyramid and eventlet (I also contributed a fair bit of the websocket support and tests to eventlet). The big advantage of pyramid for this sort of thing is that it's got the concept of a resource which the url maps to, rather than just the result of a callable. So you end up with a graph of persistent resources that maps to your url structure and websocket connections are simply routed and connected to those resources.
So you end up only needing to do two things:
class YourView(WebSocketView):
def handler(self, websocket):
self.request.context.add_listener(websocket)
while True:
msg = websocket.wait()
# Do something with message
To receive messages
and
resource.send(some_other_message)
Here resource is an instance of a stargate.resource.WebSocketAwareContext (as is self.request.context) above and the send method sends the message to all clients connected with the add_listener method.
To publish a message to all of the connected clients you just call node.send(message)
I'm hopefully going to write up a little example app in the next week or two to demonstrate this a little better.
Feel free to ping me on github if you want some help with it.
request.websocket is probably get closed when you return from the request handler (view). The simple solution is to keep the handler alive (by not returning from the view). If your server is not multi-threaded you won't be able to accept any other simultaneous requests.
I wrote a server based on Twisted, and I encountered a problem, some of the clients are disconnected not gracefully. For example, the user pulls out the network cable.
For a while, the client on Windows is disconnected (the connectionLost is called, and it is also written in Twisted). And on the Linux server side, my connectionLost of twisted is never triggered. Even it try to writes data to client when the connection is lost. Why Twisted can't detect those non-graceful disconnection (even write data to client) on Linux? How to makes Twisted detect non-graceful disconnections? Because the feature Twisted can't detect non-graceful, I have lots of zombie user on my server.
---- Update ----
I thought it might be the feature of socket of unix-like os, so, what is the behavior of socket on unix-like for handling situation like this?
Thanks.
Victor Lin.
You're describing the behavior of TCP connections on an unreliable network. Twisted is merely exposing this behavior: after all, when you set up a TCP connection with Twisted, it is nothing more than a TCP connection.
You're mistaken when you say that the connectionLost callback isn't invoked even if you try to send data over it. After two minutes, the underlying TCP connection will disappear and Twisted will inform you of this by calling connectionLost.
If you need to detect this condition more quickly than that, then you can implement your own timeouts using reactor.callLater.
Seconding what Jean-Paul said, if you need more fine grained TCP connection management, just use reactor.CallLater. We have exactly that implementation on a Twisted/wxPython trading platform, and it works a treat. You might also want to tweak the behaviour of the ReconnectingClientFactory in order to achieve the results I understand your looking for.