Twisted deferred vs blocking in web services

Twisted deferred vs blocking in web services - python

I'm struggling to produce the same behavior in web service code that uses Deferred objects as in code that does not. My objective is to write a decorator that will delegate processing of any method (which is decoupled from Twisted) to the Twisted thread pool, so that the reactor is not blocked, without changing any of that method's semantics.
When an instance of class echo below is exposed as a web service, this code:
from twisted.web import server, resource
from twisted.internet import defer, threads
from cgi import escape
from itertools import count
class echo(resource.Resource):
isLeaf = True
def errback(self, failure): return failure
def callback1(self, request, value):
#raise ValueError # E1
lines = ['<html><body>\n',
'<p>Page view #%s in this session</p>\n' % (value,),
'</body></html>\n']
return ''.join(lines)
def callback2(self, request, encoding):
def execute(message):
#raise ValueError # E2
request.write(message.encode(encoding))
#raise ValueError # E3
request.finish()
#raise ValueError # E4
return server.NOT_DONE_YET
return execute
def render_GET(self, request):
content_type, encoding = 'text/html', 'UTF-8'
request.setHeader('Content-Type', '%s; charset=%s' %
tuple(map(str, (content_type, encoding))))
s = request.getSession()
if not hasattr(s, 'counter'):
s.counter = count(1)
d = threads.deferToThread(self.callback1, request, s.counter.next())
d.addCallback(self.callback2(request, encoding))
d.addErrback(self.errback)
#raise ValueError # E5
return server.NOT_DONE_YET
will display an HTML document to the browser when all the raise statements are commented out, and display a nicely formatted stack trace (which Twisted does for me) when the raise statement labelled "E5" is included. That is what I want. Likewise, if I do not use Deferred objects at all and place all the behavior from callback1 and callback2 within render_GET(), an exception raised anywhere within render_GET will produce the desired stack trace.
I am trying to write code that will respond to the browser immediately, not cause resource leaks within Twisted, and cause the browser stack trace to also be displayed in the cases where any of the raise statements "E1" to "E3" is included in the deferred code--though of course I understand that the stack traces themselves will be different. (The "E4" case I don't care about as much.) After reading the Twisted documentation and other questions on this site I am unsure how to achieve this. I would have thought that adding an errback should facilitate this, but evidently not. There must be something about Deferred objects and the twisted.web stack that I'm not understanding.
The effects on logging I document here may be affected by my use of the PythonLoggingObserver to bridge Twisted logging to the standard logging module.
When "E1" is included, the browser waits until the reactor is shut down, at which point the ValueError exception with stack trace is logged and the browser receives an empty document.
When "E2" is included, the ValueError exception with stack trace is logged immediately, but the browser waits until the reactor shuts down at which point it receives an empty document.
When "E3" is included, the ValueError exception with stack trace is logged immediately, the browser waits until the reactor shuts down, and at that point receives the intended document.
When raise statement "E4" is included, the intended document is returned to the browser immediately, and the ValueError exception with stack trace is logged immediately. (Is there any possibility of a resource leak in this case?)

Ok, after reading your question several times, I think I understand what your asking. I have also reworked you code to make a little better than your original answer. This new answer should show off all the powers of deferred's.
from twisted.web import server, resource
from twisted.internet import defer, threads
from itertools import count
class echo(resource.Resource):
isLeaf = True
def errback(self, failure, request):
failure.printTraceback() # This will print the trace back in a way that looks like a python exception.
# log.err(failure) # This will use the twisted logger. This is the best method, but
# you need to import twisted log.
request.processingFailed(failure) # This will send a trace to the browser and close the request.
return None # We have dealt with the failure. Clean it out now.
def final(self, message, request, encoding):
# Message will contain the message returned by callback1
request.write(message.encode(encoding)) # This will write the message and return it to the browser.
request.finish() # Done
def callback1(self, value):
#raise ValueError # E1
lines = ['<html><body>\n',
'<p>Page view #%s in this session</p>\n' % (value,),
'</body></html>\n']
return ''.join(lines)
#raise ValueError # E4
def render_GET(self, request):
content_type, encoding = 'text/html', 'UTF-8'
request.setHeader('Content-Type', '%s; charset=%s' %
tuple(map(str, (content_type, encoding))))
s = request.getSession()
if not hasattr(s, 'counter'):
s.counter = count(1)
d = threads.deferToThread(self.callback1, s.counter.next())
d.addCallback(self.final, request, encoding)
d.addErrback(self.errback, request) # We put this here in case the encoding raised an exception.
#raise ValueError # E5
return server.NOT_DONE_YET
Also I recommend that you read the krondo tutorial. It will teach you everything you need to know about deferred.
Edit:
Have modified the code above to fix some silly bugs. Also improved it. If an exception happens anywhere (except in self.errback, but we need some level of trust) then it will be passed to self.errback which will log or print the error in twisted and then send the trace to the browser and close the request. This should stop any resource leaks.

I figured it out by digging through the Twisted source. The necessary insight is that the reactor and Deferred callback/errback chain logic is decoupled from the request object, which is how data gets back to the browser. The errback is necessary, but cannot merely propagate the Failure object down the chain as in the original code I posted. The errback must report the error to the browser.
The below code meets my requirements (never keeps the browser waiting, always gives the stack trace, does not require a reactor restart to get things going again) and will allow me to decorate blocking methods and thereby delegate them to threads to keep the reactor responsive to other events (such methods will essentially take the place of callback1 here). However, I did find that in the below code, uncommenting the "E4" raise statement produces very strange behavior on subsequent browser requests (partial data from previous requests returned to the browser; deadlock).
Hopefully others will find this to be a useful Deferred example.
from twisted.web import server, resource
from twisted.internet import defer, threads
from itertools import count
class echo(resource.Resource):
isLeaf = True
def errback(self, request):
def execute(failure):
request.processingFailed(failure)
return failure
return execute
def callback1(self, value):
#raise ValueError # E1
lines = ['<html><body>\n',
'<p>Page view #%s in this session</p>\n' % (value,),
'</body></html>\n']
return ''.join(lines)
def callback2(self, request, encoding):
def execute(message):
#raise ValueError # E2
request.write(message.encode(encoding))
#raise ValueError # E3
request.finish()
#raise ValueError # E4
return server.NOT_DONE_YET
return execute
def render_GET(self, request):
content_type, encoding = 'text/html', 'UTF-8'
request.setHeader('Content-Type', '%s; charset=%s' %
tuple(map(str, (content_type, encoding))))
s = request.getSession()
if not hasattr(s, 'counter'):
s.counter = count(1)
d = threads.deferToThread(self.callback1, s.counter.next())
eback = self.errback(request)
d.addErrback(eback)
d.addCallback(self.callback2(request, encoding))
d.addErrback(eback)
#raise ValueError # E5
return server.NOT_DONE_YET

Related

Request timed out: timeout('timed out') in Python's HTTPServer

I am trying to create a simple HTTP server that uses the Python HTTPServer which inherits BaseHTTPServer. [https://github.com/python/cpython/blob/main/Lib/http/server.py][1]
There are numerous examples of this approach online and I don't believe I am doing anything unusual.
I am simply importing the class via:
"from http.server import HTTPServer, BaseHTTPRequestHandler"
in my code.
My code overrides the do_GET() method to parse the path variable to determine what page to show.
However, if I start this server and connect to it locally (ex: http://127.0.0.1:50000) the first page loads fine. If I navigate to another page (via my first page links) that too works fine, however, on occasion (and this is somewhat sporadic), there is a delay and the server log shows a Request timed out: timeout('timed out') error. I have tracked this down to the handle_one_request method in the BaseHTTPServer class:
def handle_one_request(self):
"""Handle a single HTTP request.
You normally don't need to override this method; see the class
__doc__ string for information on how to handle specific HTTP
commands such as GET and POST.
"""
try:
self.raw_requestline = self.rfile.readline(65537)
if len(self.raw_requestline) > 65536:
self.requestline = ''
self.request_version = ''
self.command = ''
self.send_error(HTTPStatus.REQUEST_URI_TOO_LONG)
return
if not self.raw_requestline:
self.close_connection = True
return
if not self.parse_request():
# An error code has been sent, just exit
return
mname = 'do_' + self.command ## the name of the method is created
if not hasattr(self, mname): ## checking that we have that method defined
self.send_error(
HTTPStatus.NOT_IMPLEMENTED,
"Unsupported method (%r)" % self.command)
return
method = getattr(self, mname) ## getting that method
method() ## finally calling it
self.wfile.flush() #actually send the response if not already done.
except socket.timeout as e:
# a read or a write timed out. Discard this connection
self.log_error("Request timed out: %r", e)
self.close_connection = True
return
You can see where the exception is thrown in the "except socket.timeout as e:" clause.
I have tried overriding this method by including it in my code but it is not clear what is causing the error so I run into dead ends. I've tried creating very basic HTML pages to see if there was something in the page itself, but even "blank" pages cause the same sporadic issue.
What's odd is that sometimes a page loads instantly, and almost randomly, it will then timeout. Sometimes the same page, sometimes a different page.
I've played with the http.timeout setting, but it makes no difference. I suspect it's some underlying socket issue, but am unable to diagnose it further.
This is on a Mac running Big Sur 11.3.1, with Python version 3.9.4.
Any ideas on what might be causing this timeout, and in particular any suggestions on a resolution. Any pointers would be appreciated.

After further investigation, this particular appears to be an issue with Safari. Running the exact same code and using Firefox does not show the same issue.

Call to async endpoint gets blocked by another thread

I have a tornado webservice which is going to serve something around 500 requests per minute. All these requests are going to hit 1 specific endpoint. There is a C++ program that I have compiled using Cython and use it inside the tornado service as my processor engine. Each request that goes to /check/ will trigger a function call in the C++ program (I will call it handler) and the return value will get sent to user as response.
This is how I wrap the handler class. One important point is that I do not instantiate the handler in __init__. There is another route in my tornado code that I want to start loading the DataStructure after an authroized request hits that route. (e.g. /reload/)
executors = ThreadPoolExecutor(max_workers=4)
class CheckerInstance(object):
def __init__(self, *args, **kwargs):
self.handler = None
self.is_loading = False
self.is_live = False
def init(self):
if not self.handler:
self.handler = pDataStructureHandler()
self.handler.add_words_from_file(self.data_file_name)
self.end_loading()
self.go_live()
def renew(self):
self.handler = None
self.init()
class CheckHandler(tornado.web.RequestHandler):
async def get(self):
query = self.get_argument("q", None).encode('utf-8')
answer = query
if not checker_instance.is_live:
self.write(dict(answer=self.get_argument("q", None), confidence=100))
return
checker_response = await checker_instance.get_response(query)
answer = checker_response[0]
confidence = checker_response[1]
if self.request.connection.stream.closed():
return
self.write(dict(correct=answer, confidence=confidence, is_cache=is_cache))
def on_connection_close(self):
self.wait_future.cancel()
class InstanceReloadHandler(BasicAuthMixin, tornado.web.RequestHandler):
def prepare(self):
self.get_authenticated_user(check_credentials_func=credentials.get, realm='Protected')
def new_file_exists(self):
return True
def can_reload(self):
return not checker_instance.is_loading
def get(self):
error = False
message = None
if not self.can_reload():
error = True
message = 'another job is being processed!'
else:
if not self.new_file_exists():
error = True
message = 'no new file found!'
else:
checker_instance.go_fake()
checker_instance.start_loading()
tornado.ioloop.IOLoop.current().run_in_executor(executors, checker_instance.renew)
message = 'job started!'
if self.request.connection.stream.closed():
return
self.write(dict(
success=not error, message=message
))
def on_connection_close(self):
self.wait_future.cancel()
def main():
app = tornado.web.Application(
[
(r"/", MainHandler),
(r"/check", CheckHandler),
(r"/reload", InstanceReloadHandler),
(r"/health", HealthHandler),
(r"/log-event", SubmitLogHandler),
],
debug=options.debug,
)
checker_instance = CheckerInstance()
I want this service to keep responding after checker_instance.renew starts running in another thread. But this is not what happens. When I hit the /reload/ endpoint and renew function starts working, any request to /check/ halts and waits for the reloading process to finish and then it starts working again. When the DataStructure is being loaded, the service should be in fake mode and respond to people with the same query that they send as input.
I have tested this code in my development environment with an i5 CPU (4 CPU cores) and it works just fine! But in the production environment (3 double-thread CPU cores) the /check/ endpoint halts requests.

It is difficult to fully trace the events being handled because you have clipped out some of the code for brevity. For instance, I don't see a get_response implementation here so I don't know if it is awaiting something itself that could be dependent on the state of checker_instance.
One area I would explore is in the thread-safety (or seeming absence of) in passing the checker_instance.renew to run_in_executor. This feels questionable to me because you are mutating the state of a single instance of CheckerInstance from a separate thread. While it might not break things explicitly, it does seem like this could be introducing odd race conditions or unanticipated copies of memory that might explain the unexpected behavior you are experiencing
If possible, I would make whatever load behavior you have that you want to offload to a thread be completely self-contained and when the data is loaded, return it as the function result which can then be fed back into you checker_instance. If you were to do this with the code as-is, you would want to await the run_in_executor call for its result and then update the checker_instance. This would mean the reload GET request would wait until the data was loaded. Alternatively, in your reload GET request, you could ioloop.spawn_callback to a function that triggers the run_in_executor in this manner, allowing the reload request to complete instead of waiting.

How to use Tornado.gen.coroutine in TCP Server?

i write a Tcp Server with Tornado.
here is the code:
#! /usr/bin/env python
#coding=utf-8
from tornado.tcpserver import TCPServer
from tornado.ioloop import IOLoop
from tornado.gen import *
class TcpConnection(object):
def __init__(self,stream,address):
self._stream=stream
self._address=address
self._stream.set_close_callback(self.on_close)
self.send_messages()
def send_messages(self):
self.send_message(b'hello \n')
print("next")
self.read_message()
self.send_message(b'world \n')
self.read_message()
def read_message(self):
self._stream.read_until(b'\n',self.handle_message)
def handle_message(self,data):
print(data)
def send_message(self,data):
self._stream.write(data)
def on_close(self):
print("the monitored %d has left",self._address)
class MonitorServer(TCPServer):
def handle_stream(self,stream,address):
print("new connection",address,stream)
conn = TcpConnection(stream,address)
if __name__=='__main__':
print('server start .....')
server=MonitorServer()
server.listen(20000)
IOLoop.instance().start()
And i face some eorror assert self._read_callback is None, "Already reading",i guess the eorror is because multiple commands to read from socket at the same time.and then i change the function send_messages with tornado.gen.coroutine.here is code:
#gen.coroutine
def send_messages(self):
yield self.send_message(b'hello \n')
response1 = yield self.read_message()
print(response1)
yield self.send_message(b'world \n')
print((yield self.read_message()))
but there are some other errors. the code seem to stop after yield self.send_message(b'hello \n'),and the following code seem not to execute.
how should i do about it ? If you're aware of any Tornado tcpserver (not HTTP!) code with tornado.gen.coroutine,please tell me.I would appreciate any links!

send_messages() calls send_message() and read_message() with yield, but these methods are not coroutines, so this will raise an exception.
The reason you're not seeing the exception is that you called send_messages() without yielding it, so the exception has nowhere to go (the garbage collector should eventually notice and print the exception, but that can take a long time). Whenever you call a coroutine, you should either use yield to wait for it to finish, or IOLoop.current().spawn_callback() to run the coroutine in the "background" (this tells Tornado that you do not intend to yield the coroutine, so it will print the exception as soon as it occurs). Also, whenever you override a method you should read the documentation to see whether coroutines are allowed (when you override TCPServer.handle_stream() you can make it a coroutine, but __init__() may not be a coroutine).
Once the exception is getting logged, the next step is to fix it. You can either make send_message() and read_message() coroutines (getting rid of the handle_message() callback in the process), or you can use tornado.gen.Task() to call coroutine-style code from a coroutine. I generally recommend using coroutines everywhere.

Twisted web - write() called after finish()?

I have the following Resource to handle http POST request with twisted web:
class RootResource(Resource):
isLeaf = True
def errback(self, failure):
print "Request finished with error: %s"%str(failure.value)
return failure
def write_response_happy(self, result):
self.request.write('HAPPY!')
self.request.finish()
def write_response_unhappy(self, result):
self.request.write('UNHAPPY!')
self.request.finish()
#defer.inlineCallbacks
def method_1(self):
#IRL I have many more queries to mySQL, cassandra and memcache to get final result, this is why I use inlineCallbacks to keep the code clean.
res = yield dbpool.runQuery('SELECT something FROM table')
#Now I make a decision based on result of the queries:
if res: #Doesn't make much sense but that's only an example
self.d.addCallback(self.write_response_happy) #self.d is already available after yield, so this looks OK?
else:
self.d.addCallback(self.write_response_unhappy)
returnValue(None)
def render_POST(self, request):
self.request = request
self.d = self.method_1()
self.d.addErrback(self.errback)
return server.NOT_DONE_YET
root = RootResource()
site = server.Site(root)
reactor.listenTCP(8002, site)
dbpool = adbapi.ConnectionPool('MySQLdb', host='localhost', db='mydb', user='myuser', passwd='mypass', cp_reconnect=True)
print "Serving on 8002"
reactor.run()
I've used the ab tool (from apache utils) to test 5 POST requests one after another:
ab -n 5 -p sample_post.txt http://127.0.0.1:8002/
Works fine!
Then I tried to run the same 5 POST requests simultaneously:
ab -n 5 -c 5 -p sample_post.txt http://127.0.0.1:8002/
Here I'm getting errors: exceptions.RuntimeError: Request.write called on a request after Request.finish was called. What am I doing wrong?

As Mualig suggested in his comments, you have only one instance of RootResource. When you assign to self.request and self.d in render_POST, you overwrite whatever value those attributes already had. If two requests arrive at around the same time, then this is a problem. The first Request and Deferred are discarded and replaced by new ones associated with the request that arrives second. Later, when your database operation finishes, the second request gets both results and the first one gets none at all.
This is an example of a general mistake in concurrent programming. Your per-request state is kept where it is shared between multiple requests. When multiple requests are handled concurrently, that sharing turns into a fight, and (at least) one request has to lose.
Try keeping your per-request state where it won't be shared between multiple requests. For example, try keeping it on the Deferred:
class RootResource(Resource):
isLeaf = True
def errback(self, failure):
print "Request finished with error: %s"%str(failure.value)
# You just handled the error, don't return the failure.
# Nothing later in the callback chain is doing anything with it.
# return failure
def write_response(self, result, request):
# No "self.request" anymore, just use the argument
request.write(result)
request.finish()
#defer.inlineCallbacks
def method_1(self):
#IRL I have many more queries to mySQL, cassandra and memcache to get final result, this is why I use inlineCallbacks to keep the code clean.
res = yield dbpool.runQuery('SELECT something FROM table')
#Now I make a decision based on result of the queries:
if res: #Doesn't make much sense but that's only an example
# No "self.d" anymore, just produce a result. No shared state to confuse.
returnValue("HAPPY!")
else:
returnValue("UNHAPPY!")
def render_POST(self, request):
# No more attributes on self. Just start the operation.
d = self.method_1()
# Push the request object into the Deferred. It'll be passed to response,
# which is what needs it. Each call to method_1 returns a new Deferred,
# so no shared state here.
d.addCallback(self.write_response, request)
d.addErrback(self.errback)
return server.NOT_DONE_YET

when to use try/except blocks in GAE

I've recently started developing my first web app with GAE and Python, and it is a lot of fun.
One problem I've been having is exceptions being raised when I don't expect them (since I'm new to web apps). I want to:
Prevent users from ever seeing exceptions
Properly handle exceptions so they don't break my app
Should I put a try/except block around every call to put and get?
What other operations could fail that I should wrap with try/except?

You can create a method called handle_exception on your request handlers to deal with un-expected situations.
The webapp framework will call this automatically when it hits an issue
class YourHandler(webapp.RequestHandler):
def handle_exception(self, exception, mode):
# run the default exception handling
webapp.RequestHandler.handle_exception(self,exception, mode)
# note the error in the log
logging.error("Something bad happend: %s" % str(exception))
# tell your users a friendly message
self.response.out.write("Sorry lovely users, something went wrong")

You can wrap your views in a method that will catch all exceptions, log them and return a handsome 500 error page.
def prevent_error_display(fn):
"""Returns either the original request or 500 error page"""
def wrap(self, *args, **kwargs):
try:
return fn(self, *args, **kwargs)
except Exception, e:
# ... log ...
self.response.set_status(500)
self.response.out.write('Something bad happened back here!')
wrap.__doc__ = fn.__doc__
return wrap
# A sample request handler
class PageHandler(webapp.RequestHandler):
#prevent_error_display
def get(self):
# process your page request

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Twisted deferred vs blocking in web services - python

Related

Request timed out: timeout('timed out') in Python's HTTPServer

Call to async endpoint gets blocked by another thread

How to use Tornado.gen.coroutine in TCP Server?

Twisted web - write() called after finish()?

when to use try/except blocks in GAE

Categories

Resources