I am trying to create a simple HTTP server that uses the Python HTTPServer which inherits BaseHTTPServer. [https://github.com/python/cpython/blob/main/Lib/http/server.py][1]
There are numerous examples of this approach online and I don't believe I am doing anything unusual.
I am simply importing the class via:
"from http.server import HTTPServer, BaseHTTPRequestHandler"
in my code.
My code overrides the do_GET() method to parse the path variable to determine what page to show.
However, if I start this server and connect to it locally (ex: http://127.0.0.1:50000) the first page loads fine. If I navigate to another page (via my first page links) that too works fine, however, on occasion (and this is somewhat sporadic), there is a delay and the server log shows a Request timed out: timeout('timed out') error. I have tracked this down to the handle_one_request method in the BaseHTTPServer class:
def handle_one_request(self):
"""Handle a single HTTP request.
You normally don't need to override this method; see the class
__doc__ string for information on how to handle specific HTTP
commands such as GET and POST.
"""
try:
self.raw_requestline = self.rfile.readline(65537)
if len(self.raw_requestline) > 65536:
self.requestline = ''
self.request_version = ''
self.command = ''
self.send_error(HTTPStatus.REQUEST_URI_TOO_LONG)
return
if not self.raw_requestline:
self.close_connection = True
return
if not self.parse_request():
# An error code has been sent, just exit
return
mname = 'do_' + self.command ## the name of the method is created
if not hasattr(self, mname): ## checking that we have that method defined
self.send_error(
HTTPStatus.NOT_IMPLEMENTED,
"Unsupported method (%r)" % self.command)
return
method = getattr(self, mname) ## getting that method
method() ## finally calling it
self.wfile.flush() #actually send the response if not already done.
except socket.timeout as e:
# a read or a write timed out. Discard this connection
self.log_error("Request timed out: %r", e)
self.close_connection = True
return
You can see where the exception is thrown in the "except socket.timeout as e:" clause.
I have tried overriding this method by including it in my code but it is not clear what is causing the error so I run into dead ends. I've tried creating very basic HTML pages to see if there was something in the page itself, but even "blank" pages cause the same sporadic issue.
What's odd is that sometimes a page loads instantly, and almost randomly, it will then timeout. Sometimes the same page, sometimes a different page.
I've played with the http.timeout setting, but it makes no difference. I suspect it's some underlying socket issue, but am unable to diagnose it further.
This is on a Mac running Big Sur 11.3.1, with Python version 3.9.4.
Any ideas on what might be causing this timeout, and in particular any suggestions on a resolution. Any pointers would be appreciated.
After further investigation, this particular appears to be an issue with Safari. Running the exact same code and using Firefox does not show the same issue.
Having read through http://krondo.com/an-introduction-to-asynchronous-programming-and-twisted/, I have a basic understanding of Twisted. I have built a testing infrastructure that a few other people also use. From time to time, we experience 'Unhandled error in Deferred' type errors if there are bugs or typos in the code. The problem is that there isn't always an accompanying stack trace with these errors making it hard for people to debug.
I have reproduced the problem with the following simple code:
from twisted.internet import defer, reactor
from twisted.internet.task import deferLater
def sleep(delay):
"""Twisted safe sleep.
When using Twisted, it is not safe to call time.sleep(). So
we have this function to emulate the behavior.
"""
return deferLater(reactor, delay, lambda: None)
#defer.inlineCallbacks
def run_test():
print 'run_test: start'
bug()
yield sleep(1)
print 'run_test: stop'
#defer.inlineCallbacks
def run_tests():
def err(arg):
print 'err', arg
return arg
def success(arg):
print 'success', arg
return arg
d = run_test()
#d.addCallbacks(success, err)
try:
yield d
finally:
reactor.stop()
reactor.callWhenRunning(run_tests)
reactor.run()
When I run this code, I see the following output:
run_test: start
Unhandled error in Deferred:
And if I uncomment the addCallbacks() line above, then I get some stack trace information:
run_test: start
err [Failure instance: Traceback: <type 'exceptions.NameError'>: global name 'bug' is not defined
/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py:1406:unwindGenerator
/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py:1260:_inlineCallbacks
tmp.py:34:run_tests
/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py:1406:unwindGenerator
--- <exception caught here> ---
/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py:1260:_inlineCallbacks
tmp.py:18:run_test
]
Unhandled error in Deferred:
My question is, is there some way to get the stack trace without having to add callbacks at all defer sites?
The Deferred class has tons of try/except magic, to the point that the errors are pretty well abstracted from the end user and only easily attained via addErrback. If errors are occurring, it's a sign you have broken functionality. In synchronous apps, you might encapsulate the "problem section" in a try/except block. In the case of inlineCallbacks, the same technique can be utilized:
try:
run_test()
except Exception as err:
# handle error here
finally:
reactor.stop()
Since the bug occurs in the run_test() function, catch the exception when running that function, then handle the error according to your requirements.
However, if you don't plan to "handle" errors, but rather you want a record that an error has occurred, then you should consider using Twisted logger functionality. This will catch your unhandled tracebacks and log them somewhere.
I've made an IRC bot in Python and I've been trying to figure out a way to wait for an IRC command and return the message to a calling function for a while now. I refuse to use an external library for various reasons including I'm trying to learn to make these things from scratch. Also, I've been sifting through documentation for existing ones and they're way too comprehensive. I'me trying to make a simple one.
For example:
def who(bot, nick):
bot.send('WHO %s' % nick)
response = ResponseWaiter('352') # 352 - RPL_WHOREPLY
return response.msg
Would return a an object of my Message class that parses IRC messages to the calling function:
def check_host(bot, nick, host):
who_info = who(bot, nick)
if who_info.host == host:
return True
return False
I have looked at the reactor pattern, observer pattern, and have tried implementing a hundred different event system designs for this to no avail. I'm completely lost.
Please either provide a solution or point me in the right direction. There's got to be a simple way to do this.
So what I've done is use grab messages from my generator (a bot method) from the bot's who method. The generator looks like this:
def msg_generator(self):
''' Provides messages until bot dies '''
while self.alive:
for msg in self.irc.recv(self.buffer).split(('\r\n').encode()):
if len(msg) > 3:
try: yield Message(msg.decode())
except Exception as e:
self.log('%s %s\n' % (except_str, str(e)))
And now the bot's who method looks like this:
def who(self, nick):
self.send('WHO %s' % nick)
for msg in self.msg_generator():
if msg.command == '352':
return msg
However, it's now taking control of the messages, so I need some way of relinquishing the messages I'm not using for the who method to their appropriate handlers.
My bot generally handles all messages with this:
def handle(self):
for msg in self.msg_generator():
self.log('◀ %s' % (msg))
SpaghettiHandler(self, msg)
So any message that my SpaghettiHandler would be handling is not handled while the bot's who method uses the generator to receive messages.
It's working.. and works fast enough that it's hard to lose a message. But if my bot were to be taking many commands at the same time, this could become a problem. I'm pretty sure I'll find a solution in this direction, but I didn't create this as the answer because I'm not sure it's a good way, even when I have it set to relinquish messages that don't pertain to the listener.
I writing little customized ftp server and I need to suppress printing exceptions (well, one specific type of exception) to console but I want server to send "550 Requested action not taken: internal server error" or something like that to client.
However, when I catch exception using addErrback(), than I don't see exception in console but client gets OK status..
What could I do?
When you catch an error in errback handler, you should then inspect the type of the Failure and based on internal logic of your application send the Error as an FTP error message to the client
twisted.protocol.ftp.FTP handles this with self.reply(ERROR_CODE, "description")
So your code could look something like this:
from twisted.internet import ftp
MY_ERROR = ftp.REQ_ACTN_NOT_TAKEN
def failureCheck(failureInstance):
#do some magic to establish if we should reply an Error to this failure
return True
class myFTP(ftp.FTP):
def myActionX(self):
magicResult = self.doDeferredMagic()
magicResult.addCallback(self.onMagicSuccess)
magicResult.addErrback(self.onFailedMagic)
def onFailedMagic(self,failureInstance):
if failureCheck(failureInstance):
self.reply(MY_ERROR,'Add relevant failure information here')
else:
#do whatever other logic here
pass
I'm struggling to produce the same behavior in web service code that uses Deferred objects as in code that does not. My objective is to write a decorator that will delegate processing of any method (which is decoupled from Twisted) to the Twisted thread pool, so that the reactor is not blocked, without changing any of that method's semantics.
When an instance of class echo below is exposed as a web service, this code:
from twisted.web import server, resource
from twisted.internet import defer, threads
from cgi import escape
from itertools import count
class echo(resource.Resource):
isLeaf = True
def errback(self, failure): return failure
def callback1(self, request, value):
#raise ValueError # E1
lines = ['<html><body>\n',
'<p>Page view #%s in this session</p>\n' % (value,),
'</body></html>\n']
return ''.join(lines)
def callback2(self, request, encoding):
def execute(message):
#raise ValueError # E2
request.write(message.encode(encoding))
#raise ValueError # E3
request.finish()
#raise ValueError # E4
return server.NOT_DONE_YET
return execute
def render_GET(self, request):
content_type, encoding = 'text/html', 'UTF-8'
request.setHeader('Content-Type', '%s; charset=%s' %
tuple(map(str, (content_type, encoding))))
s = request.getSession()
if not hasattr(s, 'counter'):
s.counter = count(1)
d = threads.deferToThread(self.callback1, request, s.counter.next())
d.addCallback(self.callback2(request, encoding))
d.addErrback(self.errback)
#raise ValueError # E5
return server.NOT_DONE_YET
will display an HTML document to the browser when all the raise statements are commented out, and display a nicely formatted stack trace (which Twisted does for me) when the raise statement labelled "E5" is included. That is what I want. Likewise, if I do not use Deferred objects at all and place all the behavior from callback1 and callback2 within render_GET(), an exception raised anywhere within render_GET will produce the desired stack trace.
I am trying to write code that will respond to the browser immediately, not cause resource leaks within Twisted, and cause the browser stack trace to also be displayed in the cases where any of the raise statements "E1" to "E3" is included in the deferred code--though of course I understand that the stack traces themselves will be different. (The "E4" case I don't care about as much.) After reading the Twisted documentation and other questions on this site I am unsure how to achieve this. I would have thought that adding an errback should facilitate this, but evidently not. There must be something about Deferred objects and the twisted.web stack that I'm not understanding.
The effects on logging I document here may be affected by my use of the PythonLoggingObserver to bridge Twisted logging to the standard logging module.
When "E1" is included, the browser waits until the reactor is shut down, at which point the ValueError exception with stack trace is logged and the browser receives an empty document.
When "E2" is included, the ValueError exception with stack trace is logged immediately, but the browser waits until the reactor shuts down at which point it receives an empty document.
When "E3" is included, the ValueError exception with stack trace is logged immediately, the browser waits until the reactor shuts down, and at that point receives the intended document.
When raise statement "E4" is included, the intended document is returned to the browser immediately, and the ValueError exception with stack trace is logged immediately. (Is there any possibility of a resource leak in this case?)
Ok, after reading your question several times, I think I understand what your asking. I have also reworked you code to make a little better than your original answer. This new answer should show off all the powers of deferred's.
from twisted.web import server, resource
from twisted.internet import defer, threads
from itertools import count
class echo(resource.Resource):
isLeaf = True
def errback(self, failure, request):
failure.printTraceback() # This will print the trace back in a way that looks like a python exception.
# log.err(failure) # This will use the twisted logger. This is the best method, but
# you need to import twisted log.
request.processingFailed(failure) # This will send a trace to the browser and close the request.
return None # We have dealt with the failure. Clean it out now.
def final(self, message, request, encoding):
# Message will contain the message returned by callback1
request.write(message.encode(encoding)) # This will write the message and return it to the browser.
request.finish() # Done
def callback1(self, value):
#raise ValueError # E1
lines = ['<html><body>\n',
'<p>Page view #%s in this session</p>\n' % (value,),
'</body></html>\n']
return ''.join(lines)
#raise ValueError # E4
def render_GET(self, request):
content_type, encoding = 'text/html', 'UTF-8'
request.setHeader('Content-Type', '%s; charset=%s' %
tuple(map(str, (content_type, encoding))))
s = request.getSession()
if not hasattr(s, 'counter'):
s.counter = count(1)
d = threads.deferToThread(self.callback1, s.counter.next())
d.addCallback(self.final, request, encoding)
d.addErrback(self.errback, request) # We put this here in case the encoding raised an exception.
#raise ValueError # E5
return server.NOT_DONE_YET
Also I recommend that you read the krondo tutorial. It will teach you everything you need to know about deferred.
Edit:
Have modified the code above to fix some silly bugs. Also improved it. If an exception happens anywhere (except in self.errback, but we need some level of trust) then it will be passed to self.errback which will log or print the error in twisted and then send the trace to the browser and close the request. This should stop any resource leaks.
I figured it out by digging through the Twisted source. The necessary insight is that the reactor and Deferred callback/errback chain logic is decoupled from the request object, which is how data gets back to the browser. The errback is necessary, but cannot merely propagate the Failure object down the chain as in the original code I posted. The errback must report the error to the browser.
The below code meets my requirements (never keeps the browser waiting, always gives the stack trace, does not require a reactor restart to get things going again) and will allow me to decorate blocking methods and thereby delegate them to threads to keep the reactor responsive to other events (such methods will essentially take the place of callback1 here). However, I did find that in the below code, uncommenting the "E4" raise statement produces very strange behavior on subsequent browser requests (partial data from previous requests returned to the browser; deadlock).
Hopefully others will find this to be a useful Deferred example.
from twisted.web import server, resource
from twisted.internet import defer, threads
from itertools import count
class echo(resource.Resource):
isLeaf = True
def errback(self, request):
def execute(failure):
request.processingFailed(failure)
return failure
return execute
def callback1(self, value):
#raise ValueError # E1
lines = ['<html><body>\n',
'<p>Page view #%s in this session</p>\n' % (value,),
'</body></html>\n']
return ''.join(lines)
def callback2(self, request, encoding):
def execute(message):
#raise ValueError # E2
request.write(message.encode(encoding))
#raise ValueError # E3
request.finish()
#raise ValueError # E4
return server.NOT_DONE_YET
return execute
def render_GET(self, request):
content_type, encoding = 'text/html', 'UTF-8'
request.setHeader('Content-Type', '%s; charset=%s' %
tuple(map(str, (content_type, encoding))))
s = request.getSession()
if not hasattr(s, 'counter'):
s.counter = count(1)
d = threads.deferToThread(self.callback1, s.counter.next())
eback = self.errback(request)
d.addErrback(eback)
d.addCallback(self.callback2(request, encoding))
d.addErrback(eback)
#raise ValueError # E5
return server.NOT_DONE_YET