How to debug Protocol.dataReceived in Twisted

How to debug Protocol.dataReceived in Twisted - python

I'm new to twisted and I'm having trouble to debug my code within the dataReceived method of the twisted.internet.protocol.Protocol object.
Given some code like this
class Printer(Protocol):
def dataReceived(self, data):
print data # Works perfectly
print toto # should trigger some error since "toto" is not defined
...
response.deliverBody(Printer())
I couldn't find a way to add an Errback on dataReceived. Is there a way ? an other way to debug its behavior ?
Thanks in advance for your help.

You can't catch errors from dataReceived directly since that function isn't a deferred user's generally have control over. You can only call addErrback on deferred objects. Here is an example of how to catch errors:
from twisted.internet.protocol import Protocol
from twisted.internet.defer import Deferred
class Printer(Protocol):
def dataReceived(self, data):
d = Deferred()
d.addCallback(self.display_data)
d.addErrback(self.error_func)
d.callback(data)
def display_data(self, data):
print(data)
print(toto) # this will raise NameError error
def error_func(self, error):
print('[!] Whoops here is the error: {0}'.format(error))
A deferred is created in the dataReceived function which will print data and the invalid toto variables. An errorback function (ie. self.error_func()) is chained to catch errors that occur in display_data(). You should strive very hard to not have errors in the dataReceived function itself. This isn't always possible but one should try. Hope this helps

Related

Checking Servers with Motor(Mongodb & Tornado)

I need to create a function that checks to make sure Mongo servers are running using the ping function. I set up the clients right there (the config file has dictionary with ports numbers)
clientList = []
for value in configuration["mongodbServer"]:
client = motor.motor_tornado.MotorClient('mongodb://localhost:{}'.format(value))
clientList.append(client)
and then i run this function:
class MongoChecker(Checker):
formatter = 'stashboard.formatters.MongoFormatter'
def check(self):
for x in clientList:
if x.ping:
return x.ping
and the error i get:
yielded unknown object MotorDatabase(Database(MongoClient([]), 'ping'))\n",
I think my issue is that i'm using the ping function wrong. I can't find any other documentation on that or any other kind of feature that would check to see if the servers are still running. If anyone knows of a better way to monitor the status using Motor, i'm open. Thanks!

First, there's no "ping" function. Hence MotorClient thinks you're trying to access the database named "ping". The database named "ping" is shown in the "unknown object" exception. For all MongoDB commands like "ping", just use MotorDatabase's command method.
Second, Motor is asynchronous. You must use Motor methods in a Tornado coroutine with the "yield" statement. For example:
#gen.coroutine
def check():
try:
result = yield client.admin.command({'ping': 1})
print(result)
except ConnectionFailure as exc:
print(exc)
If you want to test this out synchronously, you can run the IOLoop just long enough for the coroutine to complete:
from pymongo.errors import ConnectionFailure
from tornado import gen
from tornado.ioloop import IOLoop
import motor.motor_tornado
client = motor.motor_tornado.MotorClient()
IOLoop.current().run_sync(check)
For an introduction to Tornado coroutines, see Refactoring Tornado Coroutines and the Tornado documentation.

How to use Tornado.gen.coroutine in TCP Server?

i write a Tcp Server with Tornado.
here is the code:
#! /usr/bin/env python
#coding=utf-8
from tornado.tcpserver import TCPServer
from tornado.ioloop import IOLoop
from tornado.gen import *
class TcpConnection(object):
def __init__(self,stream,address):
self._stream=stream
self._address=address
self._stream.set_close_callback(self.on_close)
self.send_messages()
def send_messages(self):
self.send_message(b'hello \n')
print("next")
self.read_message()
self.send_message(b'world \n')
self.read_message()
def read_message(self):
self._stream.read_until(b'\n',self.handle_message)
def handle_message(self,data):
print(data)
def send_message(self,data):
self._stream.write(data)
def on_close(self):
print("the monitored %d has left",self._address)
class MonitorServer(TCPServer):
def handle_stream(self,stream,address):
print("new connection",address,stream)
conn = TcpConnection(stream,address)
if __name__=='__main__':
print('server start .....')
server=MonitorServer()
server.listen(20000)
IOLoop.instance().start()
And i face some eorror assert self._read_callback is None, "Already reading",i guess the eorror is because multiple commands to read from socket at the same time.and then i change the function send_messages with tornado.gen.coroutine.here is code:
#gen.coroutine
def send_messages(self):
yield self.send_message(b'hello \n')
response1 = yield self.read_message()
print(response1)
yield self.send_message(b'world \n')
print((yield self.read_message()))
but there are some other errors. the code seem to stop after yield self.send_message(b'hello \n'),and the following code seem not to execute.
how should i do about it ? If you're aware of any Tornado tcpserver (not HTTP!) code with tornado.gen.coroutine,please tell me.I would appreciate any links!

send_messages() calls send_message() and read_message() with yield, but these methods are not coroutines, so this will raise an exception.
The reason you're not seeing the exception is that you called send_messages() without yielding it, so the exception has nowhere to go (the garbage collector should eventually notice and print the exception, but that can take a long time). Whenever you call a coroutine, you should either use yield to wait for it to finish, or IOLoop.current().spawn_callback() to run the coroutine in the "background" (this tells Tornado that you do not intend to yield the coroutine, so it will print the exception as soon as it occurs). Also, whenever you override a method you should read the documentation to see whether coroutines are allowed (when you override TCPServer.handle_stream() you can make it a coroutine, but __init__() may not be a coroutine).
Once the exception is getting logged, the next step is to fix it. You can either make send_message() and read_message() coroutines (getting rid of the handle_message() callback in the process), or you can use tornado.gen.Task() to call coroutine-style code from a coroutine. I generally recommend using coroutines everywhere.

Scheduling a corroutine for a later loop iteration

I borrowed this code of a simple chat:
import tornado.ioloop
import tornado.web
import tornado.websocket
import tornado.gen
clients = []
class IndexHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def get(request):
request.render("index.html")
class WebSocketChatHandler(tornado.websocket.WebSocketHandler):
def open(self, *args):
print("open", "WebSocketChatHandler")
clients.append(self)
def check_origin(self, origin):
return True
#tornado.gen.coroutine
def on_message(self, message):
for client in clients:
client.write_message(message)
#tornado.gen.coroutine
def myroutine(m):
print "mensaje: "
c = (yield 123123123)
print ("mensaje", m, c)
yield myroutine(message)
def on_close(self):
clients.remove(self)
app = tornado.web.Application([(r'/chat', WebSocketChatHandler), (r'/', IndexHandler)])
app.listen(8888)
tornado.ioloop.IOLoop.instance().start()
The Chat application works well (i.e. I see the echoes using a websocket client), and I modified it a bit to test some custom code.
And, just for testing purposes, I wanted to insert a presumably heavy function call which I wanted to make asynchronous.
The actual intention here, is that myroutine will start a game-engine as a paralell task.
Perhaps I am missing something, but the intention in my code is to re-schedule the corroutine in two parts. This means: the corroutine should print "message", then yield the value 123123123 (actually, this is an immediate value which will be wrapped into an already-resolved future - the value will be in the result), thus rescheduling itself to the next iteration, and (in the latter iteration) print the given tuple ("message", message, c).
My issue is that the function is never rescheduled (i.e. only "message:" is printed by console).
What am I doing wrong? This is my first attempt at Tornado (and async programming in general). How can I tell the tornado loop something like "dude, this value is my corruotine, and those are the arguments for my corroutine. please, start it in paralell by scheduling it in the next loop"?

There are two problems going on: first, you can't yield every kind of object from a coroutine, you must yield a Future or other special yieldable object. So when your coroutine yields 123123, Tornado throws a "bad yield" exception. Unfortunately, Tornado's websocket code isn't built to catch exceptions from "on_message" if "on_message" is a coroutine, so the exception passes silently. See the warning at the bottom of the coroutine documentation.
The solution for you is to yield a valid object from "mycoroutine". If you just want to yield for a moment, yield "gen.moment":
print "one"
yield gen.moment
print "two"
If you want "mycoroutine" to run in parallel and not block "on_message", just call it without yielding:
mycoroutine(message)
But! Calling a coroutine this way means no one is listening to see if it throws an exception. Make sure you catch and log all exceptions within "mycoroutine", since otherwise they will pass silently.

Terminate a hung redis pubsub.listen() thread

Related to this question I have the following code which subscribes to a redis pubsub queue and uses the handler provided in __init__ to feed the messages to the class that processes them:
from threading import Thread
import msgpack
class Subscriber(Thread):
def __init__(self, redis_connection, channel_name, handler):
super(Subscriber, self).__init__(name="Receiver")
self.connection = redis_connection
self.pubsub = self.connection.pubsub()
self.channel_name = channel_name
self.handler = handler
self.should_die = False
def start(self):
self.pubsub.subscribe(self.channel_name)
super(Subscriber, self).start()
def run(self):
for msg in self.pubsub.listen():
if self.should_die:
return
try:
data = msg["data"]
unpacked = msgpack.unpackb(data)
except TypeError:
# stop non-msgpacked, invalid, messages breaking stuff
# other validation happens in handler
continue
self.handler(unpacked)
def die(self):
self.should_die = True
In the linked question above, it is noted that pubsub.listen() never returns if the connection is dropped. Therefore, my die() function, while it can be called, will never actually cause the thread to terminate because it is hanging on the call to listen() inside the thread's run().
The accepted answer on the linked question mentions hacking redis-py's connection pool. I really don't want to do this and have a forked version of redis-py (at least until the fix is hopefully accepted into master), but I've had a look at the redis-py code anyway and don't immediately see where this change would be made.
Does anyone have an idea how to cleanly solve the hanging redis-py listen() call?
What issues will I incur by directly using Thread._Thread__stop?

To close this out after so many years. It ended up being a bug in the redis library. I debugged it and submitted the PR. It shouldn't occur any more.

Twisted deferred vs blocking in web services

I'm struggling to produce the same behavior in web service code that uses Deferred objects as in code that does not. My objective is to write a decorator that will delegate processing of any method (which is decoupled from Twisted) to the Twisted thread pool, so that the reactor is not blocked, without changing any of that method's semantics.
When an instance of class echo below is exposed as a web service, this code:
from twisted.web import server, resource
from twisted.internet import defer, threads
from cgi import escape
from itertools import count
class echo(resource.Resource):
isLeaf = True
def errback(self, failure): return failure
def callback1(self, request, value):
#raise ValueError # E1
lines = ['<html><body>\n',
'<p>Page view #%s in this session</p>\n' % (value,),
'</body></html>\n']
return ''.join(lines)
def callback2(self, request, encoding):
def execute(message):
#raise ValueError # E2
request.write(message.encode(encoding))
#raise ValueError # E3
request.finish()
#raise ValueError # E4
return server.NOT_DONE_YET
return execute
def render_GET(self, request):
content_type, encoding = 'text/html', 'UTF-8'
request.setHeader('Content-Type', '%s; charset=%s' %
tuple(map(str, (content_type, encoding))))
s = request.getSession()
if not hasattr(s, 'counter'):
s.counter = count(1)
d = threads.deferToThread(self.callback1, request, s.counter.next())
d.addCallback(self.callback2(request, encoding))
d.addErrback(self.errback)
#raise ValueError # E5
return server.NOT_DONE_YET
will display an HTML document to the browser when all the raise statements are commented out, and display a nicely formatted stack trace (which Twisted does for me) when the raise statement labelled "E5" is included. That is what I want. Likewise, if I do not use Deferred objects at all and place all the behavior from callback1 and callback2 within render_GET(), an exception raised anywhere within render_GET will produce the desired stack trace.
I am trying to write code that will respond to the browser immediately, not cause resource leaks within Twisted, and cause the browser stack trace to also be displayed in the cases where any of the raise statements "E1" to "E3" is included in the deferred code--though of course I understand that the stack traces themselves will be different. (The "E4" case I don't care about as much.) After reading the Twisted documentation and other questions on this site I am unsure how to achieve this. I would have thought that adding an errback should facilitate this, but evidently not. There must be something about Deferred objects and the twisted.web stack that I'm not understanding.
The effects on logging I document here may be affected by my use of the PythonLoggingObserver to bridge Twisted logging to the standard logging module.
When "E1" is included, the browser waits until the reactor is shut down, at which point the ValueError exception with stack trace is logged and the browser receives an empty document.
When "E2" is included, the ValueError exception with stack trace is logged immediately, but the browser waits until the reactor shuts down at which point it receives an empty document.
When "E3" is included, the ValueError exception with stack trace is logged immediately, the browser waits until the reactor shuts down, and at that point receives the intended document.
When raise statement "E4" is included, the intended document is returned to the browser immediately, and the ValueError exception with stack trace is logged immediately. (Is there any possibility of a resource leak in this case?)

Ok, after reading your question several times, I think I understand what your asking. I have also reworked you code to make a little better than your original answer. This new answer should show off all the powers of deferred's.
from twisted.web import server, resource
from twisted.internet import defer, threads
from itertools import count
class echo(resource.Resource):
isLeaf = True
def errback(self, failure, request):
failure.printTraceback() # This will print the trace back in a way that looks like a python exception.
# log.err(failure) # This will use the twisted logger. This is the best method, but
# you need to import twisted log.
request.processingFailed(failure) # This will send a trace to the browser and close the request.
return None # We have dealt with the failure. Clean it out now.
def final(self, message, request, encoding):
# Message will contain the message returned by callback1
request.write(message.encode(encoding)) # This will write the message and return it to the browser.
request.finish() # Done
def callback1(self, value):
#raise ValueError # E1
lines = ['<html><body>\n',
'<p>Page view #%s in this session</p>\n' % (value,),
'</body></html>\n']
return ''.join(lines)
#raise ValueError # E4
def render_GET(self, request):
content_type, encoding = 'text/html', 'UTF-8'
request.setHeader('Content-Type', '%s; charset=%s' %
tuple(map(str, (content_type, encoding))))
s = request.getSession()
if not hasattr(s, 'counter'):
s.counter = count(1)
d = threads.deferToThread(self.callback1, s.counter.next())
d.addCallback(self.final, request, encoding)
d.addErrback(self.errback, request) # We put this here in case the encoding raised an exception.
#raise ValueError # E5
return server.NOT_DONE_YET
Also I recommend that you read the krondo tutorial. It will teach you everything you need to know about deferred.
Edit:
Have modified the code above to fix some silly bugs. Also improved it. If an exception happens anywhere (except in self.errback, but we need some level of trust) then it will be passed to self.errback which will log or print the error in twisted and then send the trace to the browser and close the request. This should stop any resource leaks.

I figured it out by digging through the Twisted source. The necessary insight is that the reactor and Deferred callback/errback chain logic is decoupled from the request object, which is how data gets back to the browser. The errback is necessary, but cannot merely propagate the Failure object down the chain as in the original code I posted. The errback must report the error to the browser.
The below code meets my requirements (never keeps the browser waiting, always gives the stack trace, does not require a reactor restart to get things going again) and will allow me to decorate blocking methods and thereby delegate them to threads to keep the reactor responsive to other events (such methods will essentially take the place of callback1 here). However, I did find that in the below code, uncommenting the "E4" raise statement produces very strange behavior on subsequent browser requests (partial data from previous requests returned to the browser; deadlock).
Hopefully others will find this to be a useful Deferred example.
from twisted.web import server, resource
from twisted.internet import defer, threads
from itertools import count
class echo(resource.Resource):
isLeaf = True
def errback(self, request):
def execute(failure):
request.processingFailed(failure)
return failure
return execute
def callback1(self, value):
#raise ValueError # E1
lines = ['<html><body>\n',
'<p>Page view #%s in this session</p>\n' % (value,),
'</body></html>\n']
return ''.join(lines)
def callback2(self, request, encoding):
def execute(message):
#raise ValueError # E2
request.write(message.encode(encoding))
#raise ValueError # E3
request.finish()
#raise ValueError # E4
return server.NOT_DONE_YET
return execute
def render_GET(self, request):
content_type, encoding = 'text/html', 'UTF-8'
request.setHeader('Content-Type', '%s; charset=%s' %
tuple(map(str, (content_type, encoding))))
s = request.getSession()
if not hasattr(s, 'counter'):
s.counter = count(1)
d = threads.deferToThread(self.callback1, s.counter.next())
eback = self.errback(request)
d.addErrback(eback)
d.addCallback(self.callback2(request, encoding))
d.addErrback(eback)
#raise ValueError # E5
return server.NOT_DONE_YET

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to debug Protocol.dataReceived in Twisted - python

Related

Checking Servers with Motor(Mongodb & Tornado)

How to use Tornado.gen.coroutine in TCP Server?

Scheduling a corroutine for a later loop iteration

Terminate a hung redis pubsub.listen() thread

Twisted deferred vs blocking in web services

Categories

Resources