Let's say I have this object which is a core part of my system, and most of its work is to communicate with a remote agent - send tasks and get responses:
class AgentAPI:
... # Constructor etc.
def do_foo(self, args):
.... # Send data and wait for response
return result
def do_bar(self, args):
.... # Send data and wait for response
return result
The way I would go about unit-testing this object is with a mock, right?
So:
def test_agent_api_foo():
agent_api = AgentAPI(someargs)
agent_api.do_foo = Mock()
.... # test logic
That means the methods are not really executed, of course...
So how do I keep my unit-tests reliable on highly distributed systems?
Related
I have a tornado webservice which is going to serve something around 500 requests per minute. All these requests are going to hit 1 specific endpoint. There is a C++ program that I have compiled using Cython and use it inside the tornado service as my processor engine. Each request that goes to /check/ will trigger a function call in the C++ program (I will call it handler) and the return value will get sent to user as response.
This is how I wrap the handler class. One important point is that I do not instantiate the handler in __init__. There is another route in my tornado code that I want to start loading the DataStructure after an authroized request hits that route. (e.g. /reload/)
executors = ThreadPoolExecutor(max_workers=4)
class CheckerInstance(object):
def __init__(self, *args, **kwargs):
self.handler = None
self.is_loading = False
self.is_live = False
def init(self):
if not self.handler:
self.handler = pDataStructureHandler()
self.handler.add_words_from_file(self.data_file_name)
self.end_loading()
self.go_live()
def renew(self):
self.handler = None
self.init()
class CheckHandler(tornado.web.RequestHandler):
async def get(self):
query = self.get_argument("q", None).encode('utf-8')
answer = query
if not checker_instance.is_live:
self.write(dict(answer=self.get_argument("q", None), confidence=100))
return
checker_response = await checker_instance.get_response(query)
answer = checker_response[0]
confidence = checker_response[1]
if self.request.connection.stream.closed():
return
self.write(dict(correct=answer, confidence=confidence, is_cache=is_cache))
def on_connection_close(self):
self.wait_future.cancel()
class InstanceReloadHandler(BasicAuthMixin, tornado.web.RequestHandler):
def prepare(self):
self.get_authenticated_user(check_credentials_func=credentials.get, realm='Protected')
def new_file_exists(self):
return True
def can_reload(self):
return not checker_instance.is_loading
def get(self):
error = False
message = None
if not self.can_reload():
error = True
message = 'another job is being processed!'
else:
if not self.new_file_exists():
error = True
message = 'no new file found!'
else:
checker_instance.go_fake()
checker_instance.start_loading()
tornado.ioloop.IOLoop.current().run_in_executor(executors, checker_instance.renew)
message = 'job started!'
if self.request.connection.stream.closed():
return
self.write(dict(
success=not error, message=message
))
def on_connection_close(self):
self.wait_future.cancel()
def main():
app = tornado.web.Application(
[
(r"/", MainHandler),
(r"/check", CheckHandler),
(r"/reload", InstanceReloadHandler),
(r"/health", HealthHandler),
(r"/log-event", SubmitLogHandler),
],
debug=options.debug,
)
checker_instance = CheckerInstance()
I want this service to keep responding after checker_instance.renew starts running in another thread. But this is not what happens. When I hit the /reload/ endpoint and renew function starts working, any request to /check/ halts and waits for the reloading process to finish and then it starts working again. When the DataStructure is being loaded, the service should be in fake mode and respond to people with the same query that they send as input.
I have tested this code in my development environment with an i5 CPU (4 CPU cores) and it works just fine! But in the production environment (3 double-thread CPU cores) the /check/ endpoint halts requests.
It is difficult to fully trace the events being handled because you have clipped out some of the code for brevity. For instance, I don't see a get_response implementation here so I don't know if it is awaiting something itself that could be dependent on the state of checker_instance.
One area I would explore is in the thread-safety (or seeming absence of) in passing the checker_instance.renew to run_in_executor. This feels questionable to me because you are mutating the state of a single instance of CheckerInstance from a separate thread. While it might not break things explicitly, it does seem like this could be introducing odd race conditions or unanticipated copies of memory that might explain the unexpected behavior you are experiencing
If possible, I would make whatever load behavior you have that you want to offload to a thread be completely self-contained and when the data is loaded, return it as the function result which can then be fed back into you checker_instance. If you were to do this with the code as-is, you would want to await the run_in_executor call for its result and then update the checker_instance. This would mean the reload GET request would wait until the data was loaded. Alternatively, in your reload GET request, you could ioloop.spawn_callback to a function that triggers the run_in_executor in this manner, allowing the reload request to complete instead of waiting.
I have a python app written in the Tornado Asynchronous framework. When an HTTP request comes in, this method gets called:
#classmethod
def my_method(cls, my_arg1):
# Do some Database Transaction #1
x = get_val_from_db_table1(id=1, 'x')
y = get_val_from_db_table2(id=7, 'y')
x += x + (2 * y)
# Do some Database Transaction #2
set_val_in_db_table1(id=1, 'x', x)
return True
The three database operations are interrelated. And this is a concurrent application so multiple such HTTP calls can be happening concurrently and hitting the same DB.
For data-integrity purposes, its important that the three database operations in this method are all called without another processes reading or writing to those database rows in between.
How can I make sure this method has database atomicity? Does Tornado have a decorator for this?
Synchronous database access
You haven't stated how you access your database. If, which is likely, you have synchronous DB access in get_val_from_db_table1 and friends (e.g. with pymysql) and my_method is blocking (doesn't return control to IO loop) then you block your server (which has implications on performance and responsiveness of your server) but effectively serialise your clients and only one can execute my_method at a time. So in terms of data consistency you don't need to do anything, but generally it's a bad design. You can solve both with #xyres's solution in short term (at cost of keeping in mind thread-safely concerns because most of Tornado's functionality isn't thread-safe).
Asynchronous database access
If you have asynchronous DB access in get_val_from_db_table1 and friends (e.g. with tornado-mysql) then you can use tornado.locks.Lock. Here's an example:
from tornado import web, gen, locks, ioloop
_lock = locks.Lock()
def synchronised(coro):
async def wrapper(*args, **kwargs):
async with _lock:
return await coro(*args, **kwargs)
return wrapper
class MainHandler(web.RequestHandler):
async def get(self):
result = await self.my_method('foo')
self.write(result)
#classmethod
#synchronised
async def my_method(cls, arg):
# db access
await gen.sleep(0.5)
return 'data set for {}'.format(arg)
if __name__ == '__main__':
app = web.Application([('/', MainHandler)])
app.listen(8080)
ioloop.IOLoop.current().start()
Note that the above is said about normal single-process Tornado application. If you use tornado.process.fork_processes, then you can only go with multiprocessing.Lock.
Since you want to run those three db operations one right after the other, the function my_method must be non-asynchronous.
But this would also mean that my_method will block the server. You definitely don't want that. One way that I can think of is to run this function in another thread. This won't block the server and will keep accepting new requests while the operations are running. And since, it's going to be non-async, db atomicity is guaranteed.
Here's the relevant code to get you started:
import concurrent.futures
executor = concurrent.futures.ThreadPoolExecutor(max_workers=1)
# Don't set `max_workers` more than 1, because then multiple
# threads will be able to perform db operations
class MyHandler(...):
#gen.coroutine
def get(self):
yield executor.submit(MyHandler.my_method, my_arg1)
# above, `yield` is used to wait for
# db operations to finish
# if you don't want to wait and return
# a response immediately remove the
# `yield` keyword
self.write('Done')
#classmethod
def my_method(cls, my_arg1):
# do db stuff ...
return True
I'm looking for a way to set request level context in Tornado.
This is useful for logging purpose, to print some request attributes with every log line (like user_id).
I'd like to populate the context in web.RequestHandler and then access it in other coroutines that this request called.
class WebRequestHandler(web.RequestHandler):
#gen.coroutine
def post(self):
RequestContext.test_mode = self.application.settings.get('test_mode', False)
RequestContext.corr_id = self.request.header.get('X-Request-ID')
result = yield some_func()
self.write(result)
#gen.coroutine
def some_func()
if RequestContext.test_mode:
print "In test mode"
do more async calls
Currently I pass context object (dict with values) to every async function call down stream, this way every part of the code can do monitoring and logging with right context.
I'm looking for a cleaner/simpler solution.
Thanks
Alex
The concept of request context doesn't really hold well in async frameworks (especially if you have high volume traffic) for the simple fact that there could potentially be hundreds of concurrent requests and it becomes difficult to determine which "context" to use. This works for sequential frameworks like Flask, Falcon, Django, etc. because requests are handled one by one and it's simple to determine which request you're dealing with.
The preferred method of handling functionality between a request start and end is to override prepare and on_finish respectively.
class WebRequestHandler(web.RequestHandler):
def prepare(self):
print('Logging...prepare')
if self.application.settings.get('test_mode', False):
print("In test mode")
print('X-Request-ID: {0}'.format(self.request.header.get('X-Request-ID')))
#gen.coroutine
def post(self):
result = yield some_func()
self.write(result)
def on_finish(self):
print('Logging...on_finish')
The simple solution would be to create an object that represents the context of your request and pass that into your log function. Example:
class RequestContext(object):
"""
Hold request context
"""
class WebRequestHandler(web.RequestHandler):
#gen.coroutine
def post(self):
# create new context obj and fill w/ necessary parameters
request_context = RequestContext()
request_context.test_mode = self.application.settings.get('test_mode', False)
request_context.corr_id = self.request.header.get('X-Request-ID')
# pass context objects into coroutine
result = yield some_func(request_context)
self.write(result)
#gen.coroutine
def some_func(request_context)
if request_context.test_mode:
print "In test mode"
# do more async calls
I am trying to design a test suite for my tornado web socket server.
I am using a client to do this - connect to a server through a websocket, send a request and expect a certain response.
I am using python's unittest to run my tests, so I cannot (and do not want to really) enforce the sequence in which the tests are running.
This is how my base test class (after which all test cases inherit) is organized. (The logging and certain parts, irrelevant here are stripped).
class BaseTest(tornado.testing.AsyncTestCase):
ws_delay = .05
#classmethod
def setUpClass(cls):
cls.setup_connection()
return
#classmethod
def setup_connection(cls):
# start websocket threads
t1 = threading.Thread(target=cls.start_web_socket_handler)
t1.start()
# websocket opening delay
time.sleep(cls.ws_delay)
# this method initiates the tornado.ioloop, sets up the connection
cls.websocket.connect('localhost', 3333)
return
#classmethod
def start_web_socket_handler(cls):
# starts tornado.websocket.WebSocketHandler
cls.websocket = WebSocketHandler()
cls.websocket.start()
The scheme I came up with is to have this base class which inits the connection once for all tests (although this does not have to be the case - I am happy to set up and tear down the connection for each test case if it solves my problems). What is important that I do not want to have multiple connections open at the same time.
The simple test case looks like that.
class ATest(BaseTest):
#classmethod
def setUpClass(cls):
super(ATest, cls).setUpClass()
#classmethod
def tearDownClass(cls):
super(ATest, cls).tearDownClass()
def test_a(self):
saved_stdout = sys.stdout
try:
out = StringIO()
sys.stdout = out
message_sent = self.websocket.write_message(
str({'opcode': 'a_message'}})
)
output = out.getvalue().strip()
# the code below is useless
while (output is None or not len(output)):
self.log.debug("%s waiting for response." % str(inspect.stack()[0][3]))
output = out.getvalue().strip()
self.assertIn(
'a_response', output,
"Server didn't send source not a_response. Instead sent: %s" % output
)
finally:
sys.stdout = saved_stdout
It works fine most of the time, yet it is not fully deterministic (and therefore reliable). Since the websocket communication is performed async, and the unittest executes test synchronously, the server responses (which are received on the same websocket) get mixed up with the requests and the tests fail occasionally.
I know it should be callback based, but this won't solve the response mixing issue. Unless, all the tests are artifically sequenced in a series of callbacks (as in start test_2 inside a test_1_callback).
Tornado offers a testing library to help with synchronous testing, but I cannot seem to get it working with websockets (the tornado.ioloop has it's own thread which you cannot block).
I cannot find a python websocket synchronous client library which would work with tornado server and be RFC 6455 compliant. Pypi's websocket-client fails to meet the second demand.
My questions are:
Is there a reliable python synchronous websocket client library that meets the demands described above?
If not, what is the best way to organize a test suite like this (the tests cannot really be run in parallel)?
As far as I understand, since we're working with one websocket, the IOStreams for test cases cannot be separated, and therefore there is no way of determining to which request the response is coming (I have multiple tests for requests of the same type with different parameters). Am I wrong in this ?
Have you looked at the websocket unit tests included with tornado? They show you how you can do this:
from tornado.testing import AsyncHTTPTestCase, gen_test
from tornado.websocket import WebSocketHandler, websocket_connect
class MyHandler(WebSocketHandler):
""" This is the server code you're testing."""
def on_message(self, message):
# Put whatever response you want in here.
self.write_message("a_response\n")
class WebSocketTest(AsyncHTTPTestCase):
def get_app(self):
return Application([
('/', MyHandler, dict(close_future=self.close_future)),
])
#gen_test
def test_a(self):
ws = yield websocket_connect(
'ws://localhost:%d/' % self.get_http_port(),
io_loop=self.io_loop)
ws.write_message(str({'opcode': 'a_message'}}))
response = yield ws.read_message()
self.assertIn(
'a_response', response,
"Server didn't send source not a_response. Instead sent: %s" % response
)v
The gen_test decorator allows you to run asynchronous testcases as coroutines, which, when run inside tornado's ioloop, effectively makes them behave synchronously for testing purposes.
I have the following Resource to handle http POST request with twisted web:
class RootResource(Resource):
isLeaf = True
def errback(self, failure):
print "Request finished with error: %s"%str(failure.value)
return failure
def write_response_happy(self, result):
self.request.write('HAPPY!')
self.request.finish()
def write_response_unhappy(self, result):
self.request.write('UNHAPPY!')
self.request.finish()
#defer.inlineCallbacks
def method_1(self):
#IRL I have many more queries to mySQL, cassandra and memcache to get final result, this is why I use inlineCallbacks to keep the code clean.
res = yield dbpool.runQuery('SELECT something FROM table')
#Now I make a decision based on result of the queries:
if res: #Doesn't make much sense but that's only an example
self.d.addCallback(self.write_response_happy) #self.d is already available after yield, so this looks OK?
else:
self.d.addCallback(self.write_response_unhappy)
returnValue(None)
def render_POST(self, request):
self.request = request
self.d = self.method_1()
self.d.addErrback(self.errback)
return server.NOT_DONE_YET
root = RootResource()
site = server.Site(root)
reactor.listenTCP(8002, site)
dbpool = adbapi.ConnectionPool('MySQLdb', host='localhost', db='mydb', user='myuser', passwd='mypass', cp_reconnect=True)
print "Serving on 8002"
reactor.run()
I've used the ab tool (from apache utils) to test 5 POST requests one after another:
ab -n 5 -p sample_post.txt http://127.0.0.1:8002/
Works fine!
Then I tried to run the same 5 POST requests simultaneously:
ab -n 5 -c 5 -p sample_post.txt http://127.0.0.1:8002/
Here I'm getting errors: exceptions.RuntimeError: Request.write called on a request after Request.finish was called. What am I doing wrong?
As Mualig suggested in his comments, you have only one instance of RootResource. When you assign to self.request and self.d in render_POST, you overwrite whatever value those attributes already had. If two requests arrive at around the same time, then this is a problem. The first Request and Deferred are discarded and replaced by new ones associated with the request that arrives second. Later, when your database operation finishes, the second request gets both results and the first one gets none at all.
This is an example of a general mistake in concurrent programming. Your per-request state is kept where it is shared between multiple requests. When multiple requests are handled concurrently, that sharing turns into a fight, and (at least) one request has to lose.
Try keeping your per-request state where it won't be shared between multiple requests. For example, try keeping it on the Deferred:
class RootResource(Resource):
isLeaf = True
def errback(self, failure):
print "Request finished with error: %s"%str(failure.value)
# You just handled the error, don't return the failure.
# Nothing later in the callback chain is doing anything with it.
# return failure
def write_response(self, result, request):
# No "self.request" anymore, just use the argument
request.write(result)
request.finish()
#defer.inlineCallbacks
def method_1(self):
#IRL I have many more queries to mySQL, cassandra and memcache to get final result, this is why I use inlineCallbacks to keep the code clean.
res = yield dbpool.runQuery('SELECT something FROM table')
#Now I make a decision based on result of the queries:
if res: #Doesn't make much sense but that's only an example
# No "self.d" anymore, just produce a result. No shared state to confuse.
returnValue("HAPPY!")
else:
returnValue("UNHAPPY!")
def render_POST(self, request):
# No more attributes on self. Just start the operation.
d = self.method_1()
# Push the request object into the Deferred. It'll be passed to response,
# which is what needs it. Each call to method_1 returns a new Deferred,
# so no shared state here.
d.addCallback(self.write_response, request)
d.addErrback(self.errback)
return server.NOT_DONE_YET