Calling function from Tornado async

Calling function from Tornado async - python

just struggling on this. If i have an async request handler that during it's execution calls other functions that do something (for example async db queries) and then they call "finish" on their own, do i have to mark them as async? because if the application is structured like the example, i get errors about multiple calls to "finish". I guess i miss something.
class MainHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
#gen.engine
def post(self):
#do some stuff even with mongo motor
self.handleRequest(bla)
#gen.engine
def handleRequest(self,bla):
#do things,use motor call other functions
self.finish(result)
Do all functions have to be marked with async?
thanks

Calling finish ends the HTTP request see docs. Other functions should not call 'finish'
I think you want to do something like this. Note that there is a extra param 'callback' which is added into async functions:
#tornado.web.asynchronous
#gen.engine
def post(self):
query =''
response = yield tornado.gen.Task(
self.handleRequest,
query=query
)
result = response[0][0]
errors = response[1]['error']
# Do stuff with result
def handleRequest(self, callback, query):
self.motor['my_collection'].find(query, callback=callback)
See tornado.gen docs for more info

Related

how to use coroutine in tornado as a global initialize request handler

i want to use a base requestHander to get the global arguments from redis before all request(GET、POST..) ，and i use coroutine in tornado web as async&await. but i use aredis to client redis-server ,and the baseHandler doesn't support to use async in init method,because i need when the Class super the parent Class can get the arguments or the request have been confirm in check params from redis.
code like this.
import aredis
redis_client = aredis.StricRedis('redis://xxxx')
class BaseHandler(tornado.web.requestHander):
def check_request(self):
check_one = await redis_client.get(self.request.body['param'])
if not check_one:
self.finish(dict(msg='refused'))
async def get_params(self):
data = await redis_client.get('xxx')
self.data = data
class UserHander(BaseHander):
async def get(self,*args,**kwargs):
data = self.data
return self.finish(data)

See the RequestHandler.prepare() method. You can create this method in your subclass and it will be called before every request method.
Example:
class BaseHandler(...):
async def prepare(self):
# your code here ...

Using request level context in Tornado

I'm looking for a way to set request level context in Tornado.
This is useful for logging purpose, to print some request attributes with every log line (like user_id).
I'd like to populate the context in web.RequestHandler and then access it in other coroutines that this request called.
class WebRequestHandler(web.RequestHandler):
#gen.coroutine
def post(self):
RequestContext.test_mode = self.application.settings.get('test_mode', False)
RequestContext.corr_id = self.request.header.get('X-Request-ID')
result = yield some_func()
self.write(result)
#gen.coroutine
def some_func()
if RequestContext.test_mode:
print "In test mode"
do more async calls
Currently I pass context object (dict with values) to every async function call down stream, this way every part of the code can do monitoring and logging with right context.
I'm looking for a cleaner/simpler solution.
Thanks
Alex

The concept of request context doesn't really hold well in async frameworks (especially if you have high volume traffic) for the simple fact that there could potentially be hundreds of concurrent requests and it becomes difficult to determine which "context" to use. This works for sequential frameworks like Flask, Falcon, Django, etc. because requests are handled one by one and it's simple to determine which request you're dealing with.
The preferred method of handling functionality between a request start and end is to override prepare and on_finish respectively.
class WebRequestHandler(web.RequestHandler):
def prepare(self):
print('Logging...prepare')
if self.application.settings.get('test_mode', False):
print("In test mode")
print('X-Request-ID: {0}'.format(self.request.header.get('X-Request-ID')))
#gen.coroutine
def post(self):
result = yield some_func()
self.write(result)
def on_finish(self):
print('Logging...on_finish')
The simple solution would be to create an object that represents the context of your request and pass that into your log function. Example:
class RequestContext(object):
"""
Hold request context
"""
class WebRequestHandler(web.RequestHandler):
#gen.coroutine
def post(self):
# create new context obj and fill w/ necessary parameters
request_context = RequestContext()
request_context.test_mode = self.application.settings.get('test_mode', False)
request_context.corr_id = self.request.header.get('X-Request-ID')
# pass context objects into coroutine
result = yield some_func(request_context)
self.write(result)
#gen.coroutine
def some_func(request_context)
if request_context.test_mode:
print "In test mode"
# do more async calls

How to use tornado's asynchttpclient alone?

I'm new to tornado.
What I want is to write some functions to fetch webpages asynchronously. Since no requesthandlers, apps, or servers involved here, I think I can use tornado.httpclient.AsyncHTTPClient alone.
But all the sample codes seem to be in a tornado server or requesthandler. When I tried to use it alone, it never works.
For example:
def handle(self,response):
print response
print response.body
#tornado.web.asynchronous
def fetch(self,url):
client=tornado.httpclient.AsyncHTTPClient()
client.fetch(url,self.handle)
fetch('http://www.baidu.com')
It says "'str' object has no attribute 'application'", but I'm trying to use it alone?
or :
#tornado.gen.coroutine
def fetch_with_coroutine(url):
client=tornado.httpclient.AsyncHTTPClient()
response=yield http_client.fetch(url)
print response
print response.body
raise gen.Return(response.body)
fetch_with_coroutine('http://www.baidu.com')
doesn't work either.
Earlier, I tried pass a callback to AsyncHTTPHandler.fetch, then start the IOLoop, It works and the webpage source code is printed. But I can't figure out what to do with the ioloop.

#tornado.web.asynchronous can only be applied to certain methods in RequestHandler subclasses; it is not appropriate for this usage.
Your second example is the correct structure, but you need to actually run the IOLoop. The best way to do this in a batch-style program is IOLoop.current().run_sync(fetch_with_coroutine). This starts the IOLoop, runs your callback, then stops the IOLoop. You should run a single function within run_sync(), and then use yield within that function to call any other coroutines.
For a more complete example, see https://github.com/tornadoweb/tornado/blob/master/demos/webspider/webspider.py

Here's an example I've used in the past...
from tornado.httpclient import AsyncHTTPClient
from tornado.ioloop import IOLoop
AsyncHTTPClient.configure(None, defaults=dict(user_agent="MyUserAgent"))
http_client = AsyncHTTPClient()
def handle_response(response):
if response.error:
print("Error: %s" % response.error)
else:
print(response.body)
async def get_content():
await http_client.fetch("https://www.integralist.co.uk/", handle_response)
async def main():
await get_content()
print("I won't wait for get_content to finish. I'll show immediately.")
if __name__ == "__main__":
io_loop = IOLoop.current()
io_loop.run_sync(main)
I've also detailed how to use Pipenv with tox.ini and Flake8 with this tornado example so others should be able to get up and running much more quickly https://gist.github.com/fd603239cacbb3d3d317950905b76096

What is the benefit of adding a #tornado.web.asynchronous decorator?

class IndexHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def get(request):
request.render("../resouce/index.html")
I always read some tornado code as above, confusing that what's the purpose of adding this decorator ? I know that adding this decorator, we should call self.finish()manually, but any benefits of doing that ?
Thanks !

Normally, finish() is called for you when the handler method returns, but if your handler depends on the result of an asynchronous computation (like an HTTP request), then it won't have finished by the time the method returns. Instead it should finish in some callback.
The example from the documentation is instructive:
class MyRequestHandler(web.RequestHandler):
#web.asynchronous
def get(self):
http = httpclient.AsyncHTTPClient()
http.fetch("http://friendfeed.com/", self._on_download)
def _on_download(self, response):
self.write("Downloaded!")
self.finish()
Without the decorator, by the time _on_download is entered the request would have finished already.
If your handler isn't doing anything async, then there's no benefit to adding the decorator.

Tornado memory leak on dropped connections

I've got a setup where Tornado is used as kind of a pass-through for workers. Request is received by Tornado, which sends this request to N workers, aggregates results and sends it back to client. Which works fine, except when for some reason timeout occurs — then I've got memory leak.
I've got a setup which similar to this pseudocode:
workers = ["http://worker1.example.com:1234/",
"http://worker2.example.com:1234/",
"http://worker3.example.com:1234/" ...]
class MyHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def post(self):
responses = []
def __callback(response):
responses.append(response)
if len(responses) == len(workers):
self._finish_req(responses)
for url in workers:
async_client = tornado.httpclient.AsyncHTTPClient()
request = tornado.httpclient.HTTPRequest(url, method=self.request.method, body=body)
async_client.fetch(request, __callback)
def _finish_req(self, responses):
good_responses = [r for r in responses if not r.error]
if not good_responses:
raise tornado.web.HTTPError(500, "\n".join(str(r.error) for r in responses))
results = aggregate_results(good_responses)
self.set_header("Content-Type", "application/json")
self.write(json.dumps(results))
self.finish()
application = tornado.web.Application([
(r"/", MyHandler),
])
if __name__ == "__main__":
##.. some locking code
application.listen()
tornado.ioloop.IOLoop.instance().start()
What am I doing wrong? Where does the memory leak come from?

I don't know the source of the problem, and it seems gc should be able to take care of it, but there's two things you can try.
The first method would be to simplify some of the references (it looks like there may still be references to responses when the RequestHandler completes):
class MyHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def post(self):
self.responses = []
for url in workers:
async_client = tornado.httpclient.AsyncHTTPClient()
request = tornado.httpclient.HTTPRequest(url, method=self.request.method, body=body)
async_client.fetch(request, self._handle_worker_response)
def _handle_worker_response(self, response):
self.responses.append(response)
if len(self.responses) == len(workers):
self._finish_req()
def _finish_req(self):
....
If that doesn't work, you can always invoke garbage collection manually:
import gc
class MyHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def post(self):
....
def _finish_req(self):
....
def on_connection_close(self):
gc.collect()

The code looks good. The leak is probably inside Tornado.
I only stumbled over this line:
async_client = tornado.httpclient.AsyncHTTPClient()
Are you aware of the instantiation magic in this constructor?
From the docs:
"""
The constructor for this class is magic in several respects: It actually
creates an instance of an implementation-specific subclass, and instances
are reused as a kind of pseudo-singleton (one per IOLoop). The keyword
argument force_instance=True can be used to suppress this singleton
behavior. Constructor arguments other than io_loop and force_instance
are deprecated. The implementation subclass as well as arguments to
its constructor can be set with the static method configure()
"""
So actually, you don't need to do this inside the loop. (On the other
hand, it should not do any harm.) But which implementation are you
using CurlAsyncHTTPClient or SimpleAsyncHTTPClient?
If it is SimpleAsyncHTTPClient, be aware of this comment in the code:
"""
This class has not been tested extensively in production and
should be considered somewhat experimental as of the release of
tornado 1.2.
"""
You can try switching to CurlAsyncHTTPClient. Or follow
Nikolay Fominyh's suggestion and trace the calls to __callback().

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Calling function from Tornado async - python

Related

how to use coroutine in tornado as a global initialize request handler

Using request level context in Tornado

How to use tornado's asynchttpclient alone?

What is the benefit of adding a #tornado.web.asynchronous decorator?

Tornado memory leak on dropped connections

Categories

Resources