asyncio.lock in a Tornado application

asyncio.lock in a Tornado application - python

I'm trying to write a asynchronous method for use in a Tornado application. My method needs to manage a connection that can and should be shared among other calls to the function The connection is created by awaiting. To manage this, I was using asyncio.Lock. However, every call to my method would hang waiting for the lock.
After a few hours of experimenting, I found out a few things,
If nothing awaits in the lock block, everything works as expected
tornado.ioloop.IOLoop.configure('tornado.platform.asyncio.AsyncIOLoop') does not help
tornado.platform.asyncio.AsyncIOMainLoop().install() allows it to work, regardless if the event loop is started with tornado.ioloop.IOLoop.current().start() or asyncio.get_event_loop().run_forever()
Here is some sample code that wont work until unless you uncomment AsyncIOMainLoop().install():
import tornado.ioloop
import tornado.web
import tornado.gen
import tornado.httpclient
from tornado.platform.asyncio import AsyncIOMainLoop
import asyncio
import tornado.locks
class MainHandler(tornado.web.RequestHandler):
_lock = asyncio.Lock()
#_lock = tornado.locks.Lock()
async def get(self):
print("in get")
r = await tornado.gen.multi([self.foo(str(i)) for i in range(2)])
self.write('\n'.join(r))
async def foo(self, i):
print("Getting first lock on " + i)
async with self._lock:
print("Got first lock on " + i)
# Do something sensitive that awaits
await asyncio.sleep(0)
print("Unlocked on " + i)
# Do some work
print("Work on " + i)
await asyncio.sleep(0)
print("Getting second lock on " + i)
async with self._lock:
print("Got second lock on " + i)
# Do something sensitive that doesnt await
pass
print("Unlocked on " + i)
return "done"
def make_app():
return tornado.web.Application([
(r"/", MainHandler),
])
if __name__ == "__main__":
#AsyncIOMainLoop().install() # This will make it work
#tornado.ioloop.IOLoop.configure('tornado.platform.asyncio.AsyncIOLoop') # Does not help
app = make_app()
app.listen(8888)
print('starting app')
tornado.ioloop.IOLoop.current().start()
I now know that tornado.locks.Lock() exists and works, but I'm curious why the asyncio.Lock does not work.

Both Tornado and asyncio have a global singleton event loop which everything else depends on (for advanced use cases you can avoid the singleton, but using it is idiomatic). To use both libraries together, the two singletons need to be aware of each other.
AsyncIOMainLoop().install() makes a Tornado event loop that points to the asyncio singleton, then sets it as the tornado singleton. This works.
IOLoop.configure('AsyncIOLoop') tells Tornado "whenever you need an IOLoop, create a new (non-singleton!) asyncio event loop and use that. The asyncio loop becomes the singleton when the IOLoop is started. This almost works, but when the MainHandler class is defined (and creates its class-scoped asyncio.Lock, the asyncio singleton is still pointing to the default (which will be replaced by the one created by AsyncIOLoop).
TL;DR: Use AsyncIOMainLoop, not AsyncIOLoop, unless you're attempting to use the more advanced non-singleton use patterns. This will get simpler in Tornado 5.0 as asyncio integration will be enabled by default.

Related

Why I am not able to do simultaneous requests in Tornado?

Below tornado APP has 2 end points. One(/) is slow because it waits for an IO operation and other(/hello) is fast.
My requirement is to make a request to both end points simultaneously.I observed it takes 2nd request only after it finishes the 1st one. Even though It is asynchronous why it is not able to handle both requests at same time ?
How to make it to handle simultaneously?
Edit : I am using windows 7, Eclipse IDE
****************Module*****************
import tornado.ioloop
import tornado.web
class MainHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def get(self):
self.do_something()
self.write("FINISHED")
self.finish()
def do_something(self):
inp = input("enter to continue")
print (inp)
class HelloHandler(tornado.web.RequestHandler):
def get(self):
print ("say hello")
self.write("Hello bro")
self.finish(
def make_app():
return tornado.web.Application([
(r"/", MainHandler),
(r"/hello", HelloHandler)
])
if __name__ == "__main__":
app = make_app()
app.listen(8888)
tornado.ioloop.IOLoop.current().start()

It is asynchronous only if you make it so. A Tornado server runs in a single thread. If that thread is blocked by a synchronous function call, nothing else can happen on that thread in the meantime. What #tornado.web.asynchronous enables is the use of generators:
#tornado.web.asynchronous
def get(self):
yield from self.do_something()
^^^^^^^^^^
This yield/yield from (in current Python versions await) feature suspends the function and lets other code run on the same thread while the asynchronous call completes elsewhere (e.g. waiting for data from the database, waiting for a network request to return a response). I.e., if Python doesn't actively have to do something but is waiting for external processes to complete, it can yield processing power to other tasks. But since your function is very much running in the foreground and blocking the thread, nothing else will happen.
See http://www.tornadoweb.org/en/stable/guide/async.html and https://docs.python.org/3/library/asyncio.html.

Make a Python asyncio call from a Flask route

I want to execute an async function every time the Flask route is executed. Why is the abar function never executed?
import asyncio
from flask import Flask
async def abar(a):
print(a)
loop = asyncio.get_event_loop()
app = Flask(__name__)
#app.route("/")
def notify():
asyncio.ensure_future(abar("abar"), loop=loop)
return "OK"
if __name__ == "__main__":
app.run(debug=False, use_reloader=False)
loop.run_forever()
I also tried putting the blocking call in a separate thread. But it is still not calling the abar function.
import asyncio
from threading import Thread
from flask import Flask
async def abar(a):
print(a)
app = Flask(__name__)
def start_worker(loop):
asyncio.set_event_loop(loop)
try:
loop.run_forever()
finally:
loop.close()
worker_loop = asyncio.new_event_loop()
worker = Thread(target=start_worker, args=(worker_loop,))
#app.route("/")
def notify():
asyncio.ensure_future(abar("abar"), loop=worker_loop)
return "OK"
if __name__ == "__main__":
worker.start()
app.run(debug=False, use_reloader=False)

You can incorporate some async functionality into Flask apps without having to completely convert them to asyncio.
import asyncio
from flask import Flask
async def abar(a):
print(a)
loop = asyncio.get_event_loop()
app = Flask(__name__)
#app.route("/")
def notify():
loop.run_until_complete(abar("abar"))
return "OK"
if __name__ == "__main__":
app.run(debug=False, use_reloader=False)
This will block the Flask response until the async function returns, but it still allows you to do some clever things. I've used this pattern to perform many external requests in parallel using aiohttp, and then when they are complete, I'm back into traditional flask for data processing and template rendering.
import aiohttp
import asyncio
import async_timeout
from flask import Flask
loop = asyncio.get_event_loop()
app = Flask(__name__)
async def fetch(url):
async with aiohttp.ClientSession() as session, async_timeout.timeout(10):
async with session.get(url) as response:
return await response.text()
def fight(responses):
return "Why can't we all just get along?"
#app.route("/")
def index():
# perform multiple async requests concurrently
responses = loop.run_until_complete(asyncio.gather(
fetch("https://google.com/"),
fetch("https://bing.com/"),
fetch("https://duckduckgo.com"),
fetch("http://www.dogpile.com"),
))
# do something with the results
return fight(responses)
if __name__ == "__main__":
app.run(debug=False, use_reloader=False)

A simpler solution to your problem (in my biased view) is to switch to Quart from Flask. If so your snippet simplifies to,
import asyncio
from quart import Quart
async def abar(a):
print(a)
app = Quart(__name__)
#app.route("/")
async def notify():
await abar("abar")
return "OK"
if __name__ == "__main__":
app.run(debug=False)
As noted in the other answers the Flask app run is blocking, and does not interact with an asyncio loop. Quart on the other hand is the Flask API built on asyncio, so it should work how you expect.
Also as an update, Flask-Aiohttp is no longer maintained.

Your mistake is to try to run the asyncio event loop after calling app.run(). The latter doesn't return, it instead runs the Flask development server.
In fact, that's how most WSGI setups will work; either the main thread is going to busy dispatching requests, or the Flask server is imported as a module in a WSGI server, and you can't start an event loop here either.
You'll instead have to run your asyncio event loop in a separate thread, then run your coroutines in that separate thread via asyncio.run_coroutine_threadsafe(). See the Coroutines and Multithreading section in the documentation for what this entails.
Here is an implementation of a module that will run such an event loop thread, and gives you the utilities to schedule coroutines to be run in that loop:
import asyncio
import itertools
import threading
__all__ = ["EventLoopThread", "get_event_loop", "stop_event_loop", "run_coroutine"]
class EventLoopThread(threading.Thread):
loop = None
_count = itertools.count(0)
def __init__(self):
self.started = threading.Event()
name = f"{type(self).__name__}-{next(self._count)}"
super().__init__(name=name, daemon=True)
def __repr__(self):
loop, r, c, d = self.loop, False, True, False
if loop is not None:
r, c, d = loop.is_running(), loop.is_closed(), loop.get_debug()
return (
f"<{type(self).__name__} {self.name} id={self.ident} "
f"running={r} closed={c} debug={d}>"
)
def run(self):
self.loop = loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.call_later(0, self.started.set)
try:
loop.run_forever()
finally:
try:
shutdown_asyncgens = loop.shutdown_asyncgens()
except AttributeError:
pass
else:
loop.run_until_complete(shutdown_asyncgens)
try:
shutdown_executor = loop.shutdown_default_executor()
except AttributeError:
pass
else:
loop.run_until_complete(shutdown_executor)
asyncio.set_event_loop(None)
loop.close()
def stop(self):
loop, self.loop = self.loop, None
if loop is None:
return
loop.call_soon_threadsafe(loop.stop)
self.join()
_lock = threading.Lock()
_loop_thread = None
def get_event_loop():
global _loop_thread
if _loop_thread is None:
with _lock:
if _loop_thread is None:
_loop_thread = EventLoopThread()
_loop_thread.start()
# give the thread up to a second to produce a loop
_loop_thread.started.wait(1)
return _loop_thread.loop
def stop_event_loop():
global _loop_thread
with _lock:
if _loop_thread is not None:
_loop_thread.stop()
_loop_thread = None
def run_coroutine(coro):
"""Run the coroutine in the event loop running in a separate thread
Returns a Future, call Future.result() to get the output
"""
return asyncio.run_coroutine_threadsafe(coro, get_event_loop())
You can use the run_coroutine() function defined here to schedule asyncio routines. Use the returned Future instance to control the coroutine:
Get the result with Future.result(). You can give this a timeout; if no result is produced within the timeout, the coroutine is automatically cancelled.
You can query the state of the coroutine with the .cancelled(), .running() and .done() methods.
You can add callbacks to the future, which will be called when the coroutine has completed, or is cancelled or raised an exception (take into account that this is probably going to be called from the event loop thread, not the thread that you called run_coroutine() in).
For your specific example, where abar() doesn't return any result, you can just ignore the returned future, like this:
#app.route("/")
def notify():
run_coroutine(abar("abar"))
return "OK"
Note that before Python 3.8 that you can't use an event loop running on a separate thread to create subprocesses with! See my answer to Python3 Flask asyncio subprocess in route hangs for backport of the Python 3.8 ThreadedChildWatcher class for a work-around for this.

For same reason you won't see this print:
if __name__ == "__main__":
app.run(debug=False, use_reloader=False)
print('Hey!')
loop.run_forever()
loop.run_forever() is never called since as #dirn already noted app.run is also blocking.
Running global blocking event loop - is only way you can run asyncio coroutines and tasks, but it's not compatible with running blocking Flask app (or with any other such thing in general).
If you want to use asynchronous web framework you should choose one created to be asynchronous. For example, probably most popular now is aiohttp:
from aiohttp import web
async def hello(request):
return web.Response(text="Hello, world")
if __name__ == "__main__":
app = web.Application()
app.router.add_get('/', hello)
web.run_app(app) # this runs asyncio event loop inside
Upd:
About your try to run event loop in background thread. I didn't investigate much, but it seems problem somehow related with tread-safety: many asyncio objects are not thread-safe. If you change your code this way, it'll work:
def _create_task():
asyncio.ensure_future(abar("abar"), loop=worker_loop)
#app.route("/")
def notify():
worker_loop.call_soon_threadsafe(_create_task)
return "OK"
But again, this is very bad idea. It's not only very inconvenient, but I guess wouldn't make much sense: if you're going to use thread to start asyncio, why don't just use threads in Flask instead of asyncio? You will have Flask you want and parallelization.
If I still didn't convince you, at least take a look at Flask-aiohttp project. It has close to Flask api and I think still better that what you're trying to do.

The main issue, as already explained in the other answers by #Martijn Pieters and #Mikhail Gerasimov is that app.run is blocking, so the line loop.run_forever() is never called. You will need to manually set up and maintain a run loop on a separate thread.
Fortunately, with Flask 2.0, you don't need to create, run, and manage your own event loop anymore. You can define your route as async def and directly await on coroutines from your route functions.
https://flask.palletsprojects.com/en/2.0.x/async-await/
Using async and await
New in version 2.0.
Routes, error handlers, before request, after request, and teardown
functions can all be coroutine functions if Flask is installed with
the async extra (pip install flask[async]). It requires Python 3.7+
where contextvars.ContextVar is available. This allows views to be
defined with async def and use await.
Flask will take care of creating the event loop on each request. All you have to do is define your coroutines and await on them to finish:
https://flask.palletsprojects.com/en/2.0.x/async-await/#performance
Performance
Async functions require an event loop to run. Flask, as a WSGI
application, uses one worker to handle one request/response cycle.
When a request comes into an async view, Flask will start an event
loop in a thread, run the view function there, then return the result.
Each request still ties up one worker, even for async views. The
upside is that you can run async code within a view, for example to
make multiple concurrent database queries, HTTP requests to an
external API, etc. However, the number of requests your application
can handle at one time will remain the same.
Tweaking the original example from the question:
import asyncio
from flask import Flask, jsonify
async def send_notif(x: int):
print(f"Called coro with {x}")
await asyncio.sleep(1)
return {"x": x}
app = Flask(__name__)
#app.route("/")
async def notify():
futures = [send_notif(x) for x in range(5)]
results = await asyncio.gather(*futures)
response = list(results)
return jsonify(response)
# The recommended way now is to use `flask run`.
# See: https://flask.palletsprojects.com/en/2.0.x/cli/
# if __name__ == "__main__":
# app.run(debug=False, use_reloader=False)
$ time curl -s -XGET 'http://localhost:5000'
[{"x":0},{"x":1},{"x":2},{"x":3},{"x":4}]
real 0m1.016s
user 0m0.005s
sys 0m0.006s
Most common recipes using asyncio can be applied the same way. The one thing to take note of is, as of Flask 2.0.1, we cannot use asyncio.create_task to spawn background tasks:
https://flask.palletsprojects.com/en/2.0.x/async-await/#background-tasks
Async functions will run in an event loop until they complete, at which
stage the event loop will stop. This means any additional spawned
tasks that haven’t completed when the async function completes will be
cancelled. Therefore you cannot spawn background tasks, for example
via asyncio.create_task.
If you wish to use background tasks it is best to use a task queue to
trigger background work, rather than spawn tasks in a view function.
Other than the limitation with create_task, it should work for use-cases where you want to make async database queries or multiple calls to external APIs.

How to use tornado's asynchttpclient alone?

I'm new to tornado.
What I want is to write some functions to fetch webpages asynchronously. Since no requesthandlers, apps, or servers involved here, I think I can use tornado.httpclient.AsyncHTTPClient alone.
But all the sample codes seem to be in a tornado server or requesthandler. When I tried to use it alone, it never works.
For example:
def handle(self,response):
print response
print response.body
#tornado.web.asynchronous
def fetch(self,url):
client=tornado.httpclient.AsyncHTTPClient()
client.fetch(url,self.handle)
fetch('http://www.baidu.com')
It says "'str' object has no attribute 'application'", but I'm trying to use it alone?
or :
#tornado.gen.coroutine
def fetch_with_coroutine(url):
client=tornado.httpclient.AsyncHTTPClient()
response=yield http_client.fetch(url)
print response
print response.body
raise gen.Return(response.body)
fetch_with_coroutine('http://www.baidu.com')
doesn't work either.
Earlier, I tried pass a callback to AsyncHTTPHandler.fetch, then start the IOLoop, It works and the webpage source code is printed. But I can't figure out what to do with the ioloop.

#tornado.web.asynchronous can only be applied to certain methods in RequestHandler subclasses; it is not appropriate for this usage.
Your second example is the correct structure, but you need to actually run the IOLoop. The best way to do this in a batch-style program is IOLoop.current().run_sync(fetch_with_coroutine). This starts the IOLoop, runs your callback, then stops the IOLoop. You should run a single function within run_sync(), and then use yield within that function to call any other coroutines.
For a more complete example, see https://github.com/tornadoweb/tornado/blob/master/demos/webspider/webspider.py

Here's an example I've used in the past...
from tornado.httpclient import AsyncHTTPClient
from tornado.ioloop import IOLoop
AsyncHTTPClient.configure(None, defaults=dict(user_agent="MyUserAgent"))
http_client = AsyncHTTPClient()
def handle_response(response):
if response.error:
print("Error: %s" % response.error)
else:
print(response.body)
async def get_content():
await http_client.fetch("https://www.integralist.co.uk/", handle_response)
async def main():
await get_content()
print("I won't wait for get_content to finish. I'll show immediately.")
if __name__ == "__main__":
io_loop = IOLoop.current()
io_loop.run_sync(main)
I've also detailed how to use Pipenv with tox.ini and Flake8 with this tornado example so others should be able to get up and running much more quickly https://gist.github.com/fd603239cacbb3d3d317950905b76096

How to use Tornado.gen.coroutine in TCP Server?

i write a Tcp Server with Tornado.
here is the code:
#! /usr/bin/env python
#coding=utf-8
from tornado.tcpserver import TCPServer
from tornado.ioloop import IOLoop
from tornado.gen import *
class TcpConnection(object):
def __init__(self,stream,address):
self._stream=stream
self._address=address
self._stream.set_close_callback(self.on_close)
self.send_messages()
def send_messages(self):
self.send_message(b'hello \n')
print("next")
self.read_message()
self.send_message(b'world \n')
self.read_message()
def read_message(self):
self._stream.read_until(b'\n',self.handle_message)
def handle_message(self,data):
print(data)
def send_message(self,data):
self._stream.write(data)
def on_close(self):
print("the monitored %d has left",self._address)
class MonitorServer(TCPServer):
def handle_stream(self,stream,address):
print("new connection",address,stream)
conn = TcpConnection(stream,address)
if __name__=='__main__':
print('server start .....')
server=MonitorServer()
server.listen(20000)
IOLoop.instance().start()
And i face some eorror assert self._read_callback is None, "Already reading",i guess the eorror is because multiple commands to read from socket at the same time.and then i change the function send_messages with tornado.gen.coroutine.here is code:
#gen.coroutine
def send_messages(self):
yield self.send_message(b'hello \n')
response1 = yield self.read_message()
print(response1)
yield self.send_message(b'world \n')
print((yield self.read_message()))
but there are some other errors. the code seem to stop after yield self.send_message(b'hello \n'),and the following code seem not to execute.
how should i do about it ? If you're aware of any Tornado tcpserver (not HTTP!) code with tornado.gen.coroutine,please tell me.I would appreciate any links!

send_messages() calls send_message() and read_message() with yield, but these methods are not coroutines, so this will raise an exception.
The reason you're not seeing the exception is that you called send_messages() without yielding it, so the exception has nowhere to go (the garbage collector should eventually notice and print the exception, but that can take a long time). Whenever you call a coroutine, you should either use yield to wait for it to finish, or IOLoop.current().spawn_callback() to run the coroutine in the "background" (this tells Tornado that you do not intend to yield the coroutine, so it will print the exception as soon as it occurs). Also, whenever you override a method you should read the documentation to see whether coroutines are allowed (when you override TCPServer.handle_stream() you can make it a coroutine, but __init__() may not be a coroutine).
Once the exception is getting logged, the next step is to fix it. You can either make send_message() and read_message() coroutines (getting rid of the handle_message() callback in the process), or you can use tornado.gen.Task() to call coroutine-style code from a coroutine. I generally recommend using coroutines everywhere.

How does long-polling work in Tornado?

In Tornado's chat demo, it has a method like this:
#tornado.web.asynchronous
def post(self):
cursor = self.get_argument("cursor", None)
global_message_buffer.wait_for_messages(self.on_new_messages,
cursor=cursor)
I'm fairly new to this long polling thing, and I don't really understand exactly how the threading stuff works, though it states:
By using non-blocking network I/O, Tornado can scale to tens of thousands of open connections...
My theory was that by making a simple app:
import tornado.ioloop
import tornado.web
import time
class MainHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def get(self):
print("Start request")
time.sleep(4)
print("Okay done now")
self.write("Howdy howdy howdy")
self.finish()
application = tornado.web.Application([
(r'/', MainHandler),
])
That if I made two requests in a row (i.e. I opened two browser windows and quickly refreshed both) I would see this:
Start request
Start request
Okay done now
Okay done now
Instead, I see
Start request
Okay done now
Start request
Okay done now
Which leads me to believe that it is, in fact, blocking in this case. Why is it that my code is blocking, and how do I get some code to do what I expect? I get the same output on Windows 7 with a core i7, and a linux Mint 13 box with I think two cores.
Edit:
I found one method - if someone can provide a method that works cross-platform (I'm not too worried about performance, only that it's non-blocking), I'll accept that answer.

The right way to convert your test app into a form that won't block the IOLoop is like this:
from tornado.ioloop import IOLoop
import tornado.web
from tornado import gen
import time
#gen.coroutine
def async_sleep(timeout):
""" Sleep without blocking the IOLoop. """
yield gen.Task(IOLoop.instance().add_timeout, time.time() + timeout)
class MainHandler(tornado.web.RequestHandler):
#gen.coroutine
def get(self):
print("Start request")
yield async_sleep(4)
print("Okay done now")
self.write("Howdy howdy howdy")
self.finish()
if __name__ == "__main__":
application = tornado.web.Application([
(r'/', MainHandler),
])
application.listen(8888)
IOLoop.instance().start()
The difference is replacing the call to time.sleep with one which won't block the IOLoop. Tornado is designed to handle lots of concurrent I/O without needing multiple threads/subprocesses, but it will still block if you use synchronous APIs. In order for your long-polling solution to handle concurrency the way you'd like, you have to make sure that no long-running calls block.

The problem with the code in original question is that when you call time.sleep(4) you are effectively blocking the execution of event loop for 4 seconds. And accepted answer doesn't solve the problem either (IMHO).
Asynchronous serving in Tornado works on trust. Tornado will call your functions whenever something happens, but it trusts you that you will return control to it as soon as possible. If you block with time.sleep() then this trust is breached - Tornado can't handle new connections.
Using multiple threads only hides the mistake; running Tornado with thousands of threads (so you can serve 1000s of connections simultaneously) would be very inefficient. The appropriate way is running a single thread which only blocks inside Tornado (on select or whatever Tornado's way of listening for events is) - not on your code (to be exact: never on your code).
The proper solution is to just return from get(self) right before time.sleep() (without calling self.finish()), like this:
class MainHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def get(self):
print("Starting")
You must of course remember that this request is still open and call write() and finish() on it later.
I suggest you take a look at chat demo. Once you strip out the authentication you get a very nice example of async long polling server.

Since Tornado 5.0, asyncio is enabled automatically, so pretty much just changing time.sleep(4) to await asyncio.sleep(4) and #tornado.web.asynchronous def get(self): to async def get(self): solves the problem.
Example:
import tornado.ioloop
import tornado.web
import asyncio
class MainHandler(tornado.web.RequestHandler):
async def get(self):
print("Start request")
await asyncio.sleep(4)
print("Okay done now")
self.write("Howdy howdy howdy")
self.finish()
app = tornado.web.Application([
(r'/', MainHandler),
])
app.listen(8888)
tornado.ioloop.IOLoop.current().start()
Output:
Start request
Start request
Okay done now
Okay done now
Sources:
Tornado on asyncio
asyncio usage example

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

asyncio.lock in a Tornado application - python

Related

Why I am not able to do simultaneous requests in Tornado?

Make a Python asyncio call from a Flask route

How to use tornado's asynchttpclient alone?

How to use Tornado.gen.coroutine in TCP Server?

How does long-polling work in Tornado?

Categories

Resources