How to use tornado's asynchttpclient alone? - python

I'm new to tornado.
What I want is to write some functions to fetch webpages asynchronously. Since no requesthandlers, apps, or servers involved here, I think I can use tornado.httpclient.AsyncHTTPClient alone.
But all the sample codes seem to be in a tornado server or requesthandler. When I tried to use it alone, it never works.
For example:
def handle(self,response):
print response
print response.body
#tornado.web.asynchronous
def fetch(self,url):
client=tornado.httpclient.AsyncHTTPClient()
client.fetch(url,self.handle)
fetch('http://www.baidu.com')
It says "'str' object has no attribute 'application'", but I'm trying to use it alone?
or :
#tornado.gen.coroutine
def fetch_with_coroutine(url):
client=tornado.httpclient.AsyncHTTPClient()
response=yield http_client.fetch(url)
print response
print response.body
raise gen.Return(response.body)
fetch_with_coroutine('http://www.baidu.com')
doesn't work either.
Earlier, I tried pass a callback to AsyncHTTPHandler.fetch, then start the IOLoop, It works and the webpage source code is printed. But I can't figure out what to do with the ioloop.

#tornado.web.asynchronous can only be applied to certain methods in RequestHandler subclasses; it is not appropriate for this usage.
Your second example is the correct structure, but you need to actually run the IOLoop. The best way to do this in a batch-style program is IOLoop.current().run_sync(fetch_with_coroutine). This starts the IOLoop, runs your callback, then stops the IOLoop. You should run a single function within run_sync(), and then use yield within that function to call any other coroutines.
For a more complete example, see https://github.com/tornadoweb/tornado/blob/master/demos/webspider/webspider.py

Here's an example I've used in the past...
from tornado.httpclient import AsyncHTTPClient
from tornado.ioloop import IOLoop
AsyncHTTPClient.configure(None, defaults=dict(user_agent="MyUserAgent"))
http_client = AsyncHTTPClient()
def handle_response(response):
if response.error:
print("Error: %s" % response.error)
else:
print(response.body)
async def get_content():
await http_client.fetch("https://www.integralist.co.uk/", handle_response)
async def main():
await get_content()
print("I won't wait for get_content to finish. I'll show immediately.")
if __name__ == "__main__":
io_loop = IOLoop.current()
io_loop.run_sync(main)
I've also detailed how to use Pipenv with tox.ini and Flake8 with this tornado example so others should be able to get up and running much more quickly https://gist.github.com/fd603239cacbb3d3d317950905b76096

Related

Deferred callback not being called using Python requests-threads

I am trying to perform async HTTP requests by using the requests library in Python. I found that the last version of the library does not directly support async requets. To achive it they provide the requests-threads library that makes use of Twisted to handle asynchronicity. I tried modifying the examples provided to use callbacks instead of await/yield, but the callbacks are not being called.
My sample code is:
session = AsyncSession(n=10)
def processResponse(response):
print(response)
def main():
a = session.get('https://reqres.in/api/users')
a.addCallbacks(processResponse, processResponse)
time.sleep(5)
The requests-threads library: https://github.com/requests/requests-threads
I suspect the callbacks are not called because you aren't running Twisted's eventloop (known as the reactor). Remove your sleep function and replace it with reactor.run().
from twisted.internet import reactor
# ...
def main():
a = session.get('https://reqres.in/api/users')
a.addCallbacks(processResponse, processResponse)
#time.sleep(5) # never use blocking functions like this w/ Twisted
reactor.run()
The catch is Twisted's reactor cannot be restarted, so once you stop the event loop (ie. reactor.stop()), an exception will be raised when reactor.run() is executed again. In other words, your script/app will only "run once". To circumvent this issue, I suggest you use crochet. Here's a quick example using a similar example from requests-thread:
import crochet
crochet.setup()
print('setup')
from twisted.internet.defer import inlineCallbacks
from requests_threads import AsyncSession
session = AsyncSession(n=100)
#crochet.run_in_reactor
#inlineCallbacks
def main(reactor):
responses = []
for i in range(10):
responses.append(session.get('http://httpbin.org/get'))
for response in responses:
r = yield response
print(r)
if __name__ == '__main__':
event = main(None)
event.wait()
And just as an FYI requests-thread is not for production systems and is subject to significant change (as of Oct 2017). The end goal of this project is to design an awaitable design pattern for requests in the future. If you need production ready concurrent requests, consider grequests or treq.
I think the only mistake here is that you forgot to run the reactor/event loop.
The following code works for me:
from twisted.internet import reactor
from requests_threads import AsyncSession
session = AsyncSession(n=10)
def processResponse(response):
print(response)
a = session.get('https://reqres.in/api/users')
a.addCallbacks(processResponse, processResponse)
reactor.run()

Checking Servers with Motor(Mongodb & Tornado)

I need to create a function that checks to make sure Mongo servers are running using the ping function. I set up the clients right there (the config file has dictionary with ports numbers)
clientList = []
for value in configuration["mongodbServer"]:
client = motor.motor_tornado.MotorClient('mongodb://localhost:{}'.format(value))
clientList.append(client)
and then i run this function:
class MongoChecker(Checker):
formatter = 'stashboard.formatters.MongoFormatter'
def check(self):
for x in clientList:
if x.ping:
return x.ping
and the error i get:
yielded unknown object MotorDatabase(Database(MongoClient([]), 'ping'))\n",
I think my issue is that i'm using the ping function wrong. I can't find any other documentation on that or any other kind of feature that would check to see if the servers are still running. If anyone knows of a better way to monitor the status using Motor, i'm open. Thanks!
First, there's no "ping" function. Hence MotorClient thinks you're trying to access the database named "ping". The database named "ping" is shown in the "unknown object" exception. For all MongoDB commands like "ping", just use MotorDatabase's command method.
Second, Motor is asynchronous. You must use Motor methods in a Tornado coroutine with the "yield" statement. For example:
#gen.coroutine
def check():
try:
result = yield client.admin.command({'ping': 1})
print(result)
except ConnectionFailure as exc:
print(exc)
If you want to test this out synchronously, you can run the IOLoop just long enough for the coroutine to complete:
from pymongo.errors import ConnectionFailure
from tornado import gen
from tornado.ioloop import IOLoop
import motor.motor_tornado
client = motor.motor_tornado.MotorClient()
IOLoop.current().run_sync(check)
For an introduction to Tornado coroutines, see Refactoring Tornado Coroutines and the Tornado documentation.

How to use Tornado.gen.coroutine in TCP Server?

i write a Tcp Server with Tornado.
here is the code:
#! /usr/bin/env python
#coding=utf-8
from tornado.tcpserver import TCPServer
from tornado.ioloop import IOLoop
from tornado.gen import *
class TcpConnection(object):
def __init__(self,stream,address):
self._stream=stream
self._address=address
self._stream.set_close_callback(self.on_close)
self.send_messages()
def send_messages(self):
self.send_message(b'hello \n')
print("next")
self.read_message()
self.send_message(b'world \n')
self.read_message()
def read_message(self):
self._stream.read_until(b'\n',self.handle_message)
def handle_message(self,data):
print(data)
def send_message(self,data):
self._stream.write(data)
def on_close(self):
print("the monitored %d has left",self._address)
class MonitorServer(TCPServer):
def handle_stream(self,stream,address):
print("new connection",address,stream)
conn = TcpConnection(stream,address)
if __name__=='__main__':
print('server start .....')
server=MonitorServer()
server.listen(20000)
IOLoop.instance().start()
And i face some eorror assert self._read_callback is None, "Already reading",i guess the eorror is because multiple commands to read from socket at the same time.and then i change the function send_messages with tornado.gen.coroutine.here is code:
#gen.coroutine
def send_messages(self):
yield self.send_message(b'hello \n')
response1 = yield self.read_message()
print(response1)
yield self.send_message(b'world \n')
print((yield self.read_message()))
but there are some other errors. the code seem to stop after yield self.send_message(b'hello \n'),and the following code seem not to execute.
how should i do about it ? If you're aware of any Tornado tcpserver (not HTTP!) code with tornado.gen.coroutine,please tell me.I would appreciate any links!
send_messages() calls send_message() and read_message() with yield, but these methods are not coroutines, so this will raise an exception.
The reason you're not seeing the exception is that you called send_messages() without yielding it, so the exception has nowhere to go (the garbage collector should eventually notice and print the exception, but that can take a long time). Whenever you call a coroutine, you should either use yield to wait for it to finish, or IOLoop.current().spawn_callback() to run the coroutine in the "background" (this tells Tornado that you do not intend to yield the coroutine, so it will print the exception as soon as it occurs). Also, whenever you override a method you should read the documentation to see whether coroutines are allowed (when you override TCPServer.handle_stream() you can make it a coroutine, but __init__() may not be a coroutine).
Once the exception is getting logged, the next step is to fix it. You can either make send_message() and read_message() coroutines (getting rid of the handle_message() callback in the process), or you can use tornado.gen.Task() to call coroutine-style code from a coroutine. I generally recommend using coroutines everywhere.

Scheduling a corroutine for a later loop iteration

I borrowed this code of a simple chat:
import tornado.ioloop
import tornado.web
import tornado.websocket
import tornado.gen
clients = []
class IndexHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def get(request):
request.render("index.html")
class WebSocketChatHandler(tornado.websocket.WebSocketHandler):
def open(self, *args):
print("open", "WebSocketChatHandler")
clients.append(self)
def check_origin(self, origin):
return True
#tornado.gen.coroutine
def on_message(self, message):
for client in clients:
client.write_message(message)
#tornado.gen.coroutine
def myroutine(m):
print "mensaje: "
c = (yield 123123123)
print ("mensaje", m, c)
yield myroutine(message)
def on_close(self):
clients.remove(self)
app = tornado.web.Application([(r'/chat', WebSocketChatHandler), (r'/', IndexHandler)])
app.listen(8888)
tornado.ioloop.IOLoop.instance().start()
The Chat application works well (i.e. I see the echoes using a websocket client), and I modified it a bit to test some custom code.
And, just for testing purposes, I wanted to insert a presumably heavy function call which I wanted to make asynchronous.
The actual intention here, is that myroutine will start a game-engine as a paralell task.
Perhaps I am missing something, but the intention in my code is to re-schedule the corroutine in two parts. This means: the corroutine should print "message", then yield the value 123123123 (actually, this is an immediate value which will be wrapped into an already-resolved future - the value will be in the result), thus rescheduling itself to the next iteration, and (in the latter iteration) print the given tuple ("message", message, c).
My issue is that the function is never rescheduled (i.e. only "message:" is printed by console).
What am I doing wrong? This is my first attempt at Tornado (and async programming in general). How can I tell the tornado loop something like "dude, this value is my corruotine, and those are the arguments for my corroutine. please, start it in paralell by scheduling it in the next loop"?
There are two problems going on: first, you can't yield every kind of object from a coroutine, you must yield a Future or other special yieldable object. So when your coroutine yields 123123, Tornado throws a "bad yield" exception. Unfortunately, Tornado's websocket code isn't built to catch exceptions from "on_message" if "on_message" is a coroutine, so the exception passes silently. See the warning at the bottom of the coroutine documentation.
The solution for you is to yield a valid object from "mycoroutine". If you just want to yield for a moment, yield "gen.moment":
print "one"
yield gen.moment
print "two"
If you want "mycoroutine" to run in parallel and not block "on_message", just call it without yielding:
mycoroutine(message)
But! Calling a coroutine this way means no one is listening to see if it throws an exception. Make sure you catch and log all exceptions within "mycoroutine", since otherwise they will pass silently.

How does long-polling work in Tornado?

In Tornado's chat demo, it has a method like this:
#tornado.web.asynchronous
def post(self):
cursor = self.get_argument("cursor", None)
global_message_buffer.wait_for_messages(self.on_new_messages,
cursor=cursor)
I'm fairly new to this long polling thing, and I don't really understand exactly how the threading stuff works, though it states:
By using non-blocking network I/O, Tornado can scale to tens of thousands of open connections...
My theory was that by making a simple app:
import tornado.ioloop
import tornado.web
import time
class MainHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def get(self):
print("Start request")
time.sleep(4)
print("Okay done now")
self.write("Howdy howdy howdy")
self.finish()
application = tornado.web.Application([
(r'/', MainHandler),
])
That if I made two requests in a row (i.e. I opened two browser windows and quickly refreshed both) I would see this:
Start request
Start request
Okay done now
Okay done now
Instead, I see
Start request
Okay done now
Start request
Okay done now
Which leads me to believe that it is, in fact, blocking in this case. Why is it that my code is blocking, and how do I get some code to do what I expect? I get the same output on Windows 7 with a core i7, and a linux Mint 13 box with I think two cores.
Edit:
I found one method - if someone can provide a method that works cross-platform (I'm not too worried about performance, only that it's non-blocking), I'll accept that answer.
The right way to convert your test app into a form that won't block the IOLoop is like this:
from tornado.ioloop import IOLoop
import tornado.web
from tornado import gen
import time
#gen.coroutine
def async_sleep(timeout):
""" Sleep without blocking the IOLoop. """
yield gen.Task(IOLoop.instance().add_timeout, time.time() + timeout)
class MainHandler(tornado.web.RequestHandler):
#gen.coroutine
def get(self):
print("Start request")
yield async_sleep(4)
print("Okay done now")
self.write("Howdy howdy howdy")
self.finish()
if __name__ == "__main__":
application = tornado.web.Application([
(r'/', MainHandler),
])
application.listen(8888)
IOLoop.instance().start()
The difference is replacing the call to time.sleep with one which won't block the IOLoop. Tornado is designed to handle lots of concurrent I/O without needing multiple threads/subprocesses, but it will still block if you use synchronous APIs. In order for your long-polling solution to handle concurrency the way you'd like, you have to make sure that no long-running calls block.
The problem with the code in original question is that when you call time.sleep(4) you are effectively blocking the execution of event loop for 4 seconds. And accepted answer doesn't solve the problem either (IMHO).
Asynchronous serving in Tornado works on trust. Tornado will call your functions whenever something happens, but it trusts you that you will return control to it as soon as possible. If you block with time.sleep() then this trust is breached - Tornado can't handle new connections.
Using multiple threads only hides the mistake; running Tornado with thousands of threads (so you can serve 1000s of connections simultaneously) would be very inefficient. The appropriate way is running a single thread which only blocks inside Tornado (on select or whatever Tornado's way of listening for events is) - not on your code (to be exact: never on your code).
The proper solution is to just return from get(self) right before time.sleep() (without calling self.finish()), like this:
class MainHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def get(self):
print("Starting")
You must of course remember that this request is still open and call write() and finish() on it later.
I suggest you take a look at chat demo. Once you strip out the authentication you get a very nice example of async long polling server.
Since Tornado 5.0, asyncio is enabled automatically, so pretty much just changing time.sleep(4) to await asyncio.sleep(4) and #tornado.web.asynchronous def get(self): to async def get(self): solves the problem.
Example:
import tornado.ioloop
import tornado.web
import asyncio
class MainHandler(tornado.web.RequestHandler):
async def get(self):
print("Start request")
await asyncio.sleep(4)
print("Okay done now")
self.write("Howdy howdy howdy")
self.finish()
app = tornado.web.Application([
(r'/', MainHandler),
])
app.listen(8888)
tornado.ioloop.IOLoop.current().start()
Output:
Start request
Start request
Okay done now
Okay done now
Sources:
Tornado on asyncio
asyncio usage example

Categories