Tornado Async HTTP returning results incrementally - python

From what I understand from tornado.gen module docs is that tornado.gen.Task comprises of tornado.gen.Callback and tornado.gen.Wait with each Callback/Wait pair associated with unique keys ...
#tornado.web.asynchronous
#tornado.gen.engine
def get(self):
http_client = AsyncHTTPClient()
http_client.fetch("http://google.com",
callback=(yield tornado.gen.Callback("google")))
http_client.fetch("http://python.org",
callback=(yield tornado.gen.Callback("python")))
http_client.fetch("http://tornadoweb.org",
callback=(yield tornado.gen.Callback("tornado")))
response = yield [tornado.gen.Wait("google"), tornado.gen.Wait("tornado"), tornado.gen.Wait("python")]
do_something_with_response(response)
self.render("template.html")
So the above code will get all responses from the different URLs.
Now what I actually need to accomplish is to return the response as soon as one http_client returns the data. So if 'tornadoweb.org' returns the data first, it should do a self.write(respose) and a loop in def get() should keep waiting for other http_clients to complete.
Any ideas on how to write this using tornado.gen interface.
Very vague implementation(and syntactically incorrect) of what I am trying to do would be like this
class GenAsyncHandler2(tornado.web.RequestHandler):
#tornado.web.asynchronous
#tornado.gen.engine
def get(self):
http_client = AsyncHTTPClient()
http_client.fetch("http://google.com",
callback=(yield tornado.gen.Callback("google")))
http_client.fetch("http://python.org",
callback=(yield tornado.gen.Callback("python")))
http_client.fetch("http://tornadoweb.org",
callback=(yield tornado.gen.Callback("tornado")))
while True:
response = self.get_response()
if response:
self.write(response)
self.flush()
else:
break
self.finish()
def get_response(self):
for key in tornado.gen.availableKeys():
if key.is_ready:
value = tornado.gen.pop(key)
return value
return None

It's case, when you shouldn't use inline callbacks, i.e gen.
Also self.render will be called after all callbacks finished. If you want to return response from server partially - render it partially.
Think this way(it's only idea with big room of improvement):
response = []
#tornado.web.asynchronous
def get(self):
self.render('head.html')
http_client = AsyncHTTPClient()
http_client.fetch("http://google.com",
callback=self.mywrite)
http_client.fetch("http://python.org",
callback=self.mywrite)
http_client.fetch("http://tornadoweb.org",
callback=self.mywrite)
self.render('footer.html')
self.finish()
def mywrite(self, result):
self.render('body_part.html')
self.response.add(result)
if len(self.response) == 3:
do_something_with_response(self.response)

In addition to this, actually there is a method WaitAll which waits for all results and returns when all HTTPCliens have completed giving responses.
I have submitted the diff in my tornado branch (https://github.com/pranjal5215/tornado). I have added a class WaitAny which is async WaitAll and returns result as soon as one HTTPClient has returned result.
Diff is at (https://github.com/pranjal5215/tornado/commit/dd6902147ab2c5cbf2b9c7ee9a35b4f89b40790e), (https://github.com/pranjal5215/tornado/wiki/Add-WaitAny-to-make-WaitAll-return-results-incrementally)
Sample usage:
class GenAsyncHandler2(tornado.web.RequestHandler):
#tornado.web.asynchronous
#tornado.gen.engine
def get(self):
http_client = AsyncHTTPClient()
http_client.fetch("http://google.com",
callback=(yield tornado.gen.Callback("google")))
http_client.fetch("http://python.org",
callback=(yield tornado.gen.Callback("python")))
http_client.fetch("http://tornadoweb.org",
callback=(yield tornado.gen.Callback("tornado")))
keys = set(["google", "tornado", "python"])
while keys:
key, response = yield tornado.gen.WaitAny(keys)
keys.remove(key)
# do something with response
self.write(str(key)+" ")
self.flush()
self.finish()

Related

How to update request parameters in FastAPI

I am using FastAPI, I want to define a Middleware in which I can intercept the encrypted parameters passed by the front-end and decrypt them, and replace the original parameters with the decrypted ones, what should I do?
I have tried
body = await request.body()
request._body = body
Also I have tried
async def set_body(request: Request, body: bytes):
async def receive() -> Message:
return {"type": "http.request", "body": body}
request._receive = receive
async def get_body(request: Request) -> bytes:
body = await request.body()
set_body(request, body)
return body
But still no solution, can anyone give a solution to the problem, thanks a lot!
=========================================================================
class GzipRequest(Request):
async def body(self) -> bytes:
# if not hasattr(self, "_body"):
body = await super().body()
# if "gzip" in self.headers.getlist("Content-Encoding"):
# body = gzip.decompress(body)
self._body = body
return self._body
class GzipRoute(APIRoute):
def get_route_handler(self) -> Callable:
original_route_handler = super().get_route_handler()
async def custom_route_handler(request: Request) -> Response:
request = GzipRequest(request.scope, request.receive)
return await original_route_handler(request)
return custom_route_handler
app.router.route_class = GzipRoute
I also tested this method, but it still didn't work!
==============================================================
I have solved this problem by app.router.route_class = GzipRoute,
main tip: when defining a route in another class, you also need to make a route class, for example:
router = APIRouter(route_class=GzipRoute)

python tornado get response of multiple async httprequest

i have a list of url_handler and i would want to make asyncronous httprequest using tornado. When all response structure is arrived i need to use it for other targets.
Here a simple example of my code:
(...)
self.number = 0
self.counter = 0
self.data = {}
(...)
#tornado.web.asynchronous
def post(self):
list_url = [url_service1, url_service2]
self.number = len(list_url)
http_client = AsyncHTTPClient()
for service in list_url:
request = tornado.httpclient.HTTPRequest(url=service, method='POST', headers={'content-type': 'application/json'}, body=json.dumps({..params..}))
http_client.fetch(request, callback=self.handle_response)
# Loop for is finished. Use self.data for example in other funcions...
# if i print(self.data) i have empty dict...
# do_something(self.data)
def handle_response(self,response):
if response.error:
print("Error")
else:
self.counter = self.counter + 1
print("Response {} / {} from {}".format(self.counter, self.number, response.effective_url))
self.data[response.effective_url] = json_decode(response.body)
# number is 2
if self.counter == self.number:
print("Finish response")
def do_something(data):
# code with data parameter
I hope my problem is well explained
Since you know AsyncHTTPClient is asynchronous, that means, the requests will run in background.
So, when the for loop is finished, that does not mean all the requests are also finished - they are running in the background even when the loop finishes.
That is why self.data is empty, because the requests aren't completed yet.
How to fix this
As you know the handle_response callback is called after every request is completed. You can call do_something function from this callback when all the requests are completed. Like this:
def handle_response(...):
...
if self.counter == self.number:
self.do_something(self.data)
print("Finish response")

aiohttp: set maximum number of requests per second

How can I set maximum number of requests per second (limit them) in client side using aiohttp?
Although it's not exactly a limit on the number of requests per second, note that since v2.0, when using a ClientSession, aiohttp automatically limits the number of simultaneous connections to 100.
You can modify the limit by creating your own TCPConnector and passing it into the ClientSession. For instance, to create a client limited to 50 simultaneous requests:
import aiohttp
connector = aiohttp.TCPConnector(limit=50)
client = aiohttp.ClientSession(connector=connector)
In case it's better suited to your use case, there is also a limit_per_host parameter (which is off by default) that you can pass to limit the number of simultaneous connections to the same "endpoint". Per the docs:
limit_per_host (int) – limit for simultaneous connections to the same endpoint. Endpoints are the same if they are have equal (host, port, is_ssl) triple.
Example usage:
import aiohttp
connector = aiohttp.TCPConnector(limit_per_host=50)
client = aiohttp.ClientSession(connector=connector)
I found one possible solution here: http://compiletoi.net/fast-scraping-in-python-with-asyncio.html
Doing 3 requests at the same time is cool, doing 5000, however, is not so nice. If you try to do too many requests at the same time, connections might start to get closed, or you might even get banned from the website.
To avoid this, you can use a semaphore. It is a synchronization tool that can be used to limit the number of coroutines that do something at some point. We'll just create the semaphore before creating the loop, passing as an argument the number of simultaneous requests we want to allow:
sem = asyncio.Semaphore(5)
Then, we just replace:
page = yield from get(url, compress=True)
by the same thing, but protected by a semaphore:
with (yield from sem):
page = yield from get(url, compress=True)
This will ensure that at most 5 requests can be done at the same time.
This is an example without aiohttp, but you can wrap any async method or aiohttp.request using the Limit decorator
import asyncio
import time
class Limit(object):
def __init__(self, calls=5, period=1):
self.calls = calls
self.period = period
self.clock = time.monotonic
self.last_reset = 0
self.num_calls = 0
def __call__(self, func):
async def wrapper(*args, **kwargs):
if self.num_calls >= self.calls:
await asyncio.sleep(self.__period_remaining())
period_remaining = self.__period_remaining()
if period_remaining <= 0:
self.num_calls = 0
self.last_reset = self.clock()
self.num_calls += 1
return await func(*args, **kwargs)
return wrapper
def __period_remaining(self):
elapsed = self.clock() - self.last_reset
return self.period - elapsed
#Limit(calls=5, period=2)
async def test_call(x):
print(x)
async def worker():
for x in range(100):
await test_call(x + 1)
asyncio.run(worker())
Because none of the solution works from the other answers (I've already tried) if the API request limits the time since the end of the request. I'm posting a new one that should work:
class Limiter:
def __init__(self, calls_limit: int = 5, period: int = 1):
self.calls_limit = calls_limit
self.period = period
self.semaphore = asyncio.Semaphore(calls_limit)
self.requests_finish_time = []
async def sleep(self):
if len(self.requests_finish_time) >= self.calls_limit:
sleep_before = self.requests_finish_time.pop(0)
if sleep_before >= time.monotonic():
await asyncio.sleep(sleep_before - time.monotonic())
def __call__(self, func):
async def wrapper(*args, **kwargs):
async with self.semaphore:
await self.sleep()
res = await func(*args, **kwargs)
self.requests_finish_time.append(time.monotonic() + self.period)
return res
return wrapper
Usage:
#Limiter(calls_limit=5, period=1)
async def api_call():
...
async def main():
tasks = [asyncio.create_task(api_call(url)) for url in urls]
asyncio.gather(*tasks)
if __name__ == '__main__':
loop = asyncio.get_event_loop_policy().get_event_loop()
loop.run_until_complete(main())

Converting web.asynchronous code to gen.coroutine in tornado

I want to convert my current tornado app from using #web.asynchronous to #gen.coroutine. My asynchronous callback is called when a particular variable change happens on an IOLoop iteration. The current example in Tornado docs solves an I/O problem but in my case its the variable that I am interested in. I want the coroutine to wake up on the variable change. My app looks like the code shown below.
Note: I can only use Python2.
# A transaction is a DB change that can happen
# from another process
class Transaction:
def __init__(self):
self.status = 'INCOMPLETE'
self.callback = None
# In this, I am checking the status of the DB
# before responding to the GET request
class MainHandler(web.RequestHandler):
def initialize(self, app_reference):
self.app_reference = app_reference
#web.asynchronous
def get(self):
txn = Transaction()
callback = functools.partial(self.do_something)
txn.callback = callback
self.app_reference.monitor_transaction(txn)
def do_something(self):
self.write("Finished GET request")
self.finish()
# MyApp monitors a list of transactions and adds the callback
# 'transaction.callback' when transactions status changes to
# COMPLETE state.
class MyApp(Application):
def __init__(self, settings):
self.settings = settings
self._url_patterns = self._get_url_patterns()
self.txn_list = [] # list of all transactions being monitored
Application.__init__(self, self._url_patterns, **self.settings)
IOLoop.current().add_callback(self.check_status)
def monitor_transaction(self, txn):
self.txn_list.append(txn)
def check_status(self):
count = 0
for transaction in self.txn_list:
transaction.status = is_transaction_complete()
if transaction.status is 'COMPLETE':
IOLoop.current().add_callback(transaction.callback)
self.txn_list.pop(count)
count += 1
if len(self.txn_list):
IOloop.current().add_callback(self.check_status)
# adds 'self' to url_patterns
def _get_url_patterns(self):
from urls import url_patterns
modified_url_patterns = []
for url in url_patterns:
modified_url_patterns.append( url + ({ 'app_reference': self },))
return modified_url_patterns
If I understand right for it to write using gen.coroutine the get should be modified as
#gen.coroutine
def get(self):
txn = Transaction()
response = yield wake_up_when_transaction_completes()
# respond to GET here
My issue is I am not sure how to wake a routine only when the status changes and I cannot use a loop as it will block the tornado thread. Basically I want to notify from the IOLoop iteration.
def check_status():
for transaction in txn_list:
if transaction.status is 'COMPLETE':
NOTIFY_COROUTINE
Sounds like a job for the new tornado.locks! Released last week with Tornado 4.2:
http://tornado.readthedocs.org/en/latest/releases/v4.2.0.html#new-modules-tornado-locks-and-tornado-queues
Use an Event for this:
from tornado import locks, gen
event = locks.Event()
#gen.coroutine
def waiter():
print("Waiting for event")
yield event.wait()
print("Done")
#gen.coroutine
def setter():
print("About to set the event")
event.set()
More info on the Event interface:
http://tornado.readthedocs.org/en/latest/locks.html#tornado.locks.Event

Django: views and assert like returns

I wonder if there is a python hackish way to achieve the following:
I found myself using an assert like structure in my views a lot:
def view(request):
if not condition:
return HttpResponseServerError("error")
if not condition2:
return HttpResponseServerError("error2")
[...]
return HttpResponse("OK!")
So I thought about using an assert like function:
def view(request):
def err(msg=None):
msg = msg if msg else "Illegal Parameters"
resp = {"msg": msg}
resp = json.dumps(resp)
return HttpResponseServerError(resp)
def verify(exp, msg=None):
if not exp:
err(msg)
verify(condition, "error")
verify(condition2, "error2")
return HttpResponse("OK")
Obviously, this does not work, as the result of the error function is never returned. Furthermore, I would also need to return the Response all the way to the view function and run return verify(), which will make my code prevent from execution of course.
One possible solution would be a decorator that either returns an error or the view function after all asserts went through. However, I would like to prevent that, as I also need some of the values I am establishing (imagine parsing one number after another and then having to pass a list of numbers).
Another solution I could think of is to actually do use a decorator and make my function a generator, yielding the result of verify. The decorator is a loop over that generator and keeps going until a response is yielded.
But in this post I am really looking for a more hackish way, to let the nested function return a reponse instead of the parent function and therefore prevent execution.
I will post my yield "solution" in a separate answer so you can get the picture :)
What about an exception, and a nice decorator to catch it:
class AssertError(Exception):
pass
def assertLike(view):
def wrap(request, *args, **kwargs):
try:
return view(request, *args, **kwargs):
except AssertError as e:
return HttpResponseServerError(...)
return wrap
#assertLike
def createTask(request):
import json
....
if not exp:
raise AssertError()
....
return HttpResponse("Ok")
Here I present the generator based solution:
def assertLike(view):
def wrap(request, *args, **kwargs):
for response in view(request, *args, **kwargs):
if response:
return response
return wrap
#other_django_views
#another_django_view
#assertLike
def createTask(request):
import json
def err(msg=None):
msg = msg if msg else "Illegal Parameters"
resp = {"msg": msg}
resp = json.dumps(resp)
return HttpResponseServerError(resp)
def verify(exp, msg=None):
if not exp:
return err(msg)
# only react to ajax requests
yield verify(True, "This is not an error")
yield verify(False, "Here it should stop!")
yield HttpResponse("This is the final response!")

Categories