Sleep in async application, without blocking event loop

Sleep in async application, without blocking event loop - python

I have a problem with one my async tests: tornado seems not to be up and running (I get 599s), even though it is started as a fixture. I would like to verify that it is indeed running, by doing the following:
Start the test
Before the requests are actually sent to tornado, I would like my code to "sleep" for a long time so that I can manually check with the browser that tornado is indeed running.
How can I tell python to sleep without blocking the async loop? If I do time.sleep, the whole process (and thus the loop) is not responsive for the duration of the sleep.

Both of your questions are answered in Tornado's FAQ:
http://www.tornadoweb.org/en/latest/faq.html
Use yield gen.sleep(1) to wait 1 second while letting other tasks run on the event loop, or use IOLoop.current().add_timeout(1, callback) to run a callback one second later.
Don't use your browser to verify that your application handles multiple requests concurrently: the browser itself may not allow concurrent requests. Use curl instead.

You may check your code whether your request method is POST or PUT, I get the 599 Response Code when I use method POST and didn't post a data, and it gones when I added data.
error:
res = self.fetch('/', method='POST')
fixed:
res = self.fetch('/', method='POST', data={})

Related

Is there an alternative process to threading for async process in python?

Please tell me if there is a better way to do this in python.
I have an endpoint in the flask app that takes in the request, validate, and starts a thread and returns a response saying request if valid and a process is started.
At the end of the thread, I will send a request to the call back URL saying the threading process has completed.
is there a way to do this without threading?
what are the other options to do an asynchronous endpoint call in python where the client doesn't till the process is complete.

You can try using features from Python 3.4+'s asyncio. This will change the working of the program to cooperative multitasking, so need to be a bit careful. Some answers on how to do that here.
Maybe websockets would work for you. The Starlette framework is async-first and has inbuilt websocket support.

Running python function asynchronously without blocking caller function (and without needing the result)

I need to do a lengthy preprocessing step on POST requests to a given route (I'm using Django).
This will read a dataset, change some thing, and re-write it to disk (it can take a couple minutes).
I don't need the result of this function, I just want to execute it asynchronously and send an HTTP response immediately without waiting for it to end.
Currently like this it says the subroutine "preprocess_dataset_async" is never awaited on and it doesn't execute it fully.
#require_POST
def preprocess_dataset(request, f_path=''):
# ...
preprocess_dataset_async(f_path, data)
return HttpResponse('Request is being handled in the background', status=200)
async def preprocess_dataset_async(f_path, preprocess_args):
# ...
await stuff
# ...
What would be the best way to execute this task in the background without blocking the caller function ?

threading can be a possible solution for this.
But a better and far-sighted solution would be introducing Celery.

time.sleep, Flask and I/O wait

When using time.sleep(), will a Flask request be blocked?
One of my Flask endpoint launches a long processing subtask, and in some cases, instead of doing the work asynchronously, it is possible to wait for the completion of the task and return the result in the same request.
In this case, my Flask app starts the process, then waits for it to complete before returning the result. My issue here, is that while calling something like (simplified):
while True:
if process_is_done():
break
time.sleep(1)
Will Flask will block that request until it is done, or will it allow for other requests to come in the meantime?

Yes, that request is entirely blocked. time.sleep() does not inform anything of the sleep, it just 'idles' the CPU for the duration.
Flask is itself not asynchronous, it has no concept of putting a request handler on hold and giving other requests more time. A good WSGI server will use threads and or multiple worker processes to achieve concurrency, but this one request is blocked and taking up CPU time all the same.

Mixing tornado and sqlalchemy

I'm trying to write a tornado web application that uses sqlalchemy in some request handlers. These handlers have two parts: one that takes a long time to complete, and another that uses sqlalchemy and is relatively fast.
I would like to make the slow part of the request asynchronous, but not the sqlalchemy part. Can I do something like the following code and be safe?
class ExampleHandler(BaseHandler):
async def post(self):
loop = asyncio.get_event_loop()
await loop.run_in_executor(...) # very slow (no sqlalchemy here)
with self.db_session() as s: # sqlalchemy session
s.add(...)
s.commit()
self.render(...)
The idea is to have sqlalchemy still blocking, but have the computational heavy part not blocking the application.

The tornado web server uses asynchronous code to get around the limit of the python Global Interpreter Lock. The GIL, as it is colloquially known, allows only one thread of execution to take place in the python interpreter process. Tornado is able to answer many requests simultaneously because of its use of an event loop. The event loop can perform one small task at a time. Let's take your own post handler to understand this better.
In this handler, when the python interpreter gets to the await keyword, it pauses the execution of the function and queues it for later on its event loop. It then checks the event loop to respond to other events that may have queued up there, like responding to a new connection or servicing another handler.
When you block in an asynchronous function, you freeze the entire event loop as it is unable to pause your function and service anything else. What this actually means for you is that your web server will not accept or service any requests while your async function blocks. It will appear as if your web server is hanging and indeed it is stuck.
To keep the server responsive, you have to find a way to execute your sqlalchemy query in an asynchronous non-blocking manner.

Aborting HTTP request cross-thread

I'm porting one of my projects from C# and am having trouble solving a multithreading issue in Python. The problem relates to a long-lived HTTP request, which is expected (the request will respond when a certain event occurs on the server). Here's the summary:
I send the request using urllib2 on a separate thread. When the request returns or times out, the main thread is notified. This works fine. However, there are cases where I need to abort this outstanding request and switch to a different URL. There are four solutions that I can consider:
Abort the outstanding request. C# has WebRequest.Abort(), which I can call cross-thread to abort the request. Python urllib2.Request appears to be a pure data class, in that instances only store request information; responses are not connected to Request objects. So I can't do this.
Interrupt the thread. C# has Thread.Interrupt(), which will raise a ThreadInterruptedException in the thread if it is in a wait state, or the next time it enters such a state. (Waiting on a monitor and file/socket I/O are both waiting states.) Python doesn't seem to have anything comparable; there does not appear to be a way to wake up a thread that is blocked on I/O.
Set a low timeout on the request. On a timeout, check an "aborted" flag. If it's false, restart the request.
Similar to option 3, add an "aborted" flag to the state object so that when the request does finally end in one way or another, the thread knows that the response is no longer needed and just shuts itself down.
Options 3 and 4 seem to be the only ones supported by Python, but option 3 is a horrible solution and 4 will keep open a connection I don't need. I am hoping to be a good netizen and close this connection when I no longer need it. Is there any way to actually abort the outstanding request, one way or another?

Consider using gevent. Gevent uses non-thread cooperating units of execution called greenlets. Greenlets can "block" on IO, which really means "go to sleep until the IO is ready". You could have a requester greenlet that owns the socket and a main greenlet that decides when to abort. When you want to abort and switch URLs the main greenlet kills the requester greenlet. The requester catches the resulting exception, closes its socket/urllib2 request, and starts over.
Edited to add: Gevent is not compatible with threads, so be careful with that. You'll have to either use gevent all the way or threads all the way. Threads in python are kinda lame anyway because of the GIL.

Similar to Spike Gronim's answer, but even more heavy handed.
Consider rewriting this in twisted. You probably would want to subclass twisted.web.http.HTTPClient, in particular implementing handleResponsePart to do your client interaction (or handleResponseEnd if you don't need to see it before the response ends). To close the connection early, you just call the loseConnection method on the client protocol.

Maybe this snippet of "killable thread" could be useful to you if you have no other choice. But i would have the same opinion as Spike Gronim and recommend using gevent.

I found this question using google and used Spike Gronim's answer to come up with:
from gevent import monkey
monkey.patch_all()
import gevent
import requests
def post(*args, **kwargs):
if 'stop_event' in kwargs:
stop_event = kwargs['stop_event']
del kwargs['stop_event']
else:
stop_event = None
req = gevent.spawn(requests.post, *args, **kwargs)
while req.value is None:
req.join(timeout=0.1)
if stop_event and stop_event.is_set():
req.kill()
break
return req.value
I thought it might be useful for other people as well.
It works just like a regular request.post, but takes an extra keyword argument 'stop_event'. This is a threading.Event. The request will abort if the stop_event gets set.
Use with caution, because if it's not waiting for either the connection or the communitation, it can block GIL (as mentioned). It (gevent) does seem compatible with threading these days (through monkey patch).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.