Can AppEngine python threads last longer than the original request? - python

We're trying to use the new python 2.7 threading ability in Google App Engine and it seems like the created thread is getting killed before it finishes running. Our scenario:
User sends a message to the server
We update the user's data
We spawn a thread to do some more heavy duty processing
We return a response to the user before waiting for the heavy duty processing to finish
My assumption was that the thread would continue to run after the request had returned, as long as it did not exceed the total request time limit. What we're seeing though is that the thread is randomly killed partway through it's execution. No exceptions, no errors, nothing. It just stops running.
Are threads allowed to exist after the response has been returned? This does not repro on the dev server, only on live servers.
We could of course use a task queue instead, but that's a real pain since we'd have to set up a url for the action and serialize/deserialize the data.

The 'Sandboxing' section of this page:
http://code.google.com/appengine/docs/python/python27/using27.html#Sandboxing
indicates that threads cannot run past the end of the request.

Deferred tasks are the way to do this. You don't need a URL or serialization to use them:
from google.appengine.ext import deferred
deferred.defer(myfunction, arg1, arg2)

Related

Python function with request might hang, how to timeout?

I'm building a script to process messages using the O365 module (https://pypi.org/project/O365/).
The script runs great but for some reason, after a random time (usually about 20 hours) it get's stuck on a request without response and the script just hangs there waiting for a response.
It's not a server throttling issue as I've slowed my script down to one request every minute and it still hangs.
I think it might be a bug in O365 module where it doesn't timeout the requests, so I'm thinking on making the calls on a separate thread and if it doesn't return in a certain amount of time, kill it.
But from what I understand, if I just try to join the thread it will try to wait until it finishes (which is never), is there a way to avoid this?
Thanks!
You can use multithreading and the join method. As explained in the documentation: "This blocks the calling thread until the thread whose join() method is called terminates – either normally or through an unhandled exception – or until the optional timeout occurs."
Your request will either terminate because it has been completed or because the maximum time limit has been reached.

time.sleep, Flask and I/O wait

When using time.sleep(), will a Flask request be blocked?
One of my Flask endpoint launches a long processing subtask, and in some cases, instead of doing the work asynchronously, it is possible to wait for the completion of the task and return the result in the same request.
In this case, my Flask app starts the process, then waits for it to complete before returning the result. My issue here, is that while calling something like (simplified):
while True:
if process_is_done():
break
time.sleep(1)
Will Flask will block that request until it is done, or will it allow for other requests to come in the meantime?
Yes, that request is entirely blocked. time.sleep() does not inform anything of the sleep, it just 'idles' the CPU for the duration.
Flask is itself not asynchronous, it has no concept of putting a request handler on hold and giving other requests more time. A good WSGI server will use threads and or multiple worker processes to achieve concurrency, but this one request is blocked and taking up CPU time all the same.

Scraping website using Celery

Currently, my structure is Flask, Redis, RabbitMQ and Celery. In my scraping, I am using requests and BeautifulSoup.
My flask is running on apache and wsgi. This is on prod. With app.run(threaded=True)
I have 25 APIs. 10 are to scrape the URL like headers, etc. , and the rest is to use a 3rd party API for that URL.
I am using chord for processing my APIs and getting data from the APIs using requests.
For my chord header I have 3 workers, while on my callback I only have 1.
I am having a bottleneck issue of having ConnectTimeoutError and MaxRetryError. As I read some thread it said to do a timeout for every process, because having this error means you are overloading the remote server.
The problem is since I am using a chord there is no sense to use a time sleep since the 25 API call will be run at the same time. Have anyone encountered this? Or am I doing this wrong?
The thread I read seem to be saying to change the requests to pycurl or use Scrapy. But I dont think that's the case since ConnectTimeoutError is about my host overloading a specific URLs server.
My chord process:
callback = create_document.s(url, company_logo, api_list)
header = [api_request.s(key) for key in api_list.keys()]
result = chord(header)(callback)
In api_request task requests is used.
If you're wanting to limit the number of scrapes running at the same time you can create an enqueue task that checks to see if another task is running that shares the same properties as the task you are wanting to run. If the task is running you tell it to sleep for a few seconds and check again. When it sees that one is not running you can then queue the task you want to run. This will allow you to have sleeps with asynchronous tasks. You can even count the tasks and run more if only a certain number are running. With this you can run 5 at a time and see if it is throttled enough then queue another when you see one has finished etc.
::EDIT::
Documentation for Celery Inspect

How do I return an HTTP response in a callback in Flask, or does it even matter?

I am familiar with evented servers but not threaded ones. A common feature of REST APIs implemented in evented systems like node.js+express or tornado is, in an handler, to do some I/O work asynchronously in a callback, then return the actual HTTP response in the callback. In Express, we have things like:
app.post('/products', function (req, res) {
Product.create(req.body, function (err, product) {
if (err) return res.json(400, err);
res.send(201, product);
});
});
where Post.create hits the database and calls the callback after the product is persisted. The response, whether 201 or 400 is sent in the callback. This keeps the server freed up to do other things while the database is working, but from the point of view of the client, the request seems to take a while.
Suppose I wanted to do the same thing in Flask (which is not an evented server). If I have a POST handler to create several objects that needs to make several database writes that could take several seconds to complete, it seems I have two choices:
I could immediately return a 202 ACCEPTED but then this burdens the client with having to check back to see whether all the writes were committed.
I could just implement all the database writes directly inside the handler. The client will have to wait the few seconds for the reply, but it's synchronous and simple from the client perspective.
My question is whether if I do #2, is Flask smart enough to block the current request thread so that other requests can be handled during the database writes? I would hope the server doesn't block here.
BTW I have done long polling, but this is for a public REST API where clients expect simple requests and responses, so I think either approach 1 or 2 is best. Option 1 seems rare to me, but I am worried about #2 blocking the server? Am I right to be concerned, or is Flask (and threaded servers) just smart so I need not worry?
Blocking vs. non-blocking
Flask itself (much like express) is not inherently blocking or non-blocking - it relies on the underlying container to provide the features necessary for operation (reading data from the user and writing responses to the user). If the server does not provide an event loop (e. g. mod_wsgi) then Flask will block. If the server is a non-blocking one (e. g. gunicorn) then Flask will not block.
On the other end of things, if the code that you write in your handlers is blocking Flask will block, even if it is run on a non-blocking container.
Consider the following:
app.post('/products', function (req, res) {
var response = Product.createSync(req.body);
// Event loop is blocked until Product is created
if (response.isError) return res.json(400, err);
res.send(201, product);
});
If you run that on a node server you will quickly bring everything to a screeching halt. Even though node itself is non-blocking your code is not and it blocks the event loop preventing you from handling any other request from this node until the loop is yielded at res.json or res.send. Node's ecosystem makes it easy to find non-blocking IO libraries - in most other common environments you have to make a conscious choice to use non-blocking libraries for the IO you need to do.
Threaded servers and how they work
Most non-evented containers use multiple threads to manage the workload of a concurrent system. The container accepts requests in the main thread and then farms off the handling of the request and the serving of the response to one of its worker threads. The worker thread executes the (most often blocking) code necessary to handle the request and generate a response. While the handling code is running that thread is blocked and cannot take on any other work. If the request rate exceeds the total thread pool count then clients start backing up, waiting for a thread to complete.
What's the best thing to do with a long-running request in a threaded environment?
Knowing that blocking IO blocks one of your workers, the question now is "how many concurrent users are you expecting to have?" (Where concurrent means "occur over the span of time it takes to accept and process one request") If the answer is "less than the total number of threads in my worker thread pool" then you are golden - your server can handle the load and it's non-blocking nature is in no way a threat to stability. Choosing between #1 and #2 is largely a matter of taste.
On the other hand, if the answer to the above question is "more than the total number of works in my thread pool" then you will need to handle the requests by passing off the user's data to another worker pool (generally via a queue of some kind) and responding to the request with a 202 (Option #1 in your list). That will enable you to lower the response time, which will, in turn, enable you to handle more users.
TL;DR
Flask is not blocking or non-blocking as it does no direct IO
Threaded servers block on the request / response handling thread, not the accept request thread
Depending on the expected traffic you will almost certainly want to go go with option #1 (return a 202 and push the work into a queue to be handled by a different thread pool / evented solution).

AppEngine Timeout with Task Queues

I'm trying to execute a task in AppEngine through the Task Queues, but I still seem to be faced with a 60 second timeout. I'm unsure what I'm doing incorrectly, as the limit I'd think should be 10 minutes as advertised.
I have a call to urlfetch.fetch() that appears to be the culprit. My call is:
urlfetch.fetch(url, payload=query_data, method=method, deadline=300)
The tail end of my stack trace shows the method that triggers the url fetch call right before the DeadlineExceededError:
File "/base/data/home/apps/s~mips-conversion-scheduler/000-11.371629749593131630/views.py", line 81, in _get_mips_updated_data
policies_changed = InquiryClient().get_changed_policies(company_id, initial=initial).json()
When I look at the task queue information it shows:
Method/URL: POST /tasks/queue-initial-load
Dispatched time (UTC): 2013/11/14 15:18:49
Seconds late: 0.18
Seconds to process task: 59.90
Last http response code: 500
Reason to rety: AppError
My View that processes the task looks like:
class QueueInitialLoad(webapp2.RequestHandler):
def post(self):
company = self.request.get("company")
if company:
company_id = self.request.get("company")
queue_policy_load(company_id, queue_name="initialLoad", initial=True)
with the queue_policy_load being the method that triggers the urlfetch call.
Is there something obvious I'm missing that makes me limited to the 60 second timeout instead of 10 minutes?
Might be a little too general, but here are some thoughts that might help close the loop. There are 2 kinds of task queues, push queues and pull queues. Push queue tasks execute automatically, and they are only available to your App Engine app. On the other hand, pull queue tasks wait to be leased, are available to workers outside the app, and can be batched.
If you want to configure your queue, you can do it in the queue config file. In Java, that happens in the queue.xml file, and in Python that happens in the queue.yaml file. In terms of push queues specifically, push queue tasks are processed by handlers (URLs) as POST requests. They:
Are executed ASAP
May cause new instances (Frontend or Backend)
Have a task duration limit of 10 minutes
But, they have an unlimited duration if the tasks are run on the backend
Here is a quick Python code example showing how you can add tasks to a named push queue. Have a look at the Google developers page for Task Queues if you need more information: https://developers.google.com/appengine/docs/python/taskqueue/
Adding Tasks to a Named Push Queue:
queue = taskqueue.Queue("Qname")
task = taskqueue.Task(url='/handler', params=args)
queue.add(task)
On the other hand, let's say that you wanted to use a pull queue. You could add tasks in Python to a pull queue using the following:
queue = taskqueue.Queue("Qname")
task = taskqueue.Task(payload=load, method='PULL')
queue.add(task)
You can then lease these tasks out using the following approach in Python:
queue = taskqueue.Queue("Qname")
tasks = queue.lease_tasks(how-long, how-many)
Remember that, for pull queues, if a task fails, App Engine retries it until it succeeds.
Hope that helps in terms of providing a general perspective!
The task queues have a 10min deadline but a Urlfetch call has a 1 min deadline :
maximum deadline (request handler) 60 seconds
UPDATE: the intended behaviour was to have a max of 10mins URLFetch deadline when running in a TaskQueue, see this bug.
As GAE has evolved, this answer pertains to today where the idea of "backend" instances is deprecated. GAE Apps can be configured to be Services (aka module) and run with a manual scaling policy. Doing so allows one to set longer timeouts. If you were running your app with an autoscaling policy, it will cap your urlfetch's to 60sec and your queued tasks to 10 mins:
https://cloud.google.com/appengine/docs/python/an-overview-of-app-engine

Categories