Getting statistics on an individual task from Locust - python

I have a Locust task that requires many HTTP requests. At the end of the task, I have conditions to check for success or failure. The statistics gathered are very informative in regards to the individual HTTP requests, but I would like to know more information about each invocation of the task itself. For example, how long it took to run the function, whether it completed successfully, etc.
I can't find a good way to do this. It seems like each HTTP request makes a log entry, but I don't know how to manually create one. Can anyone give me some guidance?

You can create an entry manually by triggering the request_success event.
from locust import events
events.request_success.fire(
request_type="task",
name="my_task",
response_time=1337,
response_length=0,
)
You could also create a decorator that automatically fires the above event and tracks the execution time for the tasks that it's applied to.

Related

Queue in django/python

I have developed a django REST API. I send request/data to it to perfrom a task and it does it nicely. Though, In a way, I can send it multiple request/data to perfrom the task on each of them. The issue is that server where the task gets performed has limited memory and I need to perform these task one by one. So, I am thinking to have a queue system at django pipeline which can maintain the reqeust on hold till the task in front of the queue is done.
I am not sure if I am on right path, but not sure if celery is the option to solve my issue?
It seems a simple task and I didnt understand if celery is what i need. Can you point me what should be looking at?
If you want to keep your API as is, then celery will not help you. It's a good idea to keep API calls as short as possible. If you have some longer job done during API calls (sending emails, for example), then you better use celery. But the only thing you can get as a response to your API is that the task was queued.

Requeuing a successful task in Celery

I have a certain type of task that does something that I would like refreshed a few minutes after it originally run, if a certain condition is met.
As far as I can see, there's no way to rerun a task that has previously run since the information about the task request (args, kwargs, priority..) is not saved anywhere.
I can see that it appears in Flower, but I assume that's because it uses Celery events.
Is there any way to accomplish what I want? I could add a post-task hook which saves the request info, but that seems a bit off.
I'm using RabbitMQ as the broker and MongoDB as the results backend.
As per the docs apply_async has a cowntdown option allowing you to delay the execution for a certain number of seconds.
You could just make a recursive task:
#app.task
def my_task(an_arg):
# do something
my_task.apply_async(countdown=120, kwargs={"an_arg": an_arg})

Running functions automatically when certain criteria are met. Without user interaction.

I am using Flask.
I am currently using a fabfile to check which users should get a bill and I set up a cron job to run the fabfile every morning at 5am. This automatically creates bills in Stripe and in my database and sends out emails to the users to inform them. This could be used for birthday reminders or anything else similar.
Is setting up a cronjob the standard way of doing this sort of thing? Is there a better way/standard?
I would define "this sort of thing" as. Anything that needs to happen automatically in the app when certain criteria are met without a user interacting with said app.
I could not find much when I googled this.
Using cron is in effect the most straightforward way of doing it. However, there are other kind of services that trigger tasks on a periodic basis and offer some additional control. For instance, Celery's scheduler. There seems to be a tutorial about building periodic tasks with celery here.
What I think you have to ask yourself is:
Is a cron job the most reliable way of billing your customers?
I've written small/simple apps that use an internal timer. e.g: https://bitbucket.org/prologic/irclogger which roates it's irc log files once per day. Is this any better or more reliable? Not really; if the daemon/bot were to die prematurely or the system were to crash; what happens then? In this case it just gets started again and logs continue to rorate at the next "day" interval.
I think two things are important here:
Reliability
Robustness

Better ways to handle AppEngine requests that time out?

Sometimes, with requests that do a lot, Google AppEngine returns an error. I have been handling this by some trickery: memcaching intermediate processed data and just requesting the page again. This often works because the memcached data does not have to be recalculated and the request finishes in time.
However... this hack requires seeing an error, going back, and clicking again. Obviously less than ideal.
Any suggestions?
inb4: "optimize your process better", "split your page into sub-processes", and "use taskqueue".
Thanks for any thoughts.
Edit - To clarify:
Long wait for requests is ok because the function is administrative. I'm basically looking to run a data-mining function. I'm searching over my datastore and modifying a bunch of objects. I think the correct answer is that AppEngine may not be the right tool for this. I should be exporting the data to a computer where I can run functions like this on my own. It seems AppEngine is really intended for serving with lighter processing demands. Maybe the quota/pricing model should offer the option to increase processing timeouts and charge extra.
If interactive user requests are hitting the 30 second deadline, you have bigger problems: your user has almost certainly given up and left anyway.
What you can do depends on what your code is doing. There's a lot to be optimized by batching datastore operations, or reducing them by changing how you model your data; you can offload work to the Task Queue; for URLFetches, you can execute them in parallel. Tell us more about what you're doing and we may be able to provide more concrete suggestions.
I have been handling something similar by building a custom automatic retry dispatcher on the client. Whenever an ajax call to the server fails, the client will retry it.
This works very well if your page is ajaxy. If your app spits entire HTML pages then you can use a two pass process: first send an empty page containing only an ajax request. Then, when AppEngine receives that ajax request, it outputs the same HTML you had before. If the ajax call succeeds it fills the DOM with the result. If it fails, it retries once.

Stopping long-running requests in Pylons

I'm working on an application using Pylons and I was wondering if there was a way to make sure it doesn't spend way too much time handling one request. That is, I would like to find a way to put a timer on each request such that when too much time elapses, the request just stops (and possibly returns some kind of error).
The application is supposed to allow users to run some complex calculations but I would like to make sure that if a calculation starts taking too much time, we stop it to allow other calculations to take place.
Rather than terminate a request with an error, a better approach might be to perform long-running calculations in a separate thread (or threads) or process (or processes):
When the calculation request is received, it is added to a queue and identified with a unique id. You redirect to a results page referencing the unique ID, which can have a "Please wait, calculating" message and a refresh button (or auto-refresh via a meta tag).
The thread or process which does the calculation pops requests from the queue, updates the final result (and perhaps progress information too), which the results page handler will present to the user when refreshed.
When the calculation is complete, the returned refresh page will have no refresh button or refresh tag, but just show the final result.

Categories