I'm creating a celery task in a situation where task producers are more than consumers (workers). Now since my queues are getting filled up and the workers consume in FCFS manner, can I get to execute a specific task(given a task_id) instantly?
for eg:
My tasks are filled in the following fashion. [1,2,3,4,5,6,7,8,9,0]. Now the tasks are fetched from the zeroth index. Now a situation arise where I want to execute task 8 above all. How can I do this?
The worker need not execute that task (because there can be situation where a worker is already occupied). It can be run directly from the application. And when the task is completed (either from the worker or directly from the application), it should get deleted from the queue.
I know how to forcefully revoke a task (given a task_id) but how can I execute a task given an id ?
how can I execute a task given an id ?
the short answer is you can't. Celery workers pull tasks off the broker backend as they become available.
Why not?
Note that's not a limitation of Celery as such, rather it is a characteristic of message queuing systems(MQS) in general. The point of MQS is to desynchronize an application's component so that the producer can go on to do other work while workers execute the tasks asynchronously. In other words, once a task has been sent off it cannot be modified (but it can be removed as long as it has not been started yet).
What options are there?
Celery offers you several options to deal with lower v.s. higher priority or short- and long-running tasks, at task submission time:
Routing - tasks can be routed to different workers. So if your tasks [0 .. 9] are all long-running, except for task 8, you could route task 8 to a worker, or a set of workers, that deal with short-running tasks.
Timed execution - specify a countdown or estimated time of arrival (eta) for each task. That's a good option if you know that some tasks can be delayed for later execution i.e. when the system will be less busy. This leaves workers ready for those tasks that need to be executed immediately.
Task expiry - specify an expire countdown or time with a callback. This way the task will be revoked if it didn't execute within the time alloted to it and the callback can start an alternative course of action.
Check on task results periodically, revoke a task if it didn't start executing within some time. Note this is different from task expiry where the revoking only happens once a worker has fetched the task from the queue - if the queue is full the revoking may happen too late for your use case. Checking results periodically means you have another component in your system that does this and determines an alternate course of action.
Related
I'm working on a new monitoring system that can measure Celery queue throughput and help alert the team when the queue is getting backed up. Over the course of my work, I've come across some peculiar behaviors that I don't understand (and are not well documented in the Celery specs).
For testing purposes, I've set up an endpoint that will populate the queue with 16 several long-running tasks that can be used to simulate a backed-up queue. The framework is Flask and the Queue broker is Redis. Celery is configured for each worker to work on up to 4 tasks in parallel, and I have 2 workers running.
api/health.py
def health():
health = Blueprint("health", __name__)
#health.route("/api/debug/create-long-queue", methods=["GET"])
def long_queue():
for i in range(16):
sleepy_job.delay()
return make_response({}, 200)
return health
jobs.py
#celery.task(priority=HIGH_PRIORITY)
def sleepy_job(*args, **kwargs):
time.sleep(30)
Here's what I do to simulate a backed-up production queue:
I call /api/debug/create-long-queue to simulate a back-up in my queue. Based on the above math, the workers should be busy sleeping for 1 minute each (Together, they can concurrently handle 8 tasks at a time. Each task just sleeps for 30 seconds, and there are 16 tasks total.)
I make another API call shortly after (< 5 s), which kicks of a different job with real business logic (processing of an inbound webhook API call). We'll call this job handle_incoming_message.
Here's what I see Using flower to inspect the queue:
While all workers are blocked by the first 8 sleepy_job tasks, I see no sign of the new handle_incoming_message on the queue, even though I am certain handle_incoming_message.delay() has been called as a result of the 2nd API call.
After the first 8 sleepy_job tasks have been completed (~30s), I see the new handle_incoming_message on the queue with state RECIEVED.
After the second (and final) 8 sleepy_job tasks have been completed, I now see handle_incoming_message has state STARTED (and I can confirm this as the UI updates with the new data that was received and processed in that task.)
Questions
So it seems clear that when the workers are momentarily unblocked after handling the first 8 sleepy_job tasks, they are doing something to mark/acknowledge the new handle_incoming_message task in a way that is visible to flower. But this leaves several unanswered questions:
What is the state of the new handle_incoming_message task when the workers are blocked?
What changes after workers are unblocked that makes it so flower now has visibility into the new handle_incoming_message task?
What does the "RECEIVED" state actually mean?
(Bonus: How can I get visibility into tasks that are queued while workers are blocked?)
When all workers are blocked SOME tasks could be in the received state because of prefetching (look in the documentation for that). So chances are very high that your tasks are simply in the queue, waiting to be received by Celery workers (coordinating processes - these are not actual worker processes).
Flower is a simple service that is built upon a Celery feature called "task events". In simple terms it (Flower) subscribes itself as receiver of all events (received, succeeded, started, failed, etc) and then visually represents those to the web clients. More about it here. So when task gets received by a Celery worker, a "task-received" event is sent. Flower fetches this event, and changes the state of that task in the dashboard.
When a task is "received" it means that particular Celery worker took that task off the queue and it may be executed immediately (if there is a free worker-process to execute it), or Celery worker will wait for a worker process to become ready to run the task. I have already mentioned prefetching - Celery workers will often take more tasks then available worker-processes.
Celery does not give users a way to list what is in particular queue. That is why you will see many similar questions - including this one which offers answers. You will see my short answer there among others. In short, it depends on your broker of choice. If it is Redis, then you simply go through the list of objects. If it is RabbitMQ then you can use their tool to inspect queues. I think the decision not to provide this is good one as this information is never reliable. By the time you list all the tasks in particular queue, there may be thousands new ones...
Celery will send task to idle workers.
I have a task will run every 5 seconds, and I want this task to only be sent to one specify worker.
Other tasks can share the left over workers
Can celery do this??
And I want to know what this parameter is: CELERY_TASK_RESULT_EXPIRES
Does it means that the task will not be sent to a worker in the queue?
Or does it stop the task if it runs too long?
Sure, you can. Best way to do it, separate celery workers using different queues. You just need to make sure that task you need goes to separate queue, and your worker listening particular queue.
Long story for this: http://docs.celeryproject.org/en/latest/userguide/routing.html
Just to answer your second question CELERY_TASK_RESULT_EXPIRES is the time in seconds that the result of the task is persisted. So after a task is over, its result is saved into your result backend. The result is kept there for the amount of time specified by that parameter. That is used when a task result might be accessed by different callers.
This has probably nothing to do with your problem. As for the first solution, as already stated you have to use multiple queues. However be aware that you cannot assign the task to a specific Worker Process, just to a specific Worker which will then assign it to a specific Worker Process.
I'm working on a Python based system, to enqueue long running tasks to workers.
The tasks originate from an outside service that generate a "token", but once they're created based on that token, they should run continuously, and stopped only when explicitly removed by code.
The task starts a WebSocket and loops on it. If the socket is closed, it reopens it. Basically, the task shouldn't reach conclusion.
My goals in architecting this solutions are:
When gracefully restarting a worker (for example to load new code), the task should be re-added to the queue, and picked up by some worker.
Same thing should happen when ungraceful shutdown happens.
2 workers shouldn't work on the same token.
Other processes may create more tasks that should be directed to the same worker that's handling a specific token. This will be resolved by sending those tasks to a queue named after the token, which the worker should start listening to after starting the token's task. I am listing this requirement as an explanation to why a task engine is even required here.
Independent servers, fast code reload, etc. - Minimal downtime per task.
All our server side is Python, and looks like Celery is the best platform for it.
Are we using the right technology here? Any other architectural choices we should consider?
Thanks for your help!
According to the docs
When shutdown is initiated the worker will finish all currently executing tasks before it actually terminates, so if these tasks are important you should wait for it to finish before doing anything drastic (like sending the KILL signal).
If the worker won’t shutdown after considerate time, for example because of tasks stuck in an infinite-loop, you can use the KILL signal to force terminate the worker, but be aware that currently executing tasks will be lost (unless the tasks have the acks_late option set).
You may get something like what you want by using retry or acks_late
Overall I reckon you'll need to implement some extra application-side job control, plus, maybe, a lock service.
But, yes, overall you can do this with celery. Whether there are better technologies... that's out of the scope of this site.
I'm using Celery 3.1.x with 2 tasks. The first task (TaskOne) is enqueued when Celery starts up through the celeryd_after_setup signal:
#celeryd_after_setup.connect
def celeryd_after_setup(*args, **kwargs):
TaskOne().apply_async(countdown=5)
When TaskOne is run, it does some calculations and then enqueues TaskTwo. Imagine the following workflow:
I start celery, thus the signal is fired and TaskOne is enqueued
after the countdown (5) TaskTwo is enqueued
then I stop celery (the TaskTwo remains in the queue)
afterwards I restart celery
the workflow is run again and TaskTwo is enqueued again
So we have 2 TaskTwo in the queue. That is a problem for my workflow because I only want one TaskTwo within the queue and avoid that a second one is enqueued.
My question: How can I achieve this?
With celery.app.control.Inspect.scheduled() (Docs) I can get a list of which tasks are scheduled, hidden in a combination of lists and dicts. This is maybe a way, but going through the result of this does not feel right. Is there any better way?
An easy-to-implement solution would be to add the --purge switch to your worker command. It will clear the queue and the worker start with no scheduled jobs.
But beware: that's a kind of job-global, unrecoverable action. When there are other scheduled jobs you depend on, this is not your solution.
After considering several options I chose to use app.control.inspect.
It's not a really beautiful solution, but it works:
# fetch all scheduled tasks
scheduled_tasks = inspect().scheduled()
# iterate the scheduled task values, see http://docs.celeryproject.org/en/latest/userguide/workers.html?highlight=revoke#dump-of-scheduled-eta-tasks
for task_values in iter(scheduled_tasks.values()):
# task_values is a list of dicts
for task in task_values:
if task['request']['name'] == '{}.{}'.format(TaskTwo.__module__, TaskTwo.__name__):
logger.info('TaskTwo is already scheduled, skipping additional run')
return
I'm running Django, Celery and RabbitMQ. What I'm trying to achieve is to ensure, that tasks related to one user are executed in order (specifically, one at the time, I don't want task concurrency per user)
whenever new task is added for user, it should depend on the most recently added task. Additional functionality might include not adding task to queue, if task of this type is queued for this user and has not yet started.
I've done some research and:
I couldn't find a way to link newly created task with already queued one in Celery itself, chains seem to be only able to link new tasks.
I think that both functionalities are possible to implement with custom RabbitMQ message handler, though it might be hard to code after all.
I've also read about celery-tasktree and this might be an easiest way to ensure execution order, but how do I link new task with already "applied_async" task_tree or queue? Is there any way that I could implement that additional no-duplicate functionality using this package?
Edit: There is this also this "lock" example in celery cookbook and as the concept is fine, I can't see a possible way to make it work as intended in my case - simply if I can't acquire lock for user, task would have to be retried, but this means pushing it to the end of queue.
What would be the best course of action here?
If you configure the celery workers so that they can only execute one task at a time (see worker_concurrency setting), then you could enforce the concurrency that you need on a per user basis. Using a method like
NUMBER_OF_CELERY_WORKERS = 10
def get_task_queue_for_user(user):
return "user_queue_{}".format(user.id % NUMBER_OF_CELERY_WORKERS)
to get the task queue based on the user id, every task will be assigned to the same queue for each user. The workers would need to be configured to only consume tasks from a single task queue.
It would play out like this:
User 49 triggers a task
The task is sent to user_queue_9
When the one and only celery worker that is listening to user_queue_9 is ready to consume a new task, the task is executed
This is a hacky answer though, because
requiring just a single celery worker for each queue is a brittle system -- if the celery worker stops, the whole queue stops
the workers are running inefficiently