Celery resume consuming from a queue - python

I have different celery queues and at a certain point I want workers to stop consuming from my queues
celery_app.control.cancel_consumer(consumer_queue)
In a while I want to be able to resume consumers and I do that with the next command
celery.control.add_consumer(
consumer_queue,
routing_key=consumer_queue,
destination=['worker-name'],
)
At this point I expect worker-name will be fetching tasks from consumer_queue which my custom router redirects by a routing_key. But instead I have this output from a celery inspect
celery.control.inspect().active_queues()
{'celery#worker-name': []}
Some details
Celery: celery==3.1.23
Kombu: kombu==3.0.35
billiard: billiard==3.3.0.23
Note: adding consumer via celery flower (flower==0.8.4) works even that the command is the same.
What am I doing wrong and how to reenable consuming in a proper way?

Ok, it was a premature question with a plain solution: I have provided wrong name for the worker, instead of setting worker-name I should provide celery#worker-name identifier.
For debug purposes it's also useful to set reply=True argument
response = celery.control.add_consumer(
consumer_queue,
routing_key=consumer_queue,
destination=['celery#{}'.format(consumer)],
reply=True,
)
print(response)
and you'll see whether operation was successful or not
[{u'celery#worker-name': {u'ok': u'add consumer consumer-queue'}}]

Related

Celery chain stops in the middle without any reason (AWS Redis db, RabbitMQ broker)

Found strange behavior implementing tasks pipeline on Celery. Most of the time tasks chain executed, but sometimes it just silently stops in the middle after successful run of the previous task.
Example pipeline
pipelines = []
for task in Factory.gen_tasks(request):
pipelines.append(
chain(
first_process_step.si(task),
second_process_step.si(task),
group(
chain(
post_process_first_step.s(task.id),
post_process_second_step.s(task.id),
),
notify_user.s(task.id),
),
)
)
async_task = group(pipelines).apply_async()
Some config options:
celery_app.conf.update(task_acks_late=True)
celery_app.conf.update(task_reject_on_worker_lost=True)
celery_app.conf.update(worker_proc_alive_timeout=20)
celery_app.conf.update(worker_lost_wait=10)
So most of the time (99%, probably) everything is fine and task executed after another task. But sometimes execution stopped after first_process_step or sometimes after second_process_step
In both cases, I see in logs that the previous task completed, but the next step task is not received by a worker. Also, the queue is empty. So I started to think that a message was not sent to the broker.
App deployed to Kube and works with AWS Redis and RabbitMQ.
UPDATE:
So, if anyone will have similar behavior when the pipeline looks pretty regular, you use RabbitMQ and sometimes messages lost silently then you need to consider adding this not documented option:
celery_app.conf.update(
broker_transport_options={"confirm_publish": True},
)
Which enabled ConfirmPublish and ensures that a message will be processed at least once.
Defect about documentation update:
https://github.com/celery/celery/issues/5410

Tasks linger in celery amqp when publisher is terminated

I am using Celery with a RabbitMQ server. I have a publisher, which could potentially be terminated by a SIGKILL and since this signal cannot be watched, I cannot revoke the tasks. What would be a common approach to revoke the tasks where the publisher is not alive anymore?
I experimented with an interval on the worker side, but the publisher is obviously not registered as a worker, so I don't know how I can detect a timeout
There's nothing built-in to celery to monitor the producer / publisher status -- only the worker / consumer status. There are other alternatives that you can consider, for example by using a redis expiring key that has to be updated periodically by the publisher that can serve as a proxy for whether a publisher is alive. And then in the task checking to see if the flag for a publisher still exists within redis, and if it doesn't the task returns doing nothing.
I am pretty sure what you want is not possible with Celery, so I suggest you to shift your logic around and redesign everything to be part of a Celery workflow (or several Celery canvases depends on the actual use-case). My experience with Celery is that you can build literally any workflow you can imagine with those Celery primitives and/or custom Celery signatures.
Another solution, which works in my case, is to add the next task only if the current processed ones are finished. In this case the queue doesn't fill up.

Revoke a task from celery

I want to explicitly revoke a task from celery. This is how I'm currently doing:-
from celery.task.control import revoke
revoke(task_id, terminate=True)
where task_id is string(have also tried converting it into UUID uuid.UUID(task_id).hex).
After the above procedure, when I start celery again celery worker -A proj it still consumes the same message and starts processing it. Why?
When viewed via flower, the message is still there in the broker section. how do I delete the message so that it cant be consumed again?
How does revoke works?
When calling the revoke method the task doesn't get deleted from the queue immediately, all it does is tell celery(not your broker!) to save the task_id in a in-memory set(look here if you like reading source code like me).
When the task gets to the top of the queue, Celery will check if is it in the revoked set, if it does, it won't execute it.
It works this way to prevent O(n) search for each revoke call, where checking if the task_id is in the in-memory set is just O(1)
Why after restarting celery, your revoked tasks executed?
Understanding how things works, you now know that the set is just a normal python set, that being saved in-memory - that means when you restart, you lose this set, but the task is(of course) persistence and when the tasks turn comes, it will be executed as normal.
What can you do?
You will need to have a persistence set, this is done by initial your worker like this:
celery worker -A proj --statedb=/var/run/celery/worker.state
This will save the set on the filesystem.
References:
Celery source code of the in-memory set
Revoke doc
Persistent revokes docs

Strategy for worker addressing

I have an application which is delegating some operations to a celery task. The operations must be performed by different workers, depending on some parameters. I have thought about implementing this using queues. My idea is the following:
The client requests actions from a specific message queue1
If worker1 (exclusively responsible for queue1) is already active, it will process the request
If no worker is listening to queue1, a catch-all worker (worker-main) will instantiate worker1. The request will be forwared to worker1.
worker1 will shut itself down after some time without being used
My understanding of celery is limited, and I have several questions.
How do I implement worker-main in celery?: this is a worker listening to all queues, but with less priority than any other worker. That is, it will only act if the request is not taken by any other worker.
How does woker-main create worker1? Once creater, worker1 must be associated to queue1, with higher precendence than worker-main?
Can a request be forwarded from worker-mainto worker1? The reply should be sent to the client directly.
Can worker1 shut itself down?
You can see a graphical description of the architecture that I am trying to implement in the image below:
You could link together "worker main" and "worker1" in a sequential workflow so that "worker main" always handles the job as step 1, but simply returns and does nothing if it detects that "worker1" is already up.
So the task hits "worker main" first, "worker main" checks for upness of the server that worker1 is running on, and if that server is not up, pulls it up, waits for it to be fully up and then returns. Here is a proof of concept I tested to see how link worked in Celery to create a sequential workflow, somebody with more real-world experience may have better solutions. It also contains error handling, in case getting the worker up fails I suppose in your case.
Note that there is no concept of queues in this approach. Furthermore, you could give worker1 and worker2 different method names instead of differentiating on parameters, the client can do the parsing of parameters and then select celery method to execute.

how to track revoked tasks in across multiple celeryd processes

I have a reminder type app that schedules tasks in celery using the "eta" argument. If the parameters in the reminder object changes (e.g. time of reminder), then I revoke the task previously sent and queue a new task.
I was wondering if there's any good way of keeping track of revoked tasks across celeryd restarts. I'd like to have the ability to scale celeryd processes up/down on the fly, and it seems that any celeryd processes started after the revoke command was sent will still execute that task.
One way of doing it is to keep a list of revoked task ids, but this method will result in the list growing arbitrarily. Pruning this list requires guarantees that the task is no longer in the RabbitMQ queue, which doesn't seem to be possible.
I've also tried using a shared --statedb file for each of the celeryd workers, but it seems that the statedb file is only updated on termination of the workers and thus not suitable for what I would like to accomplish.
Thanks in advance!
Interesting problem, I think it should be easy to solve using broadcast commands.
If when a new worker starts up it requests all the other workers to dump its revoked
tasks to the new worker. Adding two new remote control commands,
you can easily add new commands by using #Panel.register,
Module control.py:
from celery.worker import state
from celery.worker.control import Panel
#Panel.register
def bulk_revoke(panel, ids):
state.revoked.update(ids)
#Panel.register
def broadcast_revokes(panel, destination):
panel.app.control.broadcast("bulk_revoke", arguments={
"ids": list(state.revoked)},
destination=destination)
Add it to CELERY_IMPORTS:
CELERY_IMPORTS = ("control", )
The only missing problem now is to connect it so that the new worker
triggers broadcast_revokes at startup. I guess you could use the worker_ready
signal for this:
from celery import current_app as celery
from celery.signals import worker_ready
def request_revokes_at_startup(sender=None, **kwargs):
celery.control.broadcast("broadcast_revokes",
destination=sender.hostname)
I had to do something similar in my project and used celerycam with django-admin-monitor. The monitor takes a snapshot of tasks and saves them in the database periodically. And there is a nice user interface to browse and check the status of all tasks. And you can even use it even if your project is not Django based.
I implemented something similar to this some time ago, and the solution I came up with was very similar to yours.
The way I solved this problem was to have the worker fetch the Task object from the database when the job ran (by passing it the primary key, as the documentation recommends). In your case, before the reminder is sent the worker should perform a check to ensure that the task is "ready" to be run. If not, it should simply return without doing any work (assuming that the ETA has changed and another worker will pick up the new job).

Categories