Is there for me to configure celery to just drop the tasks in case of a non-graceful shutdown of a worker? Its more critical for me that tasks are not repeated rather than they are always delivered.
As mentioned in the docs:
If a task isn’t acknowledged within the Visibility Timeout the task will be redelivered to another worker and executed.
This causes problems with ETA/countdown/retry tasks where the time to execute exceeds the visibility timeout; in fact if that happens it will be executed again, and again in a loop.
So you have to increase the visibility timeout to match the time of the longest ETA you’re planning to use.
My use case is that I am using a visibility_timeout of 1 day, but still in some cases that is not enough- I want to schedule tasks even further in the future. "Power failure" or any other event causing a non-graceful shutdown is very rare and I'm fine with tasks being dropped in, say, 0.01% of the cases. Moreover, a task executed 1 day later than it was supposed to, is as bad as the task not being run at all.
One obvious, hacky, way is to set visibility_timeout to 100 years. Is there a better way?
There's a acks_late configuration, but the default value is false (so make sure you didn't enable it):
The acks_late setting would be used when you need the task to be
executed again if the worker (for some reason) crashes mid-execution.
It’s important to note that the worker isn’t known to crash, and if it
does it’s usually an unrecoverable error that requires human
intervention (bug in the worker, or task code).
(quote from here)
The definition of task_acks_late (seems like the name has changed in the last version of some mismatch) can be found here.
Related
After playing with some "defect" scenarios with celery (Redis being a broker for whatever it worth) we came to understanding that there is effectively no sense in setting acks_late=true without simultaneous setting of task_reject_on_worker_lost=true because the task won't be rescheduled (again, in our tests) -- task stays in the "unacked" category forever.
At the same time everybody says that acks_late will make the task being subject for rescheduling on the same / another worker, so the question is: when does it happen?
The official docs say that
Note that the worker will acknowledge the message if the child process
executing the task is terminated (either by the task calling
sys.exit(), or by signal) even when acks_late is enabled. This
behavior is intentional as…
We don’t want to rerun tasks that forces the kernel to send a SIGSEGV (segmentation fault) or similar signals to the process.
We assume that a system administrator deliberately killing the task does not want it to automatically restart.
A task that allocates too much memory is in danger of triggering the kernel OOM killer, the same may happen again.
A task that always fails when redelivered may cause a high-frequency message loop taking down the system.
If you really want a task to be redelivered in these scenarios you
should consider enabling the task_reject_on_worker_lost setting.
What are possible examples of "something went wrong" that don't fall into the "worker terminated deliberately or due to a signal caught" category?
Reboot, power outage, hardware failure. n.b., all of your examples assume that the prefetch multiplier is 1.
Note that there is a difference between the celery worker process, to the child processes actually executing the tasks.
By default, when you create a celery worker, it will create one "parent" process and x number of child processes which executes the tasks, where x is the number of CPUs you have (you can read more about this in the docs, and how to configure it)
I have tested all the different scenarios, these are my conclusions:
acks_late is about what happens when the worker dies. task_reject_on_worker_lost is about the actual process executing the task.
For example, if I have a k8s pod running celery process: if I send sigkill (cold shutdown) to the pod, having acks_late as true will make sure that the task will be picked up by a different worker.
But, if I kill somehow the child process executing the task (go inside the pod and kill the child process for example, or if the process exits by itself somehow), the task will not be picked up even if acks_late is true.
If you set task_reject_on_worker_lost to true, the task will be picked up again.
hope that clarifies everything
I am trying to limit the rate of one celery task. Here is how I am doing it:
from project.celery import app
app.control.rate_limit('task_a', '10/m')
It is working well. However, there is a catch. Other tasks that this worker is responsible for are being blocked as well.
Let's say, 100 of task_a have been scheduled. As it is rate-limited, it will take 10 minutes to execute all of them. During this time, task_b has been scheduled as well. It will not be executed until task_a is done.
Is it possible to not block task_b?
By the looks of it, this is just how it works. I just didn't get that impression after reading the documentation.
Other options include:
Separate worker and queue only for this task
Adding an eta to the task task_a so that all of it are scheduled to run during the night
What is the best practice in such cases?
This should be part of a task declaration to work on per-task basis. The way you are doing it via control probably why it has this side-effect on other tasks
#task(rate_limit='10/m')
def task_a():
...
After more reading
Note that this is a per worker instance rate limit, and not a global rate limit. To enforce a global rate limit (e.g., for an API with a maximum number of requests per second), you must restrict to a given queue.
You probably will have to do this in separate queue
The easiest (no coding required) way is separating the task into its own queue and running a dedicated worker just for this purpose.
There's no shame in that, it is totally fine to have many Celery queues and workers, each dedicated just for a specific type of work. As an added bonus you may get some more control over the execution, you can easily turn workers ON/OFF to pause certain processes if needed, etc.
On the other hand, having lots of specialized workers idle most of the time (waiting for a specific job to be queued) is not particularly memory-efficient.
Thus, in case you need to rate limit more tasks and expect the specific workers to be idle most of the time, you may consider increasing the efficiency and implement a Token Bucket. With that all your workers can be generic-purpose and you can scale them naturally as your overall load increases, knowing that the work distribution will not be crippled by a single task's rate limit anymore.
It seems that Celery (v4.1) can be either used with some prefetching of tasks, or with CELERY_ACKS_LATE=True (Discussed here)
We currently work with CELERY_ACKS_LATE=False and CELERYD_PREFETCH_MULTIPLIER=1
In both cases, there are unacknowledged messages in Rabbit.
At times we suffer from network issues that cause Celery to lose the connection to Rabbit for few seconds, getting these warnings: consumer: Connection to broker lost. Trying to re-establish the connection...
When this happens, the unacknowledged messages, turn back to Ready, what seems to be the standard behaviour, and are being consumed by another consumer.
This causes a multiple execution of the tasks, as the consumer started a prefetched task in the worker process, but couldn't ack it to Rabbit.
As it seems that its impossible to guarantee that tasks will get executed exactly once in Celery without external tools, how is it possible to ensure that tasks are executed at most once?
----- Edit ----
One approach I'm considering is to use the task's self.request.delivery_info['redelivered'] and fail tasks that were redelivered.
While achieving the goal of "executing at most once" this will have a high rate of false positives (tasks that weren't already executed)
There is difference between the task is executed once and there is extra side effect when the task is executed multiple times ie. if your tasks are not idempotent then executing twice the same task will lead to bugs.
What I recommend is to allow task to execute several times, but making them idempotent so they have no effect if they were already executed.
I am not sure if there is something that I misunderstand regarding Celery's result backend settings, but this is the problem I am facing: I have a bunch of tasks that I am not interested in their result, but I do want to check on their state from time to time.
Hence I have set:
ignore_result = True
Despite this setting, it seems that if I have the result backend configured to use AMQP, the result queues are still being created - one per task - why?
So I tried to make sure I won't have too many (result) queues at the same time (which would affect the performance of RabbitMQ) by setting:
CELERY_TASK_RESULT_EXPIRES = 60
The weird thing is that if I don't manage to check:
AsyncResult(task_id).state
within the first 60s after a task has finished, then the state of the task will be forever PENDING, even though for some tasks I manually call
update_state(state='SUCCESS')
in the task. What is even more weird is that whenever I try to get the task state, despite the fact that all the states are PENDING (though they should be one of the FINISHED_STATES), the result queues reappear in the list of queues, but the task state never changes - I list all the queues using rabbitmqadmin before checking on the task state and after.
Can anybody explain me what am I doing wrong or what is going on?
Why is the state of a task depending on the result expiry time, if I ignore the result entirely?
What settings should I use if I want to ignore the result of the tasks, but still be able to check on the task's state?
p.s. I am using Celery 3.0.23 now, but the behaviour was the same with 3.0.13.
I just found out about the configuration option CELERYD_PREFETCH_MULTIPLIER (docs). The default is 4, but (I believe) I want the prefetching off or as low as possible. I set it to 1 now, which is close enough to what I'm looking for, but there's still some things I don't understand:
Why is this prefetching a good idea? I don't really see a reason for it, unless there's a lot of latency between the message queue and the workers (in my case, they are currently running on the same host and at worst might eventually run on different hosts in the same data center). The documentation only mentions the disadvantages, but fails to explain what the advantages are.
Many people seem to set this to 0, expecting to be able to turn off prefetching that way (a reasonable assumption in my opinion). However, 0 means unlimited prefetching. Why would anyone ever want unlimited prefetching, doesn't that entirely eliminate the concurrency/asynchronicity you introduced a task queue for in the first place?
Why can prefetching not be turned off? It might not be a good idea for performance to turn it off in most cases, but is there a technical reason for this not to be possible? Or is it just not implemented?
Sometimes, this option is connected to CELERY_ACKS_LATE. For example. Roger Hu writes «[…] often what [users] really want is to have a worker only reserve as many tasks as there are child processes. But this is not possible without enabling late acknowledgements […]» I don't understand how these two options are connected and why one is not possible without the other. Another mention of the connection can be found here. Can someone explain why the two options are connected?
Prefetching can improve the performance. Workers don't need to wait for the next message from a broker to process. Communicating with a broker once and processing a lot of messages gives a performance gain. Getting a message from a broker (even from a local one) is expensive compared to the local memory access. Workers are also allowed to acknowledge messages in batches
Prefetching set to zero means "no specific limit" rather than unlimited
Setting prefetching to 1 is documented to be equivalent to turning it off, but this may not always be the case (see https://stackoverflow.com/a/33357180/71522)
Prefetching allows to ack messages in batches. CELERY_ACKS_LATE=True prevents acknowledging messages when they reach to a worker
Old question, but still adding my answer in case it helps someone. My understanding from some initial testing was same as that in David Wolever's answer. I just tested this more in celery 3.1.19 and -Ofair does work. Just that it is not meant to disable prefetch at the worker node level. That will continue to happen. Using -Ofair has a different effect which is at the pool worker level. In summary, to disable prefetch completely, do this:
Set CELERYD_PREFETCH_MULTIPLIER = 1
Set CELERY_ACKS_LATE = True at a global level or task level
Use -Ofair while starting the workers
If you set concurrency to 1, then step 3 is not needed. If you want a
higher concurrency, then step 3 is essential to avoid tasks getting
backed up in a node that could be run long running tasks.
Adding some more details:
I found that the worker node will always prefetch by default. You can only control how many tasks it prefetches by using CELERYD_PREFETCH_MULTIPLIER. If set to 1, it will only prefetch as many tasks as the number of pool workers (concurrency) in the node. So if you had concurrency = n, the max tasks prefetched by the node will be n.
Without the -Ofair option, what happened for me was that if one of the pool worker processes was executing a long running task, the other workers in the node would also stop processing the tasks already prefetched by the node. By using -Ofair, that changed. Even though one of the workers in the node was executing a long running tasks, others would not stop processing and would continue to process the tasks prefetched by the node. So I see two levels of prefetching. One at the worker node level. The other at the individual worker level. Using -Ofair for me seemed to disable it at the worker level.
How is ACKS_LATE related? ACKS_LATE = True means that the task will be acknowledged only when the task succeeds. If not, I suppose it would happen when it is received by a worker. In case of prefetch, the task is first received by the worker (confirmed from logs) but will be executed later. I just realized that prefetched messages show up under "unacknowledged messages" in rabbitmq. So I'm not sure if setting it to True is absolutely needed. We anyway had our tasks set that way (late ack) for other reasons.
Just a warning: as of my testing with the redis broker + Celery 3.1.15, all of the advice I've read pertaining to CELERYD_PREFETCH_MULTIPLIER = 1 disabling prefetching is demonstrably false.
To demonstrate this:
Set CELERYD_PREFETCH_MULTIPLIER = 1
Queue up 5 tasks that will each take a few seconds (ex, time.sleep(5))
Start watching the length of the task queue in Redis: watch redis-cli -c llen default
Start celery worker -c 1
Notice that the queue length in Redis will immediately drop from 5 to 3
CELERYD_PREFETCH_MULTIPLIER = 1 does not prevent prefetching, it simply limits the prefetching to 1 task per queue.
-Ofair, despite what the documentation says, also does not prevent prefetching.
Short of modifying the source code, I haven't found any method for entirely disabling prefetching.
I cannot comment on David Wolever's answers, since my stackcred isn't high enough. So, I've framed my comment as an answer since I'd like to share my experience with Celery 3.1.18 and a Mongodb broker. I managed to stop prefetching with the following:
add CELERYD_PREFETCH_MULTIPLIER = 1 to the celery config
add CELERY_ACKS_LATE = True to the celery config
Start celery worker with options: --concurrency=1 -Ofair
Leaving CELERY_ACKS_LATE to the default, the worker still prefetches. Just like the OP I don't fully grasp the link between prefetching and late acks. I understand what David says "CELERY_ACKS_LATE=True prevents acknowledging messages when they reach to a worker", but I fail to understand why late acks would be incompatible with prefetch. In theory a prefetch would still allow to ack late right - even if not coded as such in celery ?
I experienced something a little bit different with SQS as broker.
The setup was:
CELERYD_PREFETCH_MULTIPLIER = 1
ACKS_ON_FAILURE_OR_TIMEOUT=False
CELERY_ACKS_LATE = True
CONCURRENCY=1
After task fail (exception raised), the worker became unavailable since the message was not acked, both local and remote queue.
The solution that made the workers continue consuming work was setting
CELERYD_PREFETCH_MULTIPLIER = 0
I can only speculate that acks_late was not taken in consideration when writing the SQS transport