I just found out about the configuration option CELERYD_PREFETCH_MULTIPLIER (docs). The default is 4, but (I believe) I want the prefetching off or as low as possible. I set it to 1 now, which is close enough to what I'm looking for, but there's still some things I don't understand:
Why is this prefetching a good idea? I don't really see a reason for it, unless there's a lot of latency between the message queue and the workers (in my case, they are currently running on the same host and at worst might eventually run on different hosts in the same data center). The documentation only mentions the disadvantages, but fails to explain what the advantages are.
Many people seem to set this to 0, expecting to be able to turn off prefetching that way (a reasonable assumption in my opinion). However, 0 means unlimited prefetching. Why would anyone ever want unlimited prefetching, doesn't that entirely eliminate the concurrency/asynchronicity you introduced a task queue for in the first place?
Why can prefetching not be turned off? It might not be a good idea for performance to turn it off in most cases, but is there a technical reason for this not to be possible? Or is it just not implemented?
Sometimes, this option is connected to CELERY_ACKS_LATE. For example. Roger Hu writes «[…] often what [users] really want is to have a worker only reserve as many tasks as there are child processes. But this is not possible without enabling late acknowledgements […]» I don't understand how these two options are connected and why one is not possible without the other. Another mention of the connection can be found here. Can someone explain why the two options are connected?
Prefetching can improve the performance. Workers don't need to wait for the next message from a broker to process. Communicating with a broker once and processing a lot of messages gives a performance gain. Getting a message from a broker (even from a local one) is expensive compared to the local memory access. Workers are also allowed to acknowledge messages in batches
Prefetching set to zero means "no specific limit" rather than unlimited
Setting prefetching to 1 is documented to be equivalent to turning it off, but this may not always be the case (see https://stackoverflow.com/a/33357180/71522)
Prefetching allows to ack messages in batches. CELERY_ACKS_LATE=True prevents acknowledging messages when they reach to a worker
Old question, but still adding my answer in case it helps someone. My understanding from some initial testing was same as that in David Wolever's answer. I just tested this more in celery 3.1.19 and -Ofair does work. Just that it is not meant to disable prefetch at the worker node level. That will continue to happen. Using -Ofair has a different effect which is at the pool worker level. In summary, to disable prefetch completely, do this:
Set CELERYD_PREFETCH_MULTIPLIER = 1
Set CELERY_ACKS_LATE = True at a global level or task level
Use -Ofair while starting the workers
If you set concurrency to 1, then step 3 is not needed. If you want a
higher concurrency, then step 3 is essential to avoid tasks getting
backed up in a node that could be run long running tasks.
Adding some more details:
I found that the worker node will always prefetch by default. You can only control how many tasks it prefetches by using CELERYD_PREFETCH_MULTIPLIER. If set to 1, it will only prefetch as many tasks as the number of pool workers (concurrency) in the node. So if you had concurrency = n, the max tasks prefetched by the node will be n.
Without the -Ofair option, what happened for me was that if one of the pool worker processes was executing a long running task, the other workers in the node would also stop processing the tasks already prefetched by the node. By using -Ofair, that changed. Even though one of the workers in the node was executing a long running tasks, others would not stop processing and would continue to process the tasks prefetched by the node. So I see two levels of prefetching. One at the worker node level. The other at the individual worker level. Using -Ofair for me seemed to disable it at the worker level.
How is ACKS_LATE related? ACKS_LATE = True means that the task will be acknowledged only when the task succeeds. If not, I suppose it would happen when it is received by a worker. In case of prefetch, the task is first received by the worker (confirmed from logs) but will be executed later. I just realized that prefetched messages show up under "unacknowledged messages" in rabbitmq. So I'm not sure if setting it to True is absolutely needed. We anyway had our tasks set that way (late ack) for other reasons.
Just a warning: as of my testing with the redis broker + Celery 3.1.15, all of the advice I've read pertaining to CELERYD_PREFETCH_MULTIPLIER = 1 disabling prefetching is demonstrably false.
To demonstrate this:
Set CELERYD_PREFETCH_MULTIPLIER = 1
Queue up 5 tasks that will each take a few seconds (ex, time.sleep(5))
Start watching the length of the task queue in Redis: watch redis-cli -c llen default
Start celery worker -c 1
Notice that the queue length in Redis will immediately drop from 5 to 3
CELERYD_PREFETCH_MULTIPLIER = 1 does not prevent prefetching, it simply limits the prefetching to 1 task per queue.
-Ofair, despite what the documentation says, also does not prevent prefetching.
Short of modifying the source code, I haven't found any method for entirely disabling prefetching.
I cannot comment on David Wolever's answers, since my stackcred isn't high enough. So, I've framed my comment as an answer since I'd like to share my experience with Celery 3.1.18 and a Mongodb broker. I managed to stop prefetching with the following:
add CELERYD_PREFETCH_MULTIPLIER = 1 to the celery config
add CELERY_ACKS_LATE = True to the celery config
Start celery worker with options: --concurrency=1 -Ofair
Leaving CELERY_ACKS_LATE to the default, the worker still prefetches. Just like the OP I don't fully grasp the link between prefetching and late acks. I understand what David says "CELERY_ACKS_LATE=True prevents acknowledging messages when they reach to a worker", but I fail to understand why late acks would be incompatible with prefetch. In theory a prefetch would still allow to ack late right - even if not coded as such in celery ?
I experienced something a little bit different with SQS as broker.
The setup was:
CELERYD_PREFETCH_MULTIPLIER = 1
ACKS_ON_FAILURE_OR_TIMEOUT=False
CELERY_ACKS_LATE = True
CONCURRENCY=1
After task fail (exception raised), the worker became unavailable since the message was not acked, both local and remote queue.
The solution that made the workers continue consuming work was setting
CELERYD_PREFETCH_MULTIPLIER = 0
I can only speculate that acks_late was not taken in consideration when writing the SQS transport
Related
Is there for me to configure celery to just drop the tasks in case of a non-graceful shutdown of a worker? Its more critical for me that tasks are not repeated rather than they are always delivered.
As mentioned in the docs:
If a task isn’t acknowledged within the Visibility Timeout the task will be redelivered to another worker and executed.
This causes problems with ETA/countdown/retry tasks where the time to execute exceeds the visibility timeout; in fact if that happens it will be executed again, and again in a loop.
So you have to increase the visibility timeout to match the time of the longest ETA you’re planning to use.
My use case is that I am using a visibility_timeout of 1 day, but still in some cases that is not enough- I want to schedule tasks even further in the future. "Power failure" or any other event causing a non-graceful shutdown is very rare and I'm fine with tasks being dropped in, say, 0.01% of the cases. Moreover, a task executed 1 day later than it was supposed to, is as bad as the task not being run at all.
One obvious, hacky, way is to set visibility_timeout to 100 years. Is there a better way?
There's a acks_late configuration, but the default value is false (so make sure you didn't enable it):
The acks_late setting would be used when you need the task to be
executed again if the worker (for some reason) crashes mid-execution.
It’s important to note that the worker isn’t known to crash, and if it
does it’s usually an unrecoverable error that requires human
intervention (bug in the worker, or task code).
(quote from here)
The definition of task_acks_late (seems like the name has changed in the last version of some mismatch) can be found here.
What are the implications of disabling gossip, mingle, and heartbeat on my celery workers?
In order to reduce the number of messages sent to CloudAMQP to stay within the free plan, I decided to follow these recommendations. I therefore used the options --without-gossip --without-mingle --without-heartbeat. Since then, I have been using these options by default for all my celery projects but I am not sure if there are any side-effects I am not aware of.
Please note:
we now moved to a Redis broker and do not have that much limitations on the number of messages sent to the broker
we have several instances running multiple celery workers with multiple queues
This is the base documentation which doesn't give us much info
heartbeat
Is related to communication between the worker and the broker (in your case the broker is CloudAMQP).
See explanation
With the --without-heartbeat the worker won't send heartbeat events
mingle
It only asks for "logical clocks" and "revoked tasks" from other workers on startup.
Taken from whatsnew-3.1
The worker will now attempt to synchronize with other workers in the same cluster.
Synchronized data currently includes revoked tasks and logical clock.
This only happens at startup and causes a one second startup delay to collect broadcast responses from other workers.
You can disable this bootstep using the --without-mingle argument.
Also see docs
gossip
Workers send events to all other workers and this is currently used for "clock synchronization", but it's also possible to write your own handlers on events, such as on_node_join, See docs
Taken from whatsnew-3.1
Workers are now passively subscribing to worker related events like heartbeats.
This means that a worker knows what other workers are doing and can detect if they go offline. Currently this is only used for clock synchronization, but there are many possibilities for future additions and you can write extensions that take advantage of this already.
Some ideas include consensus protocols, reroute task to best worker (based on resource usage or data locality) or restarting workers when they crash.
We believe that although this is a small addition, it opens amazing possibilities.
You can disable this bootstep using the --without-gossip argument.
Celery workers started up with the --without-mingle option, as #ofirule mentioned above, will not receive synchronization data from other workers, particularly revoked tasks. So if you revoke a task, all workers currently running will receive that broadcast and store it in memory so that when one of them eventually picks up the task from the queue, it will not execute it:
https://docs.celeryproject.org/en/stable/userguide/workers.html#persistent-revokes
But if a new worker starts up before that task has been dequeued by a worker that received the broadcast, it doesn't know to revoke the task. If it eventually picks up the task, then the task is executed. You will see this behavior if you're running in an environment where you are dynamically scaling in and out celery workers constantly.
I wanted to know if the --without-heartbeat flag would impact the worker's ability to detect broker disconnect and attempts to reconnect. The documentation referenced above only opaquely refers to these heartbeats acting at the application layer rather than TCP/IP layer. Ok--what I really want to know is does eliminating these messages affect my worker's ability to function--specifically to detect broker disconnect and then to try to reconnect appropriately?
I ran a few quick tests myself and found that with the --without-heartbeat flag passed, workers still detect broker disconnect very quickly (initiated by me shutting down the RabbitMQ instance), and they attempt to reconnect to the broker and do so successfully when I restart the RabbitMQ instance. So my basic testing suggests the heartbeats are not necessary for basic health checks and functionality. What's the point of them anyways? It's unclear to me, but they don't appear to have impact on worker functionality.
This question is regarding the use of multiple remote Celery workers on separate machines. The implementation of the App can be conceptualized as:
My App (Producer) will be adding multiple tasks (say 50) to the queue every 5 mins (imagine a python for loop iterating over a list of tasks to be performed asynchronously at every 5 min interval). I want the celery workers (which will be remote machines) to pick these tasks up as soon as they are pushed.
My question is will Celery/RabbitMQ automatically handle task distribution (so no Worker picks up a task that has already been picked up by a worker from the queue - i.e. to ensure work is not duplicated) and distribute the tasks evenly so no worker is left lazying about while other workers are working hard or do these have to be configured/programmed in the settings?*
I would most appreciate it if someone could forward me relevant documentation (I was checking out Celery docs but couldn't find this specific info regarding remote celery workers in this context.)
Automatically but you need to be aware of prefetching feature which is described here: http://docs.celeryproject.org/en/latest/userguide/optimizing.html#prefetch-limits, read until the end of the page.
In short, prefetching works on two levels: worker level and process level, since a worker may have multiple processes. To disable prefetch on worker level you need to specify worker_prefetch_multiplier = 1 in celery settings, to disable on the process level you need to specify -Ofair option in worker's command line.
So after digging around in RabbitMQ docs it seems that the default exchange method is Direct Exchange (ref https://www.rabbitmq.com/tutorials/amqp-concepts.html) which means that tasks will be distributed to workers in a round-robin manner.
I just upgraded to celery 3.1 and now I see this i my logs ::
on_node_lost - INFO - missed heartbeat from celery#queue_name for every queue/worker in my cluster.
According to the docs BROKER_HEARTBEAT is off by default and I haven't configured it.
Should I explicitly set BROKER_HEARTBEAT=0 or is there something else that I should be checking?
Celery 3.1 added in the new mingle and gossip procedures. I too was getting a ton of missed heartbeats and passing --without-gossip to my workers cleared it up.
https://docs.celeryproject.org/en/3.1/whatsnew-3.1.html#mingle-worker-synchronization
Mingle: Worker synchronization
The worker will now attempt to synchronize with other workers in the
same cluster.
Synchronized data currently includes revoked tasks and logical clock.
This only happens at startup and causes a one second startup delay to
collect broadcast responses from other workers.
You can disable this bootstep using the --without-mingle argument.
https://docs.celeryproject.org/en/3.1/whatsnew-3.1.html#gossip-worker-worker-communication
Gossip: Worker <-> Worker communication
Workers are now passively subscribing to worker related events like
heartbeats.
This means that a worker knows what other workers are doing and can
detect if they go offline. Currently this is only used for clock
synchronization, but there are many possibilities for future additions
and you can write extensions that take advantage of this already.
Some ideas include consensus protocols, reroute task to best worker
(based on resource usage or data locality) or restarting workers when
they crash.
We believe that although this is a small addition, it opens amazing
possibilities.
You can disable this bootstep using the --without-gossip argument.
Saw the same thing, and noticed a couple of things in the log files.
1) There were messages about time drift at the start of the log and occasional missed heartbeats.
2) At the end of the log file, the drift messages went away and only the missed heartbeat messages were present.
3) There were no changes to the system when the drift messages went away... They just stopped showing up.
I figured that the drift itself was likely the problem itself.
After syncing the time on all the servers involved these messages went away. For ubuntu, run ntpdate as a cron or ntpd.
I'm having a similar issue. I have found the reason in my case.
I have two server to run worker.
when I use "ping" to another server,
I found when the ping time larger than 2 second, the log will show " missed heartbeat from celery# ". The default heartbeat interval is 2 second.
The reason is my poor network.
http://docs.celeryproject.org/en/latest/internals/reference/celery.worker.heartbeat.html
add --without-mingle when you start celery
I only created the last 2 queue names that show in Rabbitmq management Webui in the below table:
The rest of the table has hash-like queues, which I don't know:
1- Who created them? (I know it is celery, but which process, task,etc.)
2- Why they are created, and what they are created for?.
I can notice that when the number of pushed messages increase, the number of those hash-like messages increase.
When using celery, Rabbitmq is used as a default result backend, and also to store errors of failing
tasks(that raised exceptions).
Every new task creates a new queue on the server, with thousands of tasks the
broker may be overloaded with queues and this will affect performance
in negative ways.
Each queue in Rabbit will be a separate Erlang process, so if you’re planning to
keep many results simultaneously you may have to increase the Erlang
process limit, and the maximum number of file descriptors your OS
allows.
Old results will not be cleaned automatically, so we have to tell
rabbit to do so.
The below conf. line dictates the time to live of the temp
queues. The default is 1 day
CELERY_AMQP_TASK_RESULT_EXPIRES = Number of seconds
OR, We can change the backend store totally, and not make it in Rabbit.
CELERY_BACKEND = "amqp"
We may also ignore it:
CELERY_IGNORE_RESULT = True.
Also, when ignoring the result, we can also keep the errors stored for later usage,
which means one more queue for the failing tasks.
CELERY_STORE_ERRORS_EVEN_IF_IGNORED = True.
I will not mark this question as answered, waiting for a better answer.
Rererences:
This SO link
Celery documentation
Rabbitmq documentation