How to make Celery worker consume single task and exit - python

How do I make the celery -A app worker command to consume only a single task and then exit.
I want to run celery workers as a kubernetes Job that finishes after handling a single task.
I'm using KEDA for autoscaling workers according to queue messages.
I want to run celery workers as jobs for long running tasks, as suggested in the documentation:
KEDA long running execution

There's not really anything specific for this. You would have to hack in your own driver program, probably via a custom concurrency module. Are you trying to use Keda ScaledJobs or something? You would just use a ScaledObject instead.

Related

Restart celery if celery worker is down on windows

I wanted to know if there is a way to restart celery worker if celery worker is down due to some error or issue, so that it can be automatically restarted programmatically.
Check out this SO thread.
AS you are using windows, check for the ability to run Celery as a service, such as is explained right here on SO.

Celery How to make a worker run only when other workers are broken?

I have two servers, there is one celery worker on each server. I use Redis as the broker to collaborate with workers.
My question is how can I make only one worker run for most of the time, and once this worker is broken, another worker will turn on as a backup worker?
Basically, just take one worker as a back-up.
I know how to specify a task to a certain worker by a queue on the worker respectively, after reading the doc [http://docs.celeryproject.org/en/latest/userguide/routing.html#redis-message-priorities]
This is, in my humble opinion, completely against the point of having distributed system to off-load CPU heavy, or long-running tasks, or have thousands of small tasks that you can't run elsewhere...
- You are running two servers anyway, so why keeping the other one idle? More workers mean you will be able to process more tasks concurrently.
If you are not convinced, and still want to do this, you need to write a tiny service on machine with idle Celery worker. This service will periodically check the health of the active worker, and if that check fails, you will run Celery worker on the backup server.
Here is a question for you - why this service simply does not restart the Celery worker on the active server? - It is pretty much possible to do that, so again, I see no justification for having a completely idle machine doing nothing. If you are on a cloud platform, you can easily spin up a new instance from an existing image of your Celery worker. This is scenario I use in production.

Celery - how to stop running task when using distributed RabbitMQ backend?

If I am running Celery on (say) a bank of 50 machines all using a distributed RabbitMQ cluster.
If I have a task that is running and I know the task id, how in the world can Celery figure out which machine its running on to terminate it?
Thanks.
I am not sure if you can actually do it, when you spawn a task you will have a worker, somewhere in you 50 boxes, that executes that and you technically have no control on it as it s a separate process and the only thing you can control is either the asyncResult or the amqp message on the queue.

celery launches more processes than configured

I'm running a celery machine, using redis as the broker with the following configuration:
celery -A project.tasks:app worker -l info --concurrency=8
When checking the number of celery running processes, I see more than 8.
Is there something that I am missing? Is there a limit for max concurrency?
This problem causes huge memory allocation, and is killing the machine.
With the default settings Celery will always start one more process than the number you ask. This additional process is a kind of bookkeeping process that is used to coordinate the other processes that are part of the worker. It communicates with the rest of Celery, and dispatches the tasks to the processes that actually run the tasks.
Switching to a different pool implementation than the "prefork" default might reduce the number of processes created but that's opening new can of worms.
For the concurrency problem, I have no suggestion.
For the memory problem, you can look at redis configuration in ~/.redis/redis.conf. You have a maxmemory attribute which fix a limit upon tasks…
See the Redis configuration

How to load balance celery tasks across several servers?

I'm running celery on multiple servers, each with a concurrency of 2 or more and I want to load balance celery tasks so that the server that has the lowest CPU usage can process my celery tasks.
For example, lets say I have 2 servers (A and B), each with a concurrency of 2, if I have 2 tasks in the queue, I want A to process one task and B to process the other. But currently its possible that the first process on A will execute one task and the second process on A will execute the second task while B is sitting idle.
Is there a simple way, by means of celery extensions or config, that I can route tasks to the server with lowest CPU usage?
Best option is to use celery.send_task from the producing server, then deploy the workers onto n instances. The workers can then be run as #ealeon mentioned, using celery -A proj worker -l info -Ofair.
This way, load will be distributed across all servers without having to have the codebase present on the consuming servers.
Try:
"http://docs.celeryproject.org/en/latest/userguide/optimizing.html#guide-optimizing
You can disable this prefetching behavior by enabling the -Ofair worker option:
$ celery -A proj worker -l info -Ofair"

Categories