celery launches more processes than configured

celery launches more processes than configured - python

I'm running a celery machine, using redis as the broker with the following configuration:
celery -A project.tasks:app worker -l info --concurrency=8
When checking the number of celery running processes, I see more than 8.
Is there something that I am missing? Is there a limit for max concurrency?
This problem causes huge memory allocation, and is killing the machine.

With the default settings Celery will always start one more process than the number you ask. This additional process is a kind of bookkeeping process that is used to coordinate the other processes that are part of the worker. It communicates with the rest of Celery, and dispatches the tasks to the processes that actually run the tasks.
Switching to a different pool implementation than the "prefork" default might reduce the number of processes created but that's opening new can of worms.

For the concurrency problem, I have no suggestion.
For the memory problem, you can look at redis configuration in ~/.redis/redis.conf. You have a maxmemory attribute which fix a limit upon tasks…
See the Redis configuration

Related

Celery - What pool should I use for windows heavy cpu process and redis backend for status tracking?

I am running python 3.9, Windows 10, celery 4.3, redis as the backend, and aws sqs as the broker (I wasn't intending on using the backend, but it became more and more apparent to me that due to the library's restrictions on windows that'd I'd be better off using it if I could get it to work, otherwise I would've just used redis as the broker and backend).
To give you some context, I have a webpage that a user interacts with to allow them to do a resource intensive task. If the user has a task running and decides to resend the task, I need it to kill the task, and use the new information sent by the user to create the new task.
The problem for me arrives after this line of thinking:
Me: "Hmmm, the prefork pool is used for heavy cpu background tasks... I want to use that..."
Me: Goes and configures settings.py,
updates the celery library,
sets the environment variable to allow windows to run prefork pool -
os.environ.setdefault('FORKED_BY_MULTIPROCESSING', '1'),
sets a few other configuration settings, etc,
runs the worker and it works.
Me: "Hey, hey. It works... Oh, I still can't revoke a task DESPITE RUNNING THE PREFORK POOL!?!?!
Oh, that's okay... I can just set a session variable to let me know if the user already started a task,
and if they have, just have celery tell me if the task that they started is finished
before I allow the user to request to run a task again."
Me: Goes and configures django sessions,
configures redis,
updates the views to include the session variable, etc,
Me: "Great! Everything is working, so far..."
Me: Runs a test to see if the redis server returns the status...
Celery: "PENDING"
Me: "Yo! Is my task done, yet!?"
Celery: "No - PENDING"
Celery: "PENDING"
Celery: "PENDING"
Celery: "PENDING"
Celery: "PENDING"
Celery: "PENDING"
Me: Searches stackoverflow for why its only pending...
Me: Finds out that you must use --pool=solo for the worker...
Me: Dies on the inside.
Ideally - I'd like to be able to use the prefork pool to do intense processing and to kill the task if need be. The thing is that everything that I read tells me prefork is what I want, but solo is the only way I can think of to get it to work.
Questions:
How bad is it for me to compromise these desires and just go with solo, expecting that I will be using heavy cpu for the tasks and many users? Assume 100s if not 1000s at once submitting tasks.
What other solutions should I consider?

In my experience on windows I cannot use anything other than --pool=solo
What other solutions should I consider?
The way I do it is I use 1 pool for windows development and more on production (linux) at least in my case using solo pool for development is fine.

How to make Celery worker consume single task and exit

How do I make the celery -A app worker command to consume only a single task and then exit.
I want to run celery workers as a kubernetes Job that finishes after handling a single task.
I'm using KEDA for autoscaling workers according to queue messages.
I want to run celery workers as jobs for long running tasks, as suggested in the documentation:
KEDA long running execution

There's not really anything specific for this. You would have to hack in your own driver program, probably via a custom concurrency module. Are you trying to use Keda ScaledJobs or something? You would just use a ScaledObject instead.

Celery How to make a worker run only when other workers are broken?

I have two servers, there is one celery worker on each server. I use Redis as the broker to collaborate with workers.
My question is how can I make only one worker run for most of the time, and once this worker is broken, another worker will turn on as a backup worker?
Basically, just take one worker as a back-up.
I know how to specify a task to a certain worker by a queue on the worker respectively, after reading the doc [http://docs.celeryproject.org/en/latest/userguide/routing.html#redis-message-priorities]

This is, in my humble opinion, completely against the point of having distributed system to off-load CPU heavy, or long-running tasks, or have thousands of small tasks that you can't run elsewhere...
- You are running two servers anyway, so why keeping the other one idle? More workers mean you will be able to process more tasks concurrently.
If you are not convinced, and still want to do this, you need to write a tiny service on machine with idle Celery worker. This service will periodically check the health of the active worker, and if that check fails, you will run Celery worker on the backup server.
Here is a question for you - why this service simply does not restart the Celery worker on the active server? - It is pretty much possible to do that, so again, I see no justification for having a completely idle machine doing nothing. If you are on a cloud platform, you can easily spin up a new instance from an existing image of your Celery worker. This is scenario I use in production.

Using maxtasksperchild with eventlet

We have a python application with some celery workers.
We use the next command to start celery worker:
python celery -A proj worker --queue=myqueue -P prefork --maxtasksperchild=500
We have two issues with our celery workers.
We have a memory leak
We have pretty big load and we need a lot of workers to process everything fast
We're still looking into memory leak, but since it's legacy code it's pretty hard to find a cause and it will take some time to resolve this issue. To prevent leaks we're using --maxtasksperchild, so each worker after processing 500 events restarts itself. And it works ok, memory grows just to some level.
Second issue is a bit harder. To process all events from our celery queue we have to start more workers. But with prefork each process eats a lot of memory (about 110M in our case) so we either need a lot of servers to start right number of workers or we have to switch from prefork to eventlet:
python celery -A proj worker --queue=myqueue -P eventlet --concurrency=10
In this case we'll use the same amount of memory (about 110M per process) but each process will have 10 workers which is much more memory efficient. But the issue with this is that we still have issue #1 (memory leak), and we can't use --maxtasksperchild because it doesn't work with eventlet.
Any thoughts how can use something like --maxtasksperchild with eventlet?

Upgrade Celery, I've just quick scanned master code, they promise max-memory-per-child. Hope it would work with all concurrency models. I haven't tried it yet.
Set up process monitoring, send graceful terminate signal to workers above memory threshold. Works for me.
Run Celery in control group with limited memory. Works for me.

How to load balance celery tasks across several servers?

I'm running celery on multiple servers, each with a concurrency of 2 or more and I want to load balance celery tasks so that the server that has the lowest CPU usage can process my celery tasks.
For example, lets say I have 2 servers (A and B), each with a concurrency of 2, if I have 2 tasks in the queue, I want A to process one task and B to process the other. But currently its possible that the first process on A will execute one task and the second process on A will execute the second task while B is sitting idle.
Is there a simple way, by means of celery extensions or config, that I can route tasks to the server with lowest CPU usage?

Best option is to use celery.send_task from the producing server, then deploy the workers onto n instances. The workers can then be run as #ealeon mentioned, using celery -A proj worker -l info -Ofair.
This way, load will be distributed across all servers without having to have the codebase present on the consuming servers.

Try:
"http://docs.celeryproject.org/en/latest/userguide/optimizing.html#guide-optimizing
You can disable this prefetching behavior by enabling the -Ofair worker option:
$ celery -A proj worker -l info -Ofair"

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.