How to trigger email when celery worker goes down?

How to trigger email when celery worker goes down? - python

I have configured django celery with rabbitmq in my server. Currently I am having only one node for my tasks.
I have tried with celery-flower, events, celerycam, etc. for monitoring the the worker/tasks status and it worked well.
My Problem is:-
I want to send mail notification if worker goes down for some reason.
I thought of creating cron job and running every 5 mins and check the status of worker(not sure this the correct way)
Is there any other extensions or other way to do this without cron??

Run your workers using supervisor. There's an example in the documentation. Then, take a look at this answer for how to send an email when the worker process goes down.

Related

Tasks linger in celery amqp when publisher is terminated

I am using Celery with a RabbitMQ server. I have a publisher, which could potentially be terminated by a SIGKILL and since this signal cannot be watched, I cannot revoke the tasks. What would be a common approach to revoke the tasks where the publisher is not alive anymore?
I experimented with an interval on the worker side, but the publisher is obviously not registered as a worker, so I don't know how I can detect a timeout

There's nothing built-in to celery to monitor the producer / publisher status -- only the worker / consumer status. There are other alternatives that you can consider, for example by using a redis expiring key that has to be updated periodically by the publisher that can serve as a proxy for whether a publisher is alive. And then in the task checking to see if the flag for a publisher still exists within redis, and if it doesn't the task returns doing nothing.

I am pretty sure what you want is not possible with Celery, so I suggest you to shift your logic around and redesign everything to be part of a Celery workflow (or several Celery canvases depends on the actual use-case). My experience with Celery is that you can build literally any workflow you can imagine with those Celery primitives and/or custom Celery signatures.

Another solution, which works in my case, is to add the next task only if the current processed ones are finished. In this case the queue doesn't fill up.

Celery : understanding the big picture

Celery seems to be a great tool, but I have hard time understanding how the various Celery components work together:
The workers
The apps
The tasks
The message Broker (like RabbitMQ)
From what I understand, the command line:
celery -A not-clear-what-this-option-is worker
should run some sort of celery "worker server" which would itself need to connect to a broker server (I'm not so sure why so many servers are needed).
Then in any python code, some task may be sent to the worker by instantiating an app:
app = Celery('my_module', broker='pyamqp://guest#localhost//')
and then by decorating functions with this app in the following way:
#app.tasks
def my_func():
...
so that "my_func()" can now be called as "my_func.delay()" to be ran in an asynchronuous way.
Here are my questions:
What happens when my_func.delay() is called ? which server talks to which first ? and sending what where ?
What is the option to put behind the "-A" of the celery command? is this really needed ?
Suppose I have a process X which instantiates a Celery app to launch the task A, and suppose I have another process Y who wants to know the status of task A launched by X. I assume there is a way for Y to do so, but I don't know how. I suppose that Y should create its own instance of a Celery app. But then:
What function to call in the celery app of Y to get this information (and what is the "identifier" of task A inside the process Y) ?
How does this work in terms of communication, that is, when does the request goes through the Broker, and when does it go to the worker(s) ?
If anyone has some information about these questions, I would be grateful. I intend to use Celery in a Django project, where some requests to the server can trigger various time consuming tasks, and/or inquire about the status of previously launched tasks (pending, finished, error, etc...).

About the broker:
The main role of the broker is to mediate communication between the client and the worker
basically a lot of information is being generated and processed while your worker is running
taking care of this information is the broker's role
e.g. you can configure redis so that no information is lost if the server is shut down while running a process
The worker:
you can think of the worker as an instance independent of your application, which will only execute those tasks that you delegate to it
About the state of a task:
there are ways to consult celery to find out the status of a task, but I would not recommend building your application logic depending on this
if you want to get the output of a process and turn it in the input of another one, using tasks, I would recommend you to use a queue
run task A, and before finish insert your result objects in the queue
task B will listen to the queue and processes whatever comes up
The command:
on the terminal you can see in more detail what each argument means by running celery -h or celery --help
but the argument basically specifies which instance of celery you intend to run. So normally this argument will indicate where the instance you have configured and intend to execute can be found
usage: celery [-h] [-A APP] [-b BROKER] [--result-backend RESULT_BACKEND]
[--loader LOADER] [--config CONFIG] [--workdir WORKDIR]
[--no-color] [--quiet]
I hope this can provide an initial overview for those who get here

Celery is used to make functions to run in the background. Imagine you have a web API that does a job, and returns a response. You know, that job would seriously affect the response time for the API. So you'll transfer that particular job to Celery, and your API will respond instantly. Examples for some job that affect performance of an API are,
Routing to email servers
Routing to SMS Gateways
Database backup
Chained database operations
File conversion
Now, let's cover each components of celery.
The workers
Celery workers execute the job(function). They are asynchronous. So you'll have double the number of your processor cores as celery workers. You can assign a name and task to a celery worker#.
The apps
The app is the name of project you're working on. You'll have to specify that name in the celery instance.
The tasks
The functions you need to be executed in the background. Every task Celery execute will have a task id, state(and more). You can get that by inspecting a particular task.
The message Broker
Those tasks which will be executed in the background has to be moved from your python project to to Celery workers. Message brokers act as a medium here. So functions with its arguments will be transferred to brokers and from brokers Celery will fetch them to execute.
Some codes
celery -A project_name worker_name
celery -A project_name worker_name inspect
More in documentation
docs.celeryproject.org

Dealing with django + celery when it goes offline

I am trying to figure out how to deal with celery in my django project when it goes offline for some reason.
i found in the documentation the worker-offline event, so i am guessing i can somehow catch this event when celery goes offline and email myself informing my celery worker is down.
My question is how do i implement this behaviour? Are there any examples or a Django app? Is this how i am supposed to deal with these situations?

In my production setup, I use supervisor to daemonize and control celery workers. You can a have script running on the system monitoring the state of the supervisor controlled states (fire an alert anytime the state of the process enters FATAL state).
Assuming your workers are continuously spitting out text in a log file, you can setup a log freshness check. If you detect the log has not been updated for X seconds/minute, you fire an alert.
There's also Celery Flower, a system designed for remote monitoring and managing Celery remotely. I have not used it in production, so I cannot tell whether it meets your specific needs.

You can process events using real-time processing, write a consumer and announce when event come in, a example can be found in Monitoring and Management Guide, replace the task-failed event with worker-offline event and write your own handler to deal with the event your captured.

I ended up creating this Django Reusable App with a REST API to monitor my celery workers from an external service or machine:
https://github.com/psychok7/django-celery-inspect

Persistent Long Running Tasks in Celery

I'm working on a Python based system, to enqueue long running tasks to workers.
The tasks originate from an outside service that generate a "token", but once they're created based on that token, they should run continuously, and stopped only when explicitly removed by code.
The task starts a WebSocket and loops on it. If the socket is closed, it reopens it. Basically, the task shouldn't reach conclusion.
My goals in architecting this solutions are:
When gracefully restarting a worker (for example to load new code), the task should be re-added to the queue, and picked up by some worker.
Same thing should happen when ungraceful shutdown happens.
2 workers shouldn't work on the same token.
Other processes may create more tasks that should be directed to the same worker that's handling a specific token. This will be resolved by sending those tasks to a queue named after the token, which the worker should start listening to after starting the token's task. I am listing this requirement as an explanation to why a task engine is even required here.
Independent servers, fast code reload, etc. - Minimal downtime per task.
All our server side is Python, and looks like Celery is the best platform for it.
Are we using the right technology here? Any other architectural choices we should consider?
Thanks for your help!

According to the docs
When shutdown is initiated the worker will finish all currently executing tasks before it actually terminates, so if these tasks are important you should wait for it to finish before doing anything drastic (like sending the KILL signal).
If the worker won’t shutdown after considerate time, for example because of tasks stuck in an infinite-loop, you can use the KILL signal to force terminate the worker, but be aware that currently executing tasks will be lost (unless the tasks have the acks_late option set).
You may get something like what you want by using retry or acks_late
Overall I reckon you'll need to implement some extra application-side job control, plus, maybe, a lock service.
But, yes, overall you can do this with celery. Whether there are better technologies... that's out of the scope of this site.

What could make celery worker becoming unresponsive after a few tasks?

My workers are stopping after a few (<50) tasks.
I have a very simple client/worker setup. The client post the task via func.delay(...) then enter a while loop to wait for the completion of all the tasks (i.e checking the ready() method of the AsyncResult). I use rabbitmq for the broker and the result backend.
The setup works...for a while. After a few tasks, the client doesn't receive anything and the worker seems to be idle (there is not output in the console anymore).
(The machine I work on is a bit old so a resource problem is not impossible. Still, at 50 tasks that runs for 2secs, I cannot say the system is under heavy load).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to trigger email when celery worker goes down? - python

Run your workers using supervisor. There's an example in the documentation. Then, take a look at this answer for how to send an email when the worker process goes down.

Related

Tasks linger in celery amqp when publisher is terminated

Celery : understanding the big picture

Dealing with django + celery when it goes offline

Persistent Long Running Tasks in Celery

What could make celery worker becoming unresponsive after a few tasks?

Categories

Resources