Is Celery mostly just a high level interface for message queues like RabbitMQ? I am trying to set up a system with multiple scheduled workers doing concurrent http requests, but I am not sure if I would need either of them. Another question I am wondering is where do you write the actual task in code for the workers to complete, if I am using Celery or RabbitMQ?
RabbitMQ is indeed a message queue, and Celery uses it to send messages to and from workers. Celery is more than just an interface for RabbitMQ. Celery is what you use to create workers, kick off tasks, and define your tasks. It sounds like your use case makes sense for Celery/RabbitMQ. You create a task using the #app.task decorator. Check the docs for more info. In previous projects, I've set up a module for celery, where I define any tasks I need. Then you can pull in functions from other modules to use in your tasks.
Celery is the task management framework--the API you use to schedule jobs, the code that gets those jobs started, the management tools (e.g. Flower) you use to monitor what's going on.
RabbitMQ is one of several "backends" for Celery. It's an oversimplification to say that Celery is a high-level interface to RabbitMQ. RabbitMQ is not actually required for Celery to run and do its job properly. But, in practice, they are often paired together, and Celery is a higher-level way of accomplishing some things that you could do at a lower level with just RabbitMQ (or another queue or message delivery backend).
Related
The debate I am in currently is whether we should stick with RabbitMQ implementation using Pika or move to celery, what all advantages are there if we go with Celery. From what I have understood is Celery is a distributed job queue that simplifies the management of task distribution. It uses broker (RabbitMQ, Redis and so on) for the sending and receiving the message between client and worker, it also can optionally use backend such as Redis to store the results.
Where as RabbitMQ is a message Queue which can be used to perform the Jobs in async manner. Eventually if we use either RabbitMQ and implement it using Pika in python it will do the same Job which is to execute long running processes in background.
The few advantages that I see in using Celery are:
Can store result of each task, using backend (such as redis).
Easier to implement.
Also allows to add retries.
Is a distributed job queue, can be run on multiple nodes/clusters.
Packages like flower can be used to monitor each task, their states, results, time taken and some other metadata too.
Task chaining
But on the other side it seems it does restrict us to use some of the features of RabbitMQ and also it has some limitations like it will connect with broker synchronously (issue on github https://github.com/celery/celery/issues/3884
)
I am familiar with this Question already asked here Why use Celery instead of RabbitMQ? but it does not seem to be clear.
Any help would be highly appreciated.
Erm... it seems to me like you are comparing mosquitoes and elephants here. RabbitMQ+Pika is not a replacement for Celery. However, RabbitMQ+Pika can help you implement a (miniature) service such as Celery, if that is really what you want.
If you use RabbitMQ as backend, Celery (actually kombu) will use something similar to Pika - the celery amqp project, to communicate with the broker.
I am new to Celery. In this example, I am unable to figure out how to separate the logic of publisher and consumer. Is the command celery -A tasks worker --loglevel=INFO used to start working for publishing or consuming?
If add.delay(4, 4) is to push data into a queue, how do I connect to the same queue in a separate code file and consume it?
Publishers are typically either Celery beat (scheduler), custom scripts that you develop, or other tasks executed by Celery workers in your cluster.
Consumers are EXCLUSIVELY Celery workers. Unless you dig really deep into Celery/Kombu and implement your own consumer you are pretty much not able to write consumer so easily.
In my system user is allowed to set notifications schedule. He can choose any date and time when he wants to get messages. I have discovered one mechanims is named as Celery in Python. That executes tasks asyncronly. Due this I have pair of questions:
How to intergrate Celery with user interface?
Are there any Celery alternatives?
Is it panacea?
What you are looking for is something to process background tasks submitted to a queue from your web server. To that end, Celery is a good option and easy to configure. A more comprehensive list can be found here. None of these options would integrate with a user interface, they would integrate with your web server. They can queue jobs based on what is sent from the client side, which could be included as part of handling the request-response flow.
Also, this article provides a good reference for how to schedule periodic tasks using celery.
I have a service that needs a sort of coordinator component. The coordinator will manage entities that need to be assigned to users, taken away from users if the users do not respond on a timely manner, and also handle user responses if they do response. The coordinator will also need to contact messaging services to notify the users they have something to handle.
I want the coordinator to be a single-threaded process, as the load is not expected to be too much for the first few years of usage, and I'd much rather postpone all the concurrency issues to when I really need to handle them (if at all).
The coordinator will receive new entities and user responses from a Django webserver. I thought the easiest way to handle this is with Celery tasks - the webserver just starts a task that the coordinator consumes on its own time.
For this to happen, I need the coordinator to contain a celery worker, and replace the current worker mainloop with my own version (one that checks the broker for a new message and handles the scheduling).
How feasible is it? The alternative is to avoid Celery and use RabbitMQ directly. I'd rather not do that.
Replace this names: coordinator with rabbitmq (or some other broker kombu supports) and users with celery workers.
I am pretty sure you can do all you need (and much more) just by configuring celery / kombu and rabbitmq and without writing too many (if any) lines of code.
small note: Celery features scheduled tasks.
I am using Celery + Kombu with Amazon SQL.
The goal is to be able to a remove a task already scheduled for some specific datetime.
I've tried
from celery.task.control import revoke
revoke(task_id)
but that didn't change anything. Is revoke not implemented for SQS transport? Is there some design decision behind it or it's just a lacking feature that should be implemented by some "DeleteMessage" line of code?
Unless you're using RabbitMQ, it's better to come up with a custom solution for revoking tasks. E.g. instead of executing tasks, build a system of two components: scheduler task that scans your table of potential tasks and executes them when time comes. No need to revoke, you just can decide not to execute task when needed.