Celery not processing tasks everytime - python

I am having below configuration for celery
celery = Celery(__name__,
broker=os.environ.get('CELERY_BROKER_URL', 'redis://'),
backend=os.environ.get('CELERY_BROKER_URL', 'redis://'))
celery.config_from_object(APP_SETTINGS)
ssl = celery.conf.get('REDIS_SSL', True)
r = redis.StrictRedis(REDIS_BROKER, int(REDIS_BROKER_PORT), 0,
charset='utf-8', decode_responses=True, ssl=ssl)
db_uri = celery.conf.get('SQLALCHEMY_DATABASE_URI')
#celery.task
def process_task(data):
#some code here
I am calling process task inside API endpoint like
process_task.delay(data)
sometimes it's processing tasks sometimes not.
can someone help me to resolve this issue?
I am running worker like celery worker -A api.celery --loglevel=DEBUG --concurrency=10

Once all the worker-processes are busy the new tasks will just sit on the queue waiting for the next idle worker-process to start the task. This is most likely why you perceive this as "not processing tasks everytime". Go through the monitoring and management section of the Celery documentation to find how to monitor your Celery cluster. For starters, do celery worker -A api.celery inspect active to check the currently running tasks.

Related

How do I get a Celery worker to consume an 'outside' RabbitMQ queue?

I have the following scripts:
celery_tasks.py
from celery import Celery
app = Celery(broker='amqp://guest:guest#localhost:5672//')
app.conf.task_default_queue = 'test_queue'
#app.task(acks_late=True)
def test(a):
return a
publish.py
from celery_tasks import test
test.delay('abc')
When i run publish.py and start the worker (celery -A celery_tasks worker --loglevel=DEBUG), the 'abc' content is published in the 'test_queue' and is consumed by the worker.
Is there a way for the worker to consume something from a queue that was not posted by Celery? For example, when I put something in the test_queue straight through RabbitMQ, without going through the Celery publisher, and run the Celery worker, it gave me the following warning:
WARNING/MainProcess] Received and deleted unknown message. Wrong destination?!?
The full contents of the message body was: body: 'abc' (3b)
{content_type:None content_encoding:None
delivery_info:{'exchange': '', 'redelivered': False, 'delivery_tag': 1, 'consumer_tag': 'None2', 'routing_key': 'test_queue'} headers={}}
Is there a way to solve this?
Celery has a specific format and a set of headers that needs to be maintained to comply with it. Therefore you would have to reverse engineer it to make celery-compliant message not produced by celery.
Keep in mind that celery is not really made to send messages across the broker, but to send tasks, which are enhanced messages therefore have extras in the header part of the amqp message
It's a late answer but custom consumers might help you. I'm using this for consuming messages from rabbitmq. Where these messages are being populated from another app with pika.
http://docs.celeryproject.org/en/latest/userguide/extending.html#custom-message-consumers

Celery tasks on multiple machines

I have a server where I installed a RabbitMQ broker and two Celery consumers (main1.py and main2.py) both connected to the same broker.
In the first consumer (main1.py), I implemented a Celery Beat that sends multiple times a different task on a specific queue:
app = Celery('tasks', broker=..., backend=...)
app.conf.task_routes = (
[
('tasks.beat', {'queue': 'print-queue'}),
],
)
app.conf.beat_schedule = {
'beat-every-10-seconds': {
'task': 'tasks.beat',
'schedule': 10.0
},
}
#app.task(name='tasks.beat', bind=True)
def beat(self):
for i in range(10):
app.send_task("tasks.print", args=[i], queue="print-queue")
return None
In the second consumer (main2.py), I implemented the task said above:
app = Celery('tasks', broker=..., backend=...)
app.conf.task_routes = (
[
('tasks.print', {'queue': 'print-queue'}),
],
)
#app.task(name='tasks.print', bind=True)
def print(self, name):
return name
When I start the two Celery worker:
consumer1: celery worker -A main1 -Q print-queue --beat
consumer2: celery worker -A main2 -Q print-queue
I get these errors:
[ERROR/MainProcess] Received unregistered task of type 'tasks.print'
on the first consumer
[ERROR/MainProcess] Received unregistered task of type 'tasks.beat'
on the second consumer
Is it possible to split tasks on different Celery Applications both connected to the same broker?
Thanks in advance!
Here's what is happening. You have two workers A and B one of which also happens to be running celery beat (say that one is B).
celery beat submits task.beat to the queue. All this does is enqueue a message in rabbit with some metadata including the name of the task.
one of the two workers reads the message. Both A and B are listening to the same queue so either may read it.
a. If A reads the message it will try to find the task called tasks.beat this blows up because A doesn't define that task.
b. If B reads the message it will successfully try to find the task called tasks.beat (since it does have that task) and will run the code. tasks.beat will enqueue a new message in rabbit containing the metadata for tasks.print.
The same problem will again occur because only one of A and B defines tasks.print but either may get the message.
In practice, celery may be doing some checks to throw an error message earlier but I'm fairly certain this is the underlying problem.
In short, all workers (including beat) on a queue should be running the same code.

celery apply_async choking rabbit mq

I am using celery's apply_async method to queue tasks. I expect about 100,000 such tasks to run everyday (number will only go up). I am using RabbitMQ as the broker. I ran the code a few days back and RabbitMQ crashed after a few hours. I noticed that apply_async creates a new queue for each task with x-expires set at 1 day. My hypothesis is that RabbitMQ chokes when so many queues are being created. How can I stop celery from creating these extra queues for each task?
I also tried giving the queue parameter to the apply_async and assigned a x-message-ttl to that queue. Messages did go this new queue, however they were immediately consumed and never reached the ttl of 30sec that I had put. And this did not stop celery from creating those extra queues.
Here's my code:
views.py
from celery import task, chain
chain(task1.s(a), task2.s(b),)
.apply_async(link_error=error_handler.s(a), queue="async_tasks_queue")
tasks.py
from celery.result import AsyncResult
#shared_task
def error_handler(uuid, a):
#Handle error
#shared_task
def task1(a):
#Do something
return a
#shared_task
def task2(a, b):
#Do something more
celery.py
app = Celery(
'app',
broker=settings.QUEUE_URL,
backend=settings.QUEUE_URL,
)
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
app.amqp.queues.add("async_tasks_queue", queue_arguments={'durable' : True , 'x-message-ttl': 30000})
From the celery logs:
[2016-01-05 01:17:24,398: INFO/MainProcess] Received task:
project.tasks.task1[615e094c-2ec9-4568-9fe1-82ead2cd303b]
[2016-01-05 01:17:24,834: INFO/MainProcess] Received task:
project.decorators.wrapper[bf9a0a94-8e71-4ad6-9eaa-359f93446a3f]
RabbitMQ had 2 new queues by the names "615e094c2ec945689fe182ead2cd303b" and "bf9a0a948e714ad69eaa359f93446a3f" when these tasks were executed
My code is running on Django 1.7.7, celery 3.1.17 and RabbitMQ 3.5.3.
Any other suggestions to execute tasks asynchronously are also welcome
Try using a different backend - I recommend Redis. When we tried using Rabbitmq as both broker and backend we discovered that it was ill suited to the broker role.

Celery run worker with -Ofair from python

I have a celery setup with rabbitmq. The issue is that celery is moving tasks to reserved state while running a long task, and do not execute them until the long running task is completed.
I want to accomplish that without using routing, and enabling "-Ofair" flag does the job. Prefork pool prefetch settings
How to enable the flag in celery python? Thanks
I am using celery 3.1.19
$ celery report
software -> celery:3.1.19 (Cipater) kombu:3.0.32 py:3.4.3
billiard:3.3.0.22 py-amqp:1.4.8
platform -> system:Linux arch:64bit, ELF imp:CPython
loader -> celery.loaders.default.Loader
settings -> transport:amqp results:disabled
I am using Celery as follows and concurrency is set to 4:
app = celery.Celery()
app.conf.update(
BROKER_URL=broker,
CELERY_RESULT_BACKEND=backend,
CELERY_TASK_SERIALIZER='json',
CELERY_IMPORTS=imports or [],
CELERYD_CONCURRENCY=concurrency,
CELERYD_HIJACK_ROOT_LOGGER=False
)
Here is how I start the worker:
worker = app.Worker(
hostname=hostname,
queues=[hostname]
)
worker.start()
You should be able to run it this way.
worker = app.Worker(
hostname=hostname,
queues=[hostname],
optimization='fair'
)
worker.start()

How to list the queued items in celery?

I have a Django project on an Ubuntu EC2 node, which I have been using to set up an asynchronous using Celery.
I am following http://michal.karzynski.pl/blog/2014/05/18/setting-up-an-asynchronous-task-queue-for-django-using-celery-redis/ along with the docs.
I've been able to get a basic task working at the command line, using:
(env1)ubuntu#ip-172-31-22-65:~/projects/tp$ celery --app=myproject.celery:app worker --loglevel=INFO
I just realized, that I have a bunch of tasks in my queue, that had not executed:
[2015-03-28 16:49:05,916: WARNING/MainProcess] Restoring 4 unacknowledged message(s).
(env1)ubuntu#ip-172-31-22-65:~/projects/tp$ celery -A tp purge
WARNING: This will remove all tasks from queue: celery.
There is no undo for this operation!
(to skip this prompt use the -f option)
Are you sure you want to delete all tasks (yes/NO)? yes
Purged 81 messages from 1 known task queue.
How do I get a list of the queued items from the command line?
If you want to get all scheduled tasks,
celery inspect scheduled
To find all active queues
celery inspect active_queues
For status
celery inspect stats
For all commands
celery inspect
If you want to get it explicitily.Since you are using redis as queue.Then
redis-cli
>KEYS * #find all keys
Then find out something related to celery
>LLEN KEY # i think it gives length of list
Here is a copy-paste solution for Redis:
def get_celery_queue_len(queue_name):
from yourproject.celery import app as celery_app
with celery_app.pool.acquire(block=True) as conn:
return conn.default_channel.client.llen(queue_name)
def get_celery_queue_items(queue_name):
import base64
import json
from yourproject.celery import app as celery_app
with celery_app.pool.acquire(block=True) as conn:
tasks = conn.default_channel.client.lrange(queue_name, 0, -1)
decoded_tasks = []
for task in tasks:
j = json.loads(task)
body = json.loads(base64.b64decode(j['body']))
decoded_tasks.append(body)
return decoded_tasks
It works with Django. Just don't forget to change yourproject.celery.

Categories