celery task not sent or executed - python

I'm new to learning celery and was following tutorials and setup my celery setup with docker
I'm having issue with sending and executing celery task.
So have 4 docker container one for rabbitmq server, celery producer server and 2 worker.
Celery tasks file:
"""
CELERY MAIN FILE
"""
from celery import Celery
from time import sleep
celery_obj = Celery()
celery_obj.config_from_object('celery_config') #config file we created in same folder
#celery_obj.task
def add(num1,num2):
print("executing add function")
sleep(5)
return num1 + num2
My celery config file for Producer:
"""
CELERY CONFIGURATION FILE
"""
from kombu import Exchange, Queue
broker_url = "pyamqp://rabbitmq_user:123#172.17.0.2/res_opt_rabbitmq_vhost"
result_backend = 'rpc://'
#celery_result_backend = ""
celery_imports = ('res_opt_code.tasks')
task_queues = (
Queue('worker_A_kombu_queue',Exchange('celery',type='direct'),routing_key='worker_A_rabbitmq_queue'),
Queue('worker_B_kombu_queue',Exchange('celery',type='direct'),routing_key='worker_B_rabbitmq_queue')
)
Config file for worker_A:
"""
CELERY CONFIGURATION FILE
"""
from kombu import Exchange, Queue
broker_url = "pyamqp://rabbitmq_user:123#172.17.0.2/res_opt_rabbitmq_vhost"
result_backend = 'rpc://'
#celery_result_backend = ""
celery_imports = ('worker_code.tasks')
task_queues = (
Queue('worker_A_kombu_queue',Exchange('celery',type='direct'),routing_key='worker_A_rabbitmq_queue'),
Queue('worker_B_kombu_queue',Exchange('celery',type='direct'),routing_key='worker_B_rabbitmq_queue')
)
Command for starting celery on producer:
celery -A tasks worker --loglevel=DEBUG -f log_file.txt
command for starting celery on worker:
celery -A tasks worker -n celery_worker_A -Q worker_A_kombu_queue --loglevel=DEBUG
Function call from producer:
from tasks import add
add.apply_async([4,4],routing_key='worker_A_rabbitmq_queue')
#also tried local executing the function but not logs of functions it's in pending
add.delay(4,4)
could you guyz please help me what I'm doing wrong here
In Logs I'm able to see worker_A connected but no logs for function

Tried further troubleshooting and changed the argument in apply_async from routing key to queue and it working with the queue argument
was following this tutorial:
https://www.youtube.com/watch?v=TM1a3m65zaA
old:
add.apply_async([4,4],routing_key='worker_A_rabbitmq_queue')
new:
add.apply_async([4,4],queue='worker_A_rabbitmq_queue')

Related

How to determine the name of a task in celery?

I have a fastAPI app where I want to call a celery task
I can not import the task as they are in two different code base. So I have to call it using its name.
in tasks.py
imagery = Celery(
"imagery", broker=os.getenv("BROKER_URL"), backend=os.getenv("REDIS_URL")
)
...
#imagery.task(bind=True, name="filter")
def filter_task(self, **kwargs) -> Dict[str, Any]:
print('running task')
The celery worker is running with this command:
celery worker -A worker.imagery -P threads --loglevel=INFO --queues=imagery
Now in my FastAPI code base I want to run the filter task.
So my understanding is I have to use the celery.send_task() function
In app.py I have
from celery import Celery, states
from celery.execute import send_task
from fastapi import FastAPI
from starlette.responses import JSONResponse, PlainTextResponse
from app import models
app = FastAPI()
tasks = Celery(broker=os.getenv("BROKER_URL"), backend=os.getenv("REDIS_URL"))
#app.post("/filter", status_code=201)
async def upload_images(data: models.FilterProductsModel):
"""
TODO: use a celery task(s) to query the database and upload the results to S3
"""
data = ['ok', 'un test']
data = ['ok', 'un test']
result = tasks.send_task('workers.imagery.filter', args=list(data))
return PlainTextResponse(f"here is the id: {str(result.ready())}")
After calling the /filter endpoint, I don't see any task being picked up by the worker.
So I tried different name in send_task()
filter
imagery.filter
worker.imagery.filter
How come my task never get picked up by the worker and nothing shows in the log?
Is my task name wrong?
Edit:
The worker process run in docker. Here is the fullpath of the file on its disk.
tasks.py : /workers/worker.py
So if I follow the import schema. the name of the task would be workers.worker.filter but this does not work, nothing get printed in the logs of docker. Is a print supposed to appear in the STDOUT of the celery cli?
Your Celery worker is subscribed to the imagery queue only . On the other hand, you try to send the task to the default queue (if you did not change configuration, the name of that queue is celery) with result = tasks.send_task('workers.imagery.filter', args=list(data)). It is not surprising you do not see task being executed by your worker as you have been sending tasks to the default queue whole time.
To fix this, try the following:
result = tasks.send_task('workers.imagery.filter', args=list(data), queue='imagery')
OP Here.
This is the solution I used.
task = signature("filter", kwargs=data.dict() ,queue="imagery")
res = task.delay()
As mentioned by #DejanLekic I had to specify the queue.

Reload celery beat config

I'm using celery and celery-beat without Django and I have a task which needs to modify celery-beat schedule when run.
Now I have the following code (module called celery_tasks):
# __init__.py
from .celery import app as celery_app
__all__ = ['celery_app']
#celery.py
from celery import Celery
import config
celery_config = config.get_celery_config()
app = Celery(
__name__,
include=[
'celery_tasks.tasks',
],
)
app.conf.update(celery_config)
# tasks.py
from celery_tasks import celery_app
from celery import shared_task
#shared_task
def start_game():
celery_app.conf.beat_schedule = {
'process_round': {
'task': 'celery_tasks.tasks.process_round',
'schedule': 5,
},
}
I start celery with the following command:
celery worker -A celery_tasks -E -l info --beat
start_game executes and exists normally, but beat process_round task never runs.
How can I force-reload beat schedule (restarting all workers doesn't seem as a good idea)?
the problem with normal celery backend when you start the celerybeat process. it will create a config file and write all tasks and schedules in to that file so it cannot change dynamically
you can use the package
celerybeat-sqlalchemy-scheduler so you can edit schedule on DB itself so that celerybeat will pickup the new schedule from DB itself
also there is another package celery-redbeat which using redis-server as backend
you can refer this this also
Using schedule config also seems bad idea. What if initially process_round task will be active and check if game is not started task just do nothing.

Django celerybeat periodic task only runs once

I am trying to schedule a task that runs every 10 minutes using Django 1.9.8, Celery 4.0.2, RabbitMQ 2.1.4, Redis 2.10.5. These are all running within Docker containers in Linux (Fedora 25). I have tried many combinations of things that I found in Celery docs and from this site. The only combination that has worked thus far is below. However, it only runs the periodic task initially when the application starts, but the schedule is ignored thereafter. I have absolutely confirmed that the scheduled task does not run again after the initial time.
My (almost-working) setup that only runs one-time:
settings.py:
INSTALLED_APPS = (
...
'django_celery_beat',
...
)
BROKER_URL = 'amqp://{user}:{password}#{hostname}/{vhost}/'.format(
user=os.environ['RABBIT_USER'],
password=os.environ['RABBIT_PASS'],
hostname=RABBIT_HOSTNAME,
vhost=os.environ.get('RABBIT_ENV_VHOST', '')
# We don't want to have dead connections stored on rabbitmq, so we have to negotiate using heartbeats
BROKER_HEARTBEAT = '?heartbeat=30'
if not BROKER_URL.endswith(BROKER_HEARTBEAT):
BROKER_URL += BROKER_HEARTBEAT
BROKER_POOL_LIMIT = 1
BROKER_CONNECTION_TIMEOUT = 10
# Celery configuration
# configure queues, currently we have only one
CELERY_DEFAULT_QUEUE = 'default'
CELERY_QUEUES = (
Queue('default', Exchange('default'), routing_key='default'),
)
# Sensible settings for celery
CELERY_ALWAYS_EAGER = False
CELERY_ACKS_LATE = True
CELERY_TASK_PUBLISH_RETRY = True
CELERY_DISABLE_RATE_LIMITS = False
# By default we will ignore result
# If you want to see results and try out tasks interactively, change it to False
# Or change this setting on tasks level
CELERY_IGNORE_RESULT = True
CELERY_SEND_TASK_ERROR_EMAILS = False
CELERY_TASK_RESULT_EXPIRES = 600
# Set redis as celery result backend
CELERY_RESULT_BACKEND = 'redis://%s:%d/%d' % (REDIS_HOST, REDIS_PORT, REDIS_DB)
CELERY_REDIS_MAX_CONNECTIONS = 1
# Don't use pickle as serializer, json is much safer
CELERY_TASK_SERIALIZER = "json"
CELERY_RESULT_SERIALIZER = "json"
CELERY_ACCEPT_CONTENT = ['application/json']
CELERYD_HIJACK_ROOT_LOGGER = False
CELERYD_PREFETCH_MULTIPLIER = 1
CELERYD_MAX_TASKS_PER_CHILD = 1000
celeryconf.py
coding=UTF8
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "web_portal.settings")
app = Celery('web_portal')
CELERY_TIMEZONE = 'UTC'
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
tasks.py
from celery.schedules import crontab
from .celeryconf import app as celery_app
#celery_app.on_after_finalize.connect
def setup_periodic_tasks(sender, **kwargs):
# Calls email_scanner every 10 minutes
sender.add_periodic_task(
crontab(hour='*',
minute='*/10',
second='*',
day_of_week='*',
day_of_month='*'),
email_scanner.delay(),
)
#app.task
def email_scanner():
dispatch_list = scanning.email_scan()
for dispatch in dispatch_list:
validate_dispatch.delay(dispatch)
return
run_celery.sh -- Used to start celery tasks from docker-compose.yml
#!/bin/sh
# wait for RabbitMQ server to start
sleep 10
cd web_portal
# run Celery worker for our project myproject with Celery configuration stored in Celeryconf
su -m myuser -c "celery beat -l info --pidfile=/tmp/celerybeat-web_portal.pid -s /tmp/celerybeat-schedule &"
su -m myuser -c "celery worker -A web_portal.celeryconf -Q default -n default#%h"
I have also tried using a CELERYBEAT_SCHEDULER in the settings.py in lieu of the #celery_app.on_after finalize_connect decorator and block in tasks.py, but the scheduler never ran even once.
settings.py (not working at all scenario)
(same as before except also including the following)
CELERYBEAT_SCHEDULE = {
'email-scanner-every-5-minutes': {
'task': 'tasks.email_scanner',
'schedule': timedelta(minutes=10)
},
}
The Celery 4.0.2 documentation online presumes that I should instinctively know many givens, but I am new in this environment. If anybody knows where I can find a tutorial OTHER THAN docs.celeryproject.org and http://django-celery-beat.readthedocs.io/en/latest/ which both assume that I am already a Django master, I would be grateful. Or let me know of course if you see something obviously wrong in my setup. Thanks!
I found a solution that works. I could not get CELERYBEAT_SCHEDULE or the celery task decorators to work, and I suspect that it may be at least partially due with the manner in which I started the Celery beat task.
The working solution goes the whole 9 yards to utilize Django Database Scheduler. I downloaded the GitHub project "https://github.com/celery/django-celery-beat" and incorporated all of the code as another "app" in my project. This enabled Django-Admin access to maintain the cron / interval / periodic task(s) tables via a browser. I also modified my run_celery.sh as follows:
#!/bin/sh
# wait for RabbitMQ server to start
sleep 10
# run Celery worker for our project myproject with Celery configuration stored in Celeryconf
celery beat -A web_portal.celeryconf -l info --pidfile=/tmp/celerybeat- web_portal.pid -S django --detach
su -m myuser -c "celery worker -A web_portal.celeryconf -Q default -n default#%h -l info "
After adding a scheduled task via the django-admin web interface, the scheduler started working fine.

Celery tasks doesn't works

Celery docs say that Celery 3.1 can work with django out of box. But tasks not working. I have tasks.py:
from celery import task
from datetime import timedelta
#task.periodic_task(run_every=timedelta(seconds=20), ignore_result=True)
def disable_not_confirmed_users():
print "start"
Configs:
from kombu import Exchange, Queue
CELERY_SEND_TASK_ERROR_EMAILS = True
BROKER_URL = 'amqp://guest#localhost//'
CELERY_DEFAULT_QUEUE = 'project-queue'
CELERY_DEFAULT_EXCHANGE = 'project-queue'
CELERY_DEFAULT_ROUTING_KEY = 'project-queue'
CELERY_QUEUES = (
Queue('project-queue', Exchange('project-queue'), routing_key='project-queue'),
)
project/celery.py
from future import absolute_import
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'project.settings')
from django.conf import settings
app = Celery('project')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
Run celery: celery -A project worker --loglevel=INFO
But nothing not happend.
you should use celery beat to run periodic task.
celery -A project worker --loglevel=INFO
starts the worker, which does the actually work.
celery -A proj beat
starts the beat service, which asks the work to do the job.

How to list the queued items in celery?

I have a Django project on an Ubuntu EC2 node, which I have been using to set up an asynchronous using Celery.
I am following http://michal.karzynski.pl/blog/2014/05/18/setting-up-an-asynchronous-task-queue-for-django-using-celery-redis/ along with the docs.
I've been able to get a basic task working at the command line, using:
(env1)ubuntu#ip-172-31-22-65:~/projects/tp$ celery --app=myproject.celery:app worker --loglevel=INFO
I just realized, that I have a bunch of tasks in my queue, that had not executed:
[2015-03-28 16:49:05,916: WARNING/MainProcess] Restoring 4 unacknowledged message(s).
(env1)ubuntu#ip-172-31-22-65:~/projects/tp$ celery -A tp purge
WARNING: This will remove all tasks from queue: celery.
There is no undo for this operation!
(to skip this prompt use the -f option)
Are you sure you want to delete all tasks (yes/NO)? yes
Purged 81 messages from 1 known task queue.
How do I get a list of the queued items from the command line?
If you want to get all scheduled tasks,
celery inspect scheduled
To find all active queues
celery inspect active_queues
For status
celery inspect stats
For all commands
celery inspect
If you want to get it explicitily.Since you are using redis as queue.Then
redis-cli
>KEYS * #find all keys
Then find out something related to celery
>LLEN KEY # i think it gives length of list
Here is a copy-paste solution for Redis:
def get_celery_queue_len(queue_name):
from yourproject.celery import app as celery_app
with celery_app.pool.acquire(block=True) as conn:
return conn.default_channel.client.llen(queue_name)
def get_celery_queue_items(queue_name):
import base64
import json
from yourproject.celery import app as celery_app
with celery_app.pool.acquire(block=True) as conn:
tasks = conn.default_channel.client.lrange(queue_name, 0, -1)
decoded_tasks = []
for task in tasks:
j = json.loads(task)
body = json.loads(base64.b64decode(j['body']))
decoded_tasks.append(body)
return decoded_tasks
It works with Django. Just don't forget to change yourproject.celery.

Categories