How to start remote celery workers from django - python

I'm trying to use django in combination with celery.
Therefore I came across autodiscover_tasks() and I'm not fully sure on how to use them. The celery workers get tasks added by other applications (in this case a node backend).
So far I used this to start the worker:
celery worker -Q extraction --hostname=extraction_worker
which works fine.
Now I'm not sure what the general idea of the django-celery integration is. Should workers still be started from external (e.g. with the command above), or should they be managed and started from the django application?
My celery.py looks like:
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'main.settings')
app = Celery('app')
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
then I have 2 apps containing a tasks.py file with:
#shared_task
def extraction(total):
return 'Task executed'
how can I now register django to register the worker for those tasks?

You just start worker process as documented, you don't need to register anything else
In a production environment you’ll want to run the worker in the
background as a daemon - see Daemonization - but for testing and
development it is useful to be able to start a worker instance by
using the celery worker manage command, much as you’d use Django’s
manage.py runserver:
celery -A proj worker -l info
For a complete listing of the command-line options available, use the
help command:
celery help
celery worker collects/registers task when it runs and also consumes tasks which it found out

Related

How do celery workers communicate in Heroku

I have some celery workers in a Heroku app. My app is using python3.6and django, these are the relevant dependencies and their versions:
celery==3.1.26.post2
redis==2.10.3
django-celery==3.2.2
I do not know if the are useful to this question, but just in case. On Heroku we are running the Heroku-18 stack.
As it's usual, we have our workers declared in a Procfile, with the following content:
web: ... our django app ....
celeryd: python manage.py celery worker -Q celery --loglevel=INFO -O fair
one_type_of_worker: python manage.py celery worker -Q ... --maxtasksperchild=3 --loglevel=INFO -O fair
another_type: python manage.py celery worker -Q ... --maxtasksperchild=3 --loglevel=INFO -O fair
So, my current understanding of this process is the following:
Our celery queues run on multiple workers, each worker runs as a dyno on Heroku (not a server, but a “worker process” kind of thing, since servers aren’t a concept on Heroku). We also have multiple dynos running the same celery worker with the same queue, which results in multiple parallel “threads” for that queue to run more tasks simultaneously (scalability).
The web workers, celery workers, and celery queues can talk to each other because celery manages the orchestration between them. I think it's specifically the broker that handles this responsibility. But for example, this lets our web workers schedule a celery task on a specific queue and it is routed to the correct queue/worker, or a task running in one queue/worker can schedule a task on a different queue/worker.
Now here is when comes my question, so does the worker communicate? Do they use an API endpoint in localhost with a port? RCP? Do they use the broker url? Magic?
I'm asking this because I'm trying to replicate this setup in ECS and I need to know how to set it up for celery.
Here you go to know how celery works at heroku: https://devcenter.heroku.com/articles/celery-heroku
You can't run celery on Heroku without getting a Heroku dyno for celery. Also, make sure you have Redis configured on your Django celery settings.
to run the celery on Heroku, you just add this line to your Procfile
worker: celery -A YOUR-PROJECT_NAME worker -l info -B
Note: above celery commands will run both celery worker and celery beat
If you want to run it separately, you can use separate commands but one command is recommended

Flask Error: Unable to load celery application

Please help me to get out of this problem I am getting this when I am running
celery -A app.celery worker --loglevel=info
Error:
Unable to load celery application.
The module app.celery was not found.
My code is--
# Celery Configuration
from celery import Celery
from app import app
print("App Name=",app.import_name)
celery=Celery(app.name,broker=app.config['CELERY_BROKER_URL'])
celery.conf.update(app.config)
#celery.task
def download_content():
return "hello"
directory structure--
newYoutube/app/auth/routes.py and this function is present inside routes.py
auth is blueprint.
When invoking celery via
celery -A app.celery ...
celery will look for the name celery in the app namespace, expecting it to hold an instance of Celery. If you put that elsewhere (say, in app.auth.routes), then celery won't find it.
I have a working example you can crib from at https://github.com/davewsmith/flask-celery-starter
Or, refer to chapter 22 of the Flask Mega Tutorial, which uses rx instead of celery, but the general approach to structuring the code is similar.

django with celery: how to set up periodic tasks with admin interface

I have a problem wit setting up periodic tasks with celery.
I got the scheduler running by:
celery -A myproject beat -l info --scheduler
django_celery_beat.schedulers:DatabaseScheduler
It seems as if the scheduler is up and running my task:
In the admin interface, I can see/edit the task:
But it does nothing. IMHO, the file myproject.backend.tasks.importnewvideo.py should be executed.
But it is not.
In the celery manual, I could not find any further information how to set up a task with the admin interface.
Any ideas?
Thanks in advance.

Running two celery workers in a server for two django application

I have a server in which two django application are running appone, apptwo
for them, two celery workers are started with commands:
celery worker -A appone -B --loglevel=INFO
celery worker -A apptwo -B --loglevel=INFO
Both points to same BROKER_URL = 'redis://localhost:6379'
redis is setup with db 0 and 1
I can see the task configured in these two apps in both app's log, which is leading to warnings and errors.
Can we configure in django settings such that the celery works exclusively without interfering with each other's tasks?
You can route tasks to different queues. Start Celery with two different -Q myqueueX and then use different CELERY_DEFAULT_QUEUE in your two Django projects.
Depending on your Celery configuration, your Django setting should look something like:
CELERY_DEFAULT_QUEUE = 'myqueue1'
You can also have more fine grained control with:
#celery.task(queue="myqueue3")
def some_task(...):
pass
More options here:
How to keep multiple independent celery queues?

Celery Tasks Not Being Processed

I'm trying to process some tasks using celery, and I'm not having too much luck. I'm running celeryd and celerybeat as daemons. I have a tasks.py file that look like this with a simple app and task defined:
from celery import Celery
app = Celery('tasks', broker='amqp://user:pass#hostname:5672/vhostname')
#app.task
def process_file(f):
# do some stuff
# and log results
And this file is referenced from another file process.py I use to monitor for file changes that looks like:
from tasks import process_file
file_name = '/file/to/process'
result = process_file.delay(file_name)
result.get()
And with that little code, celery is unable to see tasks and process them. I can execute similar code in the python interpreter and celery processes them:
py >>> from tasks import process_file
py >>> process_file.delay('/file/to/process')
<AsyncResult: 8af23a4e-3f26-469c-8eee-e646b9d28c7b>
When I run the tasks from the interpreter however, beat.log and worker1.log don't show any indication that the tasks were received, but using logging I can confirm the task code was executed. There are also no obvious errors in the .log files. Any ideas what could be causing this problem?
My /etc/default/celerybeat looks like:
CELERY_BIN="/usr/local/bin/celery"
CELERYBEAT_CHDIR="/opt/dirwithpyfiles"
CELERYBEAT_OPTS="--schedule=/var/run/celery/celerybeat-schedule"
And /etc/default/celeryd:
CELERYD_NODES="worker1"
CELERY_BIN="/usr/local/bin/celery"
CELERYD_CHDIR="/opt/dirwithpyfiles"
CELERYD_OPTS="--time-limit=300 --concurrency=8"
CELERYD_USER="celery"
CELERYD_GROUP="celery"
CELERYD_LOG_FILE="/var/log/celery/%N.log"
CELERYD_PID_FILE="/var/run/celery/%N.pid"
CELERY_CREATE_DIRS=1
So I figured out my issue here by running celery from the cli instead of as a daemon, enabling me to see more detailed output of errors that happened. I did this by running:
user#hostname /opt/dirwithpyfiles $ su celery
celery#hostname /opt/dirwithpyfiles $ celery -A tasks worker --loglevel=info
There I could see that a permissions issue was happening as the celery user that did not happen when I ran the commands from the python interpreter as my normal user. I fixed this by changing the permissions of /file/to/process so that both users could read from it.

Categories