I have a Django application that I've deployed with Heroku. I'm trying to user celery to create a periodic task every minute. However, when I observe the logs for the worker using the following command:
heroku logs -t -p worker
I don't see my task being executed. Perhaps there is a step I'm missing? This is my configuration below...
Procfile
web: gunicorn activiist.wsgi --log-file -
worker: celery worker --app=trending.tasks.app
Tasks.py
import celery
app = celery.Celery('activiist')
import os
from celery.schedules import crontab
from celery.task import periodic_task
from django.conf import settings
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
app.conf.update(BROKER_URL=os.environ['REDIS_URL'],
CELERY_RESULT_BACKEND=os.environ['REDIS_URL'])
os.environ['DJANGO_SETTINGS_MODULE'] = 'activiist.settings'
from trending.views import *
#periodic_task(run_every=crontab())
def add():
getarticles(30)
One thing to add. When I run the task using the python shell and the "delay()" command, the task does indeed run (it shows in the logs) -- but it only runs once and only when executed.
You need separate worker for the beat process (which is responsible for executing periodic tasks):
web: gunicorn activiist.wsgi --log-file -
worker: celery worker --app=trending.tasks.app
beat: celery --app=trending.tasks.app
Worker isn't necessary for periodic tasks so the relevant line can be omitted. The other possibility is to embed beat inside the worker:
web: gunicorn activiist.wsgi --log-file -
worker: celery worker --app=trending.tasks.app -B
but to quote the celery documentation:
You can also start embed beat inside the worker by enabling workers -B option, this is convenient if you will never run more than one worker node, but it’s not commonly used and for that reason is not recommended for production use
Related
I have some celery workers in a Heroku app. My app is using python3.6and django, these are the relevant dependencies and their versions:
celery==3.1.26.post2
redis==2.10.3
django-celery==3.2.2
I do not know if the are useful to this question, but just in case. On Heroku we are running the Heroku-18 stack.
As it's usual, we have our workers declared in a Procfile, with the following content:
web: ... our django app ....
celeryd: python manage.py celery worker -Q celery --loglevel=INFO -O fair
one_type_of_worker: python manage.py celery worker -Q ... --maxtasksperchild=3 --loglevel=INFO -O fair
another_type: python manage.py celery worker -Q ... --maxtasksperchild=3 --loglevel=INFO -O fair
So, my current understanding of this process is the following:
Our celery queues run on multiple workers, each worker runs as a dyno on Heroku (not a server, but a “worker process” kind of thing, since servers aren’t a concept on Heroku). We also have multiple dynos running the same celery worker with the same queue, which results in multiple parallel “threads” for that queue to run more tasks simultaneously (scalability).
The web workers, celery workers, and celery queues can talk to each other because celery manages the orchestration between them. I think it's specifically the broker that handles this responsibility. But for example, this lets our web workers schedule a celery task on a specific queue and it is routed to the correct queue/worker, or a task running in one queue/worker can schedule a task on a different queue/worker.
Now here is when comes my question, so does the worker communicate? Do they use an API endpoint in localhost with a port? RCP? Do they use the broker url? Magic?
I'm asking this because I'm trying to replicate this setup in ECS and I need to know how to set it up for celery.
Here you go to know how celery works at heroku: https://devcenter.heroku.com/articles/celery-heroku
You can't run celery on Heroku without getting a Heroku dyno for celery. Also, make sure you have Redis configured on your Django celery settings.
to run the celery on Heroku, you just add this line to your Procfile
worker: celery -A YOUR-PROJECT_NAME worker -l info -B
Note: above celery commands will run both celery worker and celery beat
If you want to run it separately, you can use separate commands but one command is recommended
I'm trying to use django in combination with celery.
Therefore I came across autodiscover_tasks() and I'm not fully sure on how to use them. The celery workers get tasks added by other applications (in this case a node backend).
So far I used this to start the worker:
celery worker -Q extraction --hostname=extraction_worker
which works fine.
Now I'm not sure what the general idea of the django-celery integration is. Should workers still be started from external (e.g. with the command above), or should they be managed and started from the django application?
My celery.py looks like:
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'main.settings')
app = Celery('app')
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
then I have 2 apps containing a tasks.py file with:
#shared_task
def extraction(total):
return 'Task executed'
how can I now register django to register the worker for those tasks?
You just start worker process as documented, you don't need to register anything else
In a production environment you’ll want to run the worker in the
background as a daemon - see Daemonization - but for testing and
development it is useful to be able to start a worker instance by
using the celery worker manage command, much as you’d use Django’s
manage.py runserver:
celery -A proj worker -l info
For a complete listing of the command-line options available, use the
help command:
celery help
celery worker collects/registers task when it runs and also consumes tasks which it found out
I have task
class BasecrmSync(PeriodicTask):
run_every = schedules.crontab(minute='*/1')
def run(self, **kwargs):
bc = basecrm.Client(access_token=settings.BASECRM_AUTH_TOKEN)
sync = basecrm.Sync(client=bc, device_uuid=settings.BASECRM_DEVICE_UUID)
sync.fetch(synchronize)
And celery config with db broker
CELERY_RESULT_BACKEND='djcelery.backends.database:DatabaseBackend'
BROKER_URL = 'django://'
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
I run
celery -A renuval_api worker -B --loglevel=debug
But it doesn't run task...
Also I've tried run by
python3 manage.py celery worker --loglevel=DEBUG -E -B -c 1 --settings=renuval_api.settings.local
But It uses amqp transport and I can't understand why.
I run a separate process for the beat function itself. I could never get periodic tasks to fire otherwise. Of course, I may have this completely wrong, but it works for me and has for some time.
For example, I have the celery worker with its app running in one process:
celery worker --app=celeryapp:app -l info --logfile="/var/log/celery/worker.log"
And I have the beat pointed to the same app in its own process:
celery --app=celeryapp:app beat
They are pointed at the same app and settings, and beat fires off the task which the worker picks up and does. This app is in the same code tree as my Django apps, but the processes are not running in Django. Perhaps you could run something like:
python3 manage.py celery beat --loglevel=DEBUG -E -B -c 1 --settings=renuval_api.settings.local
I hope that helps.
I'm stuck with running celery 3.1.17 on windows 7 (and later on 2013 server) using redis as backend.
In my celery.py file I defined an app with one scheudled task
app = Celery('myapp',
backend='redis://localhost',
broker='redis://localhost',
include=['tasks']
)
app.conf.update(
CELERYBEAT_SCHEDULE = {
'dumdum': {
'task': 'tasks.dumdum',
'schedule': timedelta(seconds=5),
}
}
)
The task is writing a line to a file
#app.task
def dumdum():
with open('c:/src/dumdum.txt','w') as f:
f.write('dumdum actually ran !')
Running the beat service from the command line
(venv) celery beat -A tasks
celery beat v3.1.17 (Cipater) is starting.
__ - ... __ - _
Configuration ->
. broker -> redis://localhost:6379/1
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]#%INFO
. maxinterval -> now (0s)
[2015-03-15 10:50:33,265: INFO/MainProcess] beat: Starting...
[2015-03-15 10:50:35,496: INFO/MainProcess] Scheduler: Sending due task dumdum (tasks.dumdum)
[2015-03-15 10:50:40,513: INFO/MainProcess] Scheduler: Sending due task dumdum (tasks.dumdum)
Looks promising, BUT NOTHING HAPPENS. Nothing is being writen to the file.
The celery documentation on runnig beat on windows reference this article from 2011. The article explains how to run celeryd as a scheduler task on windows. celeryd has been deprecated since and the command stated in the article is no longer working (there is no celery.bin.celeryd module).
So, What is the solution here ?
Thanks.
I used following command to run celery beat on windows:
python manage.py celery beat
after following these steps for installation:
Run celery beat on windows
it worked for me perfectly fine!
Celery beat and celery worker can not run with same project, as celery v4.0 stop supporting win for celery worker and celery beat.
one thing you can do, suppose your project name is a recommendation_system and your project hierarchy is as below:
recommendation_system
--your_app_dir
--main.py #or manage.py
--app.py
and you define scheduler(for beat) and worker fun in main.py, then you have to make a copy of this project let's say named recommendation_system_beat.
now to run workers you need to go to inside the recommendation_system directory then run cmd as :
python.exe -m celery -A main worker --pool=solo --concurrency=5 --loglevel=info -n main.%h --queues=recommendation
where the recommendation parameter is the queue name. set concurrency no according to your need.
this will run your workers. but beat will not run, to run beat
now got to recommendation_system_beat and run the following cmd:
python.exe -m celery -A main beat --loglevel=info
this will run all you beat (scheduler)
so ultimately you need to run worker and beat in two different repo
How do you diagnose why manage.py celerybeat won't execute any tasks?
I'm running celerybeat via supervisord with the command:
/usr/local/myapp/src/manage.py celerybeat --schedule=/tmp/celerybeat-schedule-myapp --pidfile=/tmp/celerybeat-myapp.pid --loglevel=INFO
Supervisord appears to run celerybeat just fine, and the log file shows:
[2013-06-12 13:17:12,540: INFO/MainProcess] Celerybeat: Starting...
[2013-06-12 13:17:12,571: WARNING/MainProcess] Reset: Account for new __version__ field
[2013-06-12 13:17:12,571: WARNING/MainProcess] Reset: Account for new tz field
[2013-06-12 13:17:12,572: WARNING/MainProcess] Reset: Account for new utc_enabled field
I have several periodic tasks showing as enabled on http://localhost:8000/admin/djcelery/periodictask which should run every few minutes. However, the celerybeat log never shows anything being executed. Why would this be?
celerybeat will just schecdule task, wont execute it.
To execute task you need to also start worker. You can start celery beat as well as worker together.
I use "celeryd -B"
In your case it should look like:
/usr/local/myapp/src/manage.py celery worker --beat
--schedule=/tmp/celerybeat-schedule-myapp --pidfile=/tmp/celerybeat-myapp.pid --loglevel=INFO
or
/usr/local/myapp/src/manage.py celeryd -B
--schedule=/tmp/celerybeat-schedule-myapp --pidfile=/tmp/celerybeat-myapp.pid --loglevel=INFO
We recently upgraded from celery 4 to celery 5.
Apparently the -l flag has been removed, or re-named?
Works in celery4, but not celery 5:
celery -A pm -l info beat
Remove -l :
celery -A pm beat