I have a Django app setup with some scheduled tasks. The app is deployed on Heroku with Redis. The task runs if invoked synchronously in the console, or locally when I also have redis and celery running. However, the scheduled jobs are not running on Heroku.
My task:
#shared_task(name="send_emails")
def send_emails():
.....
celery.py:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
from celery.schedules import crontab
# set the default Django settings module for the 'celery' program.
# this is also used in manage.py
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'my_app.settings')
# Get the base REDIS URL, default to redis' default
BASE_REDIS_URL = os.environ.get('REDIS_URL', 'redis://localhost:6379')
app = Celery('my_app')
# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
app.conf.broker_url = BASE_REDIS_URL
# this allows you to schedule items in the Django admin.
app.conf.beat_scheduler = 'django_celery_beat.schedulers.DatabaseScheduler'
# These are the scheduled jobs
app.conf.beat_schedule = {
'send_emails_crontab': {
'task': 'send_emails',
'schedule': crontab(hour=9, minute=0),
'args': (),
}
}
In Procfile:
worker: celery -A my_app worker --beat -S django -l info
I've spun up the worker with heroku ps:scale worker=1 -a my-app.
I can see the registered tasks under [tasks] in the worker logs.
However, the scheduled tasks are not running at their scheduled time. Calling send_emails.delay() in the production console does work.
How do I get the worker to stay alive and / or run the job at the scheduled time?
I have a workaround using a command and heroku scheduler. Just unsure if that's the best way to do it.
If you're on free demo, you should know that heroku server sleeps and if your scheduled task becomes due when your server is sleeping, it won't run.
I share you any ideas.
Run console and get the datetime of Dyno. The Dyno use a localtime US.
The DynoFree sleeps each 30 minutes and only 450 hours/month.
Try change celery to BackgroundScheduler,
you need add a script clock.py as:
from myapp import myfunction
from datetime import datetime
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.schedulers.blocking import BlockingScheduler
from time import monotonic, sleep, ctime
import os
sched = BlockingScheduler()
hour = int(os.environ.get("SEARCH_HOUR"))
minutes = int(os.environ.get("SEARCH_MINUTES"))
#sched.scheduled_job('cron', day_of_week='mon-sun', hour=hour, minute = minutes)
def scheduled_job():
print('This job: Execute myfunction every at ', hour, ':', minutes)
#My function
myfunction()
sched.start(
)
In Procfile:
clock: python clock.py
and run:
heroku ps:scale clock=1 --app thenameapp
Regards.
Related
I've problem to creating and adding beat schedule task to the celery worker and running that when the celery worker is currently running
well this is my code
from celery import Celery
from celery.schedules import crontab
app = Celery('proj2')
app.config_from_object('proj2.celeryconfig')
#app.task
def hello() -> None:
print(f'Hello')
app.add_periodic_task(10.0, hello)
so when i am running the celery
$ celery -A proj2 worker --beat -E
this works very well and the task will run every 10 seconds
But
I need to create a task with schedule and sending that to the worker to run it automatically
Imagine i have run the worker celery and every things is ok
well i go to the python shell and do following code
>>> from proj2.celery import app
>>>
>>> app.conf.beat_schedule = {
... 'hello-every-5': {
... 'tasks': 'proj2.tasks.bye' # i have a task called bye in tasks module
... 'schedule': 5.0,
... }
...}
>>>
it's not working. Also there is no error
but it seems does not send to the worker
p.s: i have also used add_periodic_task method. still not working
I'm creating a web app that takes username as input and perform some tasks with it. I'm using flask for the web and heroku to deploy it.I've two main python scripts these run the app. app.py and task.py. Everything is okey into my app.py file, where i used flask code for making the app but into my task.py file I've a
Initialisation step
from instabot import Bot
bot = Bot()
bot.login(username="myusername", password="mypassword")
now I deployed these script into the Procfile like
worker: python task.py
web: gunicorn app:app
but after deploying when i check it's log I'm getting
logs
2022-01-28T18:30:06.376795+00:00
heroku[worker.1]: Starting process
with command `python task.py`
2022-01-
28T18:30:07.081356+00:00
heroku[worker.1]: State changed
from starting to up
2022-01-
28T18:30:08.213687+00:00
heroku[worker.1]: Process exited
with status 0
2022-01-
28T18:30:08.274498+00:00
heroku[worker.1]: State changed
from up to crashed
as you can see it's getting crashed!! i don't know what's my mistake is can you please help me to figure out : (
The code in my worker is almost stolen from the redis-rq documentation, with some changes to make it work also in our local development enviroment:
import os
from rq import Queue, Connection
from tools import get_redis_store
# we need to use a different worker when we are in Heroku
if 'DYNO' in os.environ:
from rq.worker import HerokuWorker as Worker
else:
from rq import Worker
redis_url = os.getenv('REDIS_URL')
if not redis_url:
raise RuntimeError('Set up Redis first.')
listen = ['default']
conn = get_redis_store()
if __name__ == '__main__':
with Connection(conn):
worker = Worker(map(Queue, listen))
worker.work()
On tools.py
_redis_store = None
def get_redis_store():
'''
Get a connection pool to redis based on the url configured
on env variable REDIS_URL
Returns
-------
redis.ConnectionPool
'''
global _redis_store
if not _redis_store:
redis_url = os.getenv('REDIS_URL')
if redis_url:
logger.debug('starting redis: %s ' % redis_url)
if redis_url.startswith('rediss://'):
# redis 6 encripted connection
# heroku needs disable ssl verification
_redis_store = redis.from_url(
redis_url, ssl_cert_reqs=None)
else:
_redis_store = redis.from_url(redis_url)
else:
logger.debug('redis not configured')
return _redis_store
On the main app, to queue a task:
from rq import Queue
from tools import get_redis_store
redis_store = get_redis_store()
queue = Queue(connection=redis_store)
def _slow_task(param1, param2, paramn):
# here I execute the slow task
my_slow_code(param1, param2)
# when I need to execute the slow task
queue.enqueue(_slow_task, param1, param2)
There are multiple options to use redis on Heroku, depending on what you use, you could need to change the env variable REDIS_URL. The environment variable DYNO is checked to know if the code is running on Heroku or in a developer machine.
I'm using celery and celery-beat without Django and I have a task which needs to modify celery-beat schedule when run.
Now I have the following code (module called celery_tasks):
# __init__.py
from .celery import app as celery_app
__all__ = ['celery_app']
#celery.py
from celery import Celery
import config
celery_config = config.get_celery_config()
app = Celery(
__name__,
include=[
'celery_tasks.tasks',
],
)
app.conf.update(celery_config)
# tasks.py
from celery_tasks import celery_app
from celery import shared_task
#shared_task
def start_game():
celery_app.conf.beat_schedule = {
'process_round': {
'task': 'celery_tasks.tasks.process_round',
'schedule': 5,
},
}
I start celery with the following command:
celery worker -A celery_tasks -E -l info --beat
start_game executes and exists normally, but beat process_round task never runs.
How can I force-reload beat schedule (restarting all workers doesn't seem as a good idea)?
the problem with normal celery backend when you start the celerybeat process. it will create a config file and write all tasks and schedules in to that file so it cannot change dynamically
you can use the package
celerybeat-sqlalchemy-scheduler so you can edit schedule on DB itself so that celerybeat will pickup the new schedule from DB itself
also there is another package celery-redbeat which using redis-server as backend
you can refer this this also
Using schedule config also seems bad idea. What if initially process_round task will be active and check if game is not started task just do nothing.
I am trying to schedule a task that runs every 10 minutes using Django 1.9.8, Celery 4.0.2, RabbitMQ 2.1.4, Redis 2.10.5. These are all running within Docker containers in Linux (Fedora 25). I have tried many combinations of things that I found in Celery docs and from this site. The only combination that has worked thus far is below. However, it only runs the periodic task initially when the application starts, but the schedule is ignored thereafter. I have absolutely confirmed that the scheduled task does not run again after the initial time.
My (almost-working) setup that only runs one-time:
settings.py:
INSTALLED_APPS = (
...
'django_celery_beat',
...
)
BROKER_URL = 'amqp://{user}:{password}#{hostname}/{vhost}/'.format(
user=os.environ['RABBIT_USER'],
password=os.environ['RABBIT_PASS'],
hostname=RABBIT_HOSTNAME,
vhost=os.environ.get('RABBIT_ENV_VHOST', '')
# We don't want to have dead connections stored on rabbitmq, so we have to negotiate using heartbeats
BROKER_HEARTBEAT = '?heartbeat=30'
if not BROKER_URL.endswith(BROKER_HEARTBEAT):
BROKER_URL += BROKER_HEARTBEAT
BROKER_POOL_LIMIT = 1
BROKER_CONNECTION_TIMEOUT = 10
# Celery configuration
# configure queues, currently we have only one
CELERY_DEFAULT_QUEUE = 'default'
CELERY_QUEUES = (
Queue('default', Exchange('default'), routing_key='default'),
)
# Sensible settings for celery
CELERY_ALWAYS_EAGER = False
CELERY_ACKS_LATE = True
CELERY_TASK_PUBLISH_RETRY = True
CELERY_DISABLE_RATE_LIMITS = False
# By default we will ignore result
# If you want to see results and try out tasks interactively, change it to False
# Or change this setting on tasks level
CELERY_IGNORE_RESULT = True
CELERY_SEND_TASK_ERROR_EMAILS = False
CELERY_TASK_RESULT_EXPIRES = 600
# Set redis as celery result backend
CELERY_RESULT_BACKEND = 'redis://%s:%d/%d' % (REDIS_HOST, REDIS_PORT, REDIS_DB)
CELERY_REDIS_MAX_CONNECTIONS = 1
# Don't use pickle as serializer, json is much safer
CELERY_TASK_SERIALIZER = "json"
CELERY_RESULT_SERIALIZER = "json"
CELERY_ACCEPT_CONTENT = ['application/json']
CELERYD_HIJACK_ROOT_LOGGER = False
CELERYD_PREFETCH_MULTIPLIER = 1
CELERYD_MAX_TASKS_PER_CHILD = 1000
celeryconf.py
coding=UTF8
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "web_portal.settings")
app = Celery('web_portal')
CELERY_TIMEZONE = 'UTC'
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
tasks.py
from celery.schedules import crontab
from .celeryconf import app as celery_app
#celery_app.on_after_finalize.connect
def setup_periodic_tasks(sender, **kwargs):
# Calls email_scanner every 10 minutes
sender.add_periodic_task(
crontab(hour='*',
minute='*/10',
second='*',
day_of_week='*',
day_of_month='*'),
email_scanner.delay(),
)
#app.task
def email_scanner():
dispatch_list = scanning.email_scan()
for dispatch in dispatch_list:
validate_dispatch.delay(dispatch)
return
run_celery.sh -- Used to start celery tasks from docker-compose.yml
#!/bin/sh
# wait for RabbitMQ server to start
sleep 10
cd web_portal
# run Celery worker for our project myproject with Celery configuration stored in Celeryconf
su -m myuser -c "celery beat -l info --pidfile=/tmp/celerybeat-web_portal.pid -s /tmp/celerybeat-schedule &"
su -m myuser -c "celery worker -A web_portal.celeryconf -Q default -n default#%h"
I have also tried using a CELERYBEAT_SCHEDULER in the settings.py in lieu of the #celery_app.on_after finalize_connect decorator and block in tasks.py, but the scheduler never ran even once.
settings.py (not working at all scenario)
(same as before except also including the following)
CELERYBEAT_SCHEDULE = {
'email-scanner-every-5-minutes': {
'task': 'tasks.email_scanner',
'schedule': timedelta(minutes=10)
},
}
The Celery 4.0.2 documentation online presumes that I should instinctively know many givens, but I am new in this environment. If anybody knows where I can find a tutorial OTHER THAN docs.celeryproject.org and http://django-celery-beat.readthedocs.io/en/latest/ which both assume that I am already a Django master, I would be grateful. Or let me know of course if you see something obviously wrong in my setup. Thanks!
I found a solution that works. I could not get CELERYBEAT_SCHEDULE or the celery task decorators to work, and I suspect that it may be at least partially due with the manner in which I started the Celery beat task.
The working solution goes the whole 9 yards to utilize Django Database Scheduler. I downloaded the GitHub project "https://github.com/celery/django-celery-beat" and incorporated all of the code as another "app" in my project. This enabled Django-Admin access to maintain the cron / interval / periodic task(s) tables via a browser. I also modified my run_celery.sh as follows:
#!/bin/sh
# wait for RabbitMQ server to start
sleep 10
# run Celery worker for our project myproject with Celery configuration stored in Celeryconf
celery beat -A web_portal.celeryconf -l info --pidfile=/tmp/celerybeat- web_portal.pid -S django --detach
su -m myuser -c "celery worker -A web_portal.celeryconf -Q default -n default#%h -l info "
After adding a scheduled task via the django-admin web interface, the scheduler started working fine.
Celery docs say that Celery 3.1 can work with django out of box. But tasks not working. I have tasks.py:
from celery import task
from datetime import timedelta
#task.periodic_task(run_every=timedelta(seconds=20), ignore_result=True)
def disable_not_confirmed_users():
print "start"
Configs:
from kombu import Exchange, Queue
CELERY_SEND_TASK_ERROR_EMAILS = True
BROKER_URL = 'amqp://guest#localhost//'
CELERY_DEFAULT_QUEUE = 'project-queue'
CELERY_DEFAULT_EXCHANGE = 'project-queue'
CELERY_DEFAULT_ROUTING_KEY = 'project-queue'
CELERY_QUEUES = (
Queue('project-queue', Exchange('project-queue'), routing_key='project-queue'),
)
project/celery.py
from future import absolute_import
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'project.settings')
from django.conf import settings
app = Celery('project')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
Run celery: celery -A project worker --loglevel=INFO
But nothing not happend.
you should use celery beat to run periodic task.
celery -A project worker --loglevel=INFO
starts the worker, which does the actually work.
celery -A proj beat
starts the beat service, which asks the work to do the job.