Celery Beat - Crontab checks both UTC and local time - python

I've been having issues in getting crontab tasks to run in my local time so what I did was create 24 tasks like the following for each hour of the day.
app.conf.beat_schedule = {
'crontab-test-8am': {
'task': 'celery.tasks.test_crontab',
'schedule': crontab(minute="0", hour="8"),
'args':('8am',)
},
}
#app.task()
def test_crontab(message):
print('CRONTAB HAS RUN - ' + message)
I eventually figured out how to have crontab correctly run in my local time with the following
app.conf.enable_utc = False
app.conf.update(timezone = "Australia/Perth") #8am AWST = 12am UTC
The problem is it's now running the tasks in both local time and UTC time.
[2022-07-22 08:00:00,000: INFO/MainProcess] Scheduler: Sending due task crontab-test-8am (celery.tasks.test_crontab)
[2022-07-22 08:00:00,010: INFO/MainProcess] Scheduler: Sending due task crontab-test-12am (celery.tasks.test_crontab)
[2022-07-22 08:00:00,015: WARNING/ForkPoolWorker-51] CRONTAB HAS RUN - 8am
[2022-07-22 08:00:00,223: WARNING/ForkPoolWorker-51] CRONTAB HAS RUN - 12am
From what I could find, celery beat stores the schedules in the celerybeat-schedule file. I deleted it and restarted celery beat but it did the same thing.
I'd like any suggestions as to how I could try fix this.

Related

How to send a task with schedule to celery while worker is running?

I've problem to creating and adding beat schedule task to the celery worker and running that when the celery worker is currently running
well this is my code
from celery import Celery
from celery.schedules import crontab
app = Celery('proj2')
app.config_from_object('proj2.celeryconfig')
#app.task
def hello() -> None:
print(f'Hello')
app.add_periodic_task(10.0, hello)
so when i am running the celery
$ celery -A proj2 worker --beat -E
this works very well and the task will run every 10 seconds
But
I need to create a task with schedule and sending that to the worker to run it automatically
Imagine i have run the worker celery and every things is ok
well i go to the python shell and do following code
>>> from proj2.celery import app
>>>
>>> app.conf.beat_schedule = {
... 'hello-every-5': {
... 'tasks': 'proj2.tasks.bye' # i have a task called bye in tasks module
... 'schedule': 5.0,
... }
...}
>>>
it's not working. Also there is no error
but it seems does not send to the worker
p.s: i have also used add_periodic_task method. still not working

Celery jobs not running on heroku (python/django app)

I have a Django app setup with some scheduled tasks. The app is deployed on Heroku with Redis. The task runs if invoked synchronously in the console, or locally when I also have redis and celery running. However, the scheduled jobs are not running on Heroku.
My task:
#shared_task(name="send_emails")
def send_emails():
.....
celery.py:
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
from celery.schedules import crontab
# set the default Django settings module for the 'celery' program.
# this is also used in manage.py
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'my_app.settings')
# Get the base REDIS URL, default to redis' default
BASE_REDIS_URL = os.environ.get('REDIS_URL', 'redis://localhost:6379')
app = Celery('my_app')
# Using a string here means the worker don't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
app.conf.broker_url = BASE_REDIS_URL
# this allows you to schedule items in the Django admin.
app.conf.beat_scheduler = 'django_celery_beat.schedulers.DatabaseScheduler'
# These are the scheduled jobs
app.conf.beat_schedule = {
'send_emails_crontab': {
'task': 'send_emails',
'schedule': crontab(hour=9, minute=0),
'args': (),
}
}
In Procfile:
worker: celery -A my_app worker --beat -S django -l info
I've spun up the worker with heroku ps:scale worker=1 -a my-app.
I can see the registered tasks under [tasks] in the worker logs.
However, the scheduled tasks are not running at their scheduled time. Calling send_emails.delay() in the production console does work.
How do I get the worker to stay alive and / or run the job at the scheduled time?
I have a workaround using a command and heroku scheduler. Just unsure if that's the best way to do it.
If you're on free demo, you should know that heroku server sleeps and if your scheduled task becomes due when your server is sleeping, it won't run.
I share you any ideas.
Run console and get the datetime of Dyno. The Dyno use a localtime US.
The DynoFree sleeps each 30 minutes and only 450 hours/month.
Try change celery to BackgroundScheduler,
you need add a script clock.py as:
from myapp import myfunction
from datetime import datetime
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.schedulers.blocking import BlockingScheduler
from time import monotonic, sleep, ctime
import os
sched = BlockingScheduler()
hour = int(os.environ.get("SEARCH_HOUR"))
minutes = int(os.environ.get("SEARCH_MINUTES"))
#sched.scheduled_job('cron', day_of_week='mon-sun', hour=hour, minute = minutes)
def scheduled_job():
print('This job: Execute myfunction every at ', hour, ':', minutes)
#My function
myfunction()
sched.start(
)
In Procfile:
clock: python clock.py
and run:
heroku ps:scale clock=1 --app thenameapp
Regards.

Why does Celery periodic tasks fire a function only once

I've built a small web scraper function to get some data from the web and populate it to my db which works just well.
Now I would like to fire this function periodically every 20 seconds using Celery periodic tasks.
I walked through the docu and everything seems to be set up for development (using redis as broker).
This is my tasks.py file in project/stocksapp where my periodically fired functions are:
# Celery imports
from celery.task.schedules import crontab
from celery.decorators import periodic_task
from celery.utils.log import get_task_logger
from datetime import timedelta
logger = get_task_logger(__name__)
# periodic functions
#periodic_task(
run_every=(timedelta(seconds=20)),
name="getStocksDataDax",
ignore_result=True
)
def getStocksDataDax():
print("fired")
Now when I start the worker, the function seems to be fired once and only once (the database gets populated). But after that, the function doesn't get fired anymore, although the console suggests it:
C:\Users\Jonas\Desktop\CFD\CFD>celery -A CFD beat -l info
celery beat v4.4.2 (cliffs) is starting.
__ - ... __ - _
LocalTime -> 2020-05-15 23:06:29
Configuration ->
. broker -> redis://localhost:6379/0
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]#%INFO
. maxinterval -> 5.00 minutes (300s)
[2020-05-15 23:06:29,990: INFO/MainProcess] beat: Starting...
[2020-05-15 23:06:30,024: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:06:50,015: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:07:10,015: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:07:30,015: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:07:50,015: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:08:10,016: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:08:30,016: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:08:50,016: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
project/project/celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'CFD.settings')
app = Celery('CFD',
broker='redis://localhost:6379/0',
backend='amqp://',
include=['CFD.tasks'])
app.conf.broker_transport_options = {'visibility_timeout': 3600}
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
The function itself runs about 1 second totally.
Where could basically be an issue in this setup to make the worker/celery fire the function every 20 seconds as supposed to?
celery -A CFD beat -l info only starts the Celery beat process. You should have a separate Celery worker process - in a different terminal run something like celery -A CFD worker -c 8 -O fair -l info.

Duplicated tasks after time change

I don't know exactly why, but I am getting duplicated tasks. I thing this may be related with time change of the last weekend (The clock was delayed for an hour in the system).
The first task should not be executed, since I say explicitly hour=2. Any idea why this happens?
[2017-11-01 01:00:00,001: INFO/Beat] Scheduler: Sending due task every-first-day_month (app.users.views.websites_down)
[2017-11-01 02:00:00,007: INFO/Beat] Scheduler: Sending due task every-first-day_month (app.users.views.websites_down)
from celery.schedules import crontab
CELERYBEAT_SCHEDULE = {
'every-first-day_month': {
'task': 'app.users.views.websites_down',
'schedule': crontab(hour=2, minute=0, day_of_month=1),
}
}
CELERY_TIMEZONE = "Europe/Lisbon"

Celery multi workers unexpected task execution order

I run celery:
celery multi start --app=myapp fast_worker
slow_worker
-Q:fast_worker fast-queue
-Q:slow_worker slow-queue
-c:fast_worker 1 -c:slow_worker 1
--logfile=%n.log --pidfile=%n.pid
And celerybeat:
celery beat -A myapp
Task:
#task.periodic_task(run_every=timedelta(seconds=5), ignore_result=True)
def test_log_task_queue():
import time
time.sleep(10)
print "test_log_task_queue"
Routing:
CELERY_ROUTES = {
'myapp.tasks.test_log_task_queue': {
'queue': 'slow-queue',
'routing_key': 'slow-queue',
},
}
I use rabbitMQ. When I open rabbitMQ admin panel, I see that my tasks are in slow-queue, but when I open logs I see task output for both workers. Why do both workers execute my tasks, even when task not in worker queue?
It looks like celery multi creates something like shared queues. To fix this problem, I added -X option:
celery multi start --app=myapp fast_worker
slow_worker
-Q:fast_worker fast-queue
-Q:slow_worker slow-queue
-X:fast_worker slow-queue
-X:slow_worker fast-queue
-c:fast_worker 1 -c:slow_worker 1
--logfile=%n.log --pidfile=%n.pid

Categories