I am starting celery via supervisord, see the entry below.
[program:celery]
user = foobar
autostart = true
autorestart = true
directory = /opt/src/slicephone/cloud
command = /opt/virtenvs/django_slice/bin/celery beat --app=cloud -l DEBUG -s /home/foobar/run/celerybeat-schedule --pidfile=/home/foobar/run/celerybeat.pid
priority = 100
stdout_logfile_backups = 0
stderr_logfile_backups = 0
stdout_logfile_maxbytes = 10MB
stderr_logfile_maxbytes = 10MB
stdout_logfile = /opt/logs/celery.stdout.log
stderr_logfile = /opt/logs/celery.stderr.log
pip freeze | grep celery
celery==3.1.0
But any usage of:
#celery.task
def test_rabbit_running():
import logging
from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)
logger.setLevel(logging.DEBUG)
logger.info("foobar")
doesn't show up in the logs. Instead I get entries like the following.
celery.stdout.log
celery beat v3.1.0 (Cipater) is starting.
__ - ... __ - _
Configuration ->
. broker -> redis://localhost:6379//
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> /home/foobar/run/celerybeat-schedule
. logfile -> [stderr]#%DEBUG
. maxinterval -> now (0s)
celery.stderr.log
[2013-11-12 05:42:39,539: DEBUG/MainProcess] beat: Waking up in 2.00 seconds.
INFO Scheduler: Sending due task test_rabbit_running (retail.tasks.test_rabbit_running)
[2013-11-12 05:42:41,547: INFO/MainProcess] Scheduler: Sending due task test_rabbit_running (retail.tasks.test_rabbit_running)
DEBUG retail.tasks.test_rabbit_running sent. id->34268340-6ffd-44d0-8e61-475a83ab3481
[2013-11-12 05:42:41,550: DEBUG/MainProcess] retail.tasks.test_rabbit_running sent. id->34268340-6ffd-44d0-8e61-475a83ab3481
DEBUG beat: Waking up in 6.00 seconds.
What do I have to do to make my logging calls appear in the log files?
It doesn't log anything because it doesn't execute any tasks (and it's ok).
See also Celerybeat not executing periodic tasks
I'd try to put the call to log inside a task as the name of the util function implies get_task_logger, or just start with a simple print, or have your own log set up as suggested in Django Celery Logging Best Practice (best way to go IMO)
Related
I've built a small web scraper function to get some data from the web and populate it to my db which works just well.
Now I would like to fire this function periodically every 20 seconds using Celery periodic tasks.
I walked through the docu and everything seems to be set up for development (using redis as broker).
This is my tasks.py file in project/stocksapp where my periodically fired functions are:
# Celery imports
from celery.task.schedules import crontab
from celery.decorators import periodic_task
from celery.utils.log import get_task_logger
from datetime import timedelta
logger = get_task_logger(__name__)
# periodic functions
#periodic_task(
run_every=(timedelta(seconds=20)),
name="getStocksDataDax",
ignore_result=True
)
def getStocksDataDax():
print("fired")
Now when I start the worker, the function seems to be fired once and only once (the database gets populated). But after that, the function doesn't get fired anymore, although the console suggests it:
C:\Users\Jonas\Desktop\CFD\CFD>celery -A CFD beat -l info
celery beat v4.4.2 (cliffs) is starting.
__ - ... __ - _
LocalTime -> 2020-05-15 23:06:29
Configuration ->
. broker -> redis://localhost:6379/0
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]#%INFO
. maxinterval -> 5.00 minutes (300s)
[2020-05-15 23:06:29,990: INFO/MainProcess] beat: Starting...
[2020-05-15 23:06:30,024: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:06:50,015: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:07:10,015: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:07:30,015: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:07:50,015: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:08:10,016: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:08:30,016: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
[2020-05-15 23:08:50,016: INFO/MainProcess] Scheduler: Sending due task getStocksDataDax (getStocksDataDax)
project/project/celery.py
from __future__ import absolute_import, unicode_literals
import os
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'CFD.settings')
app = Celery('CFD',
broker='redis://localhost:6379/0',
backend='amqp://',
include=['CFD.tasks'])
app.conf.broker_transport_options = {'visibility_timeout': 3600}
# Using a string here means the worker doesn't have to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
# should have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django app configs.
app.autodiscover_tasks()
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
The function itself runs about 1 second totally.
Where could basically be an issue in this setup to make the worker/celery fire the function every 20 seconds as supposed to?
celery -A CFD beat -l info only starts the Celery beat process. You should have a separate Celery worker process - in a different terminal run something like celery -A CFD worker -c 8 -O fair -l info.
After reviewing many many threads on the same issue, I have found no answer yet.
I'm running a Django app with the following things
Python: 3.6
Django: 3.0.5
Celery: 4.0.2
Kombu: 4.2.0
I'm running all the stack with docker-compose, celery is running in a different container.
Apparently my task is registering within celery, because if I inspect the registered tasks of my application I get a list with a single element, the task itself:
$ celery -A apps.meals.tasks inspect registered
-> celery#7de3143ddcb2: OK
* apps.meals.tasks.send_slack_notification
myproj/settings.py
INSTALLED_APPS = [
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'apps',
'apps.meals',
]
myproj/celery.py:
from __future__ import absolute_import, unicode_literals
from celery import Celery
import os
from django.conf import settings
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'myproj.settings')
app = Celery('myproj')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
app.conf.update(
BROKER_URL=settings.BROKER_URL,
)
My task is in a different place, within the meals app, in a file named tasks.py.
apps/meals/tasks.py
from django.conf import settings
from slackclient import SlackClient
from celery.utils.log import get_task_logger
from myproj import celery_app
from json import loads, dumps
logger = get_task_logger(__name__)
slack_markdown_text = "Hola!\n El menu de hoy es:\n {content}\n Pueden enviar su pedido aca: {link}\n Saludos!"
#celery_app.task(name="apps.meals.tasks.send_slack_notification")
def send_slack_notification(serial_menu, serial_options):
...
My file structure is like:
technical-tests/
|
|--apps/
|----*snap
|----meals/
|------*snap
|------tasks.py
|--myproj
|----*snap
|----celery.py
|----settings.py
Finally, docker-compose.yml goes like this:
version: '3.5'
services:
backend: ....
celery:
build:
context: ..
dockerfile: ./deploy/Dockerfile
volumes:
- ../code/:/opt/somelocation
environment:
- SECRET_KEY
- SLACK_TOKEN
- SLACK_CHANNEL
- SLACK_URL_PATTERN
- BROKER_URL
command: celery -A apps.meals.tasks worker -l info
depends_on:
- backend
- redis
Celery worker console:
The Celery console when starts DOESN'T show that pretty colored screen when the registered tasks appear, so that's suspicious. It only shows:
celery_1_aa9c50e916ae | /usr/local/lib/python3.6/site-packages/celery/platforms.py:796: RuntimeWarning: You're running the worker with superuser privileges: this is
celery_1_aa9c50e916ae | absolutely not recommended!
celery_1_aa9c50e916ae |
celery_1_aa9c50e916ae | Please specify a different user using the --uid option.
celery_1_aa9c50e916ae |
celery_1_aa9c50e916ae | User information: uid=0 euid=0 gid=0 egid=0
celery_1_aa9c50e916ae |
celery_1_aa9c50e916ae | uid=uid, euid=euid, gid=gid, egid=egid,
celery_1_aa9c50e916ae | [2020-04-08 14:05:34,024: INFO/MainProcess] Connected to redis://redis:6379/0
celery_1_aa9c50e916ae | [2020-04-08 14:05:34,034: INFO/MainProcess] mingle: searching for neighbors
celery_1_aa9c50e916ae | [2020-04-08 14:05:35,053: INFO/MainProcess] mingle: all alone
celery_1_aa9c50e916ae | [2020-04-08 14:05:35,068: WARNING/MainProcess] /usr/local/lib/python3.6/site-packages/celery/fixups/django.py:200: UserWarning: Using settings.DEBUG leads to a memory leak, never use this setting in production environments!
celery_1_aa9c50e916ae | warnings.warn('Using settings.DEBUG leads to a memory leak, never '
celery_1_aa9c50e916ae | [2020-04-08 14:05:35,069: INFO/MainProcess] celery#7de3143ddcb2 ready.
I know the task is not running because in python shell when I call send_slack_notifications without delay it runs immediately, but when I use delay the commands hangs forever and the celery console doesn't move, like it doesn't receives any information.
Any insight is much appreciated!
UPDATE
The delay call is being made in apps/meals/views.py
from apps.meals.tasks import send_slack_notification
#login_required
def notify_menu(request, uuid):
if is_staff(request.user):
menu = Menu.objects.filter(uuid=uuid)
send_slack_notification.delay(
serialize(menu),
serialize(menu.first().list_options())
)
In your celery config, add the environment variable C_FORCE_ROOT with a value of 1.
Use CELERY_BROKER_URL in django settings to properly set the celery broker from settings and do not pass in BROKER_URL from the environment (celery will be confused because the environment variable may override the setting variable.
Don't set the BROKER_URL using app.conf.update, use the setting CELERY_BROKER_URL as discussed above.
How do I prevent duplicate celery logs in an application like this?
# test.py
from celery import Celery
import logging
app = Celery('tasks', broker='redis://localhost:6379/0')
app.logger = logging.getLogger("new_logger")
file_handler = logging.handlers.RotatingFileHandler("app.log", maxBytes=1024*1024, backupCount=1)
file_handler.setFormatter(logging.Formatter('custom_format %(message)s'))
app.logger.addHandler(file_handler)
#app.task
def foo(x, y):
app.logger.info("log info from foo")
I start the application with: celery -A test worker --loglevel=info --logfile celery.log
Then I cause foo to be run with python -c "from test import foo; print foo.delay(4, 4)"
This results in the "log info from foo" being displayed in both celery.log and app.log.
Here is app.log contents:
custom_format log info from foo
And here is celery.log contents:
[2017-07-26 21:17:24,962: INFO/MainProcess] Connected to redis://localhost:6379/0
[2017-07-26 21:17:24,967: INFO/MainProcess] mingle: searching for neighbors
[2017-07-26 21:17:25,979: INFO/MainProcess] mingle: all alone
[2017-07-26 21:17:25,991: INFO/MainProcess] celery#jd-t430 ready.
[2017-07-26 21:17:38,224: INFO/MainProcess] Received task: test.foo[e2c5e6aa-0d2d-4a16-978c-388a5e3cf162]
[2017-07-26 21:17:38,225: INFO/ForkPoolWorker-4] log info from foo
[2017-07-26 21:17:38,226: INFO/ForkPoolWorker-4] Task test.foo[e2c5e6aa-0d2d-4a16-978c-388a5e3cf162] succeeded in 0.000783085000876s: None
I considered removing the custom logger handler from the python code, but I don't want to just use celery.log because it doesn't support rotating files. I considered starting celery with --logfile /dev/null but then I would loose the mingle and other logs that don't show up in app.log.
Can I prevent "log info from foo" from showing up in celery.log? Given that I created the logger from scratch and only setup logging to app.log why is "log info from foo" showing up in celery.log anyway?
Is it possible to get the celery MainProcess and Worker logs (e.g. Connected to redis://localhost:6379/0) to be logged by a RotatingFileHandler (e.g. go in my app.log)?
Why is "log info from foo" showing up in celery.log?
The logging system is basically a tree of logging.Logger objects with main logging.Logger in the root of the tree (you get the root with call logging.getLogger() without parameters).
When you call logging.getLogger("child") you get reference to the logging.Logger processing the "child" logs. The problem is when you call logging.getLogger("child").info() the info message is delivered to the "child" but also to the parent of the "child" and to its parent until it arrives to the root.
To avoid sending logs to the parent you have to setup the logging.getLogger("child").propagate = False.
I'm looking for the best way to keep track of my workers and queues and I'm looking into logging.
I've seen examples in the celery documentation that suggests setting up logging as follows:
from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)
#app.task
def add(x, y):
logger.info('Adding {0} + {1}'.format(x, y))
return x + y
Where does the logging file go? Also what information is stored in the log file? Is it just information that is contained in the logger.info function?
Does the logfile store the results returned by the workers, or is that separate?
Where does the logging file go?
As I can see you don't have any FileHandlers. It means logger write messages to console.
Let's check it. Here example of tasks.py:
# celery 4.0.2
#celery.task(name='add')
def add(x, y):
logger.info('Adding {0} + {1}'.format(x, y))
return x + y
app = celery.Celery(
__name__,
broker='redis://localhost:6379/0',
backend='redis://localhost:6379/0',
)
app.conf.beat_schedule = {
# run task each 2 seconds
'add-every-2-seconds': {
'task': 'add',
'schedule': 2.0,
'args': (1, 2)
},
}
Run Celery(celery worker -A tasks.app --loglevel=info --beat) and check console. You will see something like that:
[2017-04-08 18:18:55,924: INFO/Beat] Scheduler: Sending due task add-every-2-seconds (add)
[2017-04-08 18:18:55,930: INFO/MainProcess] Received task: add[44a6877c-84a2-4a26-815e-1f637fdf9c0c]
[2017-04-08 18:18:55,932: INFO/PoolWorker-2] add[44a6877c-84a2-4a26-815e-1f637fdf9c0c]: Adding 1 + 2
[2017-04-08 18:18:55,934: INFO/PoolWorker-2] Task add[44a6877c-84a2-4a26-815e-1f637fdf9c0c] succeeded in 0.00191404699945s: 3
[2017-04-08 18:18:57,924: INFO/Beat] Scheduler: Sending due task add-every-2-seconds (add)
[2017-04-08 18:18:57,928: INFO/MainProcess] Received task: add[c386d360-57d3-4352-8a89-f86bb2376e4e]
[2017-04-08 18:18:57,930: INFO/PoolWorker-3] add[c386d360-57d3-4352-8a89-f86bb2376e4e]: Adding 1 + 2
[2017-04-08 18:18:57,931: INFO/PoolWorker-3] Task add[c386d360-57d3-4352-8a89-f86bb2376e4e] succeeded in 0.00146738500007s: 3
It means logger works good and write our messages. Now let's try to add FileHandler for our tasks:
logger = get_task_logger(__name__)
task_handler = FileHandler('task.log')
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
task_handler.setFormatter(formatter)
logger.addHandler(task_handler)
Run Celery and check folder where stored tasks.py. You should see new file(tasks.log). Example of content:
2017-04-08 18:35:02,052 - tasks - INFO - Adding 1 + 2
...
Does the logfile store the results returned by the workers?
By default information just print to console. But you can register specific loggers, handlers and customize behavior using signals, custom Task/Loader class.
Also you can set -f LOGFILE, --logfile=LOGFILE argument when run Celery.
Hope this helps.
I have a celery setup with rabbitmq. The issue is that celery is moving tasks to reserved state while running a long task, and do not execute them until the long running task is completed.
I want to accomplish that without using routing, and enabling "-Ofair" flag does the job. Prefork pool prefetch settings
How to enable the flag in celery python? Thanks
I am using celery 3.1.19
$ celery report
software -> celery:3.1.19 (Cipater) kombu:3.0.32 py:3.4.3
billiard:3.3.0.22 py-amqp:1.4.8
platform -> system:Linux arch:64bit, ELF imp:CPython
loader -> celery.loaders.default.Loader
settings -> transport:amqp results:disabled
I am using Celery as follows and concurrency is set to 4:
app = celery.Celery()
app.conf.update(
BROKER_URL=broker,
CELERY_RESULT_BACKEND=backend,
CELERY_TASK_SERIALIZER='json',
CELERY_IMPORTS=imports or [],
CELERYD_CONCURRENCY=concurrency,
CELERYD_HIJACK_ROOT_LOGGER=False
)
Here is how I start the worker:
worker = app.Worker(
hostname=hostname,
queues=[hostname]
)
worker.start()
You should be able to run it this way.
worker = app.Worker(
hostname=hostname,
queues=[hostname],
optimization='fair'
)
worker.start()