Running Flower using Supervisor

Running Flower using Supervisor - python

Am having challanges starting flower using supervisor.
The following command in my development environment works on the console
celery --app=celery_conf.celeryapp flower --conf=flowerconfig
but moving to production to use supervisor am getting all sorts of errors
/supervisor/conf.d/flower.conf
[program:flower]
command=/opt/apps/venv/my_app/bin/celery flower --app=celery_conf.celeryapp --conf=flowerconfig
directory=/opt/apps/my_app
user=www-data
autostart=true
autorestart=false
redirect_stderr=true
stderr_logfile=/var/log/celery/flower.err.log
stdout_logfile=/var/log/celery/flower.out.log
With the above configuration, there is no error but all celery does is give me a help like output. Its like it doesn't acknowledge the variables passed.
Type 'celery <command> --help' for help using a specific command.
Usage: celery <command> [options]
Show help screen and exit.
Options:
-A APP, --app=APP app instance to use (e.g. module.attr_name)
-b BROKER, --broker=BROKER
url to broker. default is 'amqp://guest#localhost//'
--loader=LOADER name of custom loader class to use.
etc..
etc..
etc...
Supervisor on the other hand throws INFO exited: flower (exit status 64; not expected)
I have other supervisor initiated apps using celery_beat and using the configuration file samples on github and they are working well with the same directory paths as above
The flowerconfig is as below:
flowerconfig.py
# Broker settings
BROKER_URL = 'amqp://guest:guest#localhost:5672//'
# RabbitMQ management api
broker_api = 'http://guest:guest#localhost:15672/api/'
#Port
port = 5555
# Enable debug logging
logging = 'INFO'
Solution:
Well, not really a solution so I haven't put it as an answer. Turned out there was a problem with my virtual environment. So I removed flower and installed again using pip3.4 as am on python3.4
Something to note though is that for flower to use your flowerconfig file, you need to add a director=/path/to/your/celery_config/folder/ entry in supervisor's /etc/supervisor/conf.d/flower.conf file else flower will launch with default settings.
/etc/supervisor/conf.d/flower.conf
; ==================================
; Flower: For monitoring Celery
; ==================================
[program:flower]
command=/opt/apps/venv/my_app/bin/celery flower --app=celery_conf.celeryapp --conf=flowerconfig
directory=/opt/apps/my_app/celery_conf #this is key as my configuration file was in the `celery_conf` folder
user=www-data
autostart=true
autorestart=false
redirect_stderr=true
stderr_logfile=/var/log/celery/flower.err.log
stdout_logfile=/var/log/celery/flower.out.log
Thanks.

Your supervisor is unable to locate celeryapp. My be your supervisor configuration file supervisor.conf is in different path.
You can pass directory option to supervisor process. So you can try
[program:flower]
directory = /opt/apps/venv/my_app/
command = celery --app=celery_conf.celeryapp flower
This starts a new flower instance.
Also note celery conf and flower conf are different.

Related

Django, Django Dynamic Scraper, Djcelery and Scrapyd - Not Sending Tasks in Production

I'm using Django Dynamic Scraper to build a basic web scraper. I have it 99% of the way finished. It works perfectly in development alongside Celery and Scrapyd. Tasks are sent and fulfilled perfectly.
As for production I'm pretty sure I have things set up correctly:
I'm using Supervisor to run Scrapyd and Celery on my VPS. They are both pointing at the correct virtualenv installations etc...
Here's how I know they're both set up fine for the project: When I SSH into my server and use the manage.py shell to execute a celery task, it returns an Async task which is then executed. The results appear in the database and both my scrapyd and celery log show the tasks being processed.
The issue is that my scheduled tasks are not being fired automatically - despite working perfectly find in development.
# django-celery settings
import djcelery
djcelery.setup_loader()
BROKER_URL = 'django://'
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
And my Supervisor configs:
Celery Config:
[program:IG_Tracker]
command=/home/dean/Development/IG_Tracker/venv/bin/celery --
app=IG_Tracker.celery:app worker --loglevel=INFO -n worker.%%h
directory=/home/dean/Development/IG_Tracker/
user=root
numprocs=1
stdout_logfile=/home/dean/Development/IG_Tracker/celery-worker.log
stderr_logfile=/home/dean/Development/IG_Tracker/celery-worker.log
autostart=true
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
killasgroup=true
priority=998
Scrapyd Config:
[program:scrapyd]
directory=/home/dean/Development/IG_Tracker/instagram/ig_scraper
command=/home/dean/Development/IG_Tracker/venv/bin/scrapyd
environment=MY_SETTINGS=/home/dean/Development/IG_Tracker/IG_Trackersettings.py
user=dean
autostart=true
autorestart=true
redirect_stderr=true
numprocs=1
stdout_logfile=/home/dean/Development/IG_Tracker/scrapyd.log
stderr_logfile=/home/dean/Development/IG_Tracker/scrapyd.log
startsecs=10
I have followed the docs as close as I could and used the recommended tools for deployment (eg. scrapyd-deploy etc...). Additionally, when I run celery and scrapyd manually on the server (as one would in development) things work fine. It's just when the two are run using supervisor.
I'm probably missing some setting or another which is preventing my celery tasks stored in the SQLite DB from being picked up and ran automatically by celery/scrapyd when in production.

Okay - so I eventually got this working. Maybe this can help someone else. My issue was that I only had ONE supervisor process for celery where as it needs two - one for actually running the tasks (worker) and another for supervising the scheduling. I only had the worker. This explains why everything worked fine when I fired off a task using the django shell (essentially manually passing a task to the worker).
Here is my conf file for the 'scheduler' celery process:
[program:celery_beat]
command=/home/dean/Development/IG_Tracker/venv/bin/celery beat -A
IG_Tracker --loglevel=INFO
directory=/home/dean/Development/IG_Tracker/
user=root
numprocs=1
stdout_logfile=/home/dean/Development/IG_Tracker/celery-worker.log
stderr_logfile=/home/dean/Development/IG_Tracker/celery-worker.log
autostart=true
autorestart=true
startsecs=10
stopwaitsecs = 600
killasgroup=true
priority=998
I added that and ran:
supervisorctl reread
supervisorctl update
supervisotctl restart all
My tasks began running right away.

How to integrate a --worker-class of gunicorn in supervisor config setting?

I have configured a supervisor on the server like this:
[program:myproject]
command = /home/mydir/myproj/venv/bin/python /home/mydir/myproj/venv/bin/gunicorn manage:app -b <ip_address>:8000
directory = /home/mydir
I have installed gevent on my virtual environment but I don't know how can I implement it on the supervisor command variable, I can run it manually through terminal like this:
gunicorn manage:app -b <ip_address>:8000 --worker-class gevent
I tried to include a path when I call gevent in supervisor command just like python and gunicorn, but it's not working, honestly, I don't know what's the correct directory/file to execute gevent and I am also not sure if this is the correct way to execute a worker class in supervisor. I am running on Ubuntu v14.04
Anyone?Thanks

Already made a solution for this. But I am not 100% sure if it is correct, after searching a hundred times, I finally came up with a working solution :)
Got this from here, I've created a gunicorn.conf.py file on my project directory containing:
worker_class = 'gevent'
And integrated this file on supervisor config setting:
[program:myproject]
command = /home/mydir/myproj/venv/bin/python /home/mydir/myproj/venv/bin/gunicorn -c /home/mydir/myproj/gunicorn.conf.py manage:app -b <ip_address>:8000
directory = /home/mydir
And start running the supervisor:
sudo supervisorctl start <my_project>
And poof! It's already working!

Celery: Start Worker Automatically (on boot)

I have tasks (for Celery) defined in /var/tasks/tasks.py.
I have a virtualenv at /var/tasks/venv which should be used to run /var/tasks/tasks.py.
I can manually start a worker to process tasks like this:
cd /var/tasks
. venv/bin/activate
celery worker -A tasks -Q queue_1
Now, I want to daemonize this.
I copied the init.d script from GitHub and am using the following config file in /etc/default/celeryd:
# name(s) of nodes to start
CELERYD_NODES="worker1"
# absolute or relative path to celery binary
CELERY_BIN="/var/tasks/venv/bin/celery"
# app instance
CELERY_APP="tasks"
# change to directory on upstart
CELERYD_CHDIR="/var/tasks"
# options
CELERYD_OPTS="-Q queue_1 --concurrency=8"
# %N will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%N.log"
CELERYD_PID_FILE="/var/run/celery/%N.pid"
# unprivileged user/group
CELERYD_USER="celery"
CELERYD_GROUP="celery"
# create pid and log directories, if missing
CELERY_CREATE_DIRS=1
When I start the service (via the init.d script), it says:
celery init v10.1.
Using config script: /etc/default/celeryd
But, it does not process any tasks from the queue, nor is there anything in the log file.
What am I doing wrong?

Supervisor might be a good option but if you want to use Celery Init.d Script will recommend you to copy it from their Github Source.
sudo vim /etc/init.d/celeryd
Copy the code from https://github.com/celery/celery/blob/master/extra/generic-init.d/celeryd in to the file. See daemonizing tutorial for details.
sudo chmod 755 /etc/init.d/celeryd
sudo chown root:root /etc/init.d/celeryd
sudo nano /etc/default/celeryd
Copy paste the below config and change accordingly
#Where your Celery is present
CELERY_BIN="/home/shivam/Desktop/deploy/bin/celery"
# App instance to use
CELERY_APP="app.celery"
# Where to chdir at start
CELERYD_CHDIR="/home/shivam/Desktop/Project/demo/"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=300 --concurrency=8"
# %n will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%n%I.log"
CELERYD_PID_FILE="/var/run/celery/%n.pid"
# Workers should run as an unprivileged user.
# You need to create this user manually (or you can choose
# A user/group combination that already exists (e.g., nobody).
CELERYD_USER="shivam"
CELERYD_GROUP="shivam"
# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1
export SECRET_KEY="foobar"
Save and exit
sudo /etc/init.d/celeryd start
sudo /etc/init.d/celeryd status
This will auto start Celery on Boot
sudo update-rc.d celeryd defaults

In case you use systemd, you should enable a celery service. It will activate your celery daemon on boot.
sudo systemctl enable yourcelery.service

I ended up using Supervisor and a script at /etc/supervisor/conf.d/celery.conf similar to this:
https://github.com/celery/celery/blob/3.1/extra/supervisord/celeryd.conf
This handles demonization, among other things, quite well and automatically.

Celery-Django as Daemon: Settings not found

By following this tutorial, I have now a Celery-Django app that is working fine if I launch the worker with this command:
celery -A myapp worker -n worker1.%h
in my Django settings.py, I set all parameters for Celery (IP of the messages broker, etc...). Everything is working well.
My next step now, is to run this app as a Daemon. So I have followed this second tutorial and everything is simple, except now, my Celery parameters included in settings.py are not loaded. By example, messages broker IP is set to 127.0.0.1 but in my settings.py, I set it at an other IP address.
In the tutorial, they say:
make sure that the module that defines your Celery app instance also sets a default value for DJANGO_SETTINGS_MODULE as shown in the example Django project in First steps with Django.
So I made it sure. I had in /etc/default/celeryd this:
export DJANGO_SETTINGS_MODULE="myapp.settings"
Still not working... So I also, had this line in /etc/init.d/celeryd, again not working.
I don't know what to do anymore. Is someone has a clue?
EDIT:
Here is my celery.py:
from __future__ import absolute_import
import os
from django.conf import settings
from celery import Celery
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'myapp.settings')
app = Celery('myapp')
# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
EDIT #2:
Here is my /etc/default/celeryd:
# Names of nodes to start
# most will only start one node:
CELERYD_NODES="worker1.%h"
# Absolute or relative path to the 'celery' command:
CELERY_BIN="/usr/local/bin/celery"
# App instance to use
# comment out this line if you don't use an app
CELERY_APP="myapp"
# Where to chdir at start.
CELERYD_CHDIR="/home/ubuntu/myapp-folder/"
# Extra command-line arguments to the worker
CELERYD_OPTS=""
# %N will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%N.log"
CELERYD_PID_FILE="/var/run/celery/%N.pid"
# Workers should run as an unprivileged user.
# You need to create this user manually (or you can choose
# a user/group combination that already exists, e.g. nobody).
CELERYD_USER="ubuntu"
CELERYD_GROUP="ubuntu"
# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1
# Name of the projects settings module.
export DJANGO_SETTINGS_MODULE=myapp.settings
export PYTHONPATH=$PYTHONPATH:/home/ubuntu/myapp-folder

All answers here could be a part of the solution but at the end, it was still not working.
But I finally succeeded to make it work.
First of all, in /etc/init.d/celeryd, I have changed this line:
CELERYD_MULTI=${CELERYD_MULTI:-"celeryd-multi"}
by:
CELERYD_MULTI=${CELERYD_MULTI:-"celery multi"}
The first one was tagged as deprecated, could be the problem.
Moreover, I put this as option:
CELERYD_OPTS="--app=myapp"
And don't forget to export some environments variables:
# Name of the projects settings module.
export DJANGO_SETTINGS_MODULE="myapp.settings"
export PYTHONPATH="$PYTHONPATH:/home/ubuntu/myapp-folder"
With all of this, it's now working on my side.

The problem is most likely that celeryd can't find your Django settings file because myapp.settings isn't in the the $PYTHONPATH then the application runs.
From what I recall, Python will look in the $PYTHONPATH as well as the local folder when importing files. When celeryd runs, it likely checks the path for a module app, doesn't find it, then looks in the current folder for a folder app with an __init__.py (i.e. a python module).
I think that all you should need to do is add this to your /etc/default/celeryd file:
export $PYTHONPATH:path/to/your/app

Below method does not helps to run celeryd, rather helps to run celery worker as a service which will be started at boot time.
commands like this sudo service celery status also works.
celery.conf
# This file sits in /etc/init
description "Celery for example"
start on runlevel [2345]
stop on runlevel [!2345]
#Send KILL after 10 seconds
kill timeout 10
script
#project(working_ecm) and Vitrualenv(working_ecm/env) settings
chdir /home/hemanth/working_ecm
exec /home/hemanth/working_ecm/env/bin/python manage.py celery worker -B -c 2 -f /var/log/celery-ecm.log --loglevel=info >> /tmp/upstart-celery-job.log 2>&1
end script
respawn

In your second tutorial they set the django_settings variable to:
export DJANGO_SETTINGS_MODULE="settings"
This could be a reason why your settings is not found in case it changes to directory
"/home/ubuntu/myapp-folder/"
Then you defined your app with "myapp" and then you say settings is in "myapp.settings"
This could lead to the fact that it searchs the settings file in
"/home/ubuntu/myapp-folder/myapp/myapp/settings"
So my suggestion is to remove the "myapp." in the DJANGO_SETTINGS_MODULE variable and dont forget quotation marks

I'd like to add an answer for anyone stumbling on this more recently.
I followed the getting started First Steps guide to a tee with Celery 4.4.7, as well as the Daemonization tutorial without luck.
My initial issue:
celery -A app_name worker -l info works without issue (actual celery configuration is OK).
I could start celeryd as daemon and status command would show OK, but couldn't receive tasks. Checking logs, I saw the following:
[2020-11-01 09:33:15,620: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**#127.0.0.1:5672//: [Errno 111] Connection refused.
This was an indication that celeryd was not connecting to my broker (redis). Given CELERY_BROKER_URL was already set in my configuration, this meant my celery app settings were not being pulled in for the daemon process.
I tried sudo C_FAKEFORK=1 sh -x -l -E /etc/init.d/celeryd start to see if any of my celery settings were pulled in, and i noticed that app was set to default default (not the app name specified as CELERY_APP in /etc/default/celeryd
Since celery -A app_name worker -l info worked, fixed the issue by exporting CELERY_APP in /etc/default/celeryd/, instead of just setting the variable per documentation.
TL;DR
If celery -A app_name worker -l info works (replace app_name with what you've defined in the Celery first steps guide), and sudo C_FAKEFORK=1 sh -x -l -E /etc/init.d/celeryd start does not show your celery app settings being pulled in, add the following to the end of your /etc/default/celeryd:
export CELERY_APP="app_name"

How can I automatically reload tasks modules with Celery daemon?

I am using Fabric to deploy a Celery broker (running RabbitMQ) and multiple Celery workers with celeryd daemonized through supervisor. I cannot for the life of me figure out how to reload the tasks.py module short of rebooting the servers.
/etc/supervisor/conf.d/celeryd.conf
[program:celeryd]
directory=/fab-mrv/celeryd
environment=[RABBITMQ crendentials here]
command=xvfb-run celeryd --loglevel=INFO --autoreload
autostart=true
autorestart=true
celeryconfig.py
import os
## Broker settings
BROKER_URL = "amqp://%s:%s#hostname" % (os.environ["RMQU"], os.environ["RMQP"])
# List of modules to import when celery starts.
CELERY_IMPORTS = ("tasks", )
## Using the database to store task state and results.
CELERY_RESULT_BACKEND = "amqp"
CELERYD_POOL_RESTARTS = True
Additional information
celery --version 3.0.19 (Chiastic Slide)
python --version 2.7.3
lsb_release -a Ubuntu 12.04.2 LTS
rabbitmqctl status ... 2.7.1 ...
Here are some things I have tried:
The celeryd --autoreload flag
sudo supervisorctl restart celeryd
celery.control.broadcast('pool_restart', arguments={'reload': True})
ps auxww | grep celeryd | grep -v grep | awk '{print $2}' | xargs kill -HUP
And unfortunately, nothing causes the workers to reload the tasks.py module (e.g. after running git pull to update the file). The gist of the relevant fab functions is available here.
The brokers/workers run fine after a reboot.

Just a shot in the dark, with the celeryd --autoreload option did you make sure you have one of the file system notification backends? It recommends PyNotify for linux, so I'd start by making sure you have that installed.

I faced a similar problem and was able to use Watchdog to reload the tasks.py tasks modules when there are changes detected. To install:
pip install watchdog
You can programmatically use the Watchdog API, for example, to monitor for directory changes in the file system. Additionally Watchdog provides an optional shell utility called watchmedo that can be used to execute commands on event. Here is an example that starts the Celery worker via Watchdog and reloads on any changes to .py files including changes via git pull:
watchmedo auto-restart --directory=./ --pattern="*.py" --recursive -- celery worker --app=worker.app --concurrency=1 --loglevel=INFO
Using Watchdog's watchmedo I was able to git pull changes and the respective tasks.py modules were auto reloaded without any reboot of the container or server.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.