How can I automatically reload tasks modules with Celery daemon? - python

I am using Fabric to deploy a Celery broker (running RabbitMQ) and multiple Celery workers with celeryd daemonized through supervisor. I cannot for the life of me figure out how to reload the tasks.py module short of rebooting the servers.
/etc/supervisor/conf.d/celeryd.conf
[program:celeryd]
directory=/fab-mrv/celeryd
environment=[RABBITMQ crendentials here]
command=xvfb-run celeryd --loglevel=INFO --autoreload
autostart=true
autorestart=true
celeryconfig.py
import os
## Broker settings
BROKER_URL = "amqp://%s:%s#hostname" % (os.environ["RMQU"], os.environ["RMQP"])
# List of modules to import when celery starts.
CELERY_IMPORTS = ("tasks", )
## Using the database to store task state and results.
CELERY_RESULT_BACKEND = "amqp"
CELERYD_POOL_RESTARTS = True
Additional information
celery --version 3.0.19 (Chiastic Slide)
python --version 2.7.3
lsb_release -a Ubuntu 12.04.2 LTS
rabbitmqctl status ... 2.7.1 ...
Here are some things I have tried:
The celeryd --autoreload flag
sudo supervisorctl restart celeryd
celery.control.broadcast('pool_restart', arguments={'reload': True})
ps auxww | grep celeryd | grep -v grep | awk '{print $2}' | xargs kill -HUP
And unfortunately, nothing causes the workers to reload the tasks.py module (e.g. after running git pull to update the file). The gist of the relevant fab functions is available here.
The brokers/workers run fine after a reboot.

Just a shot in the dark, with the celeryd --autoreload option did you make sure you have one of the file system notification backends? It recommends PyNotify for linux, so I'd start by making sure you have that installed.

I faced a similar problem and was able to use Watchdog to reload the tasks.py tasks modules when there are changes detected. To install:
pip install watchdog
You can programmatically use the Watchdog API, for example, to monitor for directory changes in the file system. Additionally Watchdog provides an optional shell utility called watchmedo that can be used to execute commands on event. Here is an example that starts the Celery worker via Watchdog and reloads on any changes to .py files including changes via git pull:
watchmedo auto-restart --directory=./ --pattern="*.py" --recursive -- celery worker --app=worker.app --concurrency=1 --loglevel=INFO
Using Watchdog's watchmedo I was able to git pull changes and the respective tasks.py modules were auto reloaded without any reboot of the container or server.

Related

Celery worker stops when console is closed [duplicate]

I am running a celery worker like this:
celery worker --app=portalmq --logfile=/tmp/portalmq.log --loglevel=INFO -E --pidfile=/tmp/portalmq.pid
Now I want to run this worker in the background. I have tried several things, including:
nohup celery worker --app=portalmq --logfile=/tmp/portal_mq.log --loglevel=INFO -E --pidfile=/tmp/portal_mq.pid >> /tmp/portal_mq.log 2>&1 </dev/null &
But it is not working. I have checked the celery documentation, and I found this:
Running the worker as a daemon
Running the celery worker server
Specially this comment is relevant:
In production you will want to run the worker in the background as a daemon.
To do this you need to use the tools provided by your platform, or something
like supervisord (see Running the worker as a daemon for more information).
This is too much overhead just to run a process in the background. I would need to install supervisord in my servers, and get familiar with it. No go at the moment. Is there a simple way of running a celery worker in the backrground?
supervisor is really simple and requires really little work to get it setup up, same applies for to celery in combination with supervisor.
It should not take more than 10 minutes to setup it up :)
install supervisor with apt-get
create /etc/supervisor/conf.d/celery.conf config file
paste somethis in the celery.conf file
[program:celery]
directory = /my_project/
command = /usr/bin/python manage.py celery worker
plus (if you need) some optional and useful stuff (with dummy
values)
user = celery_user
group = celery_group
stdout_logfile = /var/log/celeryd.log
stderr_logfile = /var/log/celeryd.err
autostart = true
environment=PATH="/some/path/",FOO="bar"
restart supervisor (or do supervisorctl reread; supervisorctl add
celery)
after that you get the nice ctl commands to manage the celery process:
supervisorctl start/restart/stop celery
supervisorctl tail [-f] celery [stderr]
celery worker -A app.celery --loglevel=info --detach
For me this one worked, I was using celery with django
celery -A proj_name worker -l INFO --detach
I have faced the same problem as a lazy solution is to use & at the end of the command.
For example
celery worker -A <app>.celery --loglevel=info &
Below command when executed in terminal will start celery as a background process.
celery -A app.celery worker --loglevel=info --detach
Incase you want stop it then ps aux | grep celery as mentioned #Kaiss B. in another answer's comment & kill -9 <process id> to kill the process.
But first of all you need to install the celery for
apt install python-celery-common.
Some of the guys might be wondering why the other answers which are upvoted but not working in there system is because celery changed the command syntax from
celery worker -A app.celery --loglevel=info --detach
to
celery -A app.celery worker --loglevel=info --detach
Hope that helps.

Running celery worker + beat in the same container

My flask app is comprised of four containers: web app, postgres, rabbitMQ and Celery. Since I have celery tasks that run periodically, I am using celery beat. I've configured my docker-compose file like this:
version: '2'
services:
rabbit:
# ...
web:
# ...
rabbit:
# ...
celery:
build:
context: .
dockerfile: Dockerfile.celery
And my Dockerfile.celery looks like this:
# ...code up here...
CMD ["celery", "-A", "app.tasks.celery", "worker", "-B", "-l", "INFO"]
While I read in the docs that I shouldn't go to production with the -B option, I hastily added it anyway (and forgot about changing it) and quickly learned that my scheduled tasks were running multiple times. For those interested, if you do a ps aux | grep celery from within your celery container, you'll see multiple celery + beat processes running (but there should only be one beat process and however many worker processes). I wasn't sure from the docs why you shouldn't run -B in production but now I know.
So then I changed my Dockerfile.celery to:
# ...code up here...
CMD ["celery", "-A", "app.tasks.celery", "worker", "-l", "INFO"]
CMD ["celery", "-A", "app.tasks.celery", "beat", "-l", "INFO"]
No when I start my app, the worker processes start but beat does not. When I flip those commands around so that beat is called first, then beat starts but the worker processes do not. So my question is: how do I run celery worker + beat together in my container? I have combed through many articles/docs but I'm still unable to figure this out.
EDITED
I changed my Dockerfile.celery to the following:
ENTRYPOINT [ "/bin/sh" ]
CMD [ "./docker.celery.sh" ]
And my docker.celery.sh file looks like this:
#!/bin/sh -ex
celery -A app.tasks.celery beat -l debug &
celery -A app.tasks.celery worker -l info &
However, I'm receiving the error celery_1 exited with code 0
Edit #2
I added the following blocking command to the end of my docker.celery.sh file and all was fixed:
tail -f /dev/null
docker run only one CMD, so only the first CMD get executed, the work around is to create a bash script that execute both worker and beat and use the docker CMD to execute this script
I got by putting in the entrypoint as explained above, plus I added the &> to have the output in a log file.
my entrypoint.sh
#!/bin/bash
python3 manage.py migrate
python3 manage.py migrate catalog --database=catalog
python manage.py collectstatic --clear --noinput --verbosity 0
# Start Celery Workers
celery worker --workdir /app --app dri -l info &> /log/celery.log &
# Start Celery Beat
celery worker --workdir /app --app dri -l info --beat &> /log/celery_beat.log &
python3 manage.py runserver 0.0.0.0:8000
Starting from the same concept #shahaf has highlighted I solved starting from this other solution using bash -c in this way:
command: bash -c "celery -A app.tasks.celery beat & celery -A app.tasks.celery worker --loglevel=debug"
You can use celery beatX for beat. It is allowed (and recommended) to have multiple beatX instances. They use locks to synchronize.
Cannot say if it is production-ready, but it works for me like a charm (with -B key)

Where is the supervisor keep a process's pidfile?

Error explain:
I do a django-celery project and use supervisor to keep the celery process.
With a lot of action,it mad out a error that I can't start a work.It says:
stale pidfile exists.Removing it.
But i did not point the pidfile path when I setting the supervisor.
question
where is the supervisor keep the process's pidfile default?
Could someone tell me how to right do command that I can see the tasks and workers in django-admin-site? I try like this when I develop the projects:
python manage.py runserver 0.0.0.0:8090
python manage.py celery events --camera=djcelery.snapshot.Camera
python manage.py celerybeat -l INFO
python manage.py celeryd -n worker_1 -l INFO
But when I try like this in supervisor,with nginx+uwsgi,I see nothing in django-admin-site

How to pip install a celery task module

I have a django REST API setup on one machine (currently in test on local machine but will be on a web server eventually). Let's call this machine "client". I also have a computing server to run CPU-intensive tasks that requires a long execution time. Let's call this machine "run-server".
"run-server" runs a celery worker connected to a local rabbitmq server. The worker currently is in a git module with this structure:
proj/
client.py
cmd.sh
requirements.txt
tasks.py
The whole thing runs in a virtualenv for what it's worth. The cmd.sh basically executes celery multi start workername -A tasks -l info on "run-server". The client.py is a cli script that can submit a tasks to the "run-server" manually from the shell from any machine (i.e. the "client").
I want to run the equivalent of the client script from a django setup without having to copy the tasks.py and client.py code in the django repository. Ideally I would pip install proj from the django code and import proj to use it just like the client script does.
How can I package proj to achieve that?
I am used to package my own python module with a structure roughly looking like:
proj/
bin/
proj
proj/
__init__.py
__main__.py
script.py
setup.py
requirements.txt
I managed to make it work on my own. The structure above just works. Instead of celery multi start workername -A tasks -l info, you simply replace with celery multi start workername -A proj.tasks -l info and everything works. The same version of the module have to be installed within django and as the worker because the job queuein is done via duck-typing (i.e. the path and names must match)

Celery: Start Worker Automatically (on boot)

I have tasks (for Celery) defined in /var/tasks/tasks.py.
I have a virtualenv at /var/tasks/venv which should be used to run /var/tasks/tasks.py.
I can manually start a worker to process tasks like this:
cd /var/tasks
. venv/bin/activate
celery worker -A tasks -Q queue_1
Now, I want to daemonize this.
I copied the init.d script from GitHub and am using the following config file in /etc/default/celeryd:
# name(s) of nodes to start
CELERYD_NODES="worker1"
# absolute or relative path to celery binary
CELERY_BIN="/var/tasks/venv/bin/celery"
# app instance
CELERY_APP="tasks"
# change to directory on upstart
CELERYD_CHDIR="/var/tasks"
# options
CELERYD_OPTS="-Q queue_1 --concurrency=8"
# %N will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%N.log"
CELERYD_PID_FILE="/var/run/celery/%N.pid"
# unprivileged user/group
CELERYD_USER="celery"
CELERYD_GROUP="celery"
# create pid and log directories, if missing
CELERY_CREATE_DIRS=1
When I start the service (via the init.d script), it says:
celery init v10.1.
Using config script: /etc/default/celeryd
But, it does not process any tasks from the queue, nor is there anything in the log file.
What am I doing wrong?
Supervisor might be a good option but if you want to use Celery Init.d Script will recommend you to copy it from their Github Source.
sudo vim /etc/init.d/celeryd
Copy the code from https://github.com/celery/celery/blob/master/extra/generic-init.d/celeryd in to the file. See daemonizing tutorial for details.
sudo chmod 755 /etc/init.d/celeryd
sudo chown root:root /etc/init.d/celeryd
sudo nano /etc/default/celeryd
Copy paste the below config and change accordingly
#Where your Celery is present
CELERY_BIN="/home/shivam/Desktop/deploy/bin/celery"
# App instance to use
CELERY_APP="app.celery"
# Where to chdir at start
CELERYD_CHDIR="/home/shivam/Desktop/Project/demo/"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=300 --concurrency=8"
# %n will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%n%I.log"
CELERYD_PID_FILE="/var/run/celery/%n.pid"
# Workers should run as an unprivileged user.
# You need to create this user manually (or you can choose
# A user/group combination that already exists (e.g., nobody).
CELERYD_USER="shivam"
CELERYD_GROUP="shivam"
# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1
export SECRET_KEY="foobar"
Save and exit
sudo /etc/init.d/celeryd start
sudo /etc/init.d/celeryd status
This will auto start Celery on Boot
sudo update-rc.d celeryd defaults
In case you use systemd, you should enable a celery service. It will activate your celery daemon on boot.
sudo systemctl enable yourcelery.service
I ended up using Supervisor and a script at /etc/supervisor/conf.d/celery.conf similar to this:
https://github.com/celery/celery/blob/3.1/extra/supervisord/celeryd.conf
This handles demonization, among other things, quite well and automatically.

Categories