I want to run some tasks in django. I'm using Celery to do this. Usually, I run this command to execute the tasks:
source myvirtualenvpath/bin/activate
nohup python manage.py celeryd -E -B --loglevel=DEBUG < /dev/null &>/dev/null &
I want to do this each time the machine reboot with a crontab. How can I
do this?
Thanks
I solved making this:
nohup path_to_virtual_env/bin/python path_to_project/manage.py celeryd -E -B --loglevel=DEBUG < /dev/null &>/dev/null &
Related
My flask app is comprised of four containers: web app, postgres, rabbitMQ and Celery. Since I have celery tasks that run periodically, I am using celery beat. I've configured my docker-compose file like this:
version: '2'
services:
rabbit:
# ...
web:
# ...
rabbit:
# ...
celery:
build:
context: .
dockerfile: Dockerfile.celery
And my Dockerfile.celery looks like this:
# ...code up here...
CMD ["celery", "-A", "app.tasks.celery", "worker", "-B", "-l", "INFO"]
While I read in the docs that I shouldn't go to production with the -B option, I hastily added it anyway (and forgot about changing it) and quickly learned that my scheduled tasks were running multiple times. For those interested, if you do a ps aux | grep celery from within your celery container, you'll see multiple celery + beat processes running (but there should only be one beat process and however many worker processes). I wasn't sure from the docs why you shouldn't run -B in production but now I know.
So then I changed my Dockerfile.celery to:
# ...code up here...
CMD ["celery", "-A", "app.tasks.celery", "worker", "-l", "INFO"]
CMD ["celery", "-A", "app.tasks.celery", "beat", "-l", "INFO"]
No when I start my app, the worker processes start but beat does not. When I flip those commands around so that beat is called first, then beat starts but the worker processes do not. So my question is: how do I run celery worker + beat together in my container? I have combed through many articles/docs but I'm still unable to figure this out.
EDITED
I changed my Dockerfile.celery to the following:
ENTRYPOINT [ "/bin/sh" ]
CMD [ "./docker.celery.sh" ]
And my docker.celery.sh file looks like this:
#!/bin/sh -ex
celery -A app.tasks.celery beat -l debug &
celery -A app.tasks.celery worker -l info &
However, I'm receiving the error celery_1 exited with code 0
Edit #2
I added the following blocking command to the end of my docker.celery.sh file and all was fixed:
tail -f /dev/null
docker run only one CMD, so only the first CMD get executed, the work around is to create a bash script that execute both worker and beat and use the docker CMD to execute this script
I got by putting in the entrypoint as explained above, plus I added the &> to have the output in a log file.
my entrypoint.sh
#!/bin/bash
python3 manage.py migrate
python3 manage.py migrate catalog --database=catalog
python manage.py collectstatic --clear --noinput --verbosity 0
# Start Celery Workers
celery worker --workdir /app --app dri -l info &> /log/celery.log &
# Start Celery Beat
celery worker --workdir /app --app dri -l info --beat &> /log/celery_beat.log &
python3 manage.py runserver 0.0.0.0:8000
Starting from the same concept #shahaf has highlighted I solved starting from this other solution using bash -c in this way:
command: bash -c "celery -A app.tasks.celery beat & celery -A app.tasks.celery worker --loglevel=debug"
You can use celery beatX for beat. It is allowed (and recommended) to have multiple beatX instances. They use locks to synchronize.
Cannot say if it is production-ready, but it works for me like a charm (with -B key)
I am debugging an issue where every scheduled task is run twice. I saw two processes named celery. Is it normal for two celery tasks to be running?
$ ps -ef | grep celery
hgarg 303 32764 0 17:24 ? 00:00:00 /home/hgarg/.pythonbrew/venvs/Python-2.7.3/hgarg_env/bin/python /data/hgarg/current/manage.py celeryd -B -s celery -E --scheduler=djcelery.schedulers.DatabaseScheduler -P eventlet -c 1000 -f /var/log/celery/celeryd.log -l INFO --pidfile=/var/run/celery/celeryd.pid --verbosity=1 --settings=settings
hgarg 307 21179 0 17:24 pts/1 00:00:00 grep celery
hgarg 32764 1 4 17:24 ? 00:00:00 /home/hgarg/.pythonbrew/venvs/Python-2.7.3/hgarg_env/bin/python /data/hgarg/current/manage.py celeryd -B -s celery -E --scheduler=djcelery.schedulers.DatabaseScheduler -P eventlet -c 1000 -f /var/log/celery/celeryd.log -l INFO --pidfile=/var/run/celery/celeryd.pid --verbosity=1 --settings=settings
There were two pairs of Celery processes, the older of which shouldn't have been. Killing them all and restarting celery seems to have fixed it. Without any other recent changes, unlikely that anything else could have caused it.
I have a running supervisor job for my celery server. Now I need to add a new task to it, but unfortunately my celery server command is not configured to track those dynamic changes automatically.
Here is my celery command:
python manage.py celery worker --broker=amqp://username:password#localhost/our_app_vhost
To restart my celery process, I have tried,
sudo supervisorctl -c /etc/supervisor/supervisord.conf restart <process_name>
supervisorctl stop all
supervisorctl start all
service supervisor restart
But nothing found working. How to restart it?
If you want to manage process with supervisorctl, you should configure supervisorctl, rpcinterface in your configuration file.
Here is a sample configuration file.
sample.conf
[supervisord]
logfile=/tmp/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10 ; (num of main logfile rotation backups;default 10)
loglevel=info ; (log level;default info; others: debug,warn,trace)
pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
nodaemon=false ; (start in foreground if true;default false)
minfds=1024 ; (min. avail startup file descriptors;default 1024)
minprocs=200 ; (min. avail process descriptors;default 200)
[program:my_worker]
command = python manage.py celery worker --broker=amqp://username:password#localhost/our_app_vhost
[unix_http_server]
file=/tmp/supervisor.sock ; (the path to the socket file)
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
Now start supervisor with
supervisord -c sample.conf
Now if you want to restart your worker you can do it with
supervisorctl -c sample.conf restart my_worker
This restarts your worker. Alternatively you can also drop to supervisor shell and you can restart it
sudo supervisorctl -c sample.conf
supervisor> restart my_worker
my_worker: stopped
my_worker: started
Note:
There is an option to autoreload workers in Celery
python manage.py celery worker --autoreload --broker=amqp://username:password#localhost/our_app_vhost
This should be used in development mode only. Using this in production is not recommended.
More about this on celery docs.
you can write your celery task in /etc/supervisor/conf.d/. create a new config file for celery like celery.conf.
Assuming your virtualenv is venv, your django project is sample and your celery script is in _celery.py
The file should look like
[program:celery]
command=/home/ubuntu/.virtualenvs/venv/bin/celery --app=sample._celery:app worker --loglevel=INFO
directory=/home/ubuntu/sample/
user=ubuntu
numprocs=1
stdout_logfile=/home/ubuntu/logs/celery-worker.log
stderr_logfile=/home/ubuntu/logs/celery-error.log
autostart=true
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998
after writing this supervisor program you need to run
If you add the supervisor program run this
$ sudo supervisorctl reread
celery: available
If you add/update the supervisor program run this
$ sudo supervisorctl update
celery: added process group
To check the status of your celery task
$ sudo supervisorctl status celery
celery RUNNING pid 18020, uptime 0:00:50
To stop the celery task
$ sudo supervisorctl stop celery
celery: stopped
To start the celery task
$ sudo supervisorctl start celery
celery: started
To restart the celery task (this would stop and again start the specified task)
$ sudo supervisorctl restart celery
celery: stopped
celery: started
If some task running then restart celery waiting for complete them. So need to kill all running process.
run following command for kill all celery process:
kill -9 $(ps aux | grep celery | grep -v grep | awk '{print $2}' | tr '\n' ' ') > /dev/null 2>&1
Restart celery:
sudo supervisorctl stop all
sudo supervisorctl start all
I use django celery
python manage.py celeryd -B
[2013-05-01 23:42:58,583: WARNING/MainProcess] celery#aaa ready.
How to run it in background?
python manage.py celeryd -B --detach
You can refer to: http://docs.celeryproject.org/en/latest/reference/celery.bin.celeryev.html?highlight=detach#cmdoption-celery-events--detach
I'm working on a Django website where I have various compilation programs that need to run (Compass/Sass, coffeescript, hamlpy), so I made this shell script for convenience:
#!/bin/bash
SITE=/home/dev/sites/rmx
echo "RMX using siteroot=$SITE"
$SITE/rmx/manage.py runserver &
PIDS[0]=$!
compass watch $SITE/media/compass/ &
PIDS[1]=$!
coffee -o $SITE/media/js -cw $SITE/media/coffee &
PIDS[2]=$!
hamlpy-watcher $SITE/templates/hamlpy $SITE/templates/templates &
PIDS[3]=$!
trap "echo PIDS: ${PIDS[*]} && kill ${PIDS[*]}" SIGINT
wait
Everything except for the Django server shuts down nicely on a ctrl+c because the PID of the server process isn't the PID of the python manage.py runserver command. Which means everytime I stop the script, I have to find the running process PID and shut it down.
Here's an example:
$> ./compile.sh
RMX using siteroot....
...
[ctrl+c]
PIDS: 29725 29726 29728 29729
$> ps -A | grep python
29732 pts/2 00:00:00 python
The first PID, 29725, is the initial python manage.py runserver call, but 29732 is the actual dev server process.
edit Looks like this is due to Django's auto-reload feature which can be disabled with the --noreload flag. Since I'd like to keep the auto reload feature, the question now becomes how to kill the child processes from the bash script. I would think killing the initial python runserver command would do it...
SOLVED
Thanks to this SO question, I've changed my script to this:
#!/bin/bash
SITE=/home/dev/sites/rmx
echo "RMX using siteroot=$SITE"
$SITE/rmx/manage.py runserver &
compass watch $SITE/media/compass/ &
coffee -o $SITE/media/js -cw $SITE/media/coffee &
hamlpy-watcher $SITE/templates/hamlpy $SITE/templates/templates &
trap "kill -TERM -$$" SIGINT
wait
PIDs preceded with the dash operate on the PID group with the kill command, and the $$ references the PID of the bash script itself.
Thanks for the help, me!
No problem, self, and hey -- you're awesome.
You can execute this to kill or process and servers, you set PORT number:
$ netstat -tulpn | grep PORT | awk '{print $7}' | cut -d/ -f 1 | xargs kill
OR
$ sudo lsof -i tcp:PORT
$ sudo lsof -i tcp:PORT|awk '{print $2}'|cut -d/ -f 1|xargs kill