How to configure celery on amazon linux 2? - python

I tried several ways to configure a celery worker service to run on beanstalk amazon linux 2, I followed this tutorial, however it doesn't work for me, I got the following error:
Failed at step EXEC spawning /var/app/current:$PYTHONPATH/celery: No such file or directory
This my 01_create_service_celery.sh:
#!/usr/bin/env bash
# Create the celery systemd service file
echo "[Unit]
Description=Celery service for __
After=network.target
[Service]
Type=simple
Restart=always
RestartSec=30
User=root
WorkingDirectory=/var/app/current
ExecStart=$PYTHONPATH/celery -A notifications worker --loglevel=INFO
ExecReload=$PYTHONPATH/celery -A notifications worker --loglevel=INFO
EnvironmentFile=/opt/elasticbeanstalk/deployment/env
[Install]
WantedBy=multi-user.target
" | tee /etc/systemd/system/celery.service
# Start celery service
systemctl start celery.service
# Enable celery service to load on system start
systemctl enable celery.service

Related

why does the "celery" queue appear in sqs when deploying to beanstalk?

Every time I deploy to a beanstalk, I run celery workers, each of which is configured for a specific queue, and logs are also configured for each worker separately.
The question is where does this queue come from and why does everything work? why do workers that are configured for other queues successfully process tasks from this one?
Below some example of my celery services.
echo "[Unit]
Description=Celery
After=network.target
[Service]
Type=simple
Restart=always
RestartSec=1
User=root
EnvironmentFile=/opt/elasticbeanstalk/deployment/env
WorkingDirectory=/var/app/current
ExecStart=$PYTHONPATH/celery -A MyApp worker -Q default-dev -n default-worker \
--logfile=/var/log/celery/path-to-log --loglevel=DEBUG --concurrency=1
[Install]
WantedBy=multi-user.target
" | tee /etc/systemd/system/celery-default.service
## Service 'celery-broadcats' ('broadcasts-dev' queue)
echo "[Unit]
Description=Celery
After=network.target
[Service]
Type=simple
Restart=always
RestartSec=1
User=root
EnvironmentFile=/opt/elasticbeanstalk/deployment/env
WorkingDirectory=/var/app/current
ExecStart=$PYTHONPATH/celery -A MyApp worker -Q broadcasts-dev -n broadcast-worker \
--logfile=/var/log/celery/path-to-logs --loglevel=DEBUG --concurrency=1
[Install]
WantedBy=multi-user.target
" | tee /etc/systemd/system/celery-broadcats.service

Running Celery as a Flask app with Gunicorn

I'm running Celery as a Flask microservice where it has tasks.py with tasks and manage.py contains the call to run the flask server.
This is part of the manage.py
class CeleryWorker(Command):
"""Starts the celery worker."""
name = 'celery'
capture_all_args = True
def run(self, argv):
if "down" in argv:
ret = subprocess.call(
['pkill', '-9', '-f', "my_app.celery"])
sys.exit(ret)
else:
ret = subprocess.call(
['celery', 'worker', '-A', 'my_app.celery'] + argv)
sys.exit(ret)
manager.add_command("celery", CeleryWorker())
I can start the service with either python manage.py runserver or `celery worker -A my_app.celery and it runs perfectly and registers all tasks in tasks.py.
But in production, i want to handle multiple requests to this microservice and want to add gunicorn to serve those requests. How do i do it?
I'm not able to figure out how i can run both my gunicorn command and celery command together.
Also, i'm running other api services using gunicorn in production from its create_app, since i dont need them to run the celery command.
Recommend to use Supervisor, which allow you to control a number of processes.
step1: pip install supervisor
step2: vi supervisord.conf
[program:flask_wsgi]
command=gunicorn -w 3 --worker-class gevent wsgi:app
directory=$SRC_PATH
autostart=true
[program:celery]
command=celery worker -A app.celery --loglevel=info
directory=$SRC_PATH
autostart=true
step3: run supervisord -c supervisord.conf

Restart celery beat and worker during Django deployment

I am using celery==4.1.0 and django-celery-beat==1.1.0.
I am running gunicorn + celery + rabbitmq with Django.
This is my config for creating beat and worker
celery -A myproject beat -l info -f /var/log/celery/celery.log --detach
celery -A myproject worker -l info -f /var/log/celery/celery.log --detach
During Django deployment I am doing following:
rm -f celerybeat.pid
rm -f celeryd.pid
celery -A myproject beat -l info -f /var/log/celery/celery.log --detach
celery -A myproject worker -l info -f /var/log/celery/celery.log --detach
service nginx restart
service gunicorn stop
sleep 1
service gunicorn start
I want to restart both celery beat and worker and it seems that this logic works. But I noticed that celery starts to use more and more memory during deployment and after several deployments I hit 100% memory use. I tried different server setups and it seems that it is not related.
rabbitmq may be to blame for high memory usage. Can you safely restart rabbit?
Also can you confirm that after a restart there is the expected amount of workers?
You are starting 2 new workers for every deployment without stopping/killing the previous workers.
During deployment, stop the existing workers with
kill -9 $PID
kill -9 `cat /var/run/myProcess.pid`
Alternatively, you can just kill all the workers with
pkill -9 celery
Now you can start workers as usual.
celery -A myproject beat -l info -f /var/log/celery/celery.log --detach
celery -A myproject worker -l info -f /var/log/celery/celery.log --detach

Celery-Supervisor: How to restart a supervisor job to make newly updated celery-tasks working?

I have a running supervisor job for my celery server. Now I need to add a new task to it, but unfortunately my celery server command is not configured to track those dynamic changes automatically.
Here is my celery command:
python manage.py celery worker --broker=amqp://username:password#localhost/our_app_vhost
To restart my celery process, I have tried,
sudo supervisorctl -c /etc/supervisor/supervisord.conf restart <process_name>
supervisorctl stop all
supervisorctl start all
service supervisor restart
But nothing found working. How to restart it?
If you want to manage process with supervisorctl, you should configure supervisorctl, rpcinterface in your configuration file.
Here is a sample configuration file.
sample.conf
[supervisord]
logfile=/tmp/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10 ; (num of main logfile rotation backups;default 10)
loglevel=info ; (log level;default info; others: debug,warn,trace)
pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
nodaemon=false ; (start in foreground if true;default false)
minfds=1024 ; (min. avail startup file descriptors;default 1024)
minprocs=200 ; (min. avail process descriptors;default 200)
[program:my_worker]
command = python manage.py celery worker --broker=amqp://username:password#localhost/our_app_vhost
[unix_http_server]
file=/tmp/supervisor.sock ; (the path to the socket file)
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
Now start supervisor with
supervisord -c sample.conf
Now if you want to restart your worker you can do it with
supervisorctl -c sample.conf restart my_worker
This restarts your worker. Alternatively you can also drop to supervisor shell and you can restart it
sudo supervisorctl -c sample.conf
supervisor> restart my_worker
my_worker: stopped
my_worker: started
Note:
There is an option to autoreload workers in Celery
python manage.py celery worker --autoreload --broker=amqp://username:password#localhost/our_app_vhost
This should be used in development mode only. Using this in production is not recommended.
More about this on celery docs.
you can write your celery task in /etc/supervisor/conf.d/. create a new config file for celery like celery.conf.
Assuming your virtualenv is venv, your django project is sample and your celery script is in _celery.py
The file should look like
[program:celery]
command=/home/ubuntu/.virtualenvs/venv/bin/celery --app=sample._celery:app worker --loglevel=INFO
directory=/home/ubuntu/sample/
user=ubuntu
numprocs=1
stdout_logfile=/home/ubuntu/logs/celery-worker.log
stderr_logfile=/home/ubuntu/logs/celery-error.log
autostart=true
autorestart=true
startsecs=10
; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600
; When resorting to send SIGKILL to the program to terminate it
; send SIGKILL to its whole process group instead,
; taking care of its children as well.
killasgroup=true
; if rabbitmq is supervised, set its priority higher
; so it starts first
priority=998
after writing this supervisor program you need to run
If you add the supervisor program run this
$ sudo supervisorctl reread
celery: available
If you add/update the supervisor program run this
$ sudo supervisorctl update
celery: added process group
To check the status of your celery task
$ sudo supervisorctl status celery
celery RUNNING pid 18020, uptime 0:00:50
To stop the celery task
$ sudo supervisorctl stop celery
celery: stopped
To start the celery task
$ sudo supervisorctl start celery
celery: started
To restart the celery task (this would stop and again start the specified task)
$ sudo supervisorctl restart celery
celery: stopped
celery: started
If some task running then restart celery waiting for complete them. So need to kill all running process.
run following command for kill all celery process:
kill -9 $(ps aux | grep celery | grep -v grep | awk '{print $2}' | tr '\n' ' ') > /dev/null 2>&1
Restart celery:
sudo supervisorctl stop all
sudo supervisorctl start all

Upstart script for Celery

I have celeryd daemons, working on small tasks. This daemon was configured with Upstart script
start on starting cessna
stop on stopping cessna
respawn
script
chdir /home/ubuntu/projects/cessna
exec su -c 'cd /home/ubuntu/projects/cessna; export MAX_POOL_SIZE="50";export newrelic-admin run-program celeryd -A cessna.celeryconfig --loglevel=info --concurrency=50 --pool=eventlet --queue=cessna_celery -E --pidfile=/tmp/cessna-3.pid >> /home/ubuntu/logs/cessna-w\
orker-3.log 2>> /home/ubuntu/errs/cessna-worker-3.log';
end script
Not so long I saw a lot of unack tasks in rabbitmq, no crashes in log files etc. We moved to native /etc/init.d/celeryd daemon, it solved the problem.
So, how it could be - Is there any relation between starting Celery with Upstart, and unacknowled tasks in Celery?

Categories